Dear Bioconductor users and package maintainer Guangchuang,
I was running a GO enrichment analysis on a set of DEGs using enrichGO function and simplify function from the R/Bioconductor package clusterProfiler and got an error message when the argument "ALL" was used in enrichGO. The simplify function works fine if the individual GO subcategory ("BP", "MF", or "CC") was used . I tried to reproduce the error with a builtin data set (gsSample) and got the same error (please see below). I was wondering if anyone has the same experience or I did anything wrong. Thanks for the help!
Best,
Lei
> library(clusterProfiler)
Loading required package: DOSE
DOSE v3.4.0 For help: https://guangchuangyu.github.io/DOSE
If you use DOSE in published research, please cite:
Guangchuang Yu, Li-Gen Wang, Guang-Rong Yan, Qing-Yu He. DOSE: an R/Bioconductor package for Disease Ontology Semantic and Enrichment analysis. Bioinformatics 2015, 31(4):608-609
clusterProfiler v3.6.0 For help: https://guangchuangyu.github.io/clusterProfiler
If you use clusterProfiler in published research, please cite:
Guangchuang Yu., Li-Gen Wang, Yanyan Han, Qing-Yu He. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS: A Journal of Integrative Biology. 2012, 16(5):284-287.
> library(org.Hs.eg.db)
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, cbind, colMeans, colnames,
colSums, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match,
mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
rbind, Reduce, rowMeans, rownames, rowSums, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which, which.max, which.min
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: IRanges
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following object is masked from ‘package:base’:
expand.grid
>
> data(gcSample)
>
> # use only BP
> x <- enrichGO(gcSample[[2]], ont = 'BP', OrgDb = 'org.Hs.eg.db')
> y <- simplify(x, measure = 'Wang', semData = NULL)
>
> # use ALL
> x <- enrichGO(gcSample[[2]], ont = 'ALL', OrgDb = 'org.Hs.eg.db')
> y <- simplify(x, measure = 'Wang', semData = NULL)
Error in .local(x, ...) :
simplify only applied to output from enrichGO...
>
> # session info
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] org.Hs.eg.db_3.5.0 AnnotationDbi_1.40.0 IRanges_2.12.0
[4] S4Vectors_0.16.0 Biobase_2.38.0 BiocGenerics_0.24.0
[7] clusterProfiler_3.6.0 DOSE_3.4.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.14 plyr_1.8.4 compiler_3.4.3
[4] tools_3.4.3 digest_0.6.13 bit_1.1-12
[7] lubridate_1.7.1 RSQLite_2.0 memoise_1.1.0
[10] tibble_1.3.4 gtable_0.2.0 pkgconfig_2.0.1
[13] rlang_0.1.4 igraph_1.1.2 fastmatch_1.1-0
[16] DBI_0.7 rvcheck_0.0.9 gridExtra_2.3
[19] fgsea_1.4.0 stringr_1.2.0 tidyselect_0.2.3
[22] autoinst_0.0.0.9000 bit64_0.9-7 grid_3.4.3
[25] glue_1.2.0 qvalue_2.10.0 data.table_1.10.4-3
[28] BiocParallel_1.12.0 GOSemSim_2.4.0 rematch2_2.0.1
[31] purrr_0.2.4 tidyr_0.7.2 reshape2_1.4.3
[34] GO.db_3.5.0 DO.db_2.9 ggplot2_2.2.1
[37] blob_1.1.0 magrittr_1.5 splines_3.4.3
[40] scales_0.5.0 colorspace_1.3-2 stringi_1.1.6
[43] lazyeval_0.2.1 munsell_0.4.3
>
FWIW this problem similarly "stumped" me for a little while.... I do hope it is resolved in the future... tx
sorry, I post my question here, maybe not related to the topic. Now I'm doing enrichment using clusterprofiler and WebGestalR. Here, x <- unique(unlist(as.list(org.Bt.egGO2ALLEGS))), x is 5586. And clusterprofiler is based on GO.db. The following is using WebGestalR: enrichD_BP <- loadGeneSet(organism = "btaurus",enrichDatabase = "geneontology_Biological_Process_noRedundant") geneSet_BP <- enrichD_BP$geneSet length(unique(geneSet_BP$gene))#9011
enrichD_CC <- loadGeneSet(organism = "btaurus",enrichDatabase = "geneontology_Cellular_Component_noRedundant") geneSet_CC <- enrichD_CC$geneSet length(unique(geneSet_CC$gene))#6224
enrichD_MF <- loadGeneSet(organism = "btaurus",enrichDatabase = "geneontology_Molecular_Function_noRedundant") geneSet_MF <- enrichD_MF$geneSet length(unique(geneSet_MF$gene))#7960
we can see WebGestalR has more gene set than clusterprofiler. So which one is the most up to date and same with online GO database? But how to get the gene set from online GO database. This question has puzzled me these day. Becasue I think they two are powerful and should have the same results