Hi all,
I did a GO Enrichment Ananlysis with ClusterProfiler's enrichGO
. My input for this are the results from DEseq2, and the Homo sapiens orgDb:
sigLevel= 0.05
univ= Results %>% pull(ENSEMBL)
geneset= Results %>%
filter(padj <= sigLevel & log2FoldChange >=2) %>%
pull(ENSEMBL)
ggo = enrichGO(gene= geneset,
universe = univ,
OrgDb = org.Hs.eg.db,
keyType = "ENSEMBL",
ont="BP",
pvalueCutoff = sigLevel)
This is working nicely, but when I check the output table from the enrichResult
instance, I find some inconsistencies. For example, one of the enriched GO terms is:
| ID | Description | GeneRatio | BgRatio | geneID | Count|
|GO:0090148| membrane fission | 2/75 | 11/14925 | ENSG00000183486/ENSG00000157601 | 2|
However, when I check the two genes, either in the org.Hs.eg.db
dataset, or at the ENSEMBL site, none of the two genes is actually associated with this GO term. I find that for several of the enriched terms, and I don't understand how this can happen, or how I need to solve it.
Anyone can help?
Thanks a ton!
My session:
> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=Dutch_Netherlands.1252 LC_CTYPE=Dutch_Netherlands.1252 LC_MONETARY=Dutch_Netherlands.1252 LC_NUMERIC=C
[5] LC_TIME=Dutch_Netherlands.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] clusterProfiler_3.14.3 org.Hs.eg.db_3.10.0 AnnotationDbi_1.48.0 RColorBrewer_1.1-2 pheatmap_1.0.12
[6] DESeq2_1.26.0 SummarizedExperiment_1.16.1 DelayedArray_0.12.2 BiocParallel_1.20.1 matrixStats_0.55.0
[11] Biobase_2.46.0 GenomicRanges_1.38.0 GenomeInfoDb_1.22.0 IRanges_2.20.2 S4Vectors_0.24.3
[16] BiocGenerics_0.32.0 forcats_0.4.0 stringr_1.4.0 dplyr_0.8.3 purrr_0.3.3
[21] readr_1.3.1 tidyr_1.0.0 tibble_2.1.3 ggplot2_3.2.1 tidyverse_1.3.0
loaded via a namespace (and not attached):
[1] readxl_1.3.1 backports_1.1.5 Hmisc_4.3-0 fastmatch_1.1-0 plyr_1.8.5 igraph_1.2.4.2
[7] lazyeval_0.2.2 splines_3.6.2 urltools_1.7.3 digest_0.6.23 htmltools_0.4.0 GOSemSim_2.12.0
[13] viridis_0.5.1 GO.db_3.7.0 fansi_0.4.1 magrittr_1.5 checkmate_1.9.4 memoise_1.1.0
[19] cluster_2.1.0 annotate_1.64.0 graphlayouts_0.5.0 modelr_0.1.5 enrichplot_1.6.1 prettyunits_1.1.1
[25] jpeg_0.1-8.1 colorspace_1.4-1 blob_1.2.1 rvest_0.3.5 ggrepel_0.8.1 haven_2.2.0
[31] xfun_0.12 crayon_1.3.4 RCurl_1.98-1.1 jsonlite_1.6 genefilter_1.68.0 zeallot_0.1.0
[37] survival_3.1-8 glue_1.3.1 polyclip_1.10-0 gtable_0.3.0 zlibbioc_1.32.0 XVector_0.26.0
[43] scales_1.1.0 DOSE_3.12.0 DBI_1.1.0 Rcpp_1.0.3 viridisLite_0.3.0 xtable_1.8-4
[49] progress_1.2.2 htmlTable_1.13.3 gridGraphics_0.4-1 foreign_0.8-75 bit_1.1-15.1 europepmc_0.3
[55] Formula_1.2-3 htmlwidgets_1.5.1 httr_1.4.1 fgsea_1.12.0 ellipsis_0.3.0 acepack_1.4.1
[61] pkgconfig_2.0.3 XML_3.99-0.3 farver_2.0.3 nnet_7.3-12 dbplyr_1.4.2 locfit_1.5-9.1
[67] labeling_0.3 ggplotify_0.0.4 tidyselect_0.2.5 rlang_0.4.2 reshape2_1.4.3 munsell_0.5.0
[73] cellranger_1.1.0 tools_3.6.2 cli_2.0.1 generics_0.0.2 RSQLite_2.2.0 ggridges_0.5.2
[79] broom_0.5.3 knitr_1.27 bit64_0.9-7 fs_1.3.1 tidygraph_1.1.2 ggraph_2.0.0
[85] nlme_3.1-143 DO.db_2.9 xml2_1.2.2 compiler_3.6.2 rstudioapi_0.10 png_0.1-7
[91] reprex_0.3.0 tweenr_1.0.1 geneplotter_1.64.0 stringi_1.4.4 lattice_0.20-38 Matrix_1.2-18
[97] vctrs_0.2.1 pillar_1.4.3 lifecycle_0.1.0 BiocManager_1.30.10 triebeard_0.3.0 data.table_1.12.8
[103] cowplot_1.0.0 bitops_1.0-6 qvalue_2.18.0 R6_2.4.1 latticeExtra_0.6-29 gridExtra_2.3
[109] MASS_7.3-51.5 assertthat_0.2.1 withr_2.1.2 GenomeInfoDbData_1.2.0 hms_0.5.3 grid_3.6.2
[115] rpart_4.1-15 rvcheck_0.1.7 ggforce_0.3.1 lubridate_1.7.4 base64enc_0.1-3
Dear James, thanks a lot! I was really confused, but that does make sense.