I have been using edgeR and limma for years but unfortunately two weeks ago I began encountering an issue with limma’s kegga() function that I haven’t seen before and haven’t been able to resolve.
The data frame created by this function is filled with NA values for the ‘Pathway’ column, even though rownames of the resulting data frame show the correct pathways, e.g. path:mmu00010, path:mmu00220, etc.
Here is an example where "results all" is the output from edgeR::topTags on a glmQLFTest object:
> ensembl=useMart("ensembl")
> ensembl=useDataset("mmusculus_gene_ensembl", mart=ensembl)
> genes.with.id <- getBM(attributes = c("external_gene_name", "entrezgene_id"), filters = "external_gene_name", values = rownames(results_all), mart = ensembl)
> genes <- results_all[results_all$table$PValue <= 0.05,]
> up_de <- rownames(genes[genes$table$logFC > 0,])
> up_de.id <- getBM(attributes = c("external_gene_name", "entrezgene_id"), filters ="external_gene_name", values = up_de, mart = ensembl)
> keg_up <- kegga(up_de.id$entrezgene_id, universe = genes.with.id$entrezgene_id, coef = ncol(results_all), species = "Mm")
> topKEGG(keg_up, n=10)
Pathway N DE P.DE
path:mmu04260 <NA> 70 11 3.692187e-06
path:mmu00280 <NA> 46 9 4.405234e-06
path:mmu04261 <NA> 130 14 1.736579e-05
path:mmu00250 <NA> 26 6 6.808361e-05
path:mmu01100 <NA> 1196 55 1.829979e-04
path:mmu05415 <NA> 181 15 1.846285e-04
path:mmu01230 <NA> 59 8 2.306363e-04
path:mmu00220 <NA> 13 4 3.604337e-04
path:mmu05414 <NA> 82 9 4.807895e-04
path:mmu01210 <NA> 15 4 6.580423e-04
This is even now occurring when I re-analyse a set of results from several months ago that previously worked correctly. The goana() function is not having the same problem and I initially suspected that the issue was with the KEGG database but getKEGGPathwayNames() appears to work - see chunk below - so I wanted to confirm if anyone is able to replicate this behaviour or can suggest why this is happening?
> head(getKEGGPathwayNames("mmu"), 10)
PathwayID Description
1 mmu01100 Metabolic pathways - Mus musculus (house mouse)
2 mmu01200 Carbon metabolism - Mus musculus (house mouse)
3 mmu01210 2-Oxocarboxylic acid metabolism - Mus musculus (house mouse)
4 mmu01212 Fatty acid metabolism - Mus musculus (house mouse)
5 mmu01230 Biosynthesis of amino acids - Mus musculus (house mouse)
6 mmu01232 Nucleotide metabolism - Mus musculus (house mouse)
7 mmu01250 Biosynthesis of nucleotide sugars - Mus musculus (house mouse)
8 mmu01240 Biosynthesis of cofactors - Mus musculus (house mouse)
9 mmu00010 Glycolysis / Gluconeogenesis - Mus musculus (house mouse)
10 mmu00020 Citrate cycle (TCA cycle) - Mus musculus (house mouse)
Thank you in advance and here is my sessionInfo().
> sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_Australia.utf8 LC_CTYPE=English_Australia.utf8 LC_MONETARY=English_Australia.utf8
[4] LC_NUMERIC=C LC_TIME=English_Australia.utf8
attached base packages:
[1] grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] gridExtra_2.3 Glimma_2.6.0 biomaRt_2.52.0 Rcpp_1.0.10 ReactomePA_1.40.0 ComplexHeatmap_2.12.1
[7] circlize_0.4.15 rmarkdown_2.20 htmlTable_2.4.1 RColorBrewer_1.1-3 reshape2_1.4.4 ggplot2_3.4.1
[13] gplots_3.1.3 edgeR_3.38.4 limma_3.52.4 statmod_1.5.0
loaded via a namespace (and not attached):
[1] shadowtext_0.1.2 backports_1.4.1 fastmatch_1.1-3 BiocFileCache_2.4.0
[5] plyr_1.8.8 igraph_1.4.1 lazyeval_0.2.2 splines_4.2.2
[9] BiocParallel_1.30.4 GenomeInfoDb_1.32.4 digest_0.6.31 foreach_1.5.2
[13] yulab.utils_0.0.6 htmltools_0.5.4 GOSemSim_2.22.0 viridis_0.6.2
[17] GO.db_3.15.0 fansi_1.0.4 magrittr_2.0.3 checkmate_2.1.0
[21] memoise_2.0.1 cluster_2.1.4 doParallel_1.0.17 annotate_1.74.0
[25] Biostrings_2.64.1 graphlayouts_0.8.4 matrixStats_0.63.0 prettyunits_1.1.1
[29] enrichplot_1.16.2 colorspace_2.1-0 blob_1.2.4 rappdirs_0.3.3
[33] ggrepel_0.9.3 xfun_0.37 dplyr_1.1.0 crayon_1.5.2
[37] RCurl_1.98-1.10 jsonlite_1.8.4 graph_1.74.0 scatterpie_0.1.8
[41] genefilter_1.78.0 survival_3.4-0 iterators_1.0.14 ape_5.7-1
[45] glue_1.6.2 polyclip_1.10-4 gtable_0.3.3 zlibbioc_1.42.0
[49] XVector_0.36.0 DelayedArray_0.22.0 GetoptLong_1.0.5 graphite_1.42.0
[53] shape_1.4.6 BiocGenerics_0.42.0 scales_1.2.1 DOSE_3.22.1
[57] DBI_1.1.3 xtable_1.8-4 progress_1.2.2 viridisLite_0.4.1
[61] clue_0.3-64 gridGraphics_0.5-1 tidytree_0.4.2 bit_4.0.5
[65] reactome.db_1.81.0 stats4_4.2.2 htmlwidgets_1.6.2 httr_1.4.5
[69] fgsea_1.22.0 XML_3.99-0.14 pkgconfig_2.0.3 farver_2.1.1
[73] dbplyr_2.3.2 locfit_1.5-9.7 utf8_1.2.3 ggplotify_0.1.0
[77] tidyselect_1.2.0 rlang_1.1.0 AnnotationDbi_1.58.0 munsell_0.5.0
[81] tools_4.2.2 cachem_1.0.7 cli_3.6.0 generics_0.1.3
[85] RSQLite_2.3.0 evaluate_0.20 stringr_1.5.0 fastmap_1.1.1
[89] yaml_2.3.7 ggtree_3.4.4 knitr_1.42 bit64_4.0.5
[93] tidygraph_1.2.3 caTools_1.18.2 purrr_1.0.1 KEGGREST_1.36.3
[97] ggraph_2.1.0 nlme_3.1-160 aplot_0.1.10 xml2_1.3.3
[101] DO.db_2.9 compiler_4.2.2 rstudioapi_0.14 filelock_1.0.2
[105] curl_5.0.0 png_0.1-8 treeio_1.20.2 geneplotter_1.74.0
[109] tibble_3.2.1 tweenr_2.0.2 stringi_1.7.12 lattice_0.20-45
[113] Matrix_1.5-3 vctrs_0.6.0 pillar_1.8.1 lifecycle_1.0.3
[117] GlobalOptions_0.1.2 data.table_1.14.8 bitops_1.0-7 GenomicRanges_1.48.0
[121] patchwork_1.1.2 qvalue_2.28.0 R6_2.5.1 KernSmooth_2.23-20
[125] IRanges_2.30.1 codetools_0.2-18 MASS_7.3-58.1 gtools_3.9.4
[129] SummarizedExperiment_1.26.1 DESeq2_1.36.0 rjson_0.2.21 withr_2.5.0
[133] S4Vectors_0.34.0 GenomeInfoDbData_1.2.8 hms_1.1.3 parallel_4.2.2
[137] ggfun_0.0.9 tidyr_1.3.0 MatrixGenerics_1.8.1 ggforce_0.4.1
[141] Biobase_2.56.0
Thank you Gordon for the quick response. I saw something along those lines in the NEWS notes so did try
But that didn't resolve the problem which is why I made the thread. There must have been some other conflict going on because I have now resolved the issue.
For anyone else with the same problem, my only solution was to completely reinstall R and all packages. I am now getting the expected data frame result full of pathway names.
Cheers!
The reason why installing limma didn't solve the problem was because you were using Bioconductor 3.15 rather than the current release, which is Bioconductor 3.16. So when you installed limma, you got the Bioconductor 3.15 version again. You only needed to upgrade to Bioconductor 3.16 -- there was no need to reinstall R.
Also no need for
force = TRUE
. You only use that to install an older version of a package.