[limma::kegga] Pathway names all NA
1
0
Entering edit mode
andrewsm • 0
@2e1e73e1
Last seen 22 months ago
Australia

I have been using edgeR and limma for years but unfortunately two weeks ago I began encountering an issue with limma’s kegga() function that I haven’t seen before and haven’t been able to resolve.

The data frame created by this function is filled with NA values for the ‘Pathway’ column, even though rownames of the resulting data frame show the correct pathways, e.g. path:mmu00010, path:mmu00220, etc.

Here is an example where "results all" is the output from edgeR::topTags on a glmQLFTest object:

> ensembl=useMart("ensembl")
> ensembl=useDataset("mmusculus_gene_ensembl", mart=ensembl)

> genes.with.id <- getBM(attributes = c("external_gene_name", "entrezgene_id"), filters = "external_gene_name", values = rownames(results_all), mart = ensembl)
> genes <- results_all[results_all$table$PValue <= 0.05,]
> up_de <- rownames(genes[genes$table$logFC > 0,])
> up_de.id <- getBM(attributes = c("external_gene_name", "entrezgene_id"), filters ="external_gene_name", values = up_de, mart = ensembl)

> keg_up <- kegga(up_de.id$entrezgene_id, universe = genes.with.id$entrezgene_id, coef = ncol(results_all), species = "Mm")
> topKEGG(keg_up, n=10)
              Pathway    N DE         P.DE
path:mmu04260    <NA>   70 11 3.692187e-06
path:mmu00280    <NA>   46  9 4.405234e-06
path:mmu04261    <NA>  130 14 1.736579e-05
path:mmu00250    <NA>   26  6 6.808361e-05
path:mmu01100    <NA> 1196 55 1.829979e-04
path:mmu05415    <NA>  181 15 1.846285e-04
path:mmu01230    <NA>   59  8 2.306363e-04
path:mmu00220    <NA>   13  4 3.604337e-04
path:mmu05414    <NA>   82  9 4.807895e-04
path:mmu01210    <NA>   15  4 6.580423e-04

This is even now occurring when I re-analyse a set of results from several months ago that previously worked correctly. The goana() function is not having the same problem and I initially suspected that the issue was with the KEGG database but getKEGGPathwayNames() appears to work - see chunk below - so I wanted to confirm if anyone is able to replicate this behaviour or can suggest why this is happening?

> head(getKEGGPathwayNames("mmu"), 10)
   PathwayID                                                    Description
1   mmu01100                Metabolic pathways - Mus musculus (house mouse)
2   mmu01200                 Carbon metabolism - Mus musculus (house mouse)
3   mmu01210   2-Oxocarboxylic acid metabolism - Mus musculus (house mouse)
4   mmu01212             Fatty acid metabolism - Mus musculus (house mouse)
5   mmu01230       Biosynthesis of amino acids - Mus musculus (house mouse)
6   mmu01232             Nucleotide metabolism - Mus musculus (house mouse)
7   mmu01250 Biosynthesis of nucleotide sugars - Mus musculus (house mouse)
8   mmu01240         Biosynthesis of cofactors - Mus musculus (house mouse)
9   mmu00010      Glycolysis / Gluconeogenesis - Mus musculus (house mouse)
10  mmu00020         Citrate cycle (TCA cycle) - Mus musculus (house mouse)

Thank you in advance and here is my sessionInfo().

> sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.utf8  LC_CTYPE=English_Australia.utf8    LC_MONETARY=English_Australia.utf8
[4] LC_NUMERIC=C                       LC_TIME=English_Australia.utf8    

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] gridExtra_2.3         Glimma_2.6.0          biomaRt_2.52.0        Rcpp_1.0.10           ReactomePA_1.40.0     ComplexHeatmap_2.12.1
 [7] circlize_0.4.15       rmarkdown_2.20        htmlTable_2.4.1       RColorBrewer_1.1-3    reshape2_1.4.4        ggplot2_3.4.1        
[13] gplots_3.1.3          edgeR_3.38.4          limma_3.52.4          statmod_1.5.0        

loaded via a namespace (and not attached):
  [1] shadowtext_0.1.2            backports_1.4.1             fastmatch_1.1-3             BiocFileCache_2.4.0        
  [5] plyr_1.8.8                  igraph_1.4.1                lazyeval_0.2.2              splines_4.2.2              
  [9] BiocParallel_1.30.4         GenomeInfoDb_1.32.4         digest_0.6.31               foreach_1.5.2              
 [13] yulab.utils_0.0.6           htmltools_0.5.4             GOSemSim_2.22.0             viridis_0.6.2              
 [17] GO.db_3.15.0                fansi_1.0.4                 magrittr_2.0.3              checkmate_2.1.0            
 [21] memoise_2.0.1               cluster_2.1.4               doParallel_1.0.17           annotate_1.74.0            
 [25] Biostrings_2.64.1           graphlayouts_0.8.4          matrixStats_0.63.0          prettyunits_1.1.1          
 [29] enrichplot_1.16.2           colorspace_2.1-0            blob_1.2.4                  rappdirs_0.3.3             
 [33] ggrepel_0.9.3               xfun_0.37                   dplyr_1.1.0                 crayon_1.5.2               
 [37] RCurl_1.98-1.10             jsonlite_1.8.4              graph_1.74.0                scatterpie_0.1.8           
 [41] genefilter_1.78.0           survival_3.4-0              iterators_1.0.14            ape_5.7-1                  
 [45] glue_1.6.2                  polyclip_1.10-4             gtable_0.3.3                zlibbioc_1.42.0            
 [49] XVector_0.36.0              DelayedArray_0.22.0         GetoptLong_1.0.5            graphite_1.42.0            
 [53] shape_1.4.6                 BiocGenerics_0.42.0         scales_1.2.1                DOSE_3.22.1                
 [57] DBI_1.1.3                   xtable_1.8-4                progress_1.2.2              viridisLite_0.4.1          
 [61] clue_0.3-64                 gridGraphics_0.5-1          tidytree_0.4.2              bit_4.0.5                  
 [65] reactome.db_1.81.0          stats4_4.2.2                htmlwidgets_1.6.2           httr_1.4.5                 
 [69] fgsea_1.22.0                XML_3.99-0.14               pkgconfig_2.0.3             farver_2.1.1               
 [73] dbplyr_2.3.2                locfit_1.5-9.7              utf8_1.2.3                  ggplotify_0.1.0            
 [77] tidyselect_1.2.0            rlang_1.1.0                 AnnotationDbi_1.58.0        munsell_0.5.0              
 [81] tools_4.2.2                 cachem_1.0.7                cli_3.6.0                   generics_0.1.3             
 [85] RSQLite_2.3.0               evaluate_0.20               stringr_1.5.0               fastmap_1.1.1              
 [89] yaml_2.3.7                  ggtree_3.4.4                knitr_1.42                  bit64_4.0.5                
 [93] tidygraph_1.2.3             caTools_1.18.2              purrr_1.0.1                 KEGGREST_1.36.3            
 [97] ggraph_2.1.0                nlme_3.1-160                aplot_0.1.10                xml2_1.3.3                 
[101] DO.db_2.9                   compiler_4.2.2              rstudioapi_0.14             filelock_1.0.2             
[105] curl_5.0.0                  png_0.1-8                   treeio_1.20.2               geneplotter_1.74.0         
[109] tibble_3.2.1                tweenr_2.0.2                stringi_1.7.12              lattice_0.20-45            
[113] Matrix_1.5-3                vctrs_0.6.0                 pillar_1.8.1                lifecycle_1.0.3            
[117] GlobalOptions_0.1.2         data.table_1.14.8           bitops_1.0-7                GenomicRanges_1.48.0       
[121] patchwork_1.1.2             qvalue_2.28.0               R6_2.5.1                    KernSmooth_2.23-20         
[125] IRanges_2.30.1              codetools_0.2-18            MASS_7.3-58.1               gtools_3.9.4               
[129] SummarizedExperiment_1.26.1 DESeq2_1.36.0               rjson_0.2.21                withr_2.5.0                
[133] S4Vectors_0.34.0            GenomeInfoDbData_1.2.8      hms_1.1.3                   parallel_4.2.2             
[137] ggfun_0.0.9                 tidyr_1.3.0                 MatrixGenerics_1.8.1        ggforce_0.4.1              
[141] Biobase_2.56.0
KEGG limma • 1.8k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 21 hours ago
WEHI, Melbourne, Australia

This is due to a change in the KEGG website. I updated limma a month ago to fix the problem, so just reinstall limma and the problem will be gone.

The reason for the problem is that the KEGG website is no longer consistent, in that some pages add path: at the start of the pathway name and some don't. So limma was trying to match mmu01100 with path:mmu01100 and assuming they were different pathways.

ADD COMMENT
0
Entering edit mode

Thank you Gordon for the quick response. I saw something along those lines in the NEWS notes so did try

BiocManager:install("limma", force=TRUE)

But that didn't resolve the problem which is why I made the thread. There must have been some other conflict going on because I have now resolved the issue.

For anyone else with the same problem, my only solution was to completely reinstall R and all packages. I am now getting the expected data frame result full of pathway names.

Cheers!

ADD REPLY
0
Entering edit mode

The reason why installing limma didn't solve the problem was because you were using Bioconductor 3.15 rather than the current release, which is Bioconductor 3.16. So when you installed limma, you got the Bioconductor 3.15 version again. You only needed to upgrade to Bioconductor 3.16 -- there was no need to reinstall R.

Also no need for force = TRUE. You only use that to install an older version of a package.

ADD REPLY

Login before adding your answer.

Traffic: 716 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6