How to use GAGE with my own metagenomic KOFams data
0
0
Entering edit mode
@a300472c
Last seen 6 months ago
Hong Kong

I want to run GAGE with my own metagenomic KOFam functionally annotated data for a wide range of species from all domains of life. I can run the test data but don't understand most of the tutorials (below) so can't adapt it to my own data. https://bioconductor.org/packages/release/bioc/vignettes/gage/inst/doc/dataPrep.pdf https://rdrr.io/bioc/gage/man/kegg.gsets.html https://bioconductor.org/packages/release/bioc/vignettes/gage/inst/doc/gage.pdf

First, I know I have to use species=ko to look up all KEGG IDs and not just specific species, but I'm not sure how to use that (see code below) or at what point or how to do anything else.

Another problem is that my IDs are Pathway-KO-IDs or Brite-KO-IDs and not KO-IDs: E.g. KO-pathway-ID=ko00010 (Glycolysis / Gluconeogenesis) whereas KO-ID=K00010 (myo-inositol 2-dehydrogenase / D-chiro-inositol 1-dehydrogenase). They are different.

Also some of the more generic, higher-level(1) KOFam annotations in my data don't come with a Pathway/Brite-KO-ID, eg "Enzymes with EC numbers" has no ID number. So idk if I can use my KOFam output for GAGE with KEGG. I think it says in the Bioconductor package manuals that you can change the IDs in your dataset to match the KEGG ones but I can't figure out how.

Can someone please explain veeery in detail and step by step how to use my own data for KEGG-GAGE? I can share my data with you if necessary. Thanks so much!

```kegg.gsets(species = "ko", id.type = "kegg", check.new=FALSE)

```kegg.gsets(species = "ko", id.type = "kegg", check.new=FALSE)

```sessionInfo( ) R version 4.4.1 (2024-06-14 ucrt) Platform: x86_64-w64-mingw32/x64 Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale: [1] LC_COLLATE=English_Hong Kong SAR.utf8 [2] LC_CTYPE=English_Hong Kong SAR.utf8
[3] LC_MONETARY=English_Hong Kong SAR.utf8 [4] LC_NUMERIC=C
[5] LC_TIME=English_Hong Kong SAR.utf8

time zone: Asia/Hong_Kong tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets methods
[7] base

other attached packages: [1] gage_2.54.0 rain_1.38.0 multtest_2.60.0
[4] Biobase_2.64.0 BiocGenerics_0.50.0 gmp_0.7-4
[7] pracma_2.4.4 compositions_2.0-8

loaded via a namespace (and not attached): [1] KEGGREST_1.44.1 SummarizedExperiment_1.34.0 [3] gtable_0.3.5 tensorA_0.36.2.1
[5] ggplot2_3.5.1 lattice_0.22-6
[7] vctrs_0.6.5 tools_4.4.1
[9] generics_0.1.3 curl_5.2.1
[11] stats4_4.4.1 parallel_4.4.1
[13] RSQLite_2.3.7 AnnotationDbi_1.66.0
[15] tibble_3.2.1 fansi_1.0.6
[17] blob_1.2.4 DEoptimR_1.1-3
[19] pkgconfig_2.0.3 Matrix_1.7-0
[21] S4Vectors_0.42.0 graph_1.82.0
[23] lifecycle_1.0.4 GenomeInfoDbData_1.2.12
[25] compiler_4.4.1 Biostrings_2.72.1
[27] munsell_0.5.1 DESeq2_1.44.0
[29] codetools_0.2-20 GenomeInfoDb_1.40.1
[31] GO.db_3.19.1 pillar_1.9.0
[33] crayon_1.5.3 MASS_7.3-61
[35] BiocParallel_1.38.0 cachem_1.1.0
[37] DelayedArray_0.30.1 abind_1.4-5
[39] robustbase_0.99-3 tidyselect_1.2.1
[41] locfit_1.5-9.10 dplyr_1.1.4
[43] splines_4.4.1 fastmap_1.2.0
[45] grid_4.4.1 colorspace_2.1-0
[47] cli_3.6.3 SparseArray_1.4.8
[49] magrittr_2.0.3 S4Arrays_1.4.1
[51] survival_3.7-0 utf8_1.2.4
[53] scales_1.3.0 UCSC.utils_1.0.0
[55] bit64_4.0.5 XVector_0.44.0
[57] httr_1.4.7 matrixStats_1.3.0
[59] bit_4.0.5 png_0.1-8
[61] memoise_2.0.1 GenomicRanges_1.56.1
[63] IRanges_2.38.0 rlang_1.1.4
[65] Rcpp_1.0.12 DBI_1.2.3
[67] glue_1.7.0 bayesm_3.1-6
[69] rstudioapi_0.16.0 jsonlite_1.8.8
[71] R6_2.5.1 MatrixGenerics_1.16.0
[73] zlibbioc_1.50.0
```

kegg.gsets gageData gage KEGG • 338 views
ADD COMMENT

Login before adding your answer.

Traffic: 910 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6