GOstats kegg analysis in wheat - help - none of the objects in the first column are legitimate kegg ids (but they are)
1
0
Entering edit mode
@8968bc6b
Last seen 9 days ago
Australia

Hello,

I'm attempting to use the GOstats package to do a differential KEGG pathway analysis in wheat (non-model organism without an organism package).

I'm following the guidelines from GOstats for unsupported organisms here. The problem is at step 1.3, using KEGGFrame in a KeggID/GeneID data frame, where the function returns a message saying the KEGG Ids are not valid. The problem also occurs when replacing "taes" with "eco" in KeggLink and using this data in KEGGFrame.

The fact that the KEGG ids were obtained using KEGGREST suggest that they are legitimate Kegg IDs (and searching them on Kegg website returns valid genes). Has anyone seen this before and any ideas how to get over it? I've tried not removing the path id text with stringr, but that didn't help.

I've just freshly installed R4.4 and updated all packages on Rstudio, however I code in Microsoft visual studio code.

Hope someone is able to help. Best regards, GV Yoshikawa


> keggpath <- keggLink("pathway", "taes")
> keggpath_df <- data.frame(path_id = keggpath, 
+                  kegg_id = names(keggpath),
+                  stringsAsFactors = FALSE)
> head(keggpath_df)
         path_id        kegg_id
1 path:taes00010 taes:100037593
2 path:taes00010 taes:100038341
3 path:taes00010 taes:100125727
4 path:taes00010 taes:100415821
5 path:taes00010 taes:100415882
6 path:taes00010 taes:100682413
> > keggframeData <- keggpath_df %>% 
+                 dplyr::select(kegg_id, path_id) %>%
+                 mutate(path_id = str_remove(path_id, "^.*:"),
+                         kegg_id = str_remove(kegg_id, "^.*:"))
> head(keggframeData)
    kegg_id   path_id
1 100037593 taes00010
2 100038341 taes00010
3 100125727 taes00010
4 100415821 taes00010
5 100415882 taes00010
6 100682413 taes00010
> keggframeData$kegg_id <- as.character(keggframeData$kegg_id)
> keggframeData$path_id <- as.character(keggframeData$path_id)
> keggFrame <- KEGGFrame(keggframeData)
Error in KEGGFrame(keggframeData) : 
  None of elements in the 1st column of your data.frame object are legitimate KEGG IDs.
>

> sessionInfo()
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 10 x64 (build 19045)

Matrix products: default


locale:
[1] LC_COLLATE=English_Australia.utf8  LC_CTYPE=English_Australia.utf8
[3] LC_MONETARY=English_Australia.utf8 LC_NUMERIC=C
[5] LC_TIME=English_Australia.utf8

time zone: Australia/Adelaide
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] stringr_1.5.1        GSEABase_1.68.0      annotate_1.84.0
 [4] XML_3.99-0.17        GOstats_2.72.0       graph_1.84.0
 [7] Category_2.72.0      Matrix_1.7-1         AnnotationDbi_1.68.0
[10] IRanges_2.40.0       S4Vectors_0.44.0     Biobase_2.66.0
[13] BiocGenerics_0.52.0  dplyr_1.1.4          KEGGREST_1.46.0

loaded via a namespace (and not attached):
 [1] utf8_1.2.4              generics_0.1.3          bitops_1.0-9
 [4] stringi_1.8.4           RSQLite_2.3.7           lattice_0.22-6
 [7] magrittr_2.0.3          grid_4.4.1              genefilter_1.88.0
[10] GO.db_3.20.0            fastmap_1.2.0           blob_1.2.4
[13] jsonlite_1.8.9          GenomeInfoDb_1.42.0     DBI_1.2.3
[16] survival_3.7-0          httr_1.4.7              fansi_1.0.6
[19] UCSC.utils_1.2.0        Rgraphviz_2.50.0        Biostrings_2.74.0
[22] cli_3.6.2               rlang_1.1.3             crayon_1.5.3
[25] XVector_0.46.0          AnnotationForge_1.48.0  splines_4.4.1
[28] bit64_4.5.2             withr_3.0.2             cachem_1.1.0
[31] tools_4.4.1             memoise_2.0.1           GenomeInfoDbData_1.2.13
[34] curl_5.2.3              vctrs_0.6.5             R6_2.5.1
[37] png_0.1-8               matrixStats_1.4.1       lifecycle_1.0.4        
[40] zlibbioc_1.52.0         RBGL_1.82.0             bit_4.5.0
[43] pkgconfig_2.0.3         pillar_1.9.0            glue_1.8.0
[46] tibble_3.2.1            tidyselect_1.2.1        MatrixGenerics_1.18.0
[49] xtable_1.8-4            compiler_4.4.1          RCurl_1.98-1.16
>
AnnotationDbi GOstats • 671 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 11 hours ago
United States

Check the help page for KEGGFrame. Your columns are out of order.

0
Entering edit mode

Thank you for the reply. I tried swapping columns before asking, it made no difference.

    head(keggframeData)
    path_id   kegg_id
1 taes00010 100037593
2 taes00010 100038341
3 taes00010 100125727
4 taes00010 100415821
5 taes00010 100415882
6 taes00010 100682413
> keggframeData$kegg_id <- as.character(keggframeData$kegg_id)
> keggframeData$path_id <- as.character(keggframeData$path_id)
> #keggframeData = data.frame(frame$path_id, frame$gene_id)
> keggFrame <- KEGGFrame(keggframeData)
Error in KEGGFrame(keggframeData) : 
  None of elements in the 1st column of your data.frame object are legitimate KEGG IDs.
>

I'm aware that the KEGGframe package is deprecated, and wheat possibly having been added to the database after the fact, hence why it doesn't recognise the IDs. I was hoping someone would have figured out a way around the issue before I spend days on this.

ADD REPLY

Login before adding your answer.

Traffic: 508 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6