pathview - a species in KEGG pathway but not in 'korg'
0
0
Entering edit mode
@70fba729
Last seen 18 months ago
China

Hello,

I work on RNA-seq data from Oryza sativa Japonica (TaxID: 39947)

There is only one genome sequence for this rice, but there are mainly two gene models. One system is known as 'osa' in KEGG, and it was from MSU-RGAP (http://rice.uga.edu; like 'LOC_Os03g02939'), and NCBI and phytozome (DOE) use its gene model. Another system is known as 'dosa' in KEGG, and it was from RAP-DB (https://rapdb.dna.affrc.go.jp/; like 'Os03g0121300') and RAP-DB and Ensembl use this gene model.

There are many resources for MSU gene model since NCBI-entrez also use this gene model. KEGG added 'osa' pathway in 2003 and 'dosa' pathway in 2012. Genome info for dosa: https://www.genome.jp/kegg-bin/show_organism?org=dosa Pathway map for dosa: https://www.genome.jp/kegg-bin/show_organism?menu_type=pathway_maps&org=dosa

As you can guess, I chose to use the gene model from RAP-DB, so I use 'dosa' for KEGG. I try to use pathview to visualize a list of interesting genes on 'dosa' KEGG pathway maps.

When I searched for 'dosa' from 'korg' in 'pathview' package, it showed no record, meaning that 'korg' do not contain 'dosa'. The total number of records is 8282 from pathview_1.38.0, so I think pathview does not include information for 'dosa'.

> data(korg, package="pathview")
> korg[korg[,3]=="dosa",]
     ktax.id tax.id kegg.code scientific.name common.name entrez.gnodes kegg.geneid ncbi.geneid ncbi.proteinid uniprot
> korg[korg[,3]=="osa",]
                ktax.id                  tax.id               kegg.code         scientific.name             common.name           entrez.gnodes 
               "T01015"                  "4530"                   "osa" "Oryza sativa japonica"         "Japanese rice"                     "1" 
            kegg.geneid             ncbi.geneid          ncbi.proteinid                 uniprot 
              "4351353"               "4351353"          "XP_015620368"                "Q6ATB4" 
> dim(korg)
[1] 8282   10

I also tried to build my own korg file based like pathview says my nonmodel species is unknown: "species invalid". However, it doesn't work.

>   korg <- cbind("ktax.id" = "T02163", "tax.id" = "39947", "kegg.code" = "dosa",
                        "scientific.name" = "Oryza sativa japonica", "common.name" = "Japanes rice",
                        "entrez.gnodes" = NA, "kegg.geneid" = NA, "ncbi.geneid" = NA,
                        "ncbi.proteinid" = NA, "uniprot" = NA)
> dosa00940 <- pathview(gene.data  = diff_cdsList,
                     pathway.id = "dosa00940", species = "dosa",
                     gene.idtype="KEGG",
                     limit      = list(gene=max(abs(diff_cdsList)), cpd=1))
Error in pathview(gene.data = diff_cdsList, pathway.id = "dosa00940",  : 
  This species is not annotated in KEGG!
> dosa00940 <- pathview(gene.data  = diff_cdsList,
                     pathway.id = "00940", species = "dosa",
                     gene.idtype="KEGG",
                     limit      = list(gene=max(abs(diff_cdsList)), cpd=1))
Error in pathview(gene.data = diff_cdsList, pathway.id = "00940", species = "dosa",  : 
  This species is not annotated in KEGG!
> 
> osa00940 <- pathview(gene.data  = diff_cdsList,
                     pathway.id = "00940", species = "osa",
                     gene.idtype="KEGG",
                     limit      = list(gene=max(abs(diff_cdsList)), cpd=1))
Warning: None of the genes or compounds mapped to the pathway!
Argument gene.idtype or cpd.idtype may be wrong.
Warning: No annotation package for the species osa, gene symbols not mapped!
Info: Working in directory /xxxx/xxxx
Info: Writing image file osa00940.pathview.png
>

Before I dig source codes from pathview package, I would like to get some help for this case where the additional genome for a model species is not included in pathview.

You can see more information about pathways of dosa. https://www.genome.jp/kegg-bin/show_organism?menu_type=pathway_maps&org=dosa https://rest.kegg.jp/list/pathway/dosa https://rest.kegg.jp/link/dosa/pathway

Thank you for your help,

Jiyoung

pathview KEGG Oryza_sativa • 1.6k views
ADD COMMENT
1
Entering edit mode

In your earlier posts in this thread I noticed that meanwhile you were able to accomplish your task through other ways, but this is also possible applying the pathview 'hack'. Key is that you still have to set "entrez.gnodes" = "1", although the input are obviously not entrez ids!

> library(pathview)
> 
> ## Create single line, korg object.
> ## Keep "entrez.gnodes" = "1", and for "kegg.geneid" and "uniprot"
> ## used random gene (to be sure).
> 
> korg <- cbind("ktax.id" = "T02163", "tax.id" = "39947", "kegg.code" = "dosa",
+               "scientific.name" = "Oryza sativa japonica", "common.name" = "Japanese rice",
+               "entrez.gnodes" = "1", "kegg.geneid" = "Os01t0100100-01", "ncbi.geneid" = NA,
+               "ncbi.proteinid" = NA, "uniprot" = "Q0JRI1")
> 
> korg
     ktax.id  tax.id  kegg.code scientific.name         common.name    
[1,] "T02163" "39947" "dosa"    "Oryza sativa japonica" "Japanese rice"
     entrez.gnodes kegg.geneid       ncbi.geneid ncbi.proteinid uniprot 
[1,] "1"           "Os01t0100100-01" NA          NA             "Q0JRI1"
> 
> ## Select 7 random dosa gene ids from the pathway you are interested in (00940)
> dosa.ids <- c("Os02t0626600-00", "Os07t0638300-01", "Os01t0901500-01", "Os06t0184900-01",
+               "Os06t0165800-01", "Os02t0467000-00", "Os05t0494000-01")
> 
> ## For these genes generate some random logFC values (used in visualization)
> data.logFC <- runif(n = length(dosa.ids), min = -5, max = 5)
> names(data.logFC) <- dosa.ids
> data.logFC[1:5]
Os02t0626600-00 Os07t0638300-01 Os01t0901500-01 Os06t0184900-01 Os06t0165800-01 
     -0.9160603      -1.5006204       2.3597816       1.9713893       2.9083099 
> 
> ## Create pathway, and save in working directory.
> pv.out <- pathview(gene.data =data.logFC, pathway.id = "00940",
+                    species = "dosa", out.suffix = "Japanese.rice",
+                    kegg.native = TRUE)
Note: Mapping via KEGG gene ID (not Entrez) is supported for this species,
it looks like "Os01t0100100-01"!
Info: Working in directory E:/000test
Info: Writing image file dosa00940.Japanese.rice.png
> 
> ## sessionInfo()
> 
> sessionInfo()
R version 4.3.0 (2023-04-21 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: Europe/Amsterdam
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] pathview_1.40.0

loaded via a namespace (and not attached):
 [1] crayon_1.5.2            vctrs_0.6.3             httr_1.4.6             
 [4] cli_3.6.1               rlang_1.1.1             DBI_1.1.3              
 [7] KEGGgraph_1.60.0        png_0.1-8               bit_4.0.5              
[10] S4Vectors_0.38.1        RCurl_1.98-1.12         Biostrings_2.68.1      
[13] XML_3.99-0.14           graph_1.78.0            org.Hs.eg.db_3.17.0    
[16] stats4_4.3.0            KEGGREST_1.40.0         Biobase_2.60.0         
[19] grid_4.3.0              fastmap_1.1.1           bitops_1.0-7           
[22] IRanges_2.34.0          GenomeInfoDb_1.36.0     memoise_2.0.1          
[25] compiler_4.3.0          RSQLite_2.3.1           blob_1.2.4             
[28] pkgconfig_2.0.3         XVector_0.40.0          Rgraphviz_2.44.0       
[31] R6_2.5.1                GenomeInfoDbData_1.2.10 AnnotationDbi_1.62.1   
[34] tools_4.3.0             bit64_4.0.5             zlibbioc_1.46.0        
[37] cachem_1.0.8            BiocGenerics_0.46.0    
> 

enter image description here

ADD REPLY
0
Entering edit mode

Hello Dr. Hooiveld,

Thank you so much for your help. I have modified korg file like your answer, and everything works perfectly! Your R codes and comments are clear. My problem is solved.

Thank you again, Jiyoung

ADD REPLY
0
Entering edit mode

Meanwhile, I found an alternative method from clusterProfile::browseURL(url) We can open a browse and directly visualize our genes on a desired pathway map. http://www.kegg.jp/kegg-bin/show_pathway?/[map_id]/[gene list separated by "/"] ex) https://www.kegg.jp/kegg-bin/show_pathway?dosa00940/Os02t0626600-00/Os07t0638300-01 Genes in the URL are highlighted.

Still, a problem is I cannot directly save the map.png. When I copied a link address on "Download" icon (https://www.kegg.jp/kegg-bin/show_pathway?dosa00940/Os02t0626600-00/Os07t0638300-01#downloadImage1x) on Terminal using wget (wget https://www.kegg.jp/kegg-bin/show_pathway?dosa00940/Os02t0626600-00/Os07t0638300-01#downloadImage1x), I didn't get the image but a html file.

I also tried wget -r -p option, then I downloaded too many other files and the png file was in 'www.kegg.jp/tmp/mark_pathway168688157088208/'. Too much trouble!

I still think pathview::pathview is the easiest solution, if 'dosa' can be included in pathview!

Additionally, I found this information from http://yulab-smu.top/biomedical-knowledge-mining-book/clusterprofiler-kegg.html

ADD REPLY
0
Entering edit mode

It still download other files, but I found a better version using wget options.

ex) downloading dosa00940.png map with two highlighted genes (Os02t0626600-00/Os07t0638300-01).

wget -nd -r -P dosa -A "dosa*.png" https://www.kegg.jp/kegg-bin/show_pathway?dosa00940/Os02t0626600-00/Os07t0638300-01#downloadImage1x
ADD REPLY

Login before adding your answer.

Traffic: 1854 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6