Hello pathview users and developers in the BioConductor community,
This is my first post asking for help with a package in R. From the get go, I am sorry if there are any basic elements missing in my question.
My issue is the following: I am trying to run pathview with expression data from potato RNAseq. My RNAseq was originally mapped on PGSC identifiers which I am able to convert into NCBI GeneID (Entrez) identifiers using gprofiler (https://biit.cs.ut.ee/gprofiler/gost).
An example data file can be found in my GitHub repository (https://github.com/andrebertran/andrebertran/blob/main/pathviewdata.csv). To prepare my file for pathview I did the following in R:
#Read table in correct format. First column is read as character, not numbers.
library(readr)
pathviewdata <- read_csv("pathviewdata.csv",
col_types = cols(...1 = col_character()))
View(pathviewdata)
#assign row names to the entrez identifiers
library(tidyverse)
pathviewdata <- column_to_rownames(pathviewdata, var = "...1")
#First trial using entrez as gene.idtype
pathview(gene.data = pathviewdata, pathway.id = "04075", gene.idtype = "entrez", species = "sot", limit = list(gene = 7, cpd = 7), out.suffix = "test27")
#Second trial using kegg as gene.idtype
pathview(gene.data = pathviewdata, pathway.id = "04075", gene.idtype = "kegg", species = "sot", limit = list(gene = 7, cpd = 7), out.suffix = "test28")
And I get, in both instances, the same warning message from the package:
#Warning: No annotation package for the species sot, gene symbols not mapped!
I have checked data(korg) to make sure that my species of interest is supported by KEGG and all seems to check out just fine. I realize that my question is similar to those seen in these previous posts (Pathview with minor species and http://seqanswers.com/forums/showthread.php?t=35472#6) but I sincerely couldn't follow the provided answers and would really appreciate a comprehensive help with this issue!
I have a vague idea that I need to download a sort of table from NCBI FTP website where the entrez gene IDs of potato are correlated to specific gene names (the gene abbreviations used by KEGG in their pathway maps) and then I need to assign this as a reference in pathview but I just don't know how to do this. It was not clear enough for me from the threads I mentioned above.
Any help is appreciated! Perhaps you can indicate me other software like pathview which can take in PGSC identifiers directly?
Cheers,
André
sessionInfo( )
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.2.1
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.8 purrr_0.3.4 tidyr_1.2.0 tibble_3.1.6
[7] ggplot2_3.3.5 tidyverse_1.3.1 readr_2.1.2 pathview_1.34.0 writexl_1.4.0
loaded via a namespace (and not attached):
[1] Biobase_2.54.0 httr_1.4.2 bit64_4.0.5 vroom_1.5.7
[5] jsonlite_1.7.3 modelr_0.1.8 assertthat_0.2.1 stats4_4.1.2
[9] blob_1.2.2 GenomeInfoDbData_1.2.7 cellranger_1.1.0 pillar_1.7.0
[13] RSQLite_2.2.9 backports_1.4.1 glue_1.6.1 XVector_0.34.0
[17] rvest_1.0.2 colorspace_2.0-2 XML_3.99-0.8 pkgconfig_2.0.3
[21] broom_0.7.12 haven_2.4.3 zlibbioc_1.40.0 scales_1.1.1
[25] tzdb_0.2.0 KEGGREST_1.34.0 generics_0.1.2 IRanges_2.28.0
[29] ellipsis_0.3.2 cachem_1.0.6 withr_2.4.3 BiocGenerics_0.40.0
[33] cli_3.2.0 magrittr_2.0.2 crayon_1.5.0 readxl_1.3.1
[37] memoise_2.0.1 KEGGgraph_1.54.0 fs_1.5.2 fansi_1.0.2
[41] xml2_1.3.3 graph_1.72.0 tools_4.1.2 hms_1.1.1
[45] org.Hs.eg.db_3.14.0 lifecycle_1.0.1 S4Vectors_0.32.3 munsell_0.5.0
[49] reprex_2.0.1 AnnotationDbi_1.56.2 Biostrings_2.62.0 compiler_4.1.2
[53] GenomeInfoDb_1.30.1 rlang_1.0.1 grid_4.1.2 RCurl_1.98-1.6
[57] rstudioapi_0.13 bitops_1.0-7 gtable_0.3.0 DBI_1.1.2
[61] R6_2.5.1 lubridate_1.8.0 fastmap_1.1.0 bit_4.0.4
[65] utf8_1.2.2 Rgraphviz_2.38.0 stringi_1.7.6 parallel_4.1.2
[69] Rcpp_1.0.8 vctrs_0.3.8 png_0.1-7 dbplyr_2.1.1
[73] tidyselect_1.1.1
Dear André,
I am facing the same problem as you mentioned here. Were you able to solve it?
Thank you in advance, Regrads,
Rocío