Dear all,
Fairly new to using pathview package (v1.34.00) and plotting KEGG pathways. I get a high fraction ( ~30%-50%) of genes plotted as NA when plotting KEGG pathways using TAIR or ENTREZ IDs. However when I check the KEGG page for the genes it appears that they have comparable TAIR IDs. For instance:
KEGG "Phtosynthesis" pathway [ath00195]:
1. Plotting gene psbA using KEGG ID:
library(pathview)
ath00195 <- pathview(gene.data = c("ArthCp002"), pathway.id = "ath00195", species = "ath", gene.idtype = "KEGG", na.col = "purple" )
Returns plot with psbA in red
2. Plot gene psbA using TAIR ID
ath00195 <- pathview(gene.data = c("ATCG00020"), pathway.id = "ath00195", species = "ath", gene.idtype = "TAIR", na.col = "purple" )
Returns error: "Error in mol.sum(gene.data, gene.idmap) : no ID can be mapped!"
3. KEGG page for psbA appears to list ATCG00020 as its TAIR ID:
https://www.genome.jp/dbget-bin/www_bget?ath:ArthCp002
Many thanks in advance for your reply and help,
Hello James thank you for your reply and apologies for the delayed reply on my behalf. I can follow your explanation that in this example the key conversion fails. But would not the NCBI gene ID listed in KEGG page (Ex. here 844802) be the ID we are looking for? In other words is this a dictionary update issue or there are other factors in play? Does not KEGG provide dictionaries that can be used for this conversion? Thanks again,
You can get the mapping from KEGG, and perhaps that's how
pathview
should do it. But for now it uses theorg.At.tair.db
package, which is built using data we can download from arabidopsis.org. And if you go to arabidopsis.org and search on that ID, there doesn't appear to be an NCBI Gene ID listed. It may be that KEGG maps the TAIR ID to UniProt and then to NCBI Gene ID, but that is way more complicated than we have the bandwidth to attempt. As it stands, generating the annotation packages the way we do right now is somewhere around 80 hours of work, and it's hard to come by the FTE to do that right before each release, which is a busy time already.