KEGGSOAP, hgu95av2.db: limited functionality

0

Entering edit mode

Ludwig Geistlinger ▴ 70

@ludwig-geistlinger-3939

Last seen 2.7 years ago

USA/Boston/HMS

Dear BioC developers, I intend to map gene expression data on KEGG pathways. In more detail, I performed a DE analysis on gene expression data from a hgu95av2 chip and want to color particular genes in corresponding pathways. I found out that the KEGGSOAP package already implemented an awesome access to the KEGG API and I honestly appreciate the work that have been done here. However, the function mark.pathway.by.objects requires KEGG gene ids or at least KEGG orthology terms, while there is now way to map hgu95av2 probe IDs on KEGG gene IDs or KO terms (not in hgu95av2.db, keggorth, KEGG.db, etc.). I wondered why there are only selected functions of the KEGG API integrated in the KEGGSOAP package, especially why the "bconv" utility is not integrated, which allows to map foreign identifiers on KEGG identifiers. With "bconv" it would be easy for me to map hgu95av2 probe IDs on ENSEMBL/UNIGENE/UNIPROT/etc IDs (via hgu95av2.db) and then on KEGG IDs (via bconv). In addition, the original mark.pathway.by.objects function from the KEGG API allows to put in EC numbers which is not supported by the corresponding KEGGSOAP function. Could you please explain why there are these limitations and how it would be possible to extend the KEGGSOAP package to all of the function of the KEGG API ? Currently, my workaround is like that: (1) map the probe IDs onto ENSEMBL IDs (using hgu95av2.db) for the selected genes (2) In the meanwhile, I have to retrieve all KEGG entries for the particular pathway using "get.genes.by.pathway" and "bget" from KEGGSOAP (3) Then, I have to parse each of these entries for ENSEMBL ID and KO ID to create a dictionary ENSEMBL -> KO (4) I map the IDs from (1) onto KO using (3) This works but it is uncomfortable and, first of all, time consuming (because of (3)). Yours faithfully, Ludwig Geistlinger (Research for an ongoing diploma thesis) (University of Cape Town, Institute of Infectious Diseases)

Pathways hgu95av2 probe KEGGSOAP Pathways hgu95av2 probe KEGGSOAP • 1.7k views

ADD COMMENT • link updated 15.1 years ago by Marc Carlson ★ 7.2k • written 15.1 years ago by Ludwig Geistlinger ▴ 70

0

Entering edit mode

Marc Carlson ★ 7.2k

@marc-carlson-2264

Last seen 8.7 years ago

United States

Hi Ludwig, It is a little difficult for me to make sure that I am helping you sufficiently because you have not posted a specific example of something that you would like to see happen. The result is that I have no way to verify that what I am suggesting will answer your question. But it occurs to me that it might help you to look at the following few example genes and see how the KEGG Gene ID looks compared to the Entrez Gene ID: KEGG Gene ID: Entrez Gene ID: hsa:8355 <http: www.genome.jp="" dbget-bin="" www_bget?hsa:8355=""> 8355 <http: www.ncbi.nih.gov="" entrez="" query.fcgi?db="gene&cmd=Retrieve&dopt=G" raphics&list_uids="8355"> hsa:9081 <http: www.genome.jp="" dbget-bin="" www_bget?hsa:9081=""> 9081 <http: www.ncbi.nih.gov="" entrez="" query.fcgi?db="gene&cmd=Retrieve&dopt=G" raphics&list_uids="9081"> hsa:51054 <http: www.genome.jp="" dbget-bin="" www_bget?hsa:51054=""> 51054 <http: www.ncbi.nih.gov="" entrez="" query.fcgi?db="gene&cmd=Retrieve&dopt=G" raphics&list_uids="51054"> etc. I am sure that you see a pattern here. ;) And so I suspect that you might be making this more difficult than it needs to be. You can get the entrez gene ID from the hgu95av2ENTREZID mapping. Does this help? Marc Ludwig Geistlinger wrote: > Dear BioC developers, > > I intend to map gene expression data on KEGG pathways. > In more detail, I performed a DE analysis on gene expression data from a hgu95av2 chip and want to color particular genes in corresponding pathways. > I found out that the KEGGSOAP package already implemented an awesome access to the KEGG API and I honestly appreciate the work that have been done here. > However, the function mark.pathway.by.objects requires KEGG gene ids or at least KEGG orthology terms, while there is now way to map hgu95av2 probe IDs on KEGG gene IDs or KO terms (not in hgu95av2.db, keggorth, KEGG.db, etc.). > I wondered why there are only selected functions of the KEGG API integrated in the KEGGSOAP package, especially why the "bconv" utility is not integrated, which allows to map foreign identifiers on KEGG identifiers. > With "bconv" it would be easy for me to map hgu95av2 probe IDs on ENSEMBL/UNIGENE/UNIPROT/etc IDs (via hgu95av2.db) and then on KEGG IDs (via bconv). > In addition, the original mark.pathway.by.objects function from the KEGG API allows to put in EC numbers which is not supported by the corresponding KEGGSOAP function. > Could you please explain why there are these limitations and how it would be possible to extend the KEGGSOAP package to all of the function of the KEGG API ? > > Currently, my workaround is like that: > (1) map the probe IDs onto ENSEMBL IDs (using hgu95av2.db) for the selected genes > (2) In the meanwhile, I have to retrieve all KEGG entries for the particular pathway using "get.genes.by.pathway" and "bget" from KEGGSOAP > (3) Then, I have to parse each of these entries for ENSEMBL ID and KO ID to create a dictionary ENSEMBL -> KO > (4) I map the IDs from (1) onto KO using (3) > > This works but it is uncomfortable and, first of all, time consuming (because of (3)). > > Yours faithfully, > Ludwig Geistlinger > (Research for an ongoing diploma thesis) > (University of Cape Town, Institute of Infectious Diseases) > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >

ADD COMMENT • link 15.1 years ago Marc Carlson ★ 7.2k

Login before adding your answer.