hs.Mm.inp.db problem
1
0
Entering edit mode
@iain-gallagher-2532
Last seen 9.4 years ago
United Kingdom
Hello List I am trying to map ~5000 mouse genes to human genes using the inparanoid package and I am failing miserably! Having followed the example in the documentation I can't get any of my 5000 mouse genes converted to human EG ids. Example follows with 3 genes only: rm(list=ls()) library(hom.Mm.inp.db) library(org.Mm.eg.db) library(org.Hs.eg.db) #mouse genes in as symbols dataIn <- c('Ints7', 'Upp1', 'Cdc2a') #map these to mouse EG ids egIds <- revmap(org.Mm.egSYMBOL) mapped <- mappedkeys(egIds) egIds <- as.list(egIds[mapped]) ind <- which(names(egIds)%in%dataIn) egIdsIn <- egIds[ind] #map these IDs to ENSEMBL protein Ids as used for the inparanoid mapping mouseProtIds <- mget(unlist(egIdsIn),org.Mm.egENSEMBLPROT) mouseProtIds <- mouseProtIds[!is.na(mouseProtIds)] #this is the point of failure! rawHumanProtIds <- mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) the returned list is full of NA Using biomart on the Ensembl site I can get: Ensembl Transcript ID Human Ensembl Protein ID ENSMUST00000020099 ENSP00000397973 For example, for Cdc2a, so I know there are homologs there, but for some reason the inparanoid package is not working for me. Using the example in the documentation it does work though so I'm assuming the mistake is with me. Can anyone help with this (more curiosity now - I can get the data through biomart)? Cheers Iain
biomaRt biomaRt • 1.5k views
ADD COMMENT
0
Entering edit mode
@iain-gallagher-2532
Last seen 9.4 years ago
United Kingdom
Hi - Just a follow up post. The title should of course be hom.Mm.inp.db problem and session info is below: > sessionInfo() R version 2.9.0 (2009-04-17) x86_64-pc-linux-gnu locale: LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB .UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_N AME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTI FICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] org.Hs.eg.db_2.2.11 org.Mm.eg.db_2.2.11 hom.Mm.inp.db_2.2.11 [4] RSQLite_0.7-1 DBI_0.2-4 AnnotationDbi_1.6.0 [7] Biobase_2.4.1 > Thanks Iain --- On Thu, 12/11/09, Iain Gallagher <iaingallagher at="" btopenworld.com=""> wrote: > From: Iain Gallagher <iaingallagher at="" btopenworld.com=""> > Subject: [BioC] hs.Mm.inp.db problem > To: bioconductor at stat.math.ethz.ch > Date: Thursday, 12 November, 2009, 18:41 > Hello List > > I am trying to map ~5000 mouse genes to human genes using > the inparanoid package and I am failing miserably! > > Having followed the example in the documentation I can't > get any of my 5000 mouse genes converted to human EG ids. > > Example follows with 3 genes only: > > rm(list=ls()) > > library(hom.Mm.inp.db) > library(org.Mm.eg.db) > library(org.Hs.eg.db) > > #mouse genes in as symbols > dataIn <- c('Ints7', 'Upp1', 'Cdc2a') > > #map these to mouse EG ids > egIds <- revmap(org.Mm.egSYMBOL) > mapped <- mappedkeys(egIds) > egIds <- as.list(egIds[mapped]) > ind <- which(names(egIds)%in%dataIn) > egIdsIn <- egIds[ind] > #map these IDs to ENSEMBL protein Ids as used for the > inparanoid mapping > mouseProtIds <- > mget(unlist(egIdsIn),org.Mm.egENSEMBLPROT) > mouseProtIds <- mouseProtIds[!is.na(mouseProtIds)] > > #this is the point of failure! > rawHumanProtIds <- > mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) > > > the returned list is full of NA > > Using biomart on the Ensembl site I can get: > > Ensembl Transcript ID? ? Human Ensembl Protein > ID > ENSMUST00000020099? ? > ???ENSP00000397973 > > For example, for Cdc2a, so I know there are homologs there, > but for some reason the inparanoid package is not working > for me. > Using the example in the documentation it does work though > so I'm assuming the mistake is with me. > > Can anyone help with this (more curiosity now - I can get > the data through biomart)? > > Cheers > > Iain > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Hi Iain, The trouble you are having is because inparanoid uses Jackson lab IDs (MGI) instead of ensembl protein IDs when representing mouse. So this script should work better: library(hom.Mm.inp.db) library(org.Mm.eg.db) library(org.Hs.eg.db) dataIn <- c('Ints7', 'Upp1', 'Cdc2a') egs <- mget(dataIn,revmap(org.Mm.egSYMBOL)) ## this is what you want right here: mouseProtIds <- mget(unlist(egs),org.Mm.egMGI) mouseProtIds <- mouseProtIds[!is.na(mouseProtIds)] rawHumanProtIds <- mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) ##etc. Hope this helps, Marc Iain Gallagher wrote: > Hi - Just a follow up post. > > The title should of course be hom.Mm.inp.db problem and session info is below: > > >> sessionInfo() >> > R version 2.9.0 (2009-04-17) > x86_64-pc-linux-gnu > > locale: > LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_ GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC _NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDEN TIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] org.Hs.eg.db_2.2.11 org.Mm.eg.db_2.2.11 hom.Mm.inp.db_2.2.11 > [4] RSQLite_0.7-1 DBI_0.2-4 AnnotationDbi_1.6.0 > [7] Biobase_2.4.1 > > Thanks > > Iain > > --- On Thu, 12/11/09, Iain Gallagher <iaingallagher at="" btopenworld.com=""> wrote: > > >> From: Iain Gallagher <iaingallagher at="" btopenworld.com=""> >> Subject: [BioC] hs.Mm.inp.db problem >> To: bioconductor at stat.math.ethz.ch >> Date: Thursday, 12 November, 2009, 18:41 >> Hello List >> >> I am trying to map ~5000 mouse genes to human genes using >> the inparanoid package and I am failing miserably! >> >> Having followed the example in the documentation I can't >> get any of my 5000 mouse genes converted to human EG ids. >> >> Example follows with 3 genes only: >> >> rm(list=ls()) >> >> library(hom.Mm.inp.db) >> library(org.Mm.eg.db) >> library(org.Hs.eg.db) >> >> #mouse genes in as symbols >> dataIn <- c('Ints7', 'Upp1', 'Cdc2a') >> >> #map these to mouse EG ids >> egIds <- revmap(org.Mm.egSYMBOL) >> mapped <- mappedkeys(egIds) >> egIds <- as.list(egIds[mapped]) >> ind <- which(names(egIds)%in%dataIn) >> egIdsIn <- egIds[ind] >> #map these IDs to ENSEMBL protein Ids as used for the >> inparanoid mapping >> mouseProtIds <- >> mget(unlist(egIdsIn),org.Mm.egENSEMBLPROT) >> mouseProtIds <- mouseProtIds[!is.na(mouseProtIds)] >> >> #this is the point of failure! >> rawHumanProtIds <- >> mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) >> >> >> the returned list is full of NA >> >> Using biomart on the Ensembl site I can get: >> >> Ensembl Transcript ID Human Ensembl Protein >> ID >> ENSMUST00000020099 >> ENSP00000397973 >> >> For example, for Cdc2a, so I know there are homologs there, >> but for some reason the inparanoid package is not working >> for me. >> Using the example in the documentation it does work though >> so I'm assuming the mistake is with me. >> >> Can anyone help with this (more curiosity now - I can get >> the data through biomart)? >> >> Cheers >> >> Iain >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY
0
Entering edit mode
Thanks Mark Works a treat. Iain --- On Thu, 12/11/09, Marc Carlson <mcarlson at="" fhcrc.org=""> wrote: > From: Marc Carlson <mcarlson at="" fhcrc.org=""> > Subject: Re: [BioC] hs.Mm.inp.db problem > To: "Iain Gallagher" <iaingallagher at="" btopenworld.com=""> > Cc: bioconductor at stat.math.ethz.ch > Date: Thursday, 12 November, 2009, 20:29 > Hi Iain, > > The trouble you are having is because inparanoid uses > Jackson lab IDs > (MGI) instead of ensembl protein IDs when representing > mouse. > > So this script should work better: > > library(hom.Mm.inp.db) > library(org.Mm.eg.db) > library(org.Hs.eg.db) > > dataIn <- c('Ints7', 'Upp1', 'Cdc2a') > egs <- mget(dataIn,revmap(org.Mm.egSYMBOL)) > > ## this is what you want right here: > mouseProtIds <- mget(unlist(egs),org.Mm.egMGI)? > mouseProtIds <- mouseProtIds[!is.na(mouseProtIds)] > > rawHumanProtIds <- > mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) > > ##etc. > > Hope this helps, > > > ? Marc > > > > Iain Gallagher wrote: > > Hi - Just a follow up post. > > > > The title should of course be hom.Mm.inp.db problem > and session info is below: > > > >??? > >> sessionInfo() > >>? ??? > > R version 2.9.0 (2009-04-17) > > x86_64-pc-linux-gnu > > > > locale: > > > LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_ GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC _NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDEN TIFICATION=C > > > > attached base packages: > > [1] stats? ???graphics? > grDevices utils? ???datasets? > methods???base? ??? > > > > other attached packages: > > [1] org.Hs.eg.db_2.2.11? > org.Mm.eg.db_2.2.11? hom.Mm.inp.db_2.2.11 > > [4] RSQLite_0.7-1? ? ? ? > DBI_0.2-4? ? ? ? ? ? > AnnotationDbi_1.6.0 > > [7] Biobase_2.4.1? ? ??? > >??? > > Thanks > > > > Iain > > > > --- On Thu, 12/11/09, Iain Gallagher <iaingallagher at="" btopenworld.com=""> > wrote: > > > >??? > >> From: Iain Gallagher <iaingallagher at="" btopenworld.com=""> > >> Subject: [BioC] hs.Mm.inp.db problem > >> To: bioconductor at stat.math.ethz.ch > >> Date: Thursday, 12 November, 2009, 18:41 > >> Hello List > >> > >> I am trying to map ~5000 mouse genes to human > genes using > >> the inparanoid package and I am failing > miserably! > >> > >> Having followed the example in the documentation I > can't > >> get any of my 5000 mouse genes converted to human > EG ids. > >> > >> Example follows with 3 genes only: > >> > >> rm(list=ls()) > >> > >> library(hom.Mm.inp.db) > >> library(org.Mm.eg.db) > >> library(org.Hs.eg.db) > >> > >> #mouse genes in as symbols > >> dataIn <- c('Ints7', 'Upp1', 'Cdc2a') > >> > >> #map these to mouse EG ids > >> egIds <- revmap(org.Mm.egSYMBOL) > >> mapped <- mappedkeys(egIds) > >> egIds <- as.list(egIds[mapped]) > >> ind <- which(names(egIds)%in%dataIn) > >> egIdsIn <- egIds[ind] > >> #map these IDs to ENSEMBL protein Ids as used for > the > >> inparanoid mapping > >> mouseProtIds <- > >> mget(unlist(egIdsIn),org.Mm.egENSEMBLPROT) > >> mouseProtIds <- > mouseProtIds[!is.na(mouseProtIds)] > >> > >> #this is the point of failure! > >> rawHumanProtIds <- > >> > mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) > >> > >> > >> the returned list is full of NA > >> > >> Using biomart on the Ensembl site I can get: > >> > >> Ensembl Transcript ID? ? Human Ensembl > Protein > >> ID > >> ENSMUST00000020099??? > >>? ? ENSP00000397973 > >> > >> For example, for Cdc2a, so I know there are > homologs there, > >> but for some reason the inparanoid package is not > working > >> for me. > >> Using the example in the documentation it does > work though > >> so I'm assuming the mistake is with me. > >> > >> Can anyone help with this (more curiosity now - I > can get > >> the data through biomart)? > >> > >> Cheers > >> > >> Iain > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor at stat.math.ethz.ch > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > >>? ??? > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > >??? > >
ADD REPLY
0
Entering edit mode
Following up this question, I am trying to get human homolog genes for some genes in mouse in Illumin mouse v2 array platform. What is the difference in results between using getLDS in biomaRt and the hom.Mm.inp.db package? Do both methods use similar source information? Thanks in advance. Di On Fri, Nov 13, 2009 at 8:28 AM, Iain Gallagher < iaingallagher@btopenworld.com> wrote: > Thanks Mark > > Works a treat. > > Iain > > --- On Thu, 12/11/09, Marc Carlson <mcarlson@fhcrc.org> wrote: > > > From: Marc Carlson <mcarlson@fhcrc.org> > > Subject: Re: [BioC] hs.Mm.inp.db problem > > To: "Iain Gallagher" <iaingallagher@btopenworld.com> > > Cc: bioconductor@stat.math.ethz.ch > > Date: Thursday, 12 November, 2009, 20:29 > > Hi Iain, > > > > The trouble you are having is because inparanoid uses > > Jackson lab IDs > > (MGI) instead of ensembl protein IDs when representing > > mouse. > > > > So this script should work better: > > > > library(hom.Mm.inp.db) > > library(org.Mm.eg.db) > > library(org.Hs.eg.db) > > > > dataIn <- c('Ints7', 'Upp1', 'Cdc2a') > > egs <- mget(dataIn,revmap(org.Mm.egSYMBOL)) > > > > ## this is what you want right here: > > mouseProtIds <- mget(unlist(egs),org.Mm.egMGI) > > mouseProtIds <- mouseProtIds[!is.na(mouseProtIds)] > > > > rawHumanProtIds <- > > mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) > > > > ##etc. > > > > Hope this helps, > > > > > > Marc > > > > > > > > Iain Gallagher wrote: > > > Hi - Just a follow up post. > > > > > > The title should of course be hom.Mm.inp.db problem > > and session info is below: > > > > > > > > >> sessionInfo() > > >> > > > R version 2.9.0 (2009-04-17) > > > x86_64-pc-linux-gnu > > > > > > locale: > > > > > > LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_ GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC _NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDEN TIFICATION=C > > > > > > attached base packages: > > > [1] stats graphics > > grDevices utils datasets > > methods base > > > > > > other attached packages: > > > [1] org.Hs.eg.db_2.2.11 > > org.Mm.eg.db_2.2.11 hom.Mm.inp.db_2.2.11 > > > [4] RSQLite_0.7-1 > > DBI_0.2-4 > > AnnotationDbi_1.6.0 > > > [7] Biobase_2.4.1 > > > > > > Thanks > > > > > > Iain > > > > > > --- On Thu, 12/11/09, Iain Gallagher <iaingallagher@btopenworld.com> > > wrote: > > > > > > > > >> From: Iain Gallagher <iaingallagher@btopenworld.com> > > >> Subject: [BioC] hs.Mm.inp.db problem > > >> To: bioconductor@stat.math.ethz.ch > > >> Date: Thursday, 12 November, 2009, 18:41 > > >> Hello List > > >> > > >> I am trying to map ~5000 mouse genes to human > > genes using > > >> the inparanoid package and I am failing > > miserably! > > >> > > >> Having followed the example in the documentation I > > can't > > >> get any of my 5000 mouse genes converted to human > > EG ids. > > >> > > >> Example follows with 3 genes only: > > >> > > >> rm(list=ls()) > > >> > > >> library(hom.Mm.inp.db) > > >> library(org.Mm.eg.db) > > >> library(org.Hs.eg.db) > > >> > > >> #mouse genes in as symbols > > >> dataIn <- c('Ints7', 'Upp1', 'Cdc2a') > > >> > > >> #map these to mouse EG ids > > >> egIds <- revmap(org.Mm.egSYMBOL) > > >> mapped <- mappedkeys(egIds) > > >> egIds <- as.list(egIds[mapped]) > > >> ind <- which(names(egIds)%in%dataIn) > > >> egIdsIn <- egIds[ind] > > >> #map these IDs to ENSEMBL protein Ids as used for > > the > > >> inparanoid mapping > > >> mouseProtIds <- > > >> mget(unlist(egIdsIn),org.Mm.egENSEMBLPROT) > > >> mouseProtIds <- > > mouseProtIds[!is.na(mouseProtIds)] > > >> > > >> #this is the point of failure! > > >> rawHumanProtIds <- > > >> > > mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) > > >> > > >> > > >> the returned list is full of NA > > >> > > >> Using biomart on the Ensembl site I can get: > > >> > > >> Ensembl Transcript ID Human Ensembl > > Protein > > >> ID > > >> ENSMUST00000020099 > > >> ENSP00000397973 > > >> > > >> For example, for Cdc2a, so I know there are > > homologs there, > > >> but for some reason the inparanoid package is not > > working > > >> for me. > > >> Using the example in the documentation it does > > work though > > >> so I'm assuming the mistake is with me. > > >> > > >> Can anyone help with this (more curiosity now - I > > can get > > >> the data through biomart)? > > >> > > >> Cheers > > >> > > >> Iain > > >> > > >> _______________________________________________ > > >> Bioconductor mailing list > > >> Bioconductor@stat.math.ethz.ch > > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > > >> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > >> > > >> > > > > > > _______________________________________________ > > > Bioconductor mailing list > > > Bioconductor@stat.math.ethz.ch > > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Di, I can't speak for the origins of the biomaRt homolog information so I can only answer half of your question. The inparanoid packages use data directly from inparanoid. All of the relevant data from inparanoid is included in database for these packages. But only the data that is predicted by this algorithm as scoring 100% is used in the actual mapping. For popular organisms like mouse, human and flies, we have made sure to include enough other data in the relevant organism packages so that you can patch through the appropriate inparanoid packages to retrieve homologs. Also, inparanoid recently updated their datasources and this was a pretty major revision (meaning it unintentionally breaks some things for us in terms of updating our inparanoid data). So if these packages are starting to finally get some use please let me know so that I can prioritize getting our sources updated to their newer version accordingly. Marc Di Wu wrote: > Following up this question, I am trying to get human homolog genes for some > genes in mouse in Illumin mouse v2 array platform. What is the difference in > results between using getLDS in biomaRt and the hom.Mm.inp.db package? Do > both methods use similar source information? > > Thanks in advance. > Di > > On Fri, Nov 13, 2009 at 8:28 AM, Iain Gallagher < > iaingallagher at btopenworld.com> wrote: > > >> Thanks Mark >> >> Works a treat. >> >> Iain >> >> --- On Thu, 12/11/09, Marc Carlson <mcarlson at="" fhcrc.org=""> wrote: >> >> >>> From: Marc Carlson <mcarlson at="" fhcrc.org=""> >>> Subject: Re: [BioC] hs.Mm.inp.db problem >>> To: "Iain Gallagher" <iaingallagher at="" btopenworld.com=""> >>> Cc: bioconductor at stat.math.ethz.ch >>> Date: Thursday, 12 November, 2009, 20:29 >>> Hi Iain, >>> >>> The trouble you are having is because inparanoid uses >>> Jackson lab IDs >>> (MGI) instead of ensembl protein IDs when representing >>> mouse. >>> >>> So this script should work better: >>> >>> library(hom.Mm.inp.db) >>> library(org.Mm.eg.db) >>> library(org.Hs.eg.db) >>> >>> dataIn <- c('Ints7', 'Upp1', 'Cdc2a') >>> egs <- mget(dataIn,revmap(org.Mm.egSYMBOL)) >>> >>> ## this is what you want right here: >>> mouseProtIds <- mget(unlist(egs),org.Mm.egMGI) >>> mouseProtIds <- mouseProtIds[!is.na(mouseProtIds)] >>> >>> rawHumanProtIds <- >>> mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) >>> >>> ##etc. >>> >>> Hope this helps, >>> >>> >>> Marc >>> >>> >>> >>> Iain Gallagher wrote: >>> >>>> Hi - Just a follow up post. >>>> >>>> The title should of course be hom.Mm.inp.db problem >>>> >>> and session info is below: >>> >>>> >>>>> sessionInfo() >>>>> >>>>> >>>> R version 2.9.0 (2009-04-17) >>>> x86_64-pc-linux-gnu >>>> >>>> locale: >>>> >>>> >> LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en _GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;L C_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDE NTIFICATION=C >> >>>> attached base packages: >>>> [1] stats graphics >>>> >>> grDevices utils datasets >>> methods base >>> >>>> other attached packages: >>>> [1] org.Hs.eg.db_2.2.11 >>>> >>> org.Mm.eg.db_2.2.11 hom.Mm.inp.db_2.2.11 >>> >>>> [4] RSQLite_0.7-1 >>>> >>> DBI_0.2-4 >>> AnnotationDbi_1.6.0 >>> >>>> [7] Biobase_2.4.1 >>>> >>>> Thanks >>>> >>>> Iain >>>> >>>> --- On Thu, 12/11/09, Iain Gallagher <iaingallagher at="" btopenworld.com=""> >>>> >>> wrote: >>> >>>> >>>>> From: Iain Gallagher <iaingallagher at="" btopenworld.com=""> >>>>> Subject: [BioC] hs.Mm.inp.db problem >>>>> To: bioconductor at stat.math.ethz.ch >>>>> Date: Thursday, 12 November, 2009, 18:41 >>>>> Hello List >>>>> >>>>> I am trying to map ~5000 mouse genes to human >>>>> >>> genes using >>> >>>>> the inparanoid package and I am failing >>>>> >>> miserably! >>> >>>>> Having followed the example in the documentation I >>>>> >>> can't >>> >>>>> get any of my 5000 mouse genes converted to human >>>>> >>> EG ids. >>> >>>>> Example follows with 3 genes only: >>>>> >>>>> rm(list=ls()) >>>>> >>>>> library(hom.Mm.inp.db) >>>>> library(org.Mm.eg.db) >>>>> library(org.Hs.eg.db) >>>>> >>>>> #mouse genes in as symbols >>>>> dataIn <- c('Ints7', 'Upp1', 'Cdc2a') >>>>> >>>>> #map these to mouse EG ids >>>>> egIds <- revmap(org.Mm.egSYMBOL) >>>>> mapped <- mappedkeys(egIds) >>>>> egIds <- as.list(egIds[mapped]) >>>>> ind <- which(names(egIds)%in%dataIn) >>>>> egIdsIn <- egIds[ind] >>>>> #map these IDs to ENSEMBL protein Ids as used for >>>>> >>> the >>> >>>>> inparanoid mapping >>>>> mouseProtIds <- >>>>> mget(unlist(egIdsIn),org.Mm.egENSEMBLPROT) >>>>> mouseProtIds <- >>>>> >>> mouseProtIds[!is.na(mouseProtIds)] >>> >>>>> #this is the point of failure! >>>>> rawHumanProtIds <- >>>>> >>>>> >>> mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) >>> >>>>> the returned list is full of NA >>>>> >>>>> Using biomart on the Ensembl site I can get: >>>>> >>>>> Ensembl Transcript ID Human Ensembl >>>>> >>> Protein >>> >>>>> ID >>>>> ENSMUST00000020099 >>>>> ENSP00000397973 >>>>> >>>>> For example, for Cdc2a, so I know there are >>>>> >>> homologs there, >>> >>>>> but for some reason the inparanoid package is not >>>>> >>> working >>> >>>>> for me. >>>>> Using the example in the documentation it does >>>>> >>> work though >>> >>>>> so I'm assuming the mistake is with me. >>>>> >>>>> Can anyone help with this (more curiosity now - I >>>>> >>> can get >>> >>>>> the data through biomart)? >>>>> >>>>> Cheers >>>>> >>>>> Iain >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at stat.math.ethz.ch >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >>>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >>>> >>> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY
0
Entering edit mode
Thanks, Marc. What you said helps to understand. I will let you know if I need your help on the code or the updated source. Di On Fri, Nov 13, 2009 at 12:33 PM, Marc Carlson <mcarlson@fhcrc.org> wrote: > Hi Di, > > I can't speak for the origins of the biomaRt homolog information so I > can only answer half of your question. > > The inparanoid packages use data directly from inparanoid. All of the > relevant data from inparanoid is included in database for these > packages. But only the data that is predicted by this algorithm as > scoring 100% is used in the actual mapping. For popular organisms like > mouse, human and flies, we have made sure to include enough other data > in the relevant organism packages so that you can patch through the > appropriate inparanoid packages to retrieve homologs. > > Also, inparanoid recently updated their datasources and this was a > pretty major revision (meaning it unintentionally breaks some things for > us in terms of updating our inparanoid data). So if these packages are > starting to finally get some use please let me know so that I can > prioritize getting our sources updated to their newer version accordingly. > > > Marc > > > > Di Wu wrote: > > Following up this question, I am trying to get human homolog genes for > some > > genes in mouse in Illumin mouse v2 array platform. What is the difference > in > > results between using getLDS in biomaRt and the hom.Mm.inp.db package? Do > > both methods use similar source information? > > > > Thanks in advance. > > Di > > > > On Fri, Nov 13, 2009 at 8:28 AM, Iain Gallagher < > > iaingallagher@btopenworld.com> wrote: > > > > > >> Thanks Mark > >> > >> Works a treat. > >> > >> Iain > >> > >> --- On Thu, 12/11/09, Marc Carlson <mcarlson@fhcrc.org> wrote: > >> > >> > >>> From: Marc Carlson <mcarlson@fhcrc.org> > >>> Subject: Re: [BioC] hs.Mm.inp.db problem > >>> To: "Iain Gallagher" <iaingallagher@btopenworld.com> > >>> Cc: bioconductor@stat.math.ethz.ch > >>> Date: Thursday, 12 November, 2009, 20:29 > >>> Hi Iain, > >>> > >>> The trouble you are having is because inparanoid uses > >>> Jackson lab IDs > >>> (MGI) instead of ensembl protein IDs when representing > >>> mouse. > >>> > >>> So this script should work better: > >>> > >>> library(hom.Mm.inp.db) > >>> library(org.Mm.eg.db) > >>> library(org.Hs.eg.db) > >>> > >>> dataIn <- c('Ints7', 'Upp1', 'Cdc2a') > >>> egs <- mget(dataIn,revmap(org.Mm.egSYMBOL)) > >>> > >>> ## this is what you want right here: > >>> mouseProtIds <- mget(unlist(egs),org.Mm.egMGI) > >>> mouseProtIds <- mouseProtIds[!is.na(mouseProtIds)] > >>> > >>> rawHumanProtIds <- > >>> mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) > >>> > >>> ##etc. > >>> > >>> Hope this helps, > >>> > >>> > >>> Marc > >>> > >>> > >>> > >>> Iain Gallagher wrote: > >>> > >>>> Hi - Just a follow up post. > >>>> > >>>> The title should of course be hom.Mm.inp.db problem > >>>> > >>> and session info is below: > >>> > >>>> > >>>>> sessionInfo() > >>>>> > >>>>> > >>>> R version 2.9.0 (2009-04-17) > >>>> x86_64-pc-linux-gnu > >>>> > >>>> locale: > >>>> > >>>> > >> > LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_ GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC _NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDEN TIFICATION=C > >> > >>>> attached base packages: > >>>> [1] stats graphics > >>>> > >>> grDevices utils datasets > >>> methods base > >>> > >>>> other attached packages: > >>>> [1] org.Hs.eg.db_2.2.11 > >>>> > >>> org.Mm.eg.db_2.2.11 hom.Mm.inp.db_2.2.11 > >>> > >>>> [4] RSQLite_0.7-1 > >>>> > >>> DBI_0.2-4 > >>> AnnotationDbi_1.6.0 > >>> > >>>> [7] Biobase_2.4.1 > >>>> > >>>> Thanks > >>>> > >>>> Iain > >>>> > >>>> --- On Thu, 12/11/09, Iain Gallagher <iaingallagher@btopenworld.com> > >>>> > >>> wrote: > >>> > >>>> > >>>>> From: Iain Gallagher <iaingallagher@btopenworld.com> > >>>>> Subject: [BioC] hs.Mm.inp.db problem > >>>>> To: bioconductor@stat.math.ethz.ch > >>>>> Date: Thursday, 12 November, 2009, 18:41 > >>>>> Hello List > >>>>> > >>>>> I am trying to map ~5000 mouse genes to human > >>>>> > >>> genes using > >>> > >>>>> the inparanoid package and I am failing > >>>>> > >>> miserably! > >>> > >>>>> Having followed the example in the documentation I > >>>>> > >>> can't > >>> > >>>>> get any of my 5000 mouse genes converted to human > >>>>> > >>> EG ids. > >>> > >>>>> Example follows with 3 genes only: > >>>>> > >>>>> rm(list=ls()) > >>>>> > >>>>> library(hom.Mm.inp.db) > >>>>> library(org.Mm.eg.db) > >>>>> library(org.Hs.eg.db) > >>>>> > >>>>> #mouse genes in as symbols > >>>>> dataIn <- c('Ints7', 'Upp1', 'Cdc2a') > >>>>> > >>>>> #map these to mouse EG ids > >>>>> egIds <- revmap(org.Mm.egSYMBOL) > >>>>> mapped <- mappedkeys(egIds) > >>>>> egIds <- as.list(egIds[mapped]) > >>>>> ind <- which(names(egIds)%in%dataIn) > >>>>> egIdsIn <- egIds[ind] > >>>>> #map these IDs to ENSEMBL protein Ids as used for > >>>>> > >>> the > >>> > >>>>> inparanoid mapping > >>>>> mouseProtIds <- > >>>>> mget(unlist(egIdsIn),org.Mm.egENSEMBLPROT) > >>>>> mouseProtIds <- > >>>>> > >>> mouseProtIds[!is.na(mouseProtIds)] > >>> > >>>>> #this is the point of failure! > >>>>> rawHumanProtIds <- > >>>>> > >>>>> > >>> mget(unlist(mouseProtIds),hom.Mm.inpHOMSA,ifnotfound=NA) > >>> > >>>>> the returned list is full of NA > >>>>> > >>>>> Using biomart on the Ensembl site I can get: > >>>>> > >>>>> Ensembl Transcript ID Human Ensembl > >>>>> > >>> Protein > >>> > >>>>> ID > >>>>> ENSMUST00000020099 > >>>>> ENSP00000397973 > >>>>> > >>>>> For example, for Cdc2a, so I know there are > >>>>> > >>> homologs there, > >>> > >>>>> but for some reason the inparanoid package is not > >>>>> > >>> working > >>> > >>>>> for me. > >>>>> Using the example in the documentation it does > >>>>> > >>> work though > >>> > >>>>> so I'm assuming the mistake is with me. > >>>>> > >>>>> Can anyone help with this (more curiosity now - I > >>>>> > >>> can get > >>> > >>>>> the data through biomart)? > >>>>> > >>>>> Cheers > >>>>> > >>>>> Iain > >>>>> > >>>>> _______________________________________________ > >>>>> Bioconductor mailing list > >>>>> Bioconductor@stat.math.ethz.ch > >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>>> Search the archives: > >>>>> > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > >>>>> > >>>> _______________________________________________ > >>>> Bioconductor mailing list > >>>> Bioconductor@stat.math.ethz.ch > >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>> Search the archives: > >>>> > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > >>>> > >>> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@stat.math.ethz.ch > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > >> > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 377 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6