org.*.eg.db problem
2
0
Entering edit mode
@arnemuellernovartiscom-2205
Last seen 9.2 years ago
Switzerland
Hello, the org.*.eg.db environments cannot be used in a generic way :-( . Let's say I'm writing a function that needs an entire org.*.eg.db environment as argument, and the function doesn't care whether it's human, mouse rat or jellyfish. Inside my function I'd be required accessing the maps (e.g. for chromosomal location) without knowing the species. The problem is that you do need to know the species because the mapping names use the species abbreviation: > org.Mm.egCHRLOC CHRLOC map for Mouse (object of class "AnnDbMap") Why isn't this more generic so that one could just call egCHRLOC instead of org.Mm.egCHRLOC which makes code that uses this annotation having to know about the organism - why does it have to be be hard coded? Ideally I'd like to be able to do the following: > library(org.Mm.eg.db) > myGenomeAnnotationFunction(org.Mm.eg.db) { # pass in as an environment? # use the annotation environment to extract whatever information ... } How would you solve this when having to work with several species (if else ... ???) thanks a lot for your help, Arne [[alternative HTML version deleted]]
Annotation Organism Annotation Organism • 1.5k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 5 months ago
United States
On 12/01/2010 08:48 AM, arne.mueller at novartis.com wrote: > Hello, > > the org.*.eg.db environments cannot be used in a generic way :-( . Let's > say I'm writing a function that needs an entire org.*.eg.db environment as > argument, and the function doesn't care whether it's human, mouse rat or > jellyfish. Inside my function I'd be required accessing the maps (e.g. for > chromosomal location) without knowing the species. The problem is that you > do need to know the species because the mapping names use the species > abbreviation: > >> org.Mm.egCHRLOC > CHRLOC map for Mouse (object of class "AnnDbMap") > > Why isn't this more generic so that one could just call egCHRLOC instead > of org.Mm.egCHRLOC which makes code that uses this annotation having to > know about the organism - why does it have to be be hard coded? Ideally > I'd like to be able to do the following: > >> library(org.Mm.eg.db) >> myGenomeAnnotationFunction(org.Mm.eg.db) { # pass in as an environment? > # use the annotation environment to extract whatever information ... > } > > How would you solve this when having to work with several species (if else > ... ???) > Hi arne -- For many cases, library(annotate) map <- getAnnMap("CHRLOC", "org.Mm.eg.db") which will take care of loading the org package as well. Martin > thanks a lot for your help, > > Arne > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD COMMENT
0
Entering edit mode
On 01/12/10 17:50, Martin Morgan wrote: > On 12/01/2010 08:48 AM, arne.mueller at novartis.com wrote: >> Hello, >> >> the org.*.eg.db environments cannot be used in a generic way :-( . Let's >> say I'm writing a function that needs an entire org.*.eg.db environment as >> argument, and the function doesn't care whether it's human, mouse rat or >> jellyfish. Inside my function I'd be required accessing the maps (e.g. for >> chromosomal location) without knowing the species. The problem is that you >> do need to know the species because the mapping names use the species >> abbreviation: >> >>> org.Mm.egCHRLOC >> CHRLOC map for Mouse (object of class "AnnDbMap") >> >> Why isn't this more generic so that one could just call egCHRLOC instead >> of org.Mm.egCHRLOC which makes code that uses this annotation having to >> know about the organism - why does it have to be be hard coded? Ideally >> I'd like to be able to do the following: >> This is likely a design decision forced by R's scoping rule for attached packages: attaching a package will bring its (exported) content into the search path. In other words, if you would have a package org.Mm.eg.db and a package org.Hs.eg.db, both containing an object egCHRLOC, the symbol egCHRLOC would be resolved to two different things depending on which one of the two packages was attached last. This does prevent some of the patterns frequently found in other language from being (safely) implemented (and is unfortunate). >>> library(org.Mm.eg.db) >>> myGenomeAnnotationFunction(org.Mm.eg.db) { # pass in as an environment? >> # use the annotation environment to extract whatever information ... >> } >> >> How would you solve this when having to work with several species (if else >> ... ???) >> > > Hi arne -- > > For many cases, > > library(annotate) > map<- getAnnMap("CHRLOC", "org.Mm.eg.db") > > which will take care of loading the org package as well. > > Martin > > >> thanks a lot for your help, >> >> Arne >> >> >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
You could write in a check and use 'with'. if 'mouse' yourFunction (with (org.Mm.eg.db... etc etc)) if 'human' yourFunction (with (org.Hs.eg.db... etc etc)) This might not be optimal as I'm not a programmer by any measure (!) but it's one way to get what I think you want. cheers iain --- On Wed, 1/12/10, Laurent Gautier <laurent at="" cbs.dtu.dk=""> wrote: > From: Laurent Gautier <laurent at="" cbs.dtu.dk=""> > Subject: Re: [BioC] org.*.eg.db problem > To: arne.mueller at novartis.com > Cc: bioconductor at stat.math.ethz.ch > Date: Wednesday, 1 December, 2010, 17:28 > On 01/12/10 17:50, Martin Morgan > wrote: > > On 12/01/2010 08:48 AM, arne.mueller at novartis.com > wrote: > >> Hello, > >> > >> the org.*.eg.db environments cannot be used in a > generic way :-( . Let's > >> say I'm writing a function that needs an entire > org.*.eg.db environment as > >> argument, and the function doesn't care whether > it's human, mouse rat or > >> jellyfish. Inside my function I'd be required > accessing the maps (e.g. for > >> chromosomal location) without knowing the species. > The problem is that you > >> do need to know the species because the mapping > names use the species > >> abbreviation: > >> > >>> org.Mm.egCHRLOC > >> CHRLOC map for Mouse (object of class "AnnDbMap") > >> > >> Why isn't this more generic so that one could just > call egCHRLOC instead > >> of org.Mm.egCHRLOC which makes code that uses this > annotation having to > >> know about the organism - why does it have to be > be hard coded? Ideally > >> I'd like to be able to do the following: > >> > > This is likely a design decision forced by R's scoping rule > for attached > packages: attaching a package will bring its (exported) > content into the > search path. > In other words, if you would have a package org.Mm.eg.db > and a package > org.Hs.eg.db, both containing an object egCHRLOC, the > symbol egCHRLOC > would be resolved to two different things depending on > which one of the > two packages was attached last. > > This does prevent some of the patterns frequently found in > other > language from being (safely) implemented > (and is unfortunate). > > > >>> library(org.Mm.eg.db) > >>> myGenomeAnnotationFunction(org.Mm.eg.db)? > { # pass in as an environment? > >>? ? ? ? # use the annotation > environment to extract whatever information ... > >> } > >> > >> How would you solve this when having to work with > several species (if else > >> ... ???) > >> > > > > Hi arne -- > > > > For many cases, > > > > library(annotate) > > map<- getAnnMap("CHRLOC", "org.Mm.eg.db") > > > > which will take care of loading the org package as > well. > > > > Martin > > > > > >>? ? thanks a lot for your help, > >> > >>? ? Arne > >> > >> > >> > >> > >> ??? [[alternative HTML version > deleted]] > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor at r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Even shorter, one could use the "::" operator and write org.Mm.eg.db::egCHRLOC. The presumed issue is more that one could import (attach) several org.X.eg.db packages, still write egCHRLOC and get something dependent on the order in which the import was made (hence the note about implementing the pattern "safely"). The only way around I can thing of would be to do not export names out a package namespace, forcing the use of ":::", but that's twisting things a little. L. On 12/1/10 7:37 PM, Iain Gallagher wrote: > You could write in a check and use 'with'. > > if 'mouse' > > yourFunction (with (org.Mm.eg.db... etc etc)) > > if 'human' > > yourFunction (with (org.Hs.eg.db... etc etc)) > > This might not be optimal as I'm not a programmer by any measure (!) but it's one way to get what I think you want. > > cheers > > iain > > --- On Wed, 1/12/10, Laurent Gautier<laurent at="" cbs.dtu.dk=""> wrote: > >> From: Laurent Gautier<laurent at="" cbs.dtu.dk=""> >> Subject: Re: [BioC] org.*.eg.db problem >> To: arne.mueller at novartis.com >> Cc: bioconductor at stat.math.ethz.ch >> Date: Wednesday, 1 December, 2010, 17:28 >> On 01/12/10 17:50, Martin Morgan >> wrote: >>> On 12/01/2010 08:48 AM, arne.mueller at novartis.com >> wrote: >>>> Hello, >>>> >>>> the org.*.eg.db environments cannot be used in a >> generic way :-( . Let's >>>> say I'm writing a function that needs an entire >> org.*.eg.db environment as >>>> argument, and the function doesn't care whether >> it's human, mouse rat or >>>> jellyfish. Inside my function I'd be required >> accessing the maps (e.g. for >>>> chromosomal location) without knowing the species. >> The problem is that you >>>> do need to know the species because the mapping >> names use the species >>>> abbreviation: >>>> >>>>> org.Mm.egCHRLOC >>>> CHRLOC map for Mouse (object of class "AnnDbMap") >>>> >>>> Why isn't this more generic so that one could just >> call egCHRLOC instead >>>> of org.Mm.egCHRLOC which makes code that uses this >> annotation having to >>>> know about the organism - why does it have to be >> be hard coded? Ideally >>>> I'd like to be able to do the following: >>>> >> This is likely a design decision forced by R's scoping rule >> for attached >> packages: attaching a package will bring its (exported) >> content into the >> search path. >> In other words, if you would have a package org.Mm.eg.db >> and a package >> org.Hs.eg.db, both containing an object egCHRLOC, the >> symbol egCHRLOC >> would be resolved to two different things depending on >> which one of the >> two packages was attached last. >> >> This does prevent some of the patterns frequently found in >> other >> language from being (safely) implemented >> (and is unfortunate). >> >> >>>>> library(org.Mm.eg.db) >>>>> myGenomeAnnotationFunction(org.Mm.eg.db) >> { # pass in as an environment? >>>> # use the annotation >> environment to extract whatever information ... >>>> } >>>> >>>> How would you solve this when having to work with >> several species (if else >>>> ... ???) >>>> >>> Hi arne -- >>> >>> For many cases, >>> >>> library(annotate) >>> map<- getAnnMap("CHRLOC", "org.Mm.eg.db") >>> >>> which will take care of loading the org package as >> well. >>> Martin >>> >>> >>>> thanks a lot for your help, >>>> >>>> Arne >>>> >>>> >>>> >>>> >>>> [[alternative HTML version >> deleted]] >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Hi Arne, Martin's proposed solution will allow you to generalize the code and is probably what you want to be doing. But you also have to be mindful that independent of design decisions made by the Bioconductor team, the universe we live in simply does not always provide annotations for each species equally. So for example, OMIM annotations only exist for humans (by definition), and there are not any flybase IDs found in mouse. You might reasonably expect that you should always be able to find an annotation such as 'CHRLOC', for any given species. But even 'CHRLOC' may not exist for some species simply because the place that generates that annotation might not have done so for the species you are interested in. All of these things are well outside of Bioconductors control, and so we simply cannot guarantee that all these mappings will be available in every situation. So these less generic mapping names are actually telling you something that you need to know. They are telling you that a kind of information exists for a particular species. When you write your function, you have to also consider whether or not the mapping can even exist for the species in question. getAnnMap() will tell you this by throwing an error when the mapping in question does not exist. Hope this helps, Marc On 12/01/2010 08:50 AM, Martin Morgan wrote: > On 12/01/2010 08:48 AM, arne.mueller at novartis.com wrote: > >> Hello, >> >> the org.*.eg.db environments cannot be used in a generic way :-( . Let's >> say I'm writing a function that needs an entire org.*.eg.db environment as >> argument, and the function doesn't care whether it's human, mouse rat or >> jellyfish. Inside my function I'd be required accessing the maps (e.g. for >> chromosomal location) without knowing the species. The problem is that you >> do need to know the species because the mapping names use the species >> abbreviation: >> >> >>> org.Mm.egCHRLOC >>> >> CHRLOC map for Mouse (object of class "AnnDbMap") >> >> Why isn't this more generic so that one could just call egCHRLOC instead >> of org.Mm.egCHRLOC which makes code that uses this annotation having to >> know about the organism - why does it have to be be hard coded? Ideally >> I'd like to be able to do the following: >> >> >>> library(org.Mm.eg.db) >>> myGenomeAnnotationFunction(org.Mm.eg.db) { # pass in as an environment? >>> >> # use the annotation environment to extract whatever information ... >> } >> >> How would you solve this when having to work with several species (if else >> ... ???) >> >> > > Hi arne -- > > For many cases, > > library(annotate) > map <- getAnnMap("CHRLOC", "org.Mm.eg.db") > > which will take care of loading the org package as well. > > Martin > > > >> thanks a lot for your help, >> >> Arne >> >> >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > >
ADD REPLY
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 2 minutes ago
United States
I don't think this question is extremely clear, but you could probably get around your concern by using get() while pasting the organism-identifying token into some string template get(paste("org.", [token1], ".eg", [token2], sep="")) will retrieve a map for organism identified by token1 with targets of type identified by token2 -- examples are "Mm", "CHRLOC" for these tokens respectively there are undoubtedly more elegant ways to proceed, depending on your specific concern On Wed, Dec 1, 2010 at 11:48 AM, <arne.mueller at="" novartis.com=""> wrote: > Hello, > > the org.*.eg.db environments cannot be used in a generic way :-( . Let's > say I'm writing a function that needs an entire org.*.eg.db environment as > argument, and the function doesn't care whether it's human, mouse rat or > jellyfish. Inside my function I'd be required accessing the maps (e.g. for > chromosomal location) without knowing the species. The problem is that you > do need to know the species because the mapping names use the species > abbreviation: > >> org.Mm.egCHRLOC > CHRLOC map for Mouse (object of class "AnnDbMap") > > Why isn't this more generic so that one could just call egCHRLOC instead > of org.Mm.egCHRLOC which makes code that uses this annotation having to > know about the organism - why does it have to be be hard coded? Ideally > I'd like to be able to do the following: > >> library(org.Mm.eg.db) >> myGenomeAnnotationFunction(org.Mm.eg.db) ?{ # pass in as an environment? > ? ? ?# use the annotation environment to extract whatever information ... > } > > How would you solve this when having to work with several species (if else > ... ???) > > ?thanks a lot for your help, > > ?Arne > > > > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT

Login before adding your answer.

Traffic: 537 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6