how to understand the concept metadata in bioconductor
1
0
Entering edit mode
wang peter ★ 2.0k
@wang-peter-4647
Last seen 10.2 years ago
dear all: in bioconductor, is the metadata equally to "annotation imformatioin" ? -- shan gao Room 231(Dr.Fei lab) Boyce Thompson Institute for Plant Research Cornell University Tower Road, Ithaca, NY 14853-1801 Office phone: 1-607-254-1267(day) Official email:sg839 at cornell.edu Facebook:http://www.facebook.com/profile.php?id=100001986532253
• 1.9k views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 21 months ago
United States
Hi, On Tue, Jan 29, 2013 at 2:38 PM, Wang Peter <wng.peter at="" gmail.com=""> wrote: > dear all: > > > in bioconductor, is the metadata equally to "annotation imformatioin" ? Can you provide more context on this question? What metadata are you referring to? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
what i mean is : generally, the metadata == "annotation imformatioin" in any bioconductor context. > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact -- shan gao Room 231(Dr.Fei lab) Boyce Thompson Institute for Plant Research Cornell University Tower Road, Ithaca, NY 14853-1801 Office phone: 1-607-254-1267(day) Official email:sg839 at cornell.edu Facebook:http://www.facebook.com/profile.php?id=100001986532253
ADD REPLY
0
Entering edit mode
metadata is, generically, data about data. Same in BioC as elsewhere. So, concretely, let's look at TxDb.Dmelanogaster.UCSC.dm3.ensGene. The annotations of gene/transcript locations and their IDs are the actual data. But, if you pull up the TranscriptDb object, there's a lot more information, too. R> TxDb.Dmelanogaster.UCSC.dm3.ensGene TranscriptDb object: | Db type: TranscriptDb | Supporting package: GenomicFeatures | Data source: UCSC | Genome: dm3 | Genus and Species: Drosophila melanogaster | UCSC Table: ensGene | Resource URL: http://genome.ucsc.edu/ | Type of Gene ID: Ensembl gene ID | Full dataset: yes | miRBase build ID: NA | transcript_nrow: 23017 | exon_nrow: 69155 | cds_nrow: 59573 | Db created by: GenomicFeatures package from Bioconductor | Creation time: 2012-09-10 13:00:23 -0700 (Mon, 10 Sep 2012) | GenomicFeatures version at creation time: 1.9.39 | RSQLite version at creation time: 0.11.1 | DBSCHEMAVERSION: 1.0 The information about where the transcripts came from (Ensembl), what organism they are for (Drosophila melanogaster), what genomic assembly this corresponds to (dm3), when the package was built (9/10/2012), what version of GenomicFeatures built it... these are all metadata. Hence, data about the data. Useful in resolving issues when the data itself conflicts across sources, assemblies, builds, or simply between lab groups... On Tue, Jan 29, 2013 at 1:15 PM, Wang Peter <wng.peter@gmail.com> wrote: > what i mean is : > > generally, > > the metadata == "annotation imformatioin" > > in any bioconductor context. > > > > -- > > Steve Lianoglou > > Graduate Student: Computational Systems Biology > > | Memorial Sloan-Kettering Cancer Center > > | Weill Medical College of Cornell University > > Contact Info: http://cbio.mskcc.org/~lianos/contact > > > > -- > shan gao > Room 231(Dr.Fei lab) > Boyce Thompson Institute for Plant Research > Cornell University > Tower Road, Ithaca, NY 14853-1801 > Office phone: 1-607-254-1267(day) > Official email:sg839@cornell.edu > Facebook:http://www.facebook.com/profile.php?id=100001986532253 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
thank you very much, Steven and Tim But my question is quite simple. if the metadata == "annotation imformatioin"??? in bioconductor. -- shan gao Room 231(Dr.Fei lab) Boyce Thompson Institute for Plant Research Cornell University Tower Road, Ithaca, NY 14853-1801 Office phone: 1-607-254-1267(day) Official email:sg839 at cornell.edu Facebook:http://www.facebook.com/profile.php?id=100001986532253
ADD REPLY
0
Entering edit mode
By definition, if it's primary data (e.g. an annotation package's annotations), it's not metadata. Data about the data is metadata. Depending on what you are treating as primary, there are times when then term can be highly ambiguous. However, the only times I have encountered anything explicitly labeled 'metadata' in BioC is when constructing or using annotations, and in these instances, it has always referred to information about provenance (where the annotations came from). I have to believe this is an effort to reduce confusion. Hope this makes sense. --t On Tue, Jan 29, 2013 at 2:00 PM, Wang Peter <wng.peter@gmail.com> wrote: > thank you very much, Steven and Tim > > But my question is quite simple. > > > > if > the metadata == "annotation imformatioin"??? in bioconductor. > > > > > -- > shan gao > Room 231(Dr.Fei lab) > Boyce Thompson Institute for Plant Research > Cornell University > Tower Road, Ithaca, NY 14853-1801 > Office phone: 1-607-254-1267(day) > Official email:sg839@cornell.edu > Facebook:http://www.facebook.com/profile.php?id=100001986532253 > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi, On Tue, Jan 29, 2013 at 5:25 PM, Tim Triche, Jr. <tim.triche at="" gmail.com=""> wrote: > By definition, if it's primary data (e.g. an annotation package's > annotations), it's not metadata. Data about the data is metadata. Depending > on what you are treating as primary, there are times when then term can be > highly ambiguous. > > However, the only times I have encountered anything explicitly labeled > 'metadata' in BioC is when constructing or using annotations, and in these > instances, it has always referred to information about provenance (where the > annotations came from). I have to believe this is an effort to reduce > confusion. In order to clarify (aka "muddy") things even more: there is also the "metadata" that is accessible by `mcols` on various IRanges-derived classes. Thus my initial call for clarification re: context. There is no universal answer to this question, except the one that Tim already provided: metadata is data about meta. Shan Gao: where are you seeing "metadata" (provide a link to some documentation or something) so someone can answer this for you if you're still confused. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
I meant to say: On Tue, Jan 29, 2013 at 5:33 PM, Steve Lianoglou <mailinglist.honeypot at="" gmail.com=""> wrote: > There is no universal answer to this question, except the one that Tim > already provided: metadata is data about meta. metadata is data about *data* I'm not sure what "metadata is data about meta" would mean, but I think it must have something to do with why people first "invented" recursion. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
please see this website on the right upper region metadata=annotation packages http://www.bioconductor.org/packages/release/data/annotation/ shan
ADD REPLY
0
Entering edit mode
(meant to send to list as well) That link could certainly stand to be renamed, and thereby reduce confusion. It's not that annotation isn't data about collected measurement data (it is), but rather there's a handy name for such metadata within BioC: 'annotations' Now I can understand better the point of confusion. On Tue, Jan 29, 2013 at 2:40 PM, Wang Peter <wng.peter@gmail.com> wrote: > please see this website > on the right upper region > > metadata=annotation packages > > http://www.bioconductor.org/packages/release/data/annotation/ > > shan > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
any way if bioconductor use metadata to refer to all the information for annotation that makes sense shan
ADD REPLY
0
Entering edit mode
"Data about data" doesn't have a better general name than just "metadata". "Data about genes" does -- i.e., "annotations" -- so why not use that name? Usually, the most specific word available is the right one. Since the word "metadata" is fundamentally ambiguous, the only appropriate time to use it (IMHO) is thus when there exists no specific alternative :-/ just my $0.02 On Tue, Jan 29, 2013 at 3:04 PM, Wang Peter <wng.peter@gmail.com> wrote: > any way > if bioconductor use metadata to refer to all the information for annotation > that makes sense > > shan > -- *A model is a lie that helps you see the truth.* * * Howard Skipper<http: cancerres.aacrjournals.org="" content="" 31="" 9="" 1173.full.pdf=""> [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On 01/29/2013 02:58 PM, Tim Triche, Jr. wrote: > (meant to send to list as well) > > That link could certainly stand to be renamed, and thereby reduce confusion. > > It's not that annotation isn't data about collected measurement data (it > is), but rather there's a handy name for such metadata within BioC: > 'annotations' > > Now I can understand better the point of confusion. Not sure that I do, but the links have been re-named 'Annotation Data' consistent with the biocViews term. Martin > > > > On Tue, Jan 29, 2013 at 2:40 PM, Wang Peter <wng.peter at="" gmail.com=""> wrote: > >> please see this website >> on the right upper region >> >> metadata=annotation packages >> >> http://www.bioconductor.org/packages/release/data/annotation/ >> >> shan >> > > > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD REPLY

Login before adding your answer.

Traffic: 453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6