Getting annotation into eset

0

Entering edit mode

Timothy Wu ▴ 120

@timothy-wu-3964

Last seen 10.6 years ago

Hi, I noticed that when I obtain eset from GEO matrix file via GEOquery, the result obtained from limma analysis includes the annotation information. However, eset obtained from CEL files via ReadAffy() and GCRMA threestep() do not include those annotations. How do I infuse the annotations into eset (preferred) before limma analysis? Or maybe during limma result if that is more proper. The platform is U133 plus 2. Any help appreciated, thank in advance. Timothy [[alternative HTML version deleted]]

limma GEOquery limma GEOquery • 2.8k views

ADD COMMENT • link updated 14.5 years ago by Sean Davis 21k • written 14.5 years ago by Timothy Wu ▴ 120

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 9 weeks ago

United States

On Thu, Oct 28, 2010 at 5:44 AM, Timothy Wu <2huggie@gmail.com> wrote: > Hi, > > I noticed that when I obtain eset from GEO matrix file via GEOquery, the > result obtained from limma analysis includes the annotation information. > However, eset obtained from CEL files via ReadAffy() and GCRMA threestep() > do not include those annotations. How do I infuse the annotations into eset > (preferred) before limma analysis? Or maybe during limma result if that is > more proper. The platform is U133 plus 2. > > Any help appreciated, thank in advance. > > Hi, Timothy. The easiest way to do this will depend on the details of the workflow you are using. If you can provide those details, answering your question will be simpler. To answer directly, though, you simply need to fill the featureData slot of the eSet with the annotation information. Sean [[alternative HTML version deleted]]

ADD COMMENT • link 14.5 years ago Sean Davis 21k

0

Entering edit mode

On Thu, Oct 28, 2010 at 6:34 PM, Sean Davis <sdavis2@mail.nih.gov> wrote: > > > On Thu, Oct 28, 2010 at 5:44 AM, Timothy Wu <2huggie@gmail.com> wrote: > >> Hi, >> >> I noticed that when I obtain eset from GEO matrix file via GEOquery, the >> result obtained from limma analysis includes the annotation information. >> However, eset obtained from CEL files via ReadAffy() and GCRMA threestep() >> do not include those annotations. How do I infuse the annotations into >> eset >> (preferred) before limma analysis? Or maybe during limma result if that is >> more proper. The platform is U133 plus 2. >> >> Any help appreciated, thank in advance. >> >> > Hi, Timothy. The easiest way to do this will depend on the details of the > workflow you are using. If you can provide those details, answering your > question will be simpler. To answer directly, though, you simply need to > fill the featureData slot of the eSet with the annotation information. > > Sean > Hi Sean, I am not sure what the workflow you're referring to. This is how I obtain the eset: == library(affy) library(affyPLM) affybatch <- ReadAffy() eset <<- threestep_gcrma(affybatch) == Are there package that I can use to fill the featureData? I am not very skilled in R and what you says looks intimidating. Timothy > > [[alternative HTML version deleted]]

ADD REPLY • link 14.5 years ago Timothy Wu ▴ 120

0

Entering edit mode

On Thu, Oct 28, 2010 at 7:44 AM, Timothy Wu <2huggie@gmail.com> wrote: > On Thu, Oct 28, 2010 at 6:34 PM, Sean Davis <sdavis2@mail.nih.gov> wrote: > > > > > > > On Thu, Oct 28, 2010 at 5:44 AM, Timothy Wu <2huggie@gmail.com> wrote: > > > >> Hi, > >> > >> I noticed that when I obtain eset from GEO matrix file via GEOquery, the > >> result obtained from limma analysis includes the annotation information. > >> However, eset obtained from CEL files via ReadAffy() and GCRMA > threestep() > >> do not include those annotations. How do I infuse the annotations into > >> eset > >> (preferred) before limma analysis? Or maybe during limma result if that > is > >> more proper. The platform is U133 plus 2. > >> > >> Any help appreciated, thank in advance. > >> > >> > > Hi, Timothy. The easiest way to do this will depend on the details of > the > > workflow you are using. If you can provide those details, answering your > > question will be simpler. To answer directly, though, you simply need to > > fill the featureData slot of the eSet with the annotation information. > > > > Sean > > > > Hi Sean, > > I am not sure what the workflow you're referring to. > > This is how I obtain the eset: > > == > library(affy) > library(affyPLM) > > affybatch <- ReadAffy() > eset <<- threestep_gcrma(affybatch) > == > > Are there package that I can use to fill the featureData? I am not very > skilled in R and what you says looks intimidating. > > Hi, Timothy. You could look at the the AnnotationDbi package vignette: AnnotationDbi: How to use the ".db" annotation packages. Also, you should look at the annotate bioconductor package. Sean [[alternative HTML version deleted]]

ADD REPLY • link 14.5 years ago Sean Davis 21k

0

Entering edit mode

Hi Timothy, It sounds like you might just be asking how to set the annotation slot on the ExpressionSet object? You have not given us a lot of information, so it is hard to know. But if that is the case, then you might also try: library(Biobase) openVignette() Or just go here: http://www.bioconductor.org/help/bioc- views/release/bioc/html/Biobase.html Or look at the manual page for ExpressionSets: ?ExpressionSet And learn about the annotation() accessor method that is defined for these. Marc On 10/28/2010 04:48 AM, Sean Davis wrote: > On Thu, Oct 28, 2010 at 7:44 AM, Timothy Wu <2huggie at gmail.com> wrote: > > >> On Thu, Oct 28, 2010 at 6:34 PM, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: >> >> >>> >>> On Thu, Oct 28, 2010 at 5:44 AM, Timothy Wu <2huggie at gmail.com> wrote: >>> >>> >>>> Hi, >>>> >>>> I noticed that when I obtain eset from GEO matrix file via GEOquery, the >>>> result obtained from limma analysis includes the annotation information. >>>> However, eset obtained from CEL files via ReadAffy() and GCRMA >>>> >> threestep() >> >>>> do not include those annotations. How do I infuse the annotations into >>>> eset >>>> (preferred) before limma analysis? Or maybe during limma result if that >>>> >> is >> >>>> more proper. The platform is U133 plus 2. >>>> >>>> Any help appreciated, thank in advance. >>>> >>>> >>>> >>> Hi, Timothy. The easiest way to do this will depend on the details of >>> >> the >> >>> workflow you are using. If you can provide those details, answering your >>> question will be simpler. To answer directly, though, you simply need to >>> fill the featureData slot of the eSet with the annotation information. >>> >>> Sean >>> >>> >> Hi Sean, >> >> I am not sure what the workflow you're referring to. >> >> This is how I obtain the eset: >> >> == >> library(affy) >> library(affyPLM) >> >> affybatch <- ReadAffy() >> eset <<- threestep_gcrma(affybatch) >> == >> >> Are there package that I can use to fill the featureData? I am not very >> skilled in R and what you says looks intimidating. >> >> >> > Hi, Timothy. You could look at the the AnnotationDbi package > vignette: AnnotationDbi: > How to use the ".db" annotation packages. Also, you should look at the > annotate bioconductor package. > > Sean > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 14.5 years ago Marc Carlson ★ 7.2k

0

Entering edit mode

Dear BioC users, I performed different hypergometric tests on my data regarding GO terms and KEGG pathways. With GO resukt I can use the probeSetSummary function to retrieve the gene list associated to each significant category. However this function does not work if I apply the HG test using KEGGHyperGParams because the results are not of GOHyperGResult class... Is there any equivalent KEGG function to get those genes list? WIth advanced thanks for your help. Cl?mentine -- Cl?mentine Dressaire Post-doctoral research fellow Control of gene expression lab ITQB - Instituto de Tecnologia Qu?mica e Biol?gica Apartado 127, Av. da Rep?blica 2780-157 Oeiras Portugal +351 214469562

ADD REPLY • link 14.5 years ago Clémentine Dressaire ▴ 120

0

Entering edit mode

Hi Cl?mentine, I don't know, if such a function exists. I use two little helper functions to retrieve probe IDs or gene symbols of genes in a genelist, that are associated with a KEGG ID: KEGG2genes = function(KEGGID, genelist, db){ require(paste(db, "db", sep="."), character.only = TRUE) l = vector("list") for (i in 1:length(KEGGID)){ kegg = as.matrix(unlist(mget(KEGGID[i], get(paste(db, "PATH2PROBE", sep="")), ifnotfound=NA))) l[[i]] = genelist[is.element(genelist,kegg[,1])] } names(l)=KEGGID l } KEGG2symbol = function(KEGGID, genelist, db){ l = vector("list") for (i in 1:length(KEGGID)){ id = unlist(KEGG2genes(KEGGID=KEGGID[i], genelist=genelist, db=db)) l[[i]] = as.matrix(mget(id, get(paste(db, "SYMBOL", sep="")), ifnotfound=NA)) } names(l)=KEGGID l } where "KEGGID" is a character vector of your KEGGID(s) you are interested in, "genelist" is a character vector containing the probe IDs/probeset IDs of your genelist you used to create the KEGGHyperGResult and "db" is a character vector with the annotation database for your array without the .db extension (e.g. db="hgu133plus" for the affy U133+ 2.0 array). As a result you get a matrix containing the probeIDs and genesymbols for each KEGGID stored in a list. It might not be the most elegant way, but it works. Kind regards, Mike -----Urspr?ngliche Nachricht----- Von: "Cl?mentine Dressaire" <clementinedressaire at="" itqb.unl.pt=""> Gesendet: 29.10.2010 13:27:44 An: bioconductor at stat.math.ethz.ch Betreff: [BioC] retrieve genes names after KEGG hypergeometric test > >Dear BioC users, > > > >I performed different hypergometric tests on my data regarding GO terms > >and KEGG pathways. With GO resukt I can use the probeSetSummary function to > >retrieve the gene list associated to each significant category. > >However this function does not work if I apply the HG test using > >KEGGHyperGParams because the results are not of GOHyperGResult class... Is > >there any equivalent KEGG function to get those genes list? > > > >WIth advanced thanks for your help. > > > >Cl?mentine > > > >-- > >Cl?mentine Dressaire > >Post-doctoral research fellow > >Control of gene expression lab > >ITQB - Instituto de Tecnologia Qu?mica e Biol?gica > >Apartado 127, Av. da Rep?blica > >2780-157 Oeiras > >Portugal > >+351 214469562 > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 14.5 years ago Mike Walter ▴ 230

0

Entering edit mode

Hi Mike, Could ou explain me the difference between the db and "db" you are using? If db is the character vector with the annotation database for your array without the .db extension, then what does db represent? Again thanks for your help, Cl?mentine On Fri, 29 Oct 2010 14:23:00 +0200 (CEST), "Mike Walter" <michael_walter at="" email.de=""> wrote: > Hi Cl?mentine, > > I don't know, if such a function exists. I use two little helper functions > to retrieve probe IDs or gene symbols of genes in a genelist, that are > associated with a KEGG ID: > > KEGG2genes = function(KEGGID, genelist, db){ > require(paste(db, "db", sep="."), character.only = TRUE) > l = vector("list") > for (i in 1:length(KEGGID)){ > kegg = as.matrix(unlist(mget(KEGGID[i], get(paste(db, "PATH2PROBE", > sep="")), ifnotfound=NA))) > l[[i]] = genelist[is.element(genelist,kegg[,1])] > } > names(l)=KEGGID > l > } > > KEGG2symbol = function(KEGGID, genelist, db){ > l = vector("list") > for (i in 1:length(KEGGID)){ > id = unlist(KEGG2genes(KEGGID=KEGGID[i], genelist=genelist, db=db)) > l[[i]] = as.matrix(mget(id, get(paste(db, "SYMBOL", sep="")), > ifnotfound=NA)) > } > names(l)=KEGGID > l > } > > where "KEGGID" is a character vector of your KEGGID(s) you are interested > in, "genelist" is a character vector containing the probe IDs/probeset IDs > of your genelist you used to create the KEGGHyperGResult and "db" is a > character vector with the annotation database for your array without the > .db extension (e.g. db="hgu133plus" for the affy U133+ 2.0 array). As a > result you get a matrix containing the probeIDs and genesymbols for each > KEGGID stored in a list. It might not be the most elegant way, but it > works. > > Kind regards, > > Mike > > -----Urspr?ngliche Nachricht----- > Von: "Cl?mentine Dressaire" <clementinedressaire at="" itqb.unl.pt=""> > Gesendet: 29.10.2010 13:27:44 > An: bioconductor at stat.math.ethz.ch > Betreff: [BioC] retrieve genes names after KEGG hypergeometric test > >> >>Dear BioC users, >> >> >> >>I performed different hypergometric tests on my data regarding GO terms >> >>and KEGG pathways. With GO resukt I can use the probeSetSummary function >>to >> >>retrieve the gene list associated to each significant category. >> >>However this function does not work if I apply the HG test using >> >>KEGGHyperGParams because the results are not of GOHyperGResult class... Is >> >>there any equivalent KEGG function to get those genes list? >> >> >> >>WIth advanced thanks for your help. >> >> >> >>Cl?mentine >> >> >> >>-- >> >>Cl?mentine Dressaire >> >>Post-doctoral research fellow >> >>Control of gene expression lab >> >>ITQB - Instituto de Tecnologia Qu?mica e Biol?gica >> >>Apartado 127, Av. da Rep?blica >> >>2780-157 Oeiras >> >>Portugal >> >>+351 214469562 >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor at stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor >>Search the archives: >>http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 14.5 years ago Clémentine Dressaire ▴ 120

Login before adding your answer.