topTable (fit) annotation
0
0
Entering edit mode
Jing Huang ▴ 380
@jing-huang-4737
Last seen 10.4 years ago
Sorry. My problem is the colnames(topTable(fit)) that I generated by GEO GSE dataset only provides a few items such as pValue... I would like to have a full table just like GEO GDS dataset generated. I describe the problem at my original email: Different dataset of GEO provides different patten on topTable(fit). This generated with GDS dataset >colnames(topTable(fit)) [1] "ID" "Gene.title" "Gene.symbol" [4] "Gene.ID" "UniGene.title" "UniGene.symbol" [7] "UniGene.ID" "Nucleotide.Title" "GI" [10] "GenBank.Accession" "Platform_CLONEID" "Platform_ORF" [13] "Platform_SPOTID" "Chromosome.location" "Chromosome.annotation" [16] "GO.Function" "GO.Process" "GO.Component" [19] "GO.Function.1" "GO.Process.1" "GO.Component.1" [22] "CTRL" "HIF1a" "HIF2a" [25] "HIF1a2a" "AveExpr" "F" [28] "P.Value" "adj.P.Val" This generated with GSE dataset >colnames(topTable(fit) [1] "ID" "mir210" "CTRL2" "AveExpr" "F" "P.Value" "adj.P.Val" I hope this is clear. Jing On 8/31/11 9:38 AM, "Freudenberg, Johannes (NIH/NIEHS) [E]" <johannes.freudenberg at="" nih.gov=""> wrote: >Hi Jing, > >I'm not quite sure I understand your question. Why can't you use the >data for further analysis? Maybe the issue is that getGEO() returns a >list? Have you tried something like: > >> gse16962 <- getGEO("GSE16962") >> eset <- gse16962[[1]] >> head(fData(eset)) > ID GB_ACC SPOT_ID Species.Scientific.Name Annotation.Date >1007_s_at 1007_s_at U48705 <na> Homo sapiens Mar 11, 2009 >1053_at 1053_at M87338 <na> Homo sapiens Mar 11, 2009 >117_at 117_at X51757 <na> Homo sapiens Mar 11, 2009 >121_at 121_at X69699 <na> Homo sapiens Mar 11, 2009 >1255_g_at 1255_g_at L36861 <na> Homo sapiens Mar 11, 2009 >1294_at 1294_at L13852 <na> Homo sapiens Mar 11, 2009 > Sequence.Type Sequence.Source >1007_s_at Exemplar sequence Affymetrix Proprietary Database >1053_at Exemplar sequence GenBank >117_at Exemplar sequence Affymetrix Proprietary Database >121_at Exemplar sequence GenBank >1255_g_at Exemplar sequence Affymetrix Proprietary Database >1294_at Exemplar sequence GenBank > >Etc. > >--Johannes > > > >-----Original Message----- >From: Jing Huang [mailto:huangji at ohsu.edu] >Sent: Wednesday, August 31, 2011 12:21 PM >To: Davis, Sean (NIH/NCI) [E] >Cc: bioconductor at r-project.org >Subject: Re: [BioC] topTable (fit) annotation > >Thank YOU Sean for responding my question. I am not sure where I should >add the platform annotation in. > >Here are what I observed: > >If I extract "GDS2162" by typing in > >> gds=getGEO("GDS2162") >> eset=GDS2eSet(gds,do.log2=T) > >>eset > >ExpressionSet (storageMode: lockedEnvironment) >assayData: 45101 features, 16 samples > element names: exprs >protocolData: none >phenoData > sampleNames: GSM67339 GSM67343 ... GSM67352 (16 total) > varLabels: sample genotype/variation agent description > varMetadata: labelDescription >featureData > featureNames: 1415670_at 1415671_at ... AFFX-TrpnX-M_at (45101 total) > fvarLabels: ID Gene.title ... GO.Component.1 (21 total) > > fvarMetadata: Column labelDescription >experimentData: use 'experimentData(object)' > pubMedIds: 16237459 >Annotation: > >Most of annotation is in. > > > >If I extract "GSE16962" by typing in > >>gse=getGEO("GSE16962") > >> gse >$GSE16962_series_matrix.txt.gz >ExpressionSet (storageMode: lockedEnvironment) >assayData: 54675 features, 12 samples > element names: exprs >protocolData: none >phenoData > sampleNames: GSM424759 GSM424760 ... GSM424770 (12 total) > varLabels: title geo_accession ... data_row_count (34 total) > varMetadata: labelDescription >featureData > featureNames: 1007_s_at 1053_at ... AFFX-TrpnX-M_at (54675 total) > fvarLabels: ID GB_ACC ... Gene.Ontology.Molecular.Function (16 total) > fvarMetadata: Column Description labelDescription >experimentData: use 'experimentData(object)' >Annotation: GPL570 > >It looks to me it includes annotation package such as GO term.... >But I can't use this eset data to do further analysis (such as fit table) >what I need. > >If I extract ("GSE16962") by typing in > >>gse=getGEO("GSE16962", GSEMatrix=F) > >Then following GEOquery package, I can generate eset2, which I can use to >do analysis what I need. But the eset2 looks like this: > >> eset2 >ExpressionSet (storageMode: lockedEnvironment) >assayData: 54675 features, 12 samples > element names: exprs >protocolData: none >phenoData > sampleNames: GSM424759 GSM424760 ... GSM424770 (12 total) > varLabels: samples > varMetadata: labelDescription >featureData: none >experimentData: use 'experimentData(object)' >Annotation: > > > >Here are the scripts that I use to generate eset2: > >>probesets <- Table(GPLList(gse)[[1]])$ID data.matrix <- >>do.call("cbind", lapply(GSMList(gse), function(x) { >+ tab <- Table(x) >+ mymatch <- match(probesets,tab$ID_REF) >+ return(tab$VALUE[mymatch]) >+ })) >> data.matrix <- apply(data.matrix, 2, function(x) { >+ as.numeric(as.character(x)) >+ }) >> require(Biobase) >> rownames(data.matrix) <- probesets >> colnames(data.matrix) <- names(GSMList(gse)) pdata <- >> data.frame(samples=names(GSMList(gse))) >> rownames(pdata) <- names(GSMList(gse)) pheno <- >> as(pdata,"AnnotatedDataFrame") >> eset2 <- new('ExpressionSet',exprs=data.matrix,phenoData=pheno) > > >At which step, I should add > >>gplannot = getGEO("GPL96", AnnotGPL=TRUE) > > >Many Many thanks > >Jing > > > > > > > > > > > > > > > >On 8/30/11 6:01 PM, "Sean Davis" <sdavis2 at="" mail.nih.gov=""> wrote: > >>Hi, Jing. >> >>NCBI GEO maintains two types of GPL records. The normal variant is >>just supplied by the submitter. However, when a GEO Series is curated >>by NCBI GEO into a GEO DataSet (GDS), they create a so-called >>"Annotation GPL". These have a relatively standard set of columns. I >>have not made the change to GEOquery yet to grab this annotation GPL >>when getting Series Matrix files. But, you can get them yourself by >>specifying: >> >>gplannot = getGEO("GPL96", AnnotGPL=TRUE) >> >>You can always replace the feature data of the ExpressionSets with the >>information in the retrieved Annotation GPL. >> >>I hope that is clear. >> >>Sean >> >> >>On Tue, Aug 30, 2011 at 5:01 PM, Jing Huang <huangji at="" ohsu.edu=""> wrote: >>> Dear All members, >>> >>> I have been extracting data from GEO (GEO package) and do some >>>analysis on them by using limma package. What I discover is the >>>components of >>>topTable(fit) are different from the dataset GDS and GSE. >>> >>> If the data is from GDS, then the colnames of topTable (fit) looks >>>like this. >>> >>>> colnames(topTable(fit)) >>> [1] "ID" "Gene.title" "Gene.symbol" >>> [4] "Gene.ID" "UniGene.title" "UniGene.symbol" >>> [7] "UniGene.ID" "Nucleotide.Title" "GI" >>> [10] "GenBank.Accession" "Platform_CLONEID" "Platform_ORF" >>> [13] "Platform_SPOTID" "Chromosome.location" >>>"Chromosome.annotation" >>> [16] "GO.Function" "GO.Process" "GO.Component" >>> [19] "GO.Function.1" "GO.Process.1" "GO.Component.1" >>> [22] "CTRL" "HIF1a" "HIF2a" >>> [25] "HIF1a2a" "AveExpr" "F" >>> [28] "P.Value" "adj.P.Val" >>> >>> If the data is from GSE, then the colnames of topTable(fit) looks >>>like this: >>> >>>>colnames(topTable(fit) >>> >>> [1] "ID" "mir210" "CTRL2" "AveExpr" "F" >>>"P.Value" "adj.P.Val" >>> >>> I am trying to add some term into this table by doing following one >>>by >>>one: the data is generated by Affymetrix human U133 platform: >>> >>>>Library(hgu133plus2.db) >>>>x=hgu133plus2SYMBOL >>>>y=topTable(fit) >>>>y$SYMBOL=unlist(as.list(x[y$ID])) >>> >>> It works but I need to add ENTREZID,SYMBOL,CHR, CHRloc, and GO >>>annotations as well. I like to have the topTable more like the >>>topTable(fit) generated at top by data GEO GDS data >>> >>> I am wondering if there is an easy way to annotate all once. >>> >>> In addition, I am having a trouble to annotate GO term. >>> >>> Many Thanks >>> >>> Jing >>> >>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>>http://news.gmane.org/gmane.science.biology.informatics.conductor >>> > >_______________________________________________ >Bioconductor mailing list >Bioconductor at r-project.org >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor
Annotation GO Homo sapiens annotate limma GEOquery Annotation GO Homo sapiens annotate • 1.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 559 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6