pd.hugene.1.0.st.v1

0

Entering edit mode

Mark Robinson ★ 1.1k

@mark-robinson-2171

Last seen 10.7 years ago

Hi Vince. Thanks for the reply. That's good to know. But, it only allows me to access the indices, not to actually compute gene-level summaries, right? Any way to do that without building the package from scratch? Cheers, Mark On 31/07/2009, at 10:10 PM, Vincent Carey wrote: > On Fri, Jul 31, 2009 at 12:48 AM, Mark > Robinson<mrobinson at="" wehi.edu.au=""> wrote: >> Hi all. >> >> I wonder if its makes more sense to have the *transcript* version >> of this >> package, instead of the *probeset* version available when you >> install via: >> > > This merits further discussion. Note that under the current approach > you can obtain > the transcript cluster indices for summarization using fData on the > output of rma > >> class(tismix) > [1] "GeneFeatureSet" > attr(,"package") > [1] "oligoClasses" >> class(tismixRMA) > [1] "ExpressionSet" > attr(,"package") > [1] "Biobase" >> fData(tismixRMA)[1:4,] > fsetid exon_id transcript_cluster_id level crosshyb_type > chrom > 7896737 7896737 96595542 7896736 NA > 3 1 > 7896739 7896739 96595544 7896738 NA > 3 1 > 7896741 7896741 96595546 7896740 NA > 3 1 > 7896743 7896743 96595548 7896742 NA > 3 1 > > accessions > 7896737 > <na> > 7896739 > <na> > 7896741 > BC136848 > ,BC136907,ENST00000318050,ENST00000326183,ENST00000335137,NM_001 > 004195,NM_001005240,NM_001005484 > 7896743 > BC118988,ENST00000279067 > >> dim(fData(tismixRMA)) > [1] 253002 7 >> dim(exprs(tismixRMA)) > [1] 253002 33 > > annotation packages are available at both the probescript and > transcript cluster level, thanks > to folks at city of hope (e.g., > http://www.bioconductor.org/packages/release/data/annotation/html/hu gene10sttranscriptcluster.db.html) > > >> source("http://bioconductor.org/biocLite.R") >> biocLite("pd.hugene.1.0.st.v1") >> >> It seems like as a default, more people would want gene-level >> summaries for >> these arrays ... especially since ~200k (~80%) of the probesets >> have 3 >> probes or less. >> >> Of course I (and everyone around the world) could build this >> package locally >> from scratch using the transcript CSV, but it seems like there >> would be >> enough demand for this to make available direct from BioC. Just a >> thought. >> Does anyone agree? >> >> Or, am I missing something that will allow me to do gene-level >> analysis from >> this package? >> >> My session is below. >> >> Thanks in advance. >> Mark >> >> >> >> ---------------------- >> mac1618:Desktop mrobinson$ wc -l HuGene-1_0-st-v1.na29.*.csv >> 257449 HuGene-1_0-st-v1.na29.hg18.probeset.csv >> 33317 HuGene-1_0-st-v1.na29.hg18.transcript.csv >> ---------------------- >> >> >> ---------------------- >>> library(oligo) >> Loading required package: oligoClasses >> Loading required package: Biobase >> >> Welcome to Bioconductor >> >> Vignettes contain introductory material. To view, type >> 'openVignette()'. To cite Bioconductor, see >> 'citation("Biobase")' and for packages 'citation(pkgname)'. >> >> Loading required package: preprocessCore >> Welcome to oligo version 1.8.1 >>> cf <- dir(celPath,"CEL") >>> fs <- read.celfiles( file.path(celPath,cf) ) >> Loading required package: pd.hugene.1.0.st.v1 >> Loading required package: RSQLite >> Loading required package: DBI >> Platform design info loaded. >> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer1.CEL >> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer2.CEL >> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal1.CEL >> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal2.CEL >>> rmaOligo <- oligo::rma(fs) >> Background correcting >> Normalizing >> Calculating Expression >> dmOligo <- exprs(rmaOligo) >> dim(rmaOligo) >>> dmOligo <- exprs(rmaOligo) >>> dim(rmaOligo) >> Features Samples >> 253002 4 >>> sessionInfo() >> R version 2.9.0 (2009-04-17) >> i386-apple-darwin8.11.1 >> >> locale: >> en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] pd.hugene.1.0.st.v1_2.4.1 RSQLite_0.7-1 >> [3] DBI_0.2-4 oligo_1.8.1 >> [5] preprocessCore_1.6.0 oligoClasses_1.6.0 >> [7] Biobase_2.4.1 >> >> loaded via a namespace (and not attached): >> [1] affxparser_1.15.6 affyio_1.12.0 Biostrings_2.12.1 >> IRanges_1.2.2 >> [5] splines_2.9.0 >> ---------------------- >> >> >> >> >> >> >> >> ------------------------------ >> Mark Robinson, PhD (Melb) >> Epigenetics Laboratory, Garvan >> Bioinformatics Division, WEHI >> e: m.robinson at garvan.org.au >> e: mrobinson at wehi.edu.au >> p: +61 (0)3 9345 2628 >> f: +61 (0)3 9347 0852 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > Vincent Carey, PhD > Biostatistics, Channing Lab > 617 525 2265 ------------------------------ Mark Robinson, PhD (Melb) Epigenetics Laboratory, Garvan Bioinformatics Division, WEHI e: m.robinson at garvan.org.au e: mrobinson at wehi.edu.au p: +61 (0)3 9345 2628 f: +61 (0)3 9347 0852

Epigenetics Annotation oligo Epigenetics Annotation oligo • 2.5k views

ADD COMMENT • link updated 15.8 years ago by Benilton Carvalho ★ 4.3k • written 15.8 years ago by Mark Robinson ★ 1.1k

0

Entering edit mode

Benilton Carvalho ★ 4.3k

@benilton-carvalho-1375

Last seen 5.1 years ago

Brazil/Campinas/UNICAMP

Mark, I'm planning on providing an updated version of thhe annotation pkgs that will allow gene-level summarization in about 1 week (maybe earlier). b -- Sent from my iPhone On Jul 31, 2009, at 7:20 PM, "Mark Robinson" <mrobinson at="" wehi.edu.au=""> wrote: > Hi Vince. > > Thanks for the reply. > > That's good to know. But, it only allows me to access the indices, > not to actually compute gene-level summaries, right? Any way to do > that without building the package from scratch? > > Cheers, > Mark > > On 31/07/2009, at 10:10 PM, Vincent Carey wrote: > >> On Fri, Jul 31, 2009 at 12:48 AM, Mark >> Robinson<mrobinson at="" wehi.edu.au=""> wrote: >>> Hi all. >>> >>> I wonder if its makes more sense to have the *transcript* version >>> of this >>> package, instead of the *probeset* version available when you >>> install via: >>> >> >> This merits further discussion. Note that under the current approach >> you can obtain >> the transcript cluster indices for summarization using fData on the >> output of rma >> >>> class(tismix) >> [1] "GeneFeatureSet" >> attr(,"package") >> [1] "oligoClasses" >>> class(tismixRMA) >> [1] "ExpressionSet" >> attr(,"package") >> [1] "Biobase" >>> fData(tismixRMA)[1:4,] >> fsetid exon_id transcript_cluster_id level crosshyb_type >> chrom >> 7896737 7896737 96595542 7896736 NA >> 3 1 >> 7896739 7896739 96595544 7896738 NA >> 3 1 >> 7896741 7896741 96595546 7896740 NA >> 3 1 >> 7896743 7896743 96595548 7896742 NA >> 3 1 >> >> accessions >> 7896737 >> <na> >> 7896739 >> <na> >> 7896741 >> BC136848 >> ,BC136907,ENST00000318050,ENST00000326183,ENST00000335137,NM_001 >> 004195,NM_001005240,NM_001005484 >> 7896743 >> BC118988,ENST00000279067 >> >>> dim(fData(tismixRMA)) >> [1] 253002 7 >>> dim(exprs(tismixRMA)) >> [1] 253002 33 >> >> annotation packages are available at both the probescript and >> transcript cluster level, thanks >> to folks at city of hope (e.g., >> http://www.bioconductor.org/packages/release/data/annotation/html/h ugene10sttranscriptcluster.db.html >> ) >> >> >>> source("http://bioconductor.org/biocLite.R") >>> biocLite("pd.hugene.1.0.st.v1") >>> >>> It seems like as a default, more people would want gene-level >>> summaries for >>> these arrays ... especially since ~200k (~80%) of the probesets >>> have 3 >>> probes or less. >>> >>> Of course I (and everyone around the world) could build this >>> package locally >>> from scratch using the transcript CSV, but it seems like there >>> would be >>> enough demand for this to make available direct from BioC. Just a >>> thought. >>> Does anyone agree? >>> >>> Or, am I missing something that will allow me to do gene-level >>> analysis from >>> this package? >>> >>> My session is below. >>> >>> Thanks in advance. >>> Mark >>> >>> >>> >>> ---------------------- >>> mac1618:Desktop mrobinson$ wc -l HuGene-1_0-st-v1.na29.*.csv >>> 257449 HuGene-1_0-st-v1.na29.hg18.probeset.csv >>> 33317 HuGene-1_0-st-v1.na29.hg18.transcript.csv >>> ---------------------- >>> >>> >>> ---------------------- >>>> library(oligo) >>> Loading required package: oligoClasses >>> Loading required package: Biobase >>> >>> Welcome to Bioconductor >>> >>> Vignettes contain introductory material. To view, type >>> 'openVignette()'. To cite Bioconductor, see >>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>> >>> Loading required package: preprocessCore >>> Welcome to oligo version 1.8.1 >>>> cf <- dir(celPath,"CEL") >>>> fs <- read.celfiles( file.path(celPath,cf) ) >>> Loading required package: pd.hugene.1.0.st.v1 >>> Loading required package: RSQLite >>> Loading required package: DBI >>> Platform design info loaded. >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer1.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer2.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal1.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal2.CEL >>>> rmaOligo <- oligo::rma(fs) >>> Background correcting >>> Normalizing >>> Calculating Expression >>> dmOligo <- exprs(rmaOligo) >>> dim(rmaOligo) >>>> dmOligo <- exprs(rmaOligo) >>>> dim(rmaOligo) >>> Features Samples >>> 253002 4 >>>> sessionInfo() >>> R version 2.9.0 (2009-04-17) >>> i386-apple-darwin8.11.1 >>> >>> locale: >>> en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] pd.hugene.1.0.st.v1_2.4.1 RSQLite_0.7-1 >>> [3] DBI_0.2-4 oligo_1.8.1 >>> [5] preprocessCore_1.6.0 oligoClasses_1.6.0 >>> [7] Biobase_2.4.1 >>> >>> loaded via a namespace (and not attached): >>> [1] affxparser_1.15.6 affyio_1.12.0 Biostrings_2.12.1 >>> IRanges_1.2.2 >>> [5] splines_2.9.0 >>> ---------------------- >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------ >>> Mark Robinson, PhD (Melb) >>> Epigenetics Laboratory, Garvan >>> Bioinformatics Division, WEHI >>> e: m.robinson at garvan.org.au >>> e: mrobinson at wehi.edu.au >>> p: +61 (0)3 9345 2628 >>> f: +61 (0)3 9347 0852 >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> Vincent Carey, PhD >> Biostatistics, Channing Lab >> 617 525 2265 > > ------------------------------ > Mark Robinson, PhD (Melb) > Epigenetics Laboratory, Garvan > Bioinformatics Division, WEHI > e: m.robinson at garvan.org.au > e: mrobinson at wehi.edu.au > p: +61 (0)3 9345 2628 > f: +61 (0)3 9347 0852 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 15.8 years ago Benilton Carvalho ★ 4.3k

0

Entering edit mode

cstrato ★ 3.9k

@cstrato-908

Last seen 6.6 years ago

Austria

Dear Mark, I am not sure, but maybe you could use the old annotation package, which I believe was built for release 3 of the HuGene array, see: http://www.bioconductor.org/packages/2.3/data/annotation/html/hugene10 st.db.html Alternatively, you could use package xps, which allows you to compute both gene-level summaries and probeset-level summaries. Best regards Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ Mark Robinson wrote: > Hi Vince. > > Thanks for the reply. > > That's good to know. But, it only allows me to access the indices, > not to actually compute gene-level summaries, right? Any way to do > that without building the package from scratch? > > Cheers, > Mark > > On 31/07/2009, at 10:10 PM, Vincent Carey wrote: > >> On Fri, Jul 31, 2009 at 12:48 AM, Mark >> Robinson<mrobinson at="" wehi.edu.au=""> wrote: >>> Hi all. >>> >>> I wonder if its makes more sense to have the *transcript* version of >>> this >>> package, instead of the *probeset* version available when you >>> install via: >>> >> >> This merits further discussion. Note that under the current approach >> you can obtain >> the transcript cluster indices for summarization using fData on the >> output of rma >> >>> class(tismix) >> [1] "GeneFeatureSet" >> attr(,"package") >> [1] "oligoClasses" >>> class(tismixRMA) >> [1] "ExpressionSet" >> attr(,"package") >> [1] "Biobase" >>> fData(tismixRMA)[1:4,] >> fsetid exon_id transcript_cluster_id level crosshyb_type chrom >> 7896737 7896737 96595542 7896736 NA 3 1 >> 7896739 7896739 96595544 7896738 NA 3 1 >> 7896741 7896741 96595546 7896740 NA 3 1 >> 7896743 7896743 96595548 7896742 NA 3 1 >> >> accessions >> 7896737 >> <na> >> 7896739 >> <na> >> 7896741 >> BC136848,BC136907,ENST00000318050,ENST00000326183,ENST00000335137,N M_001 >> 004195,NM_001005240,NM_001005484 >> 7896743 >> BC118988,ENST00000279067 >> >>> dim(fData(tismixRMA)) >> [1] 253002 7 >>> dim(exprs(tismixRMA)) >> [1] 253002 33 >> >> annotation packages are available at both the probescript and >> transcript cluster level, thanks >> to folks at city of hope (e.g., >> http://www.bioconductor.org/packages/release/data/annotation/html/h ugene10sttranscriptcluster.db.html) >> >> >> >>> source("http://bioconductor.org/biocLite.R") >>> biocLite("pd.hugene.1.0.st.v1") >>> >>> It seems like as a default, more people would want gene-level >>> summaries for >>> these arrays ... especially since ~200k (~80%) of the probesets have 3 >>> probes or less. >>> >>> Of course I (and everyone around the world) could build this package >>> locally >>> from scratch using the transcript CSV, but it seems like there would be >>> enough demand for this to make available direct from BioC. Just a >>> thought. >>> Does anyone agree? >>> >>> Or, am I missing something that will allow me to do gene-level >>> analysis from >>> this package? >>> >>> My session is below. >>> >>> Thanks in advance. >>> Mark >>> >>> >>> >>> ---------------------- >>> mac1618:Desktop mrobinson$ wc -l HuGene-1_0-st-v1.na29.*.csv >>> 257449 HuGene-1_0-st-v1.na29.hg18.probeset.csv >>> 33317 HuGene-1_0-st-v1.na29.hg18.transcript.csv >>> ---------------------- >>> >>> >>> ---------------------- >>>> library(oligo) >>> Loading required package: oligoClasses >>> Loading required package: Biobase >>> >>> Welcome to Bioconductor >>> >>> Vignettes contain introductory material. To view, type >>> 'openVignette()'. To cite Bioconductor, see >>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>> >>> Loading required package: preprocessCore >>> Welcome to oligo version 1.8.1 >>>> cf <- dir(celPath,"CEL") >>>> fs <- read.celfiles( file.path(celPath,cf) ) >>> Loading required package: pd.hugene.1.0.st.v1 >>> Loading required package: RSQLite >>> Loading required package: DBI >>> Platform design info loaded. >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer1.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer2.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal1.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal2.CEL >>>> rmaOligo <- oligo::rma(fs) >>> Background correcting >>> Normalizing >>> Calculating Expression >>> dmOligo <- exprs(rmaOligo) >>> dim(rmaOligo) >>>> dmOligo <- exprs(rmaOligo) >>>> dim(rmaOligo) >>> Features Samples >>> 253002 4 >>>> sessionInfo() >>> R version 2.9.0 (2009-04-17) >>> i386-apple-darwin8.11.1 >>> >>> locale: >>> en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] pd.hugene.1.0.st.v1_2.4.1 RSQLite_0.7-1 >>> [3] DBI_0.2-4 oligo_1.8.1 >>> [5] preprocessCore_1.6.0 oligoClasses_1.6.0 >>> [7] Biobase_2.4.1 >>> >>> loaded via a namespace (and not attached): >>> [1] affxparser_1.15.6 affyio_1.12.0 Biostrings_2.12.1 IRanges_1.2.2 >>> [5] splines_2.9.0 >>> ---------------------- >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------ >>> Mark Robinson, PhD (Melb) >>> Epigenetics Laboratory, Garvan >>> Bioinformatics Division, WEHI >>> e: m.robinson at garvan.org.au >>> e: mrobinson at wehi.edu.au >>> p: +61 (0)3 9345 2628 >>> f: +61 (0)3 9347 0852 >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> Vincent Carey, PhD >> Biostatistics, Channing Lab >> 617 525 2265 > > ------------------------------ > Mark Robinson, PhD (Melb) > Epigenetics Laboratory, Garvan > Bioinformatics Division, WEHI > e: m.robinson at garvan.org.au > e: mrobinson at wehi.edu.au > p: +61 (0)3 9345 2628 > f: +61 (0)3 9347 0852 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >

ADD COMMENT • link 15.8 years ago cstrato ★ 3.9k

Login before adding your answer.