Human Gene ST 1.0 probeset controls
2
0
Entering edit mode
@javier-perez-florido-3121
Last seen 6.8 years ago
Dear list, I would like to know if the GeneChip Human Gene ST 1.0 array has some gene controls (like AFFX genes in other Affymetrix technologies). Thanks in advance, Javier
• 1.5k views
ADD COMMENT
0
Entering edit mode
Guido Hooiveld ★ 4.1k
@guido-hooiveld-2020
Last seen 4 days ago
Wageningen University, Wageningen, the …
Hi Javier, In our facility we do add the hybridization and spike-in controls to the samples when arraying them on mouse gene st arrays (as recommended by affy), but I noticed that these are indeed not listed in the annotation files...?? Something for affymetrix to answer? Regards, Guido ------------------------------------------------ Guido Hooiveld, PhD Nutrition, Metabolism & Genomics Group Division of Human Nutrition Wageningen University Biotechnion, Bomenweg 2 NL-6703 HD Wageningen the Netherlands tel: (+)31 317 485788 fax: (+)31 317 483342 internet: http://nutrigene.4t.com email: guido.hooiveld at wur.nl > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of > Javier P?rez Florido > Sent: 04 November 2009 20:30 > To: bioconductor at stat.math.ethz.ch > Subject: [BioC] Human Gene ST 1.0 probeset controls > > Dear list, > I would like to know if the GeneChip Human Gene ST 1.0 array > has some gene controls (like AFFX genes in other Affymetrix > technologies). > Thanks in advance, > Javier > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD COMMENT
0
Entering edit mode
... In addition to my previous mail, I just noticed there are 21 so- called 'control -> affx' probesets (column R in affy;s annotation file), but these probesets don't have any annotation other than this. Netaffx is also not more informative. Maybe these code for the hyb/spike in controls, but obviously I am not sure at all about this. G > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of > Hooiveld, Guido > Sent: 04 November 2009 21:55 > To: Javier P?rez Florido; bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] Human Gene ST 1.0 probeset controls > > Hi Javier, > > In our facility we do add the hybridization and spike-in > controls to the samples when arraying them on mouse gene st > arrays (as recommended by affy), but I noticed that these are > indeed not listed in the annotation files...?? > Something for affymetrix to answer? > > Regards, > Guido > > ------------------------------------------------ > Guido Hooiveld, PhD > Nutrition, Metabolism & Genomics Group > Division of Human Nutrition > Wageningen University > Biotechnion, Bomenweg 2 > NL-6703 HD Wageningen > the Netherlands > tel: (+)31 317 485788 > fax: (+)31 317 483342 > internet: http://nutrigene.4t.com > email: guido.hooiveld at wur.nl > > > > > -----Original Message----- > > From: bioconductor-bounces at stat.math.ethz.ch > > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Javier > > P?rez Florido > > Sent: 04 November 2009 20:30 > > To: bioconductor at stat.math.ethz.ch > > Subject: [BioC] Human Gene ST 1.0 probeset controls > > > > Dear list, > > I would like to know if the GeneChip Human Gene ST 1.0 > array has some > > gene controls (like AFFX genes in other Affymetrix technologies). > > Thanks in advance, > > Javier > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY
0
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 6.2 years ago
Austria
Dear Javier, When you open the Affymetrix annotation files for the HuGene ST 1.0 array you will see that it does contain 13 AFFX controls and a numberof "other_spike" controls for both the transcript and the probeset annotation files. The MoGene array contains 22 "control->affx" probesets including 13 AFFX controls (bac_spike, polya_spike). Best regards Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ Javier P?rez Florido wrote: > Dear list, > I would like to know if the GeneChip Human Gene ST 1.0 array has some > gene controls (like AFFX genes in other Affymetrix technologies). > Thanks in advance, > Javier > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Thanks to everybody, I'm new working on HuGene ST 1.0 and have some questions: * I have normalized some CEL files using the oligo package and the annotation file used, by default, is the pd.hugene.1.0.st.v1. How can I access to this annotation file to check the type of control probe sets used? I've tried: conn<-db(pd.hugene.1.0.st.v1) dbListTables(conn) [1] "bgfeature" "chrom_dict" "core_mps" "featureSet" "level_dict" [6] "pmfeature" "table_info" "type_dict" dbListFields(conn,"featureSet") [1] "fsetid" "strand" "start" "stop" [5] "transcript_cluster_id" "exon_id" "crosshyb_type" "level" [9] "chrom" "type" sql="SELECT fsetid,type FROM featureSet" dbGetQuery(conn,sql) But I get integer numbers (1,2,3...) for the type field instead of "AFFX*", "other-spike", etc control probe sets using the annotation file....How can I get this information? * What is the difference between hugene10stprobeset.db and hugene10sttranscriptcluster.db? What is the diference between summarize at the probe set level and at the gene level? Thanks again, Javier P.S. If you know any document that could help me on this arrays, it would be great. R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 LC_MONETARY=Spanish_Spain.1252 [4] LC_NUMERIC=C LC_TIME=Spanish_Spain.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] annotate_1.24.0 AnnotationDbi_1.8.0 pd.hugene.1.0.st.v1_3.0.0 RSQLite_0.7-3 [5] DBI_0.2-4 oligo_1.10.0 preprocessCore_1.8.0 oligoClasses_1.8.0 [9] Biobase_2.6.0 loaded via a namespace (and not attached): [1] affxparser_1.18.0 affyio_1.14.0 Biostrings_2.14.0 IRanges_1.4.0 splines_2.10.0 tools_2.10.0 [7] xtable_1.5-5 cstrato escribió: > Dear Javier, > > When you open the Affymetrix annotation files for the HuGene ST 1.0 > array you will see that it does contain 13 AFFX controls and a > numberof "other_spike" controls for both the transcript and the > probeset annotation files. The MoGene array contains 22 > "control->affx" probesets including 13 AFFX controls (bac_spike, > polya_spike). > > Best regards > Christian > _._._._._._._._._._._._._._._._._._ > C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a > V.i.e.n.n.a A.u.s.t.r.i.a > e.m.a.i.l: cstrato at aon.at > _._._._._._._._._._._._._._._._._._ > > > Javier Pérez Florido wrote: >> Dear list, >> I would like to know if the GeneChip Human Gene ST 1.0 array has some >> gene controls (like AFFX genes in other Affymetrix technologies). >> Thanks in advance, >> Javier >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Javier, This is what you want to do: info = dbGetQuery(conn, paste("SELECT DISTINCT meta_fsetid as transcript_id, type_id", "FROM featureSet, core_mps, type_dict", "WHERE featureSet.fsetid=core_mps.fsetid", "AND featureSet.type=type_dict.type") I'll make sure that, in the next releases, the users are not expected to figure out queries like this. Using a simplistic description: The probeset db is at the exon level; Transcript db is at the gene level. b On Nov 5, 2009, at 8:59 AM, Javier P?rez Florido wrote: > Thanks to everybody, > I'm new working on HuGene ST 1.0 and have some questions: > > * I have normalized some CEL files using the oligo package and the > annotation file used, by default, is the pd.hugene.1.0.st.v1. How > can I access to this annotation file to check the type of control > probe sets used? I've tried: > > conn<-db(pd.hugene.1.0.st.v1) > dbListTables(conn) > [1] "bgfeature" "chrom_dict" "core_mps" "featureSet" > "level_dict" > [6] "pmfeature" "table_info" "type_dict" > dbListFields(conn,"featureSet") > [1] "fsetid" "strand" > "start" "stop" > [5] "transcript_cluster_id" "exon_id" > "crosshyb_type" "level" > [9] "chrom" "type" > sql="SELECT fsetid,type FROM featureSet" > dbGetQuery(conn,sql) > But I get integer numbers (1,2,3...) for the type field instead > of "AFFX*", "other-spike", etc control probe sets using the > annotation file....How can I get this information? > > * What is the difference between hugene10stprobeset.db and > hugene10sttranscriptcluster.db? What is the diference between > summarize at the probe set level and at the gene level? > > Thanks again, > Javier > P.S. If you know any document that could help me on this arrays, it > would be great. > > R version 2.10.0 (2009-10-26) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 > LC_MONETARY=Spanish_Spain.1252 > [4] LC_NUMERIC=C LC_TIME=Spanish_Spain.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] annotate_1.24.0 AnnotationDbi_1.8.0 > pd.hugene.1.0.st.v1_3.0.0 RSQLite_0.7-3 > [5] DBI_0.2-4 oligo_1.10.0 > preprocessCore_1.8.0 oligoClasses_1.8.0 > [9] Biobase_2.6.0 > > loaded via a namespace (and not attached): > [1] affxparser_1.18.0 affyio_1.14.0 Biostrings_2.14.0 > IRanges_1.4.0 splines_2.10.0 tools_2.10.0 > [7] xtable_1.5-5 > > > cstrato escribi?: >> Dear Javier, >> >> When you open the Affymetrix annotation files for the HuGene ST 1.0 >> array you will see that it does contain 13 AFFX controls and a >> numberof "other_spike" controls for both the transcript and the >> probeset annotation files. The MoGene array contains 22 >> "control->affx" probesets including 13 AFFX controls (bac_spike, >> polya_spike). >> >> Best regards >> Christian >> _._._._._._._._._._._._._._._._._._ >> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >> V.i.e.n.n.a A.u.s.t.r.i.a >> e.m.a.i.l: cstrato at aon.at >> _._._._._._._._._._._._._._._._._._ >> >> >> Javier P?rez Florido wrote: >>> Dear list, >>> I would like to know if the GeneChip Human Gene ST 1.0 array has >>> some >>> gene controls (like AFFX genes in other Affymetrix technologies). >>> Thanks in advance, >>> Javier >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> > > > [[alternative HTML version deleted]] > > <att00001.txt>
ADD REPLY
0
Entering edit mode
OK, So, to sum up (and check if I understand the Human Gene ST 1.0 array), when summarizing to the gene level means that there are several probesets that compose a gene. To summarize to the gene level when normalizing, I executed: OligoEset<-rma(OligoRaw,target="core") and I got 33297 genes (transcript ids). Using the following query on pd.hugene.1.0.st.v1: dbListTables(conn) dbListFields(conn,"type_dict") info2<-"SELECT * from type_dict" result<-dbGetQuery(conn,info2) I got: # type type_id #1 1 main #2 2 control->affx #3 3 control->chip #4 4 control->bgp->antigenomic #5 5 control->bgp->genomic #6 6 normgene->exon #7 7 normgene->intron #8 8 rescue->FLmRNA->unmapped I also executed the following query: conn<-db(pd.hugene.1.0.st.v1) dbListTables(conn) info = dbGetQuery(conn, paste("SELECT DISTINCT meta_fsetid as transcript_id, type_id", "FROM featureSet, core_mps, type_dict", "WHERE featureSet.fsetid=core_mps.fsetid", "AND featureSet.type=type_dict.type")) I have a complete processed example (it is summarized to the gene level, and it has the ACC number, Symbol information, etc for each transcript id). I wanted to reproduce the example by myself using the raw data. When matching the transcript_id field given by the above query and the transcript_id given by the example data set, the following information can be extracted: * control->affx are related to other-spike y AFFX probe sets (57 probe sets) * normgene->exon are related to 1195 pos_control probe sets * normgene->intron are related to 2904 neg_control probe sets So, I suppose that there are about 4156 control transcripts. Since I summarized to the gene level, I have used the annotation file "hugene10sttranscriptcluster.db". I've tried to get the ACC number and the Symbol for some transcript_id. The idea was to check if the results given were the same as the example I have. For example: hugene10sttranscriptclusterACCNUM[["7912580"]] I get "NM_001136561", but in the example, the accession number is XM_001714578. However, I get the same result for Symbol: hugene10sttranscriptclusterSYMBOL[["7912580"]] : LOC440563 Why are there different Accession Number for the same transcript_id? Thanks again, Javier > sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C [5] LC_TIME=Spanish_Spain.1252 attached base packages: [1] tools tcltk stats graphics grDevices utils datasets [8] methods base other attached packages: [1] pd.hugene.1.0.st.v1_3.0.0 oligoClasses_1.8.0 [3] hugene10stprobeset.db_4.0.1 hugene10sttranscriptcluster.db_4.0.1 [5] org.Hs.eg.db_2.3.6 oneChannelGUI_1.12.0 [7] preprocessCore_1.8.0 GOstats_2.12.0 [9] RSQLite_0.7-3 DBI_0.2-4 [11] graph_1.24.0 Category_2.12.0 [13] AnnotationDbi_1.8.0 tkWidgets_1.24.0 [15] DynDoc_1.24.0 widgetTools_1.24.0 [17] affylmGUI_1.20.0 affyio_1.14.0 [19] affy_1.24.0 limma_3.2.1 [21] Biobase_2.6.0 loaded via a namespace (and not attached): [1] annotate_1.24.0 Biostrings_2.14.0 genefilter_1.28.0 GO.db_2.3.5 [5] GSEABase_1.8.0 IRanges_1.4.0 RBGL_1.20.0 splines_2.10.0 [9] survival_2.35-7 XML_2.6-0 xtable_1.5-5 > Benilton Carvalho escribió: > Hi Javier, > > This is what you want to do: > > info = dbGetQuery(conn, paste("SELECT DISTINCT meta_fsetid as > transcript_id, type_id", > "FROM featureSet, core_mps, type_dict", > "WHERE featureSet.fsetid=core_mps.fsetid", > "AND featureSet.type=type_dict.type") > > I'll make sure that, in the next releases, the users are not expected > to figure out queries like this. > > Using a simplistic description: The probeset db is at the exon level; > Transcript db is at the gene level. > > b > > On Nov 5, 2009, at 8:59 AM, Javier Pérez Florido wrote: > >> Thanks to everybody, >> I'm new working on HuGene ST 1.0 and have some questions: >> >> * I have normalized some CEL files using the oligo package and the >> annotation file used, by default, is the pd.hugene.1.0.st.v1. How >> can I access to this annotation file to check the type of control >> probe sets used? I've tried: >> >> conn<-db(pd.hugene.1.0.st.v1) >> dbListTables(conn) >> [1] "bgfeature" "chrom_dict" "core_mps" "featureSet" >> "level_dict" >> [6] "pmfeature" "table_info" "type_dict" >> dbListFields(conn,"featureSet") >> [1] "fsetid" "strand" >> "start" "stop" >> [5] "transcript_cluster_id" "exon_id" >> "crosshyb_type" "level" >> [9] "chrom" "type" >> sql="SELECT fsetid,type FROM featureSet" >> dbGetQuery(conn,sql) >> But I get integer numbers (1,2,3...) for the type field instead >> of "AFFX*", "other-spike", etc control probe sets using the >> annotation file....How can I get this information? >> >> * What is the difference between hugene10stprobeset.db and >> hugene10sttranscriptcluster.db? What is the diference between >> summarize at the probe set level and at the gene level? >> >> Thanks again, >> Javier >> P.S. If you know any document that could help me on this arrays, it >> would be great. >> >> R version 2.10.0 (2009-10-26) >> i386-pc-mingw32 >> >> locale: >> [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 >> LC_MONETARY=Spanish_Spain.1252 >> [4] LC_NUMERIC=C LC_TIME=Spanish_Spain.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] annotate_1.24.0 AnnotationDbi_1.8.0 >> pd.hugene.1.0.st.v1_3.0.0 RSQLite_0.7-3 >> [5] DBI_0.2-4 oligo_1.10.0 >> preprocessCore_1.8.0 oligoClasses_1.8.0 >> [9] Biobase_2.6.0 >> >> loaded via a namespace (and not attached): >> [1] affxparser_1.18.0 affyio_1.14.0 Biostrings_2.14.0 >> IRanges_1.4.0 splines_2.10.0 tools_2.10.0 >> [7] xtable_1.5-5 >> >> >> cstrato escribió: >>> Dear Javier, >>> >>> When you open the Affymetrix annotation files for the HuGene ST 1.0 >>> array you will see that it does contain 13 AFFX controls and a >>> numberof "other_spike" controls for both the transcript and the >>> probeset annotation files. The MoGene array contains 22 >>> "control->affx" probesets including 13 AFFX controls (bac_spike, >>> polya_spike). >>> >>> Best regards >>> Christian >>> _._._._._._._._._._._._._._._._._._ >>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >>> V.i.e.n.n.a A.u.s.t.r.i.a >>> e.m.a.i.l: cstrato at aon.at >>> _._._._._._._._._._._._._._._._._._ >>> >>> >>> Javier Pérez Florido wrote: >>>> Dear list, >>>> I would like to know if the GeneChip Human Gene ST 1.0 array has some >>>> gene controls (like AFFX genes in other Affymetrix technologies). >>>> Thanks in advance, >>>> Javier >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor@stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >>> >> >> >> [[alternative HTML version deleted]] >> >> <att00001.txt> > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
XM_001714578 was replaced by NM_001136561. http://www.ncbi.nlm.nih.gov/nuccore/NM_001136561?log$=seqview_status b On Nov 9, 2009, at 9:19 AM, Javier P?rez Florido wrote: > > OK, > So, to sum up (and check if I understand the Human Gene ST 1.0 > array), when summarizing to the gene level means that there are > several probesets that compose a gene. To summarize to the gene > level when normalizing, I executed: > OligoEset<-rma(OligoRaw,target="core") > and I got 33297 genes (transcript ids). > > Using the following query on pd.hugene.1.0.st.v1: > dbListTables(conn) > dbListFields(conn,"type_dict") > info2<-"SELECT * from type_dict" > result<-dbGetQuery(conn,info2) > > I got: > # type type_id > #1 1 main > #2 2 control->affx > #3 3 control->chip > #4 4 control->bgp->antigenomic > #5 5 control->bgp->genomic > #6 6 normgene->exon > #7 7 normgene->intron > #8 8 rescue->FLmRNA->unmapped > > I also executed the following query: > conn<-db(pd.hugene.1.0.st.v1) > dbListTables(conn) > info = dbGetQuery(conn, paste("SELECT DISTINCT meta_fsetid as > transcript_id, type_id", > "FROM featureSet, core_mps, type_dict", > "WHERE featureSet.fsetid=core_mps.fsetid", > "AND featureSet.type=type_dict.type")) > > I have a complete processed example (it is summarized to the gene > level, and it has the ACC number, Symbol information, etc for each > transcript id). I wanted to reproduce the example by myself using > the raw data. When matching the transcript_id field given by the > above query and the transcript_id given by the example data set, the > following information can be extracted: > ? control->affx are related to other-spike y AFFX probe sets (57 > probe sets) > ? normgene->exon are related to 1195 pos_control probe sets > ? normgene->intron are related to 2904 neg_control probe sets > So, I suppose that there are about 4156 control transcripts. > > Since I summarized to the gene level, I have used the annotation > file "hugene10sttranscriptcluster.db". I've tried to get the ACC > number and the Symbol for some transcript_id. The idea was to check > if the results given were the same as the example I have. For example: > > hugene10sttranscriptclusterACCNUM[["7912580"]] > I get "NM_001136561", but in the example, the accession number is > XM_001714578. However, I get the same result for Symbol: > hugene10sttranscriptclusterSYMBOL[["7912580"]] : LOC440563 > > Why are there different Accession Number for the same transcript_id? > > Thanks again, > Javier > > > sessionInfo() > R version 2.10.0 (2009-10-26) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 > [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C > [5] LC_TIME=Spanish_Spain.1252 > > attached base packages: > [1] tools tcltk stats graphics grDevices utils > datasets > [8] methods base > > other attached packages: > [1] pd.hugene.1.0.st.v1_3.0.0 oligoClasses_1.8.0 > [3] hugene10stprobeset.db_4.0.1 > hugene10sttranscriptcluster.db_4.0.1 > [5] org.Hs.eg.db_2.3.6 oneChannelGUI_1.12.0 > [7] preprocessCore_1.8.0 GOstats_2.12.0 > [9] RSQLite_0.7-3 DBI_0.2-4 > [11] graph_1.24.0 Category_2.12.0 > [13] AnnotationDbi_1.8.0 tkWidgets_1.24.0 > [15] DynDoc_1.24.0 widgetTools_1.24.0 > [17] affylmGUI_1.20.0 affyio_1.14.0 > [19] affy_1.24.0 limma_3.2.1 > [21] Biobase_2.6.0 > > loaded via a namespace (and not attached): > [1] annotate_1.24.0 Biostrings_2.14.0 genefilter_1.28.0 GO.db_2.3.5 > [5] GSEABase_1.8.0 IRanges_1.4.0 RBGL_1.20.0 > splines_2.10.0 > [9] survival_2.35-7 XML_2.6-0 xtable_1.5-5 > > > > > Benilton Carvalho escribi?: >> >> Hi Javier, >> >> This is what you want to do: >> >> info = dbGetQuery(conn, paste("SELECT DISTINCT meta_fsetid as >> transcript_id, type_id", >> "FROM featureSet, core_mps, type_dict", >> "WHERE featureSet.fsetid=core_mps.fsetid", >> "AND featureSet.type=type_dict.type") >> >> I'll make sure that, in the next releases, the users are not >> expected to figure out queries like this. >> >> Using a simplistic description: The probeset db is at the exon >> level; Transcript db is at the gene level. >> >> b >> >> On Nov 5, 2009, at 8:59 AM, Javier P?rez Florido wrote: >> >>> Thanks to everybody, >>> I'm new working on HuGene ST 1.0 and have some questions: >>> >>> * I have normalized some CEL files using the oligo package and >>> the >>> annotation file used, by default, is the pd.hugene.1.0.st.v1. >>> How >>> can I access to this annotation file to check the type of >>> control >>> probe sets used? I've tried: >>> >>> conn<-db(pd.hugene.1.0.st.v1) >>> dbListTables(conn) >>> [1] "bgfeature" "chrom_dict" "core_mps" "featureSet" >>> "level_dict" >>> [6] "pmfeature" "table_info" "type_dict" >>> dbListFields(conn,"featureSet") >>> [1] "fsetid" "strand" >>> "start" "stop" >>> [5] "transcript_cluster_id" "exon_id" >>> "crosshyb_type" "level" >>> [9] "chrom" "type" >>> sql="SELECT fsetid,type FROM featureSet" >>> dbGetQuery(conn,sql) >>> But I get integer numbers (1,2,3...) for the type field >>> instead >>> of "AFFX*", "other-spike", etc control probe sets using the >>> annotation file....How can I get this information? >>> >>> * What is the difference between hugene10stprobeset.db and >>> hugene10sttranscriptcluster.db? What is the diference between >>> summarize at the probe set level and at the gene level? >>> >>> Thanks again, >>> Javier >>> P.S. If you know any document that could help me on this arrays, it >>> would be great. >>> >>> R version 2.10.0 (2009-10-26) >>> i386-pc-mingw32 >>> >>> locale: >>> [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 >>> LC_MONETARY=Spanish_Spain.1252 >>> [4] LC_NUMERIC=C LC_TIME=Spanish_Spain.1252 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] annotate_1.24.0 AnnotationDbi_1.8.0 >>> pd.hugene.1.0.st.v1_3.0.0 RSQLite_0.7-3 >>> [5] DBI_0.2-4 oligo_1.10.0 >>> preprocessCore_1.8.0 oligoClasses_1.8.0 >>> [9] Biobase_2.6.0 >>> >>> loaded via a namespace (and not attached): >>> [1] affxparser_1.18.0 affyio_1.14.0 Biostrings_2.14.0 >>> IRanges_1.4.0 splines_2.10.0 tools_2.10.0 >>> [7] xtable_1.5-5 >>> >>> >>> cstrato escribi?: >>>> Dear Javier, >>>> >>>> When you open the Affymetrix annotation files for the HuGene ST 1.0 >>>> array you will see that it does contain 13 AFFX controls and a >>>> numberof "other_spike" controls for both the transcript and the >>>> probeset annotation files. The MoGene array contains 22 >>>> "control->affx" probesets including 13 AFFX controls (bac_spike, >>>> polya_spike). >>>> >>>> Best regards >>>> Christian >>>> _._._._._._._._._._._._._._._._._._ >>>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >>>> V.i.e.n.n.a A.u.s.t.r.i.a >>>> e.m.a.i.l: cstrato at aon.at >>>> _._._._._._._._._._._._._._._._._._ >>>> >>>> >>>> Javier P?rez Florido wrote: >>>>> Dear list, >>>>> I would like to know if the GeneChip Human Gene ST 1.0 array has >>>>> some >>>>> gene controls (like AFFX genes in other Affymetrix technologies). >>>>> Thanks in advance, >>>>> Javier >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at stat.math.ethz.ch >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>> >>>> >>> >>> >>> [[alternative HTML version deleted]] >>> >>> <att00001.txt> >> >> >
ADD REPLY
0
Entering edit mode
Thanks Prof. Carvalho, Is the rest of my e-mail correct? Thanks, Javier Benilton Carvalho escribi?: > XM_001714578 was replaced by NM_001136561. > > http://www.ncbi.nlm.nih.gov/nuccore/NM_001136561?log$=seqview_status > > b > > On Nov 9, 2009, at 9:19 AM, Javier P?rez Florido wrote: > >> >> OK, >> So, to sum up (and check if I understand the Human Gene ST 1.0 >> array), when summarizing to the gene level means that there are >> several probesets that compose a gene. To summarize to the gene level >> when normalizing, I executed: >> OligoEset<-rma(OligoRaw,target="core") >> and I got 33297 genes (transcript ids). >> >> Using the following query on pd.hugene.1.0.st.v1: >> dbListTables(conn) >> dbListFields(conn,"type_dict") >> info2<-"SELECT * from type_dict" >> result<-dbGetQuery(conn,info2) >> >> I got: >> # type type_id >> #1 1 main >> #2 2 control->affx >> #3 3 control->chip >> #4 4 control->bgp->antigenomic >> #5 5 control->bgp->genomic >> #6 6 normgene->exon >> #7 7 normgene->intron >> #8 8 rescue->FLmRNA->unmapped >> >> I also executed the following query: >> conn<-db(pd.hugene.1.0.st.v1) >> dbListTables(conn) >> info = dbGetQuery(conn, paste("SELECT DISTINCT meta_fsetid as >> transcript_id, type_id", >> "FROM featureSet, core_mps, type_dict", >> "WHERE featureSet.fsetid=core_mps.fsetid", >> "AND featureSet.type=type_dict.type")) >> >> I have a complete processed example (it is summarized to the gene >> level, and it has the ACC number, Symbol information, etc for each >> transcript id). I wanted to reproduce the example by myself using the >> raw data. When matching the transcript_id field given by the above >> query and the transcript_id given by the example data set, the >> following information can be extracted: >> ? control->affx are related to other-spike y AFFX probe sets (57 >> probe sets) >> ? normgene->exon are related to 1195 pos_control probe sets >> ? normgene->intron are related to 2904 neg_control probe sets >> So, I suppose that there are about 4156 control transcripts. >> >> Since I summarized to the gene level, I have used the annotation file >> "hugene10sttranscriptcluster.db". I've tried to get the ACC number >> and the Symbol for some transcript_id. The idea was to check if the >> results given were the same as the example I have. For example: >> >> hugene10sttranscriptclusterACCNUM[["7912580"]] >> I get "NM_001136561", but in the example, the accession number is >> XM_001714578. However, I get the same result for Symbol: >> hugene10sttranscriptclusterSYMBOL[["7912580"]] : LOC440563 >> >> Why are there different Accession Number for the same transcript_id? >> >> Thanks again, >> Javier >> >> > sessionInfo() >> R version 2.10.0 (2009-10-26) >> i386-pc-mingw32 >> >> locale: >> [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 >> [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C >> [5] LC_TIME=Spanish_Spain.1252 >> >> attached base packages: >> [1] tools tcltk stats graphics grDevices utils datasets >> [8] methods base >> >> other attached packages: >> [1] pd.hugene.1.0.st.v1_3.0.0 oligoClasses_1.8.0 >> [3] hugene10stprobeset.db_4.0.1 >> hugene10sttranscriptcluster.db_4.0.1 >> [5] org.Hs.eg.db_2.3.6 oneChannelGUI_1.12.0 >> [7] preprocessCore_1.8.0 GOstats_2.12.0 >> [9] RSQLite_0.7-3 DBI_0.2-4 >> [11] graph_1.24.0 Category_2.12.0 >> [13] AnnotationDbi_1.8.0 tkWidgets_1.24.0 >> [15] DynDoc_1.24.0 widgetTools_1.24.0 >> [17] affylmGUI_1.20.0 affyio_1.14.0 >> [19] affy_1.24.0 limma_3.2.1 >> [21] Biobase_2.6.0 >> >> loaded via a namespace (and not attached): >> [1] annotate_1.24.0 Biostrings_2.14.0 genefilter_1.28.0 GO.db_2.3.5 >> [5] GSEABase_1.8.0 IRanges_1.4.0 RBGL_1.20.0 >> splines_2.10.0 >> [9] survival_2.35-7 XML_2.6-0 xtable_1.5-5 >> > >> >> >> Benilton Carvalho escribi?: >>> >>> Hi Javier, >>> >>> This is what you want to do: >>> >>> info = dbGetQuery(conn, paste("SELECT DISTINCT meta_fsetid as >>> transcript_id, type_id", >>> "FROM featureSet, core_mps, type_dict", >>> "WHERE featureSet.fsetid=core_mps.fsetid", >>> "AND featureSet.type=type_dict.type") >>> >>> I'll make sure that, in the next releases, the users are not >>> expected to figure out queries like this. >>> >>> Using a simplistic description: The probeset db is at the exon >>> level; Transcript db is at the gene level. >>> >>> b >>> >>> On Nov 5, 2009, at 8:59 AM, Javier P?rez Florido wrote: >>> >>>> Thanks to everybody, >>>> I'm new working on HuGene ST 1.0 and have some questions: >>>> >>>> * I have normalized some CEL files using the oligo package and the >>>> annotation file used, by default, is the pd.hugene.1.0.st.v1. How >>>> can I access to this annotation file to check the type of control >>>> probe sets used? I've tried: >>>> >>>> conn<-db(pd.hugene.1.0.st.v1) >>>> dbListTables(conn) >>>> [1] "bgfeature" "chrom_dict" "core_mps" "featureSet" >>>> "level_dict" >>>> [6] "pmfeature" "table_info" "type_dict" >>>> dbListFields(conn,"featureSet") >>>> [1] "fsetid" "strand" >>>> "start" "stop" >>>> [5] "transcript_cluster_id" "exon_id" >>>> "crosshyb_type" "level" >>>> [9] "chrom" "type" >>>> sql="SELECT fsetid,type FROM featureSet" >>>> dbGetQuery(conn,sql) >>>> But I get integer numbers (1,2,3...) for the type field instead >>>> of "AFFX*", "other-spike", etc control probe sets using the >>>> annotation file....How can I get this information? >>>> >>>> * What is the difference between hugene10stprobeset.db and >>>> hugene10sttranscriptcluster.db? What is the diference between >>>> summarize at the probe set level and at the gene level? >>>> >>>> Thanks again, >>>> Javier >>>> P.S. If you know any document that could help me on this arrays, it >>>> would be great. >>>> >>>> R version 2.10.0 (2009-10-26) >>>> i386-pc-mingw32 >>>> >>>> locale: >>>> [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 >>>> LC_MONETARY=Spanish_Spain.1252 >>>> [4] LC_NUMERIC=C LC_TIME=Spanish_Spain.1252 >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] annotate_1.24.0 AnnotationDbi_1.8.0 >>>> pd.hugene.1.0.st.v1_3.0.0 RSQLite_0.7-3 >>>> [5] DBI_0.2-4 oligo_1.10.0 >>>> preprocessCore_1.8.0 oligoClasses_1.8.0 >>>> [9] Biobase_2.6.0 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] affxparser_1.18.0 affyio_1.14.0 Biostrings_2.14.0 >>>> IRanges_1.4.0 splines_2.10.0 tools_2.10.0 >>>> [7] xtable_1.5-5 >>>> >>>> >>>> cstrato escribi?: >>>>> Dear Javier, >>>>> >>>>> When you open the Affymetrix annotation files for the HuGene ST 1.0 >>>>> array you will see that it does contain 13 AFFX controls and a >>>>> numberof "other_spike" controls for both the transcript and the >>>>> probeset annotation files. The MoGene array contains 22 >>>>> "control->affx" probesets including 13 AFFX controls (bac_spike, >>>>> polya_spike). >>>>> >>>>> Best regards >>>>> Christian >>>>> _._._._._._._._._._._._._._._._._._ >>>>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >>>>> V.i.e.n.n.a A.u.s.t.r.i.a >>>>> e.m.a.i.l: cstrato at aon.at >>>>> _._._._._._._._._._._._._._._._._._ >>>>> >>>>> >>>>> Javier P?rez Florido wrote: >>>>>> Dear list, >>>>>> I would like to know if the GeneChip Human Gene ST 1.0 array has >>>>>> some >>>>>> gene controls (like AFFX genes in other Affymetrix technologies). >>>>>> Thanks in advance, >>>>>> Javier >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor at stat.math.ethz.ch >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>> >>>>> >>>>> >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> <att00001.txt> >>> >>> >> > >
ADD REPLY
0
Entering edit mode
yes. everything, but the Prof part, is correct. ;) b On Nov 9, 2009, at 1:17 PM, Javier P?rez Florido wrote: > Thanks Prof. Carvalho, > Is the rest of my e-mail correct? > Thanks, > Javier > > > Benilton Carvalho escribi?: >> XM_001714578 was replaced by NM_001136561. >> >> http://www.ncbi.nlm.nih.gov/nuccore/NM_001136561?log$=seqview_status >> >> b >> >> On Nov 9, 2009, at 9:19 AM, Javier P?rez Florido wrote: >> >>> >>> OK, >>> So, to sum up (and check if I understand the Human Gene ST 1.0 >>> array), when summarizing to the gene level means that there are >>> several probesets that compose a gene. To summarize to the gene >>> level >>> when normalizing, I executed: >>> OligoEset<-rma(OligoRaw,target="core") >>> and I got 33297 genes (transcript ids). >>> >>> Using the following query on pd.hugene.1.0.st.v1: >>> dbListTables(conn) >>> dbListFields(conn,"type_dict") >>> info2<-"SELECT * from type_dict" >>> result<-dbGetQuery(conn,info2) >>> >>> I got: >>> # type type_id >>> #1 1 main >>> #2 2 control->affx >>> #3 3 control->chip >>> #4 4 control->bgp->antigenomic >>> #5 5 control->bgp->genomic >>> #6 6 normgene->exon >>> #7 7 normgene->intron >>> #8 8 rescue->FLmRNA->unmapped >>> >>> I also executed the following query: >>> conn<-db(pd.hugene.1.0.st.v1) >>> dbListTables(conn) >>> info = dbGetQuery(conn, paste("SELECT DISTINCT meta_fsetid as >>> transcript_id, type_id", >>> "FROM featureSet, core_mps, type_dict", >>> "WHERE featureSet.fsetid=core_mps.fsetid", >>> "AND featureSet.type=type_dict.type")) >>> >>> I have a complete processed example (it is summarized to the gene >>> level, and it has the ACC number, Symbol information, etc for each >>> transcript id). I wanted to reproduce the example by myself using >>> the >>> raw data. When matching the transcript_id field given by the above >>> query and the transcript_id given by the example data set, the >>> following information can be extracted: >>> ? control->affx are related to other-spike y AFFX probe sets (57 >>> probe sets) >>> ? normgene->exon are related to 1195 pos_control probe sets >>> ? normgene->intron are related to 2904 neg_control probe sets >>> So, I suppose that there are about 4156 control transcripts. >>> >>> Since I summarized to the gene level, I have used the annotation >>> file >>> "hugene10sttranscriptcluster.db". I've tried to get the ACC number >>> and the Symbol for some transcript_id. The idea was to check if the >>> results given were the same as the example I have. For example: >>> >>> hugene10sttranscriptclusterACCNUM[["7912580"]] >>> I get "NM_001136561", but in the example, the accession number is >>> XM_001714578. However, I get the same result for Symbol: >>> hugene10sttranscriptclusterSYMBOL[["7912580"]] : LOC440563 >>> >>> Why are there different Accession Number for the same transcript_id? >>> >>> Thanks again, >>> Javier >>> >>>> sessionInfo() >>> R version 2.10.0 (2009-10-26) >>> i386-pc-mingw32 >>> >>> locale: >>> [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 >>> [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C >>> [5] LC_TIME=Spanish_Spain.1252 >>> >>> attached base packages: >>> [1] tools tcltk stats graphics grDevices utils >>> datasets >>> [8] methods base >>> >>> other attached packages: >>> [1] pd.hugene.1.0.st.v1_3.0.0 oligoClasses_1.8.0 >>> [3] hugene10stprobeset.db_4.0.1 >>> hugene10sttranscriptcluster.db_4.0.1 >>> [5] org.Hs.eg.db_2.3.6 oneChannelGUI_1.12.0 >>> [7] preprocessCore_1.8.0 GOstats_2.12.0 >>> [9] RSQLite_0.7-3 DBI_0.2-4 >>> [11] graph_1.24.0 Category_2.12.0 >>> [13] AnnotationDbi_1.8.0 tkWidgets_1.24.0 >>> [15] DynDoc_1.24.0 widgetTools_1.24.0 >>> [17] affylmGUI_1.20.0 affyio_1.14.0 >>> [19] affy_1.24.0 limma_3.2.1 >>> [21] Biobase_2.6.0 >>> >>> loaded via a namespace (and not attached): >>> [1] annotate_1.24.0 Biostrings_2.14.0 genefilter_1.28.0 >>> GO.db_2.3.5 >>> [5] GSEABase_1.8.0 IRanges_1.4.0 RBGL_1.20.0 >>> splines_2.10.0 >>> [9] survival_2.35-7 XML_2.6-0 xtable_1.5-5 >>>> >>> >>> >>> Benilton Carvalho escribi?: >>>> >>>> Hi Javier, >>>> >>>> This is what you want to do: >>>> >>>> info = dbGetQuery(conn, paste("SELECT DISTINCT meta_fsetid as >>>> transcript_id, type_id", >>>> "FROM featureSet, core_mps, type_dict", >>>> "WHERE featureSet.fsetid=core_mps.fsetid", >>>> "AND featureSet.type=type_dict.type") >>>> >>>> I'll make sure that, in the next releases, the users are not >>>> expected to figure out queries like this. >>>> >>>> Using a simplistic description: The probeset db is at the exon >>>> level; Transcript db is at the gene level. >>>> >>>> b >>>> >>>> On Nov 5, 2009, at 8:59 AM, Javier P?rez Florido wrote: >>>> >>>>> Thanks to everybody, >>>>> I'm new working on HuGene ST 1.0 and have some questions: >>>>> >>>>> * I have normalized some CEL files using the oligo package and >>>>> the >>>>> annotation file used, by default, is the pd.hugene. >>>>> 1.0.st.v1. How >>>>> can I access to this annotation file to check the type of >>>>> control >>>>> probe sets used? I've tried: >>>>> >>>>> conn<-db(pd.hugene.1.0.st.v1) >>>>> dbListTables(conn) >>>>> [1] "bgfeature" "chrom_dict" "core_mps" "featureSet" >>>>> "level_dict" >>>>> [6] "pmfeature" "table_info" "type_dict" >>>>> dbListFields(conn,"featureSet") >>>>> [1] "fsetid" "strand" >>>>> "start" "stop" >>>>> [5] "transcript_cluster_id" "exon_id" >>>>> "crosshyb_type" "level" >>>>> [9] "chrom" "type" >>>>> sql="SELECT fsetid,type FROM featureSet" >>>>> dbGetQuery(conn,sql) >>>>> But I get integer numbers (1,2,3...) for the type field >>>>> instead >>>>> of "AFFX*", "other-spike", etc control probe sets using the >>>>> annotation file....How can I get this information? >>>>> >>>>> * What is the difference between hugene10stprobeset.db and >>>>> hugene10sttranscriptcluster.db? What is the diference between >>>>> summarize at the probe set level and at the gene level? >>>>> >>>>> Thanks again, >>>>> Javier >>>>> P.S. If you know any document that could help me on this arrays, >>>>> it >>>>> would be great. >>>>> >>>>> R version 2.10.0 (2009-10-26) >>>>> i386-pc-mingw32 >>>>> >>>>> locale: >>>>> [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 >>>>> LC_MONETARY=Spanish_Spain.1252 >>>>> [4] LC_NUMERIC=C LC_TIME=Spanish_Spain.1252 >>>>> >>>>> attached base packages: >>>>> [1] stats graphics grDevices utils datasets methods >>>>> base >>>>> >>>>> other attached packages: >>>>> [1] annotate_1.24.0 AnnotationDbi_1.8.0 >>>>> pd.hugene.1.0.st.v1_3.0.0 RSQLite_0.7-3 >>>>> [5] DBI_0.2-4 oligo_1.10.0 >>>>> preprocessCore_1.8.0 oligoClasses_1.8.0 >>>>> [9] Biobase_2.6.0 >>>>> >>>>> loaded via a namespace (and not attached): >>>>> [1] affxparser_1.18.0 affyio_1.14.0 Biostrings_2.14.0 >>>>> IRanges_1.4.0 splines_2.10.0 tools_2.10.0 >>>>> [7] xtable_1.5-5 >>>>> >>>>> >>>>> cstrato escribi?: >>>>>> Dear Javier, >>>>>> >>>>>> When you open the Affymetrix annotation files for the HuGene ST >>>>>> 1.0 >>>>>> array you will see that it does contain 13 AFFX controls and a >>>>>> numberof "other_spike" controls for both the transcript and the >>>>>> probeset annotation files. The MoGene array contains 22 >>>>>> "control->affx" probesets including 13 AFFX controls (bac_spike, >>>>>> polya_spike). >>>>>> >>>>>> Best regards >>>>>> Christian >>>>>> _._._._._._._._._._._._._._._._._._ >>>>>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >>>>>> V.i.e.n.n.a A.u.s.t.r.i.a >>>>>> e.m.a.i.l: cstrato at aon.at >>>>>> _._._._._._._._._._._._._._._._._._ >>>>>> >>>>>> >>>>>> Javier P?rez Florido wrote: >>>>>>> Dear list, >>>>>>> I would like to know if the GeneChip Human Gene ST 1.0 array has >>>>>>> some >>>>>>> gene controls (like AFFX genes in other Affymetrix >>>>>>> technologies). >>>>>>> Thanks in advance, >>>>>>> Javier >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioconductor mailing list >>>>>>> Bioconductor at stat.math.ethz.ch >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>> Search the archives: >>>>>>> http://news.gmane.org/ >>>>>>> gmane.science.biology.informatics.conductor >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> <att00001.txt> >>>> >>>> >>> >> >> >
ADD REPLY

Login before adding your answer.

Traffic: 517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6