Detection Above Background (DABG) on Gene Level for Exon and Gene Arrays
2
1
Entering edit mode
@pascal-gellert-4249
Last seen 10.2 years ago
Hi all, The detection above background algorithm calculates a p-value for each probe set, indication if this probe set is expressed or not (within the background noise). This is similar to the MAS5 detection calls, but Exon 1.0 ST and Gene 1.0 ST Arrays don't have mismatch probes, therefore MAS5 cannot be used. According to Affymetrix, the DABG is not valid on gene level: "There is a strong assumption in DABG that all the probes are measuring the same thing (i.e., the same transcript). This is not the case at the gene level due to alternative splicing. For example, probes for a cassette exon that is skipped will contribute to a mis- leadingly insignificant p-value." To obtain, if a gene is expressed, often all probe sets of a gene were used. If less than e.g. 50% of the exons of the gene are above a DABG threshold, the gene is considered as not expressed. Nevertheless, the XPS package supports DABG on gene level. Does anyone has experiences with DAGB on gene level? Thanks, Pascal Gellert
probe xps probe xps • 4.0k views
ADD COMMENT
0
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 6.1 years ago
Austria
Dear Pascal, Dut to the design of xps it is not only able to support DABG calls at both the probeset and transcript level, but does also support MAS5 detection calls for both Exon 1.0 ST and Gene 1.0 ST Arrays. To answer your question whether DABG is valid on the transcript level I think you need to distinguish between HuExon and HuGene arrays: For HuExon arrays the statement of Affymetrix is probably true for genes where alternative splicing occurs. However, HuGene arrays were originally designed by Affymetrix as arrays measuring the transcript level and thus I assume they have tried to select mainly probes which are not affected by alternative splicing, so the statement may not apply. I would also be interested to hear from user experiences not only with DABG on the transcript level but also with MAS5 calls on Exon ST and Gene ST arrays. In my own experience the p-values between DABG and MAS5 calls are almost identical for very low p-values but partly tend to differ for larger p-values. Best regards Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ On 9/21/10 2:27 PM, Pascal Gellert wrote: > Hi all, > > The detection above background algorithm calculates a p-value for each > probe set, indication if this probe set is expressed or not (within the > background noise). > This is similar to the MAS5 detection calls, but Exon 1.0 ST and Gene > 1.0 ST Arrays don't have mismatch probes, therefore MAS5 cannot be used. > > According to Affymetrix, the DABG is not valid on gene level: > > "There is a strong assumption in DABG > that all the probes are measuring the same > thing (i.e., the same transcript). This is not > the case at the gene level due to alternative > splicing. For example, probes for a cassette > exon that is skipped will contribute to a mis- > leadingly insignificant p-value." > > To obtain, if a gene is expressed, often all probe sets of a gene were > used. If less than e.g. 50% of the exons of the gene are above a DABG > threshold, the gene is considered as not expressed. > > Nevertheless, the XPS package supports DABG on gene level. Does anyone > has experiences with DAGB on gene level? > > Thanks, > > Pascal Gellert > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Dear Christian & Mark, Thanks for your answer. > HuGene arrays were originally designed by Affymetrix as arrays > measuring the transcript level and thus I assume they have tried to > select mainly probes which are not affected by alternative splicing, > so the statement may not apply. I am not sure, but at his point I think I have to disagree, because HuGene and HuExon have many (~65%) identical probes. This is from the Affymetrix Technical Note for HuGene: "The Human Gene 1.0 ST Array design, wherever possible, uses a subset of the same probes on the Human Exon 1.0 ST Array to interrogate the more focused, better-annotated content at the gene level." So this sounds like HuGene is designed from the core HuExon (but only two probes per probe set). It got additional probe sets for genes with few exons, but I am not sure if they avoided regions which are likely spliced? > I guess to be really thorough, one could use the Affymetrix sample > data that's been run on 133+2, Gene ST and Exon ST platforms to see > how DABG compares to the old mas5calls on the same RNA I think I really have to do this. It also would be interesting, if MAS5 for HuGene and HuExon with the XPS package performs better than DABG. Best regards, Pascal On 09/21/2010 11:18 PM, cstrato wrote: > Dear Pascal, > > Dut to the design of xps it is not only able to support DABG calls at > both the probeset and transcript level, but does also support MAS5 > detection calls for both Exon 1.0 ST and Gene 1.0 ST Arrays. > > To answer your question whether DABG is valid on the transcript level > I think you need to distinguish between HuExon and HuGene arrays: > > For HuExon arrays the statement of Affymetrix is probably true for > genes where alternative splicing occurs. However, HuGene arrays were > originally designed by Affymetrix as arrays measuring the transcript > level and thus I assume they have tried to select mainly probes which > are not affected by alternative splicing, so the statement may not apply. > > I would also be interested to hear from user experiences not only with > DABG on the transcript level but also with MAS5 calls on Exon ST and > Gene ST arrays. In my own experience the p-values between DABG and > MAS5 calls are almost identical for very low p-values but partly tend > to differ for larger p-values. > > Best regards > Christian > _._._._._._._._._._._._._._._._._._ > C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a > V.i.e.n.n.a A.u.s.t.r.i.a > e.m.a.i.l: cstrato at aon.at > _._._._._._._._._._._._._._._._._._ > > > On 9/21/10 2:27 PM, Pascal Gellert wrote: >> Hi all, >> >> The detection above background algorithm calculates a p-value for each >> probe set, indication if this probe set is expressed or not (within the >> background noise). >> This is similar to the MAS5 detection calls, but Exon 1.0 ST and Gene >> 1.0 ST Arrays don't have mismatch probes, therefore MAS5 cannot be used. >> >> According to Affymetrix, the DABG is not valid on gene level: >> >> "There is a strong assumption in DABG >> that all the probes are measuring the same >> thing (i.e., the same transcript). This is not >> the case at the gene level due to alternative >> splicing. For example, probes for a cassette >> exon that is skipped will contribute to a mis- >> leadingly insignificant p-value." >> >> To obtain, if a gene is expressed, often all probe sets of a gene were >> used. If less than e.g. 50% of the exons of the gene are above a DABG >> threshold, the gene is considered as not expressed. >> >> Nevertheless, the XPS package supports DABG on gene level. Does anyone >> has experiences with DAGB on gene level? >> >> Thanks, >> >> Pascal Gellert >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >
ADD REPLY
0
Entering edit mode
Hello Pascal, I asked the same question to Affy support (which are very responsive) 6 months ago and here is the answer I got (last part is the most relevant): Hello Dr. Imbeault, Thank you for contacting Affymetrix Technical Support. I have attached a paper that you will find useful. As far as DABG, DABG is an acronym for Detected Above Background. It is an algorithm that was intended to serve as a confidence score in lieu of a Detection call ("Present" or "Absent"). The first official application of DABG was introduced in the ExACT software for WT Exon Array data analysis. DABG is also represented in Expression Console. However, based on an internal assessment of performance by our Bioinformatics team (Alan Williams primarily), DABG is not considered to be a very informative or robust metric. Customers should utilize signal estimates and secondary/tertiary analysis methods to determine robust confidence scores and ultimately gene expression results. DABG Calculations The individual probe p-values are computed based on the rank order against the background probe set intensities (probes in the .BGP file). The probe level p-values are combined into a probe set level p-value using the Fisher equation. There are DABG options available in APT to use a percentile (i.e. median) rather than the Fisher equation. DABG Limitations in gene-level analyses? The DABG algorithm evaluates detection by combining probe-level p-values that are assumed to be monitoring the same region of a transcript. This assumption is met for an exon-level probe. However, when combining all probes across a transcript in a gene-level analysis, this assumption is not guaranteed. Some of the probes may hit parts of the gene which are expressed while others may not, and yet the gene is still expressed. Since this type of scenario could generate misleading detection calls, DABG is not considered a robust gene-level metric. (from the WT QC WRC). Please let me know if you have any other questions. Regards, Shireen On 22/09/2010 4:17 AM, Pascal Gellert wrote: > Dear Christian & Mark, > > Thanks for your answer. > >> HuGene arrays were originally designed by Affymetrix as arrays >> measuring the transcript level and thus I assume they have tried to >> select mainly probes which are not affected by alternative splicing, >> so the statement may not apply. > I am not sure, but at his point I think I have to disagree, because > HuGene and HuExon have many (~65%) identical probes. This is from the > Affymetrix Technical Note for HuGene: > > "The Human Gene 1.0 ST Array design, wherever possible, uses a subset > of the same probes on the Human Exon 1.0 ST Array to interrogate the > more focused, better-annotated content at the gene level." > > So this sounds like HuGene is designed from the core HuExon (but only > two probes per probe set). It got additional probe sets for genes with > few exons, but I am not sure if they avoided regions which are likely > spliced? > >> I guess to be really thorough, one could use the Affymetrix sample >> data that's been run on 133+2, Gene ST and Exon ST platforms to see >> how DABG compares to the old mas5calls on the same RNA > I think I really have to do this. It also would be interesting, if > MAS5 for HuGene and HuExon with the XPS package performs better than > DABG. > > > Best regards, > > Pascal > > > On 09/21/2010 11:18 PM, cstrato wrote: >> Dear Pascal, >> >> Dut to the design of xps it is not only able to support DABG calls at >> both the probeset and transcript level, but does also support MAS5 >> detection calls for both Exon 1.0 ST and Gene 1.0 ST Arrays. >> >> To answer your question whether DABG is valid on the transcript level >> I think you need to distinguish between HuExon and HuGene arrays: >> >> For HuExon arrays the statement of Affymetrix is probably true for >> genes where alternative splicing occurs. However, HuGene arrays were >> originally designed by Affymetrix as arrays measuring the transcript >> level and thus I assume they have tried to select mainly probes which >> are not affected by alternative splicing, so the statement may not >> apply. >> >> I would also be interested to hear from user experiences not only >> with DABG on the transcript level but also with MAS5 calls on Exon ST >> and Gene ST arrays. In my own experience the p-values between DABG >> and MAS5 calls are almost identical for very low p-values but partly >> tend to differ for larger p-values. >> >> Best regards >> Christian >> _._._._._._._._._._._._._._._._._._ >> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >> V.i.e.n.n.a A.u.s.t.r.i.a >> e.m.a.i.l: cstrato at aon.at >> _._._._._._._._._._._._._._._._._._ >> >> >> On 9/21/10 2:27 PM, Pascal Gellert wrote: >>> Hi all, >>> >>> The detection above background algorithm calculates a p-value for each >>> probe set, indication if this probe set is expressed or not (within the >>> background noise). >>> This is similar to the MAS5 detection calls, but Exon 1.0 ST and Gene >>> 1.0 ST Arrays don't have mismatch probes, therefore MAS5 cannot be >>> used. >>> >>> According to Affymetrix, the DABG is not valid on gene level: >>> >>> "There is a strong assumption in DABG >>> that all the probes are measuring the same >>> thing (i.e., the same transcript). This is not >>> the case at the gene level due to alternative >>> splicing. For example, probes for a cassette >>> exon that is skipped will contribute to a mis- >>> leadingly insignificant p-value." >>> >>> To obtain, if a gene is expressed, often all probe sets of a gene were >>> used. If less than e.g. 50% of the exons of the gene are above a DABG >>> threshold, the gene is considered as not expressed. >>> >>> Nevertheless, the XPS package supports DABG on gene level. Does anyone >>> has experiences with DAGB on gene level? >>> >>> Thanks, >>> >>> Pascal Gellert >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY
0
Entering edit mode
Mark Cowley ▴ 910
@mark-cowley-2951
Last seen 10.2 years ago
Hi Pascal, I often use DABG at the Gene level. I use a P-value threshold of 0.00000 which is the minimum value that APT reports. This gives similar proportions of expressed genes per sample as using mas5calls did on similar tissues. ie 35-60%, depending on the transcriptional complexity of the samples. I had a post on the Affymetrix forums with their developers about this, but i can't for the life of me find the affymetrix developer forums at this late hour I guess to be really thorough, one could use the Affymetrix sample data that's been run on 133+2, Gene ST and Exon ST platforms to see how DABG compares to the old mas5calls on the same RNA I hope that helps mark ----------------------------------------------------- Mark Cowley, PhD Peter Wills Bioinformatics Centre Garvan Institute of Medical Research, Sydney, Australia ----------------------------------------------------- On 21/09/2010, at 10:27 PM, Pascal Gellert wrote: > Hi all, > > The detection above background algorithm calculates a p-value for > each probe set, indication if this probe set is expressed or not > (within the background noise). > This is similar to the MAS5 detection calls, but Exon 1.0 ST and > Gene 1.0 ST Arrays don't have mismatch probes, therefore MAS5 cannot > be used. > > According to Affymetrix, the DABG is not valid on gene level: > > "There is a strong assumption in DABG > that all the probes are measuring the same > thing (i.e., the same transcript). This is not > the case at the gene level due to alternative > splicing. For example, probes for a cassette > exon that is skipped will contribute to a mis- > leadingly insignificant p-value." > > To obtain, if a gene is expressed, often all probe sets of a gene > were used. If less than e.g. 50% of the exons of the gene are above > a DABG threshold, the gene is considered as not expressed. > > Nevertheless, the XPS package supports DABG on gene level. Does > anyone has experiences with DAGB on gene level? > > Thanks, > > Pascal Gellert > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 864 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6