Question

Re: replicates and low expression levels

0

Entering edit mode

Eric Blalock ▴ 250

@eric-blalock-78

Last seen 10.7 years ago

Hi, To add to what Rafael Irizarry said, when we had multiple subjects/ chips per treatment group in our recent publication (Blalock et al, 2003, J Neurosci.), we used P/A filtering to determine what probe sets were to be included in the 'final' statistical analysis. Because we did this on an entire record basis- that is, a single probe set was removed from further consideration if there were 'too many' absence calls for that probe set (the determination was arbitrarily set at 40% presence calls in at least one treatment group), the F-statistics for each gene that remained were unchanged. However, this filtering has a huge effect on the error of multiple testing when using the 'MAS' algorithms because part of what is being removed is the unexpressed probe set contingent- that fairly large group of probe sets (in our case nearly 50%) that are not detectable/ not expressed in the tissue of interest (I'd guess that this will be an issue with any 'general purpose' array designed to genome wide expression). Affy is as much as telling you that they are not confident in the average difference score (ADS) and signal intensity (SI) numbers their algorithms produce if the probe sets are rated absent. My current understanding is that the MAS metrics are not 'stand alone'. Although Affy intends ADS and SI to be their quantitative measures of mRNA level, these measures go hand and glove with thier respective absence calls. As far as what the absence calls mean, there appears to be a shell game (three card monte) going on with the 'why' of absence calls. You are correct that many probe sets are called absent because they have insufficient signal, but many probe sets are also called 'absent' because, although there is sufficient signal intensity, there is also too much disagreement among probe pairs. Thus there are two reasons probe sets get called absent, 1) the signal is too dim and 2) the probe set is not working the way the algorithm expects. Oh, and add an interaction of those two as well. So if you are using another algorithm like RMA to look at your data, then the presence/ absence calls could be dangerous because they are taking out probe sets that didn't work well for MAS, however those probe sets may have done just fine with RMA. Hope that helps., -E >Message: 4 >Date: Fri, 30 May 2003 17:28:45 +0100 >From: "Crispin Miller" <cmiller@picr.man.ac.uk> >Subject: [BioC] replicates and low expression levels >To: <bioconductor@stat.math.ethz.ch> >Message-ID: > <baa35444b19ad940997ed02a6996aae00b1448@sanmail.picr.man.ac.uk> >Content-Type: text/plain; charset="iso-8859-1" > >Hi, >Just a quick question about low expression levels on Affy systems - I hope >it's not too off-topic; it is about normalisation and data analysis... >I've heard a lot of people advocating that it's a good idea to perform an >initial filtering on either Present Marginal or Absent calls, or on >gene-expression levels (so that only genes with an expression > 40, say, >after scaling to a TGT of 100 using the MAS5.0 algorithm, are part of the >further analysis). Firstly, am I right in thinking that this is to >eliminate data that are too close to the background noise level of the system. > >I wanted to canvas opinion as to whether people feel we need to do this if >we have replicates and are using statistical tests - rather than just >fold-changes - to identify 'interesting' genes. Does the statistical >testing do this job for us? > >Crispin >

probe affy probe affy • 885 views

ADD COMMENT • link 21.9 years ago Eric Blalock ▴ 250