I knew when I saw your reply in my mailbox that I was in trouble :/
Thanks for the clarification- I hadn't looked carefully at the PPV
references, but I believe I may have done the authors a disservice by
implying that they had developed PPV, rather than adopting it (this
was my
mistaken impression and not their claim). I'd leave it to the authors
to
address the short falls you point out, and I agree that the paper may
inappropriately/ unfairly compare logit-t to RMA and dChip.
My enthusiasm regarding the paper is that it is the first one (to my
knowledge) that has interrogated probe_set differences at the probe
level
across groups (rather than pairwise), especially since the authors
reported
what I suspected- that the probe level data does a better job than the
expression values generated by a probe level algorithm (I could still
be
totally wrong, but it IS what I have been suspecting). This may be
more
important for discerning meaningful differences than the transforms
themselves. If it is possible to interrogate the data at this level,
why
bother with probe level algorithms which may lose information in the
process of cooking 11-32 intensity values into a single number?
Personally,
I'd be willing to tolerate the extra statistical complexity to get
results
that more accurately reflect the biological processes under
investigation.
I realize that the data set you had to work with was relatively small
(n =
3/ group), but would you still advocate average log fold change as a
discriminator if you had say 10 chips in each group? While it would
probably still work well for spike-in data, how would it do on real
live
biological samples? We are working very hard to dispel the notion that
one
can find accurate microarray results with an n of 3 per group,
particularly
in animal studies. This has never been acceptable in univariate work
(at
least in our field of research) and there is nothing magical about
microarray technology that makes Gene Chips more capable of assessing
biological variance when only a few biological replicates are present.
In
our grant writing, comments to other researchers in the neurosciences,
and
advice from our microarray core, we are strongly advocating sufficient
replication and statistical determination of significant differences.
Cheers,
-E
At 01:13 AM 9/26/2003 -0400, you wrote:
>i havent had time to read this paper carefully. but here are some
minor
>comments from what i saw:
>
>1) if i understood correctly, they compare their test to the t-test
>(for RMA, dChip, and MAS). which in this data set implies they are
doing
>3 versus 3 comparisons (is this right?)
>With an N=3 the t-test has very little
>power. In fact, we find that with N=3, in these data (affy spikein
>etc...), average log fold change outperforms the t-test dramatically.
the
>SAM statistic does even better:
>
>
http://biosun01.biostat.jhsph.edu/~ririzarr/badttest.png
>
>notice in the posted figure that for high specificity (around 100
false
>positives), avg log fc give twice as many true positives.
>
>so their conclusion should not be that logit-t is better than RMA but
>rather that logit-t is a better test than a t-test when N=3
regardless of
>expression measure. not an impressive feat. RMA is not a test, its an
>expression measure. one can build tests with RMA. some will be better
than
>others. judging by their ROC, RMA using the SAM stat or simply the
avg log
>fc stat would outperform the logit-t.
>
>2 - another problem i found is the use of the PPV for just one cut-
off as
>an assessment. ROC curves where both true positive (TP) and false
>positives (FP) are shown are much more informative (notice TP and FP
can
>be calculated easily from the rates if one knows the number of spiked
in
>genes and total number of genes in the array). the PPV can be
computed for
>any cutoff or point in the ROC curve: TP/(FP+TP). In affycomp
>(
http://affycomp.biostat.jhsph.edu) we show
>ROC curves where the FP go only up to 100 since having lists with
more
>than 100 FP is not practical. see Figure 5 here:
>
>
http://biosun01.biostat.jhsph.edu/~ririzarr/papers/affycomp.pdf
>
>When computes a t-test and uses a p-value of 0.01 as the threshhold
one
>is way outside this bound. So IMHO, Table 1 in their paper is
misleading.
>because the ROC curve flattens very quickly, if
>one changed the p-value cut-off to 0.001 then the FPs for both RMA
and
>dChip will reduce dramatically but the TPs wont reduce too much. this
is
>why it is more informative to show ROC curves as opposed to just a
>number based on one point in the ROC curve. if a one number summary
is
>needed, the area under the ROC curve or the ROC convex hull are much
>better summaries than just one PPV.
>
>the ROC curves shown in this paper go up to rates of 0.4 (5000 FP).
for
>such tight comparisons, this should really go up to around 0.01 (100
FP)
>so one can see the area of interest. in our NAR paper we show rates
up to
>1.0 but this is because the comparisons where not tight at all.
3 - a minor mistake is that they incorrectly state that affy;s spikein
is
done in the hgu95av2 chip. it was done on the hgu95a chip.
4- finally
On Thu, 25 Sep 2003, Eric wrote:
<snip>
> Lemon et al. developed an interesting and possibly improved gauge
of
> confidence called the positive predictive value (PPV) that may be
useful
the positive predictive value (PPV) is a term that has been around for
decades. in medical language:
"the positive predictive value of a test is the probability that the
patient has the disease when restricted to those patients who test
positive." a simple estimate is TP/(TP+FP).
hope this helps,
rafael
> > for future scientists looking to test their low level algorithms
on known
> > data sets, but the heart of the paper has to do with their idea on
> > transforming the intensity values.
> >
> > The authors set out, using a variation on Langmuir's adsorption
isotherm
> > (that is, the classic semi-log sigmoidal dose-response
relationship) to
> > transform the intensity values of individual probes on the array.
To me,
> > this makes more biological sense than some other procedures
because it is
> > based on the ligand-receptor relationship between the probes and
the mRNA
> > species to which they are designed to hybridize.
> >
> > However, when the authors combined their transformed feature level
> > information into a single measure per probe set, they found that
their
> > procedure (Logit-Exp and Logit-ExpR) performed no better than RMA
or
> > dChip.
> >
> > Interestingly, if they DID NOT collapse their probe level data
into a
> > single probe_set value, and instead tested across all probes
(logit-t),
> > their transformed data did a much better job of winnowing the
wheat from
> > the chaff. They concluded that "...the modeling paradigm may cause
the
> > loss of information from the probe-level data"..
> >
> > This seems critical to me, there is a huge discrepancy between the
> > significant gene lists generated with different probe level
algorithms,
> > and I don't believe we'll be able to understand why that dichotomy
exists
> > until we look at the underlying probe level information.
> >
> > I have been pleading with our stats department for over a year (I
am just
> > a neuroscientist and I write code like a hippopotamus roller
skates) to
> > employ a 2-way ANOVA on repeated measures at the probe level to
test for
> > significance, and in fact went so far as to put the notion (with
some
> > sample data) into a book chapter I authored earlier this year
(Chapter 6:
> > in A Beginner's Guide to Microarrays).
> >
> > The authors state that "the combination of logit transformation
and
> > probe-level statistical testing provides a means for greatly
improved
> > PPV...". I would agree, but add the caveat that the comparison, at
the
> > probe level, on untransformed values has yet to be done, thus the
probe
> > level idea may be more important than the transformation notion to
> > improved PPV. Other methods have looked at the probe level
information
> > (e.g., Liu et al. 2002- Affymetrix multiple pairwise comparison-
but
> > their use of the feature level data as biological n may be
inapropriate;
> > and Zhang et al., 2002- but their intention was only for a two
chip
> > comparison).
> >
> > I believe that it is unfortunate that the authors resort to fold
change
> > as a final discriminator after all of that hard work, rather than
a
> > formal statistical test. I still feel that 2-way ANOVA on repeated
> > measures is the right test for this, but would love to hear from
others.
> >
> > -E
> >
> > P.S. My apologies to Lemon et al if I have misrepresented/
misunderstood
> > your work. I will gladly retract/ correct this (or any part of it)
at
> > your request.
> >
> > At 12:00 PM 9/25/2003 +0200, you wrote:
> >
> > Message: 1
> > Date: Wed, 24 Sep 2003 19:46:04 +0200
> > From: "Dario Greco" <greco@biogem.it>
> > Subject: [BioC] ...Logit-t vs RMA...
> > To: "Bioconductor" <bioconductor@stat.math.ethz.ch>
> > Message-ID: <002601c382c3$bd4a42a0$ce3ca48c@neo>
> > Content-Type: text/plain; charset="us-ascii"
> >
> > Hi to everybody,
> > I've just red some days ago the new paper on "logit-t"
method
> > to analyze
> > affy chips.
> >
> >
--------------------------------------------------------------
> > "A high performance test of differential gene expression for
> > oligonucleotide arrays"
> >
> > William J Lemon, Sandya Liyanarachchi and Ming You
> >
> > Genome Biology 2003, 4:R67
> >
--------------------------------------------------------------
> >
> > What do you think about this?
> >
> > Regards
> > Dario
> >
> > --------------------------------------------
> > Dario Greco
> > Institute of Genetics and Biophysics
> > "Adriano Buzzati Traverso" - CNR
> > 111, Via P.Castellino
> > 80131 Naples, Italy
> > phone +39 081 6132 367
> > fax +39 081 6132 350
> > email: greco@igb.cnr.it; greco@biogem.it
> >
> > Eric Blalock, PhD
> > Dept Pharmacology, UKMC
> > 859 323-8033
> >
> > STATEMENT OF CONFIDENTIALITY
> >
> > The contents of this e-mail message and any attachments are
confidential
> > and are intended solely for addressee. The information may also be
> > legally privileged. This transmission is sent in trust, for the
sole
> > purpose of delivery to the intended recipient. If you have
received this
> > transmission in error, any use, reproduction or dissemination of
this
> > transmission is strictly prohibited. If you are not the intended
> > recipient, please immediately notify the sender by reply e-mail or
at
> > (859) 323-8033 and delete this message and its attachments, if
any.
> >
Eric Blalock, PhD
Dept Pharmacology, UKMC
859 323-8033
STATEMENT OF CONFIDENTIALITY
The contents of this e-mail message and any attachments are
confidential
and are intended solely for addressee. The information may also be
legally
privileged. This transmission is sent in trust, for the sole purpose
of
delivery to the intended recipient. If you have received this
transmission
in error, any use, reproduction or dissemination of this transmission
is
strictly prohibited. If you are not the intended recipient, please
immediately notify the sender by reply e-mail or at (859) 323-8033 and
delete this message and its attachments, if any.