ttest or fold change

0

Entering edit mode

Jason Hipp ▴ 40

@jason-hipp-557

Last seen 10.2 years ago

I am comparinga relatively homogeneous cell culture to another that has been treated, and am using RMA. I only have 3 replicates of each. Would you recommend a 2 tailed equal variance t test? I also thought I read that with such few replicates, a fold change would be better than a t test? If I get a t test of .0001, and a fold change of 1.2, is this a reliable change using RMA? Thanks, Jason

• 5.9k views

ADD COMMENT • link updated 21.0 years ago by Garrett Frampton ▴ 20 • written 21.0 years ago by Jason Hipp ▴ 40

0

Entering edit mode

Rafael A. Irizarry ★ 2.3k

@rafael-a-irizarry-205

Last seen 10.2 years ago

in my experience with affymetrix microarryas the t-test is close to power-less with only 3 replicates. fold-change works much better. things like sam (see siggenes package for sam-stat and limma package for sam- like stats) are much better as well. if you do use a t-test make sure to look at a volcano plot (-log p-value versus average log fold change). typically you see many many genes with very small pvalues but log fold-chages very close to 0. these are likely not of interest (the denominator of t-test was small by chance). On Sat, 13 Dec 2003, Jason Hipp wrote: > I am comparinga relatively homogeneous cell culture to another that has been treated, and am using RMA. > > I only have 3 replicates of each. Would you recommend a 2 tailed equal variance t test? > I also thought I read that with such few replicates, a fold change would be better than a t test? > If I get a t test of .0001, and a fold change of 1.2, is this a reliable change using RMA? > > Thanks, > Jason > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 21.0 years ago Rafael A. Irizarry ★ 2.3k

0

Entering edit mode

A.J. Rossini ▴ 810

@aj-rossini-209

Last seen 10.2 years ago

"Jason Hipp" <jhipp@wfubmc.edu> writes: > I am comparinga relatively homogeneous cell culture to another that has been treated, and am using RMA. > > I only have 3 replicates of each. Would you recommend a 2 tailed equal variance t test? > I also thought I read that with such few replicates, a fold change would be better than a t test? > If I get a t test of .0001, and a fold change of 1.2, is this a reliable change using RMA? You need to consider: 1. what is that value compared to the others from the experiment 2. are you doing an exploratory analysis to be confirmed later or is this part of the final scientific justification? 3. you can't treat it like a black box. Context-free science is pretty much content-free (i.e. what cell culture, what treatment, apriori should the differences be large, etc.. etc...) If you are just generating hypotheses, probably reasonable, based on MY ASSUMPTIONS. Of course, you are probably violating those, since I'm not telling you what they are and you have no clue as to whether I'm reasonable or insane this morning. (think about that for a bit and re-read what you wrote... how can we know what you've done from the above? how can you even come close to justifying equal variance? Are you just looking for a way to order the results?) best, -tony -- rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}

ADD COMMENT • link 21.0 years ago A.J. Rossini ▴ 810

0

Entering edit mode

Ramon Diaz ★ 1.1k

@ramon-diaz-159

Last seen 10.2 years ago

Dear Jason, First, I think you should recognize that three replicates are very few and thus conclusions will not be particularly trustworthy. I assume this is a first round of screening for relevant genes for subsequent studies. Second, I think the fold-ratio vs. t-test issue can often muddle two different questions: a) is there statistical evidence of differential expression; b) is the expression of gene X altered in a biologically relevant way (where biologically relevant means more than Z times). If you had a large number of samples you might be able to detect as "statistically significant" very small log ratio changes (which might, or might not, be biologically relevant); converseley, what if the fold change is large but the variance is huge? For reasons I don't understand, the two-fold change sometimes has a sacrosant status, but it is my understanding that other fold changes (say 1.3 or 3.5) could, on certain cases, be much more biologically relevant; this, of course, depends on the context. In your case, the t-test has an additional potential problem with the denominator. I would suggest using some procedure, such as the empirical bayes one in limma, that will use a modificied expression for the denominator, and save you from finding some very small p-values just because that gene has, by chance, an artificially small variance. So I would use limma (or something like it) and also filter by some criterion that biologists tell you is relevant for them (say, we only want genes that are overexpressed at least 5 times, or whatever). Best, R. On Saturday 13 December 2003 16:45, Jason Hipp wrote: > I am comparinga relatively homogeneous cell culture to another that has > been treated, and am using RMA. > > I only have 3 replicates of each. Would you recommend a 2 tailed equal > variance t test? I also thought I read that with such few replicates, a > fold change would be better than a t test? If I get a t test of .0001, and > a fold change of 1.2, is this a reliable change using RMA? > > Thanks, > Jason > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor -- Ram?n D?az-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncol?gicas (CNIO) (Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz PGP KeyID: 0xE89B3462 (http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc)

ADD COMMENT • link 21.0 years ago Ramon Diaz ★ 1.1k

0

Entering edit mode

michael watson IAH-C ★ 3.4k

@michael-watson-iah-c-378

Last seen 10.2 years ago

Why not try the non-parametric t-tests available? I know all the arguments about a "loss of power" etc, but at the end of day, as statisticians and bioinformaticians, sometimes biologists come to us with small numbers of replicates (for very understandable reasons) and it is our job to get some meaning out of that data. Trying to fit any kind of statistic involving a p-value to such data is a difficult and risky task, and trying to explain those results to the biologist is often very difficult. So here's what happens with the non-parametric tests based on ranking. Those genes with the highest |t| are those where all the replicates of one condition are greater than all the replicates of the other condition. The next highest |t| is where all but one of the replicates of one condition are greater than all the replicates of the other conddition, etc etc. OK, so some of these differences could occur by chance, but we're dealing with often millions of data points and I really don't think it's possible to make no mistakes. And curse me if you like, but if i have a gene expression measurement, replicated 5 times in two conditions, and in one condition all five replicates are higher than the five replicates of the other condition, then I believe that that gene is differentially expressed. And thats easy to find with non- parametric t, and it is easy to explain to a biologist, and at the end of the day, is it really wrong to do that? -----Original Message----- From: Ramon Diaz-Uriarte [mailto:rdiaz@cnio.es] Sent: 15 December 2003 11:54 To: Jason Hipp; bioconductor@stat.math.ethz.ch Subject: Re: [BioC] ttest or fold change Dear Jason, First, I think you should recognize that three replicates are very few and thus conclusions will not be particularly trustworthy. I assume this is a first round of screening for relevant genes for subsequent studies. Second, I think the fold-ratio vs. t-test issue can often muddle two different questions: a) is there statistical evidence of differential expression; b) is the expression of gene X altered in a biologically relevant way (where biologically relevant means more than Z times). If you had a large number of samples you might be able to detect as "statistically significant" very small log ratio changes (which might, or might not, be biologically relevant); converseley, what if the fold change is large but the variance is huge? For reasons I don't understand, the two-fold change sometimes has a sacrosant status, but it is my understanding that other fold changes (say 1.3 or 3.5) could, on certain cases, be much more biologically relevant; this, of course, depends on the context. In your case, the t-test has an additional potential problem with the denominator. I would suggest using some procedure, such as the empirical bayes one in limma, that will use a modificied expression for the denominator, and save you from finding some very small p-values just because that gene has, by chance, an artificially small variance. So I would use limma (or something like it) and also filter by some criterion that biologists tell you is relevant for them (say, we only want genes that are overexpressed at least 5 times, or whatever). Best, R. On Saturday 13 December 2003 16:45, Jason Hipp wrote: > I am comparinga relatively homogeneous cell culture to another that has > been treated, and am using RMA. > > I only have 3 replicates of each. Would you recommend a 2 tailed equal > variance t test? I also thought I read that with such few replicates, a > fold change would be better than a t test? If I get a t test of .0001, and > a fold change of 1.2, is this a reliable change using RMA? > > Thanks, > Jason > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor -- Ram?n D?az-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncol?gicas (CNIO) (Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz PGP KeyID: 0xE89B3462 (http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc) _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

ADD COMMENT • link 21.0 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

Michael Newton ▴ 50

@michael-newton-456

Last seen 10.2 years ago

Hi, My own calculations have also shown a lack of sensitivity of both t-testing and other approaches when we have few replicates. You might be interested in a mixture approach that seems promising. See http://www.stat.wisc.edu/~newton/papers/abstracts/tr1074a.html for code and a paper. Michael Newton p.s. That site contains a major revision of the report I released last January, with code etc recently updated; aiming for Bioconductor soon! On Sat, 13 Dec 2003, Jason Hipp wrote: > I am comparinga relatively homogeneous cell culture to another that has been treated, and am using RMA. > > I only have 3 replicates of each. Would you recommend a 2 tailed equal variance t test? > I also thought I read that with such few replicates, a fold change would be better than a t test? > If I get a t test of .0001, and a fold change of 1.2, is this a reliable change using RMA? > > Thanks, > Jason > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 21.0 years ago Michael Newton ▴ 50

0

Entering edit mode

You may try LPE (under developmental packages on Bioconductor), which is suited for significance analysis for low number of replicates. Regards, -Nitin > > Hi, > > My own calculations have also shown a lack of sensitivity of both > t-testing and other approaches when we have few replicates. You > might be interested in a mixture approach that seems promising. > See http://www.stat.wisc.edu/~newton/papers/abstracts/tr1074a.html > for code and a paper. > > Michael Newton > > p.s. That site contains a major revision of the report I released > last January, with code etc recently updated; aiming for > Bioconductor soon! > > > On Sat, 13 Dec 2003, Jason Hipp wrote: > > > I am comparinga relatively homogeneous cell culture to another that has been treated, and am using RMA. > > > > I only have 3 replicates of each. Would you recommend a 2 tailed equal variance t test? > > I also thought I read that with such few replicates, a fold change would be better than a t test? > > If I get a t test of .0001, and a fold change of 1.2, is this a reliable change using RMA? > > > > Thanks, > > Jason > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > >

ADD REPLY • link 21.0 years ago Nitin Jain ▴ 30

0

Entering edit mode

Baker, Stephen ▴ 160

@baker-stephen-469

Last seen 10.2 years ago

Two POINTS: First, I think people are missing the point about statistics. When you look at tens of thousands of genes, random chance plays a BIG role. This means that large fold changes will occur and very small p-values will occur often by chance alone. The investigator doesn't know which large fold changes or small p-values are due to chance or which are "true" results. The best we can do is to try to apply statisics (and biology) with these things in mind, that's what statistics is all about (one of the things anyway),i.e. applying methods in such a way as to get results that are meaningful and for which the probabilities of making different types of errors are known. With respect to just looking at the rankings, with small samples the distributions are well known, very discrete and have very few distinct probabilities. The situation that Mike describes, i.e. 10 samples with 5 per group is a situation for which there are 252 possible outcomes (10 choose 5 combinatorically). If you consider both up and down regulation as an outcome of interest there are 126 distinct outcomes so that the probability of this occuring by chance alone is 1 in 126 or .00794. This seems small but with a microarray with thousands of genes, this easily produces a bunch of false positives. I looked at 10 chips from a real control group arbitrarily labeling 5 chips as control and 5 as experimental. I would by theory expect 35 false positives and got exactly 32, that is 32 sitations in which all the low ranks were in one group and the high ranks in the other. For a chip with 22000 genes, you would expect 175 false positive results by this criteria. Standard statistical methods would give you a specified type I error rate that you can count on, it would have found NONE of the genes significant (i.e. bonferroni adjustment) This same set of control chips produced 50 false positive results using a 2 fold change criteria. Again, these are ALL false positives. Second, With respect to t-Tests a couple of people have mentioned "the problem" that the t-test denominator may be accidentally "too small" . This is because the t-test uses an ESTIMATE of the variance from the sample itself. This is what William Sealey Gossett, otherwise known as "Student" discovered that prompted him to develop the t-distribution and t-test. Gossett or Student was a brewmaster for Guinness breweries in Dublin and was doing experiments with hops and things and discovered that the well known "normal distribution" was inaccurate when you estimated the variance from a sample. He developed the t-distribution empirically that takes the variability in the variance estimate into account so that the t-test is ALREADY ADJUSTED to compensate for weird values in the denominator due to random sampling. One thing that I think is too often ignored is that different genes have different variances, the fact that one gene appears to have a smaller variance than its neighbors (or a larger one) could be that it ACTUALLY DOES have a larger or smaller variance OR it may be due to sampling variability. The t-test assumes the former but adjusts for the latter possiblity. It worked then and it works now, it is NOT a problem. Student's friend, the genius R.A.Fisher took Student's empirical result and worked out the theory on which analysis of variance is all based. This theory has withstood the test of time, it is about 100 years old and still holds, given the assumptions are correct, t-tests and ANOVA are still "uniformly most powerful tests". -.- -.. .---- .--. ..-. Stephen P. Baker, MScPH, PhD (ABD) (508) 856-2625 Sr. Biostatistician- Information Services Lecturer in Biostatistics (775) 254-4885 fax Graduate School of Biomedical Sciences University of Massachusetts Medical School, Worcester 55 Lake Avenue North stephen.baker@umassmed.edu Worcester, MA 01655 USA ------------------------------ .Message: 3 .Date: Mon, 15 Dec 2003 12:11:43 -0000 .From: "michael watson (IAH-C)" <michael.watson@bbsrc.ac.uk> .Subject: RE: [BioC] ttest or fold change .To: bioconductor@stat.math.ethz.ch .Message-ID: . <20B7EB075F2D4542AFFAF813E98ACD93028224D1@cl-exsrv1.irad.bbsrc.ac.uk> .Content-Type: text/plain; charset="utf-8" . . .Why not try the non-parametric t-tests available? . .I know all the arguments about a "loss of power" etc, but at the end of day, as statisticians and bioinformaticians, .sometimes biologists come to us with small numbers of replicates (for very understandable reasons) and it is our job to get .some meaning out of that data. Trying to fit any kind of statistic involving a p-value to such data is a difficult and .risky task, and trying to explain those results to the biologist is often very difficult. . .So here's what happens with the non-parametric tests based on ranking. Those genes with the highest |t| are those where .all the replicates of one condition are greater than all the replicates of the other condition. The next highest |t| is .where all but one of the replicates of one condition are greater than all the replicates of the other conddition, etc etc. . .OK, so some of these differences could occur by chance, but we're dealing with often millions of data points and I really .don't think it's possible to make no mistakes. And curse me if you like, but if i have a gene expression measurement, .replicated 5 times in two conditions, and in one condition all five replicates are higher than the five replicates of the .other condition, then I believe that that gene is differentially expressed. And thats easy to find with non- parametric t, .and it is easy to explain to a biologist, and at the end of the day, is it really wrong to do that?

ADD COMMENT • link 21.0 years ago Baker, Stephen ▴ 160

0

Entering edit mode

Dr. Baker, You wrote about "the problem" that the t-test denominator may be accidentally "too small". You say that this issue has been solved within the T-test. It is my belief that this problem has only been partially solved. It is true that this "problem" has been solved for a single hypothesis test within the T-test, but it has not been solved for microarray data analysis as a whole. It is possible to gain power by using local estimates of variance based upon more than one gene. This sort of approach is extremely useful for experiments with only a few replicates because it deals with the situation where the within group variance for a single gene happens to be very small. This is the approach implemented in Cyber-T; http://visitor.ics.uci.edu/genex/cybert/. By looking at the dataset as a whole, rather than 1 gene at a time, it is possible to eliminate false-positives that arise as a result of coincidentally low within group variance. Do you agree? Other than this minor point I think you did a wonderful job putting the statistical concepts that so many struggle with into words. Garrett Frampton Research Associate Boston University School of Medicine - Microarray Resource -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Baker, Stephen Sent: Monday, December 15, 2003 2:15 PM To: bioconductor@stat.math.ethz.ch Subject: RE: [BioC] ttest or fold change Second, With respect to t-Tests a couple of people have mentioned "the problem" that the t-test denominator may be accidentally "too small" . This is because the t-test uses an ESTIMATE of the variance from the sample itself. This is what William Sealey Gossett, otherwise known as "Student" discovered that prompted him to develop the t-distribution and t-test. Gossett or Student was a brewmaster for Guinness breweries in Dublin and was doing experiments with hops and things and discovered that the well known "normal distribution" was inaccurate when you estimated the variance from a sample. He developed the t-distribution empirically that takes the variability in the variance estimate into account so that the t-test is ALREADY ADJUSTED to compensate for weird values in the denominator due to random sampling. One thing that I think is too often ignored is that different genes have different variances, the fact that one gene appears to have a smaller variance than its neighbors (or a larger one) could be that it ACTUALLY DOES have a larger or smaller variance OR it may be due to sampling variability. The t-test assumes the former but adjusts for the latter possiblity. It worked then and it works now, it is NOT a problem. Student's friend, the genius R.A.Fisher took Student's empirical result and worked out the theory on which analysis of variance is all based. This theory has withstood the test of time, it is about 100 years old and still holds, given the assumptions are correct, t-tests and ANOVA are still "uniformly most powerful tests". -.- -.. .---- .--. ..-. Stephen P. Baker, MScPH, PhD (ABD) (508) 856-2625 Sr. Biostatistician- Information Services Lecturer in Biostatistics (775) 254-4885 fax Graduate School of Biomedical Sciences University of Massachusetts Medical School, Worcester 55 Lake Avenue North stephen.baker@umassmed.edu Worcester, MA 01655 USA ------------------------------ .Message: 3 .Date: Mon, 15 Dec 2003 12:11:43 -0000 .From: "michael watson (IAH-C)" <michael.watson@bbsrc.ac.uk> .Subject: RE: [BioC] ttest or fold change .To: bioconductor@stat.math.ethz.ch .Message-ID: . <20B7EB075F2D4542AFFAF813E98ACD93028224D1@cl-exsrv1.irad.bbsrc.ac.uk> .Content-Type: text/plain; charset="utf-8" . . .Why not try the non-parametric t-tests available? . .I know all the arguments about a "loss of power" etc, but at the end of day, as statisticians and bioinformaticians, .sometimes biologists come to us with small numbers of replicates (for very understandable reasons) and it is our job to get .some meaning out of that data. Trying to fit any kind of statistic involving a p-value to such data is a difficult and .risky task, and trying to explain those results to the biologist is often very difficult. . .So here's what happens with the non-parametric tests based on ranking. Those genes with the highest |t| are those where .all the replicates of one condition are greater than all the replicates of the other condition. The next highest |t| is .where all but one of the replicates of one condition are greater than all the replicates of the other conddition, etc etc. . .OK, so some of these differences could occur by chance, but we're dealing with often millions of data points and I really .don't think it's possible to make no mistakes. And curse me if you like, but if i have a gene expression measurement, .replicated 5 times in two conditions, and in one condition all five replicates are higher than the five replicates of the .other condition, then I believe that that gene is differentially expressed. And thats easy to find with non- parametric t, .and it is easy to explain to a biologist, and at the end of the day, is it really wrong to do that? _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

ADD REPLY • link 21.0 years ago Garrett Frampton ▴ 10

0

Entering edit mode

michael watson IAH-C ★ 3.4k

@michael-watson-iah-c-378

Last seen 10.2 years ago

>This seems small but with a microarray with thousands of genes, this >easily produces a bunch of false positives. I looked at 10 chips from a >real control group arbitrarily labeling 5 chips as control and 5 as >experimental. I would by theory expect 35 false positives and got >exactly 32, that is 32 sitations in which all the low ranks were in one >group and the high ranks in the other. For a chip with 22000 genes, you >would expect 175 false positive results by this criteria. Standard >statistical methods would give you a specified type I error rate that >you can count on, it would have found NONE of the genes significant >(i.e. bonferroni adjustment) A truly excellent reply, and one which I will no doubt refer to frequently; I am still very much a novice statistician. However, and please correct me if I am wrong, but I presume that some scientists are equally afraid of false negatives as false positives? i.e. that if we are so conservative such that we try to ENSURE that there are NO false positives, we may throw away genes as not differentially expressed when in reality they are? It will be interesting to have a discussion on this - is it possible, using statistics, to guarentee both no false positives and no false negatives? If not, then surely the investigator must decide which is relevant to the study in question before going on to decide which stats to use.

ADD COMMENT • link 21.0 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

Baker, Stephen ▴ 160

@baker-stephen-469

Last seen 10.2 years ago

RE: [BioC] ttest or fold changeOf course investigators don't want false negatives as well as false positives but you can promise neither no false positives nor false negatives except for the trivial case when one either classifies 100% as positive or negative. The best you can do is to quantify the probabilities, then trade off one for the other, i.e. decreasing the probability of one type of error increases the probability of the other. However, as there is an arbitrary but real defacto standard for type I error of 5% which limits how much the tradeoff can be manipulated. New approaches such as mixture models offer some promise of improvement and use of the False Discovery Rate can make a very big difference in the number of regulated genes detected. I think this is underutilized. Fortunately the REAL answer is under the control of the investigator with the help of the statistician. That is the process of power analysis, i.e. the statistician can help the investigator calculate the number of microarrays that are needed to provide a desired probability of detecting a specified size effect (in fold changes). There is no effect that is too subtle to detect with enough data. Of course, there is no such thing as a free lunch and microarrays are still expensive (but getting cheaper), but then again if one thinks of the cost of any other technology, microarrays are incredibly inexpensive considering the amount of data they produce. Imagine the cost in materials and labor to do PCR on 10,000 or 20,000 genes! The studies we are seeing are getting larger and larger. Funding agencies are funding well prepared proposals for large studies with many microarrays (i.e. enough to detect meaningful effects) based on small studies of a few microarrays. These small studies are then pilot studies and pilot studies do not need to be "definitive" to be useful. They just may not be publishable on their own. -.- -.. .---- .--. ..-. Stephen P. Baker, MScPH , PhD(ABD) (508) 856-2625 Senior Biostatistician (775) 254-4885 fax Academic Computing Services Lecturer in Biostatistics , Graduate School of Biomedical Sciences University of Massachusetts Medical School 55 Lake Avenue North stephen.baker@umassmed.edu Worcester, MA 01655 USA ----- Original Message ----- From: michael watson (IAH-C) To: Baker, Stephen ; bioconductor@stat.math.ethz.ch Sent: Tuesday, December 16, 2003 4:46 AM Subject: RE: [BioC] ttest or fold change >This seems small but with a microarray with thousands of genes, this >easily produces a bunch of false positives. I looked at 10 chips from a A truly excellent reply, and one which I will no doubt refer to frequently; I am still very much a novice statistician. However, and please correct me if I am wrong, but I presume that some scientists are equally afraid of false negatives as false positives? i.e. that if we are so conservative such that we try to ENSURE that there are NO false positives, we may throw away genes as not differentially expressed when in reality they are? It will be interesting to have a discussion on this - is it possible, using statistics, to guarentee both no false positives and no false negatives? If not, then surely the investigator must decide which is relevant to the study in question before going on to decide which stats to use. [[alternative HTML version deleted]]

ADD COMMENT • link 21.0 years ago Baker, Stephen ▴ 160

0

Entering edit mode

Stephen Henderson ★ 1.0k

@stephen-henderson-71

Last seen 7.6 years ago

Yes This is exactly the idea behind the False Discovery Rate(FDR) algorithms that adjust p-values that you can find described in both the multtest and Limma packages. A truly excellent reply, and one which I will no doubt refer to frequently; I am still very much a novice statistician. However, and please correct me if I am wrong, but I presume that some scientists are equally afraid of false negatives as false positives? i.e. that if we are so conservative such that we try to ENSURE that there are NO false positives, we may throw away genes as not differentially expressed when in reality they are? It will be interesting to have a discussion on this - is it possible, using statistics, to guarentee both no false positives and no false negatives? If not, then surely the investigator must decide which is relevant to the study in question before going on to decide which stats to use. _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor ********************************************************************** This email and any files transmitted with it are confidentia...{{dropped}}

ADD COMMENT • link 21.0 years ago Stephen Henderson ★ 1.0k

0

Entering edit mode

Crispin Miller ★ 1.1k

@crispin-miller-264

Last seen 10.2 years ago

Hi, I was wondering what people made of the following, hypothetical, experiment - I think it raises some interesting issues about where arrays sit in the context of a complete experiment at the bench: The old hgu95 Affy chips had about 12k probesets on it, the u133a chips have about 22k probesets and the new plus2 chips have about 56k probesets. When I did my first experiment on u95 arrays I did multiple testing correction and found one gene that was particularly interesting - with a corrected p-score of 0.01. Then, I repeated the experiment on u133 arrays and found the same gene, but because there were nearly twice as many probesets on the array, the chances of false positives nearly doubled (ish), so my p-score dropped. Now, on the plus2 arrays, my gene has a p-score of >0.05 and it's not significant anymore. What troubles me is that the gene I chose to work on 6-months ago with my collaborators now falls through the filter I've set. I guess this means that I can't say that it isn't just a false positive, and so I need to do some follow-up to confirm it by other means. The trouble is that if I did that with a Northern, or real-time PCR, or even a Southern, shouldn't I be applying multiple testing corrections to these, based on the other Northerns or Southerns, I could have run, in parallel? Crispin -----Original Message----- From: Stephen P. Baker [mailto:stephen.baker@umassmed.edu] Sent: 16 December 2003 11:21 To: michael watson (IAH-C); bioconductor@stat.math.ethz.ch Subject: Re: [BioC] ttest or fold change RE: [BioC] ttest or fold changeOf course investigators don't want false negatives as well as false positives but you can promise neither no false positives nor false negatives except for the trivial case when one either classifies 100% as positive or negative. The best you can do is to quantify the probabilities, then trade off one for the other, i.e. decreasing the probability of one type of error increases the probability of the other. However, as there is an arbitrary but real defacto standard for type I error of 5% which limits how much the tradeoff can be manipulated. New approaches such as mixture models offer some promise of improvement and use of the False Discovery Rate can make a very big difference in the number of regulated genes detected. I think this is underutilized. Fortunately the REAL answer is under the control of the investigator with the help of the statistician. That is the process of power analysis, i.e. the statistician can help the investigator calculate the number of microarrays that are needed to provide a desired probability of detecting a specified size effect (in fold changes). There is no effect that is too subtle to detect with enough data. Of course, there is no such thing as a free lunch and microarrays are still expensive (but getting cheaper), but then again if one thinks of the cost of any other technology, microarrays are incredibly inexpensive considering the amount of data they produce. Imagine the cost in materials and labor to do PCR on 10,000 or 20,000 genes! The studies we are seeing are getting larger and larger. Funding agencies are funding well prepared proposals for large studies with many microarrays (i.e. enough to detect meaningful effects) based on small studies of a few microarrays. These small studies are then pilot studies and pilot studies do not need to be "definitive" to be useful. They just may not be publishable on their own. -.- -.. .---- .--. ..-. Stephen P. Baker, MScPH , PhD(ABD) (508) 856-2625 Senior Biostatistician (775) 254-4885 fax Academic Computing Services Lecturer in Biostatistics , Graduate School of Biomedical Sciences University of Massachusetts Medical School 55 Lake Avenue North stephen.baker@umassmed.edu Worcester, MA 01655 USA ----- Original Message ----- From: michael watson (IAH-C) To: Baker, Stephen ; bioconductor@stat.math.ethz.ch Sent: Tuesday, December 16, 2003 4:46 AM Subject: RE: [BioC] ttest or fold change >This seems small but with a microarray with thousands of genes, this >easily produces a bunch of false positives. I looked at 10 chips from a A truly excellent reply, and one which I will no doubt refer to frequently; I am still very much a novice statistician. However, and please correct me if I am wrong, but I presume that some scientists are equally afraid of false negatives as false positives? i.e. that if we are so conservative such that we try to ENSURE that there are NO false positives, we may throw away genes as not differentially expressed when in reality they are? It will be interesting to have a discussion on this - is it possible, using statistics, to guarentee both no false positives and no false negatives? If not, then surely the investigator must decide which is relevant to the study in question before going on to decide which stats to use. [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor -------------------------------------------------------- This email is confidential and intended solely for the use o...{{dropped}}

ADD COMMENT • link 21.0 years ago Crispin Miller ★ 1.1k

0

Entering edit mode

James W. MacDonald 67k

@james-w-macdonald-5106

Last seen 14 hours ago

United States

You only have to adjust for the multiple comparisons you have made, not those you could have made. Best, Jim James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 >>> "Crispin Miller" <cmiller@picr.man.ac.uk> 12/16/03 06:38AM >>> Hi, I was wondering what people made of the following, hypothetical, experiment - I think it raises some interesting issues about where arrays sit in the context of a complete experiment at the bench: The old hgu95 Affy chips had about 12k probesets on it, the u133a chips have about 22k probesets and the new plus2 chips have about 56k probesets. When I did my first experiment on u95 arrays I did multiple testing correction and found one gene that was particularly interesting - with a corrected p-score of 0.01. Then, I repeated the experiment on u133 arrays and found the same gene, but because there were nearly twice as many probesets on the array, the chances of false positives nearly doubled (ish), so my p-score dropped. Now, on the plus2 arrays, my gene has a p-score of >0.05 and it's not significant anymore. What troubles me is that the gene I chose to work on 6-months ago with my collaborators now falls through the filter I've set. I guess this means that I can't say that it isn't just a false positive, and so I need to do some follow-up to confirm it by other means. The trouble is that if I did that with a Northern, or real-time PCR, or even a Southern, shouldn't I be applying multiple testing corrections to these, based on the other Northerns or Southerns, I could have run, in parallel? Crispin -----Original Message----- From: Stephen P. Baker [mailto:stephen.baker@umassmed.edu] Sent: 16 December 2003 11:21 To: michael watson (IAH-C); bioconductor@stat.math.ethz.ch Subject: Re: [BioC] ttest or fold change RE: [BioC] ttest or fold changeOf course investigators don't want false negatives as well as false positives but you can promise neither no false positives nor false negatives except for the trivial case when one either classifies 100% as positive or negative. The best you can do is to quantify the probabilities, then trade off one for the other, i.e. decreasing the probability of one type of error increases the probability of the other. However, as there is an arbitrary but real defacto standard for type I error of 5% which limits how much the tradeoff can be manipulated. New approaches such as mixture models offer some promise of improvement and use of the False Discovery Rate can make a very big difference in the number of regulated genes detected. I think this is underutilized. Fortunately the REAL answer is under the control of the investigator with the help of the statistician. That is the process of power analysis, i.e. the statistician can help the investigator calculate the number of microarrays that are needed to provide a desired probability of detecting a specified size effect (in fold changes). There is no effect that is too subtle to detect with enough data. Of course, there is no such thing as a free lunch and microarrays are still expensive (but getting cheaper), but then again if one thinks of the cost of any other technology, microarrays are incredibly inexpensive considering the amount of data they produce. Imagine the cost in materials and labor to do PCR on 10,000 or 20,000 genes! The studies we are seeing are getting larger and larger. Funding agencies are funding well prepared proposals for large studies with many microarrays (i.e. enough to detect meaningful effects) based on small studies of a few microarrays. These small studies are then pilot studies and pilot studies do not need to be "definitive" to be useful. They just may not be publishable on their own. -.- -.. .---- .--. ..-. Stephen P. Baker, MScPH , PhD(ABD) (508) 856-2625 Senior Biostatistician (775) 254-4885 fax Academic Computing Services Lecturer in Biostatistics , Graduate School of Biomedical Sciences University of Massachusetts Medical School 55 Lake Avenue North stephen.baker@umassmed.edu Worcester, MA 01655 USA ----- Original Message ----- From: michael watson (IAH-C) To: Baker, Stephen ; bioconductor@stat.math.ethz.ch Sent: Tuesday, December 16, 2003 4:46 AM Subject: RE: [BioC] ttest or fold change >This seems small but with a microarray with thousands of genes, this >easily produces a bunch of false positives. I looked at 10 chips from a A truly excellent reply, and one which I will no doubt refer to frequently; I am still very much a novice statistician. However, and please correct me if I am wrong, but I presume that some scientists are equally afraid of false negatives as false positives? i.e. that if we are so conservative such that we try to ENSURE that there are NO false positives, we may throw away genes as not differentially expressed when in reality they are? It will be interesting to have a discussion on this - is it possible, using statistics, to guarentee both no false positives and no false negatives? If not, then surely the investigator must decide which is relevant to the study in question before going on to decide which stats to use. [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor -------------------------------------------------------- This email is confidential and intended solely for the use\ ...{{dropped}}

ADD COMMENT • link 21.0 years ago James W. MacDonald 67k

0

Entering edit mode

Crispin Miller ★ 1.1k

@crispin-miller-264

Last seen 10.2 years ago

Hi Jim, > You only have to adjust for the multiple comparisons you have made, > not those you could have made. I take your point, but I'm still concerned that a hypothesis I would have accepted six months ago is now something I'd reject. If nothing else, because I have to explain why :-) I think there is a wider issue about the logical decisions that need to be made about how one groups the data in order to apply a correction... Another example: Scientist A is interested by what is up-regulated by his transcription factor. He does a real-time experiment with replicates and finds a gene with significant induction. Scientist B is also interested in the same transcription factor. She does a similar real-time experiment on a different gene (but with the same transcription factor). Does she do multiple testing correction to take into account the previous work by A? (I think this is very similar to the new-v-old chips question, but with different numbers). I did a quick search on the web and this sort of thing appears to have been discussed quite a lot before - e.g. 'To Bonferroni or not to Bonferroni...' Cabin and Mitchell (2000), which is cited a fair number of times by articles in Ecology journals... Crispin -------------------------------------------------------- This email is confidential and intended solely for the use o...{{dropped}}

ADD COMMENT • link 21.0 years ago Crispin Miller ★ 1.1k

0

Entering edit mode

Baker, Stephen ▴ 160

@baker-stephen-469

Last seen 10.2 years ago

Garrett et al, The t-test (or ANOVA) does not have a problem with "accidentally too small" variances, either with one or more than one outcome of interest. The estimate of the error variance by t-tests and ANOVA is a Least Squares estimate and is the UNBIASED ESTIMATOR that is also the lower bound on the variance for the "best" (minimum variance) linear unbiased estimator (BLUE) of the effects being tested (see Graybill 1976). Some bayesian methods can generate smaller estimates of variances by biasing the estimate toward some overall measure such as the average of variances for nearby genes. These are BIASED estimates based on an assumption that a particular gene should really be like genes that are "nearby" in some sense, such as they have similar expression levels. You would have to present a lot of data to me to convince me that any randomly selected gene should have a variance like some other set of genes, especially when I have an unbiased estimate at hand that is non-controversial, requires no defense, and uses methods that have withstood 100 years of review and scrutiny. I'm familiar with shrunken estimates of effects that can have a smaller "mean squared error", but these are random effects, not variances which control the power and type I error rate. These approaches, in addition to producing biased estimates sometimes require the analyst to impose his or her own particular biases, called "prior beliefs" or "priors" on as to how much these estimates should be biased by requiring that the analyst input how much weight is given to the data from that gene and how much weight is given to the other set that the gene is supposed to "be more like". Again, it would take some pretty strong arguments to convince me that any particular analysts prior beliefs about how much the data for a gene or data from other genes should or should not be weighted. I would be concerned about how much convincing a readership, reviewer, or study group would need if they ever decide to "open the black box" and ask me to explain why such an approach is reasonable/justifiable. The program Garrett mentioned, Cyber-T, uses such an approach. To quote the Cyber-T manual "...This weighting factor IS CONTROLLED BY THE EXPERIMENTER AND WILL DEPEND ON HOW CONFIDENT THE EXPERIMENTER IS that the background variance of a closely related set of genes approximates the variance of the gene under consideration". Now if one was looking at just ONE gene, it makes sense that someone might put a lot of thought into it, have looked at a lot of similar genes or other data and come to the conclusion that a gene should be like some other genes and THEN use this approach. But this is not the case when you have 10,000 or 22,000 genes, at least not in the world I'm familiar with. I use empirical bayes methods for fitting general linear mixed models, where the priors are objective, not my own opinion. Cyber-T does offer the option of setting low confidence in the prior which is an objective prior, but the manual points out that this results in the standard Student t-test! Another feature of Cyber-T is that when you have "enough" data, the weighted approach converges into the standard t-test as well. The real problem that researchers face with microarrays is NOT that their t-test variances are too small, but that they often have insufficient sample to detect the differences they need to detect. The ready solution is to get enough data. -.- -.. .---- .--. ..-. Stephen P. Baker, MScPH, PhD (ABD) (508) 856-2625 Sr. Biostatistician- Information Services Lecturer in Biostatistics (775) 254-4885 fax Graduate School of Biomedical Sciences University of Massachusetts Medical School, Worcester 55 Lake Avenue North stephen.baker@umassmed.edu Worcester, MA 01655 USA ------------------------------ Message: 6 Date: Tue, 16 Dec 2003 10:24:31 -0500 From: "Garrett Frampton" <gmframpt@bu.edu> Subject: RE: [BioC] ttest or fold change To: <bioconductor@stat.math.ethz.ch> Message-ID: <00b801c3c3e8$b3ed2cc0$e1be299b@GARRETT> Content-Type: text/plain; charset="US-ASCII" Dr. Baker, You wrote about "the problem" that the t-test denominator may be accidentally "too small". You say that this issue has been solved within the T-test. It is my belief that this problem has only been partially solved. It is true that this "problem" has been solved for a single hypothesis test within the T-test, but it has not been solved for microarray data analysis as a whole. It is possible to gain power by using local estimates of variance based upon more than one gene. This sort of approach is extremely useful for experiments with only a few replicates because it deals with the situation where the within group variance for a single gene happens to be very small. This is the approach implemented in Cyber-T; http://visitor.ics.uci.edu/genex/cybert/. By looking at the dataset as a whole, rather than 1 gene at a time, it is possible to eliminate false-positives that arise as a result of coincidentally low within group variance. Do you agree? Other than this minor point I think you did a wonderful job putting the statistical concepts that so many struggle with into words. Garrett Frampton Research Associate Boston University School of Medicine - Microarray Resource

ADD COMMENT • link 21.0 years ago Baker, Stephen ▴ 160

0

Entering edit mode

Dear Stephen, Thank you for your detailed comments. Two points: 1. It is my understanding that there are other issues at stake besides the unbiasedness of the error variance estimate, but I'll leave the technical discussion to others who are much more capable. However, the results in, for example, L?nnstedt & Speed (2002, Statistica Sinica, 12:31--46), or Smyth (2003, http://www.statsci.org/smyth/pubs/ebayes.pdf) or in Qin & Kerr at the IMA Workshop (http://www.ima.umn.edu/talks/workshops/9-29-10-3.2003/kerr/ KerrIMA.pdf), seem to indicate, with both simulated and "wet lab data", that we can do much better (in terms of false positve and false negatives) using t-like tests that combine information across genes than with the standard t-test. 2. > The real problem that researchers face with microarrays is NOT that > their t-test variances are too small, but that they often have > insufficient sample to detect the differences they need to detect. The > ready solution is to get enough data. I do agree with the general point. In a previous incarnation I used to do behavioral ecology and help field biologists with their data. It was not unheard of (in areas with a lot less funding than molecular biology) to spend two yeards in the field following some creatures to try to get decent sample sizes (and maybe one or two papers out). The answer to small sample sizes was often more field seasons, not shortcuts in the data analysis. I often interact with many molecular biologists and MDs who persevere in using tiny sample sizes for "serious stuff". This concerns me a lot (as both a statistician and a potential patient who might one day seek treatment!). Best, R. On Tuesday 16 December 2003 23:45, Baker, Stephen wrote: > Garrett et al, > > The t-test (or ANOVA) does not have a problem with "accidentally too > small" variances, either with one or more than one outcome of interest. > The estimate of the error variance by t-tests and ANOVA is a Least > Squares estimate and is the UNBIASED ESTIMATOR that is also the lower > bound on the variance for the "best" (minimum variance) linear unbiased > estimator (BLUE) of the effects being tested (see Graybill 1976). > > Some bayesian methods can generate smaller estimates of variances by > biasing the estimate toward some overall measure such as the average of > variances for nearby genes. These are BIASED estimates based on an > assumption that a particular gene should really be like genes that are > "nearby" in some sense, such as they have similar expression levels. > You would have to present a lot of data to me to convince me that any > randomly selected gene should have a variance like some other set of > genes, especially when I have an unbiased estimate at hand that is > non-controversial, requires no defense, and uses methods that have > withstood 100 years of review and scrutiny. I'm familiar with shrunken > estimates of effects that can have a smaller "mean squared error", but > these are random effects, not variances which control the power and type > I error rate. > > These approaches, in addition to producing biased estimates sometimes > require the analyst to impose his or her own particular biases, called > "prior beliefs" or "priors" on as to how much these estimates should be > biased by requiring that the analyst input how much weight is given to > the data from that gene and how much weight is given to the other set > that the gene is supposed to "be more like". Again, it would take some > pretty strong arguments to convince me that any particular analysts > prior beliefs about how much the data for a gene or data from other > genes should or should not be weighted. I would be concerned about how > much convincing a readership, reviewer, or study group would need if > they ever decide to "open the black box" and ask me to explain why such > an approach is reasonable/justifiable. > > The program Garrett mentioned, Cyber-T, uses such an approach. To quote > the Cyber-T manual "...This weighting factor IS CONTROLLED BY THE > EXPERIMENTER AND WILL DEPEND ON HOW CONFIDENT THE EXPERIMENTER IS that > the background variance of a closely related set of genes approximates > the variance of the gene under consideration". Now if one was looking > at just ONE gene, it makes sense that someone might put a lot of > thought into it, have looked at a lot of similar genes or other data and > come to the conclusion that a gene should be like some other genes and > THEN use this approach. But this is not the case when you have 10,000 > or 22,000 genes, at least not in the world I'm familiar with. > > I use empirical bayes methods for fitting general linear mixed models, > where the priors are objective, not my own opinion. Cyber-T does offer > the option of setting low confidence in the prior which is an objective > prior, but the manual points out that this results in the standard > Student t-test! Another feature of Cyber-T is that when you have > "enough" data, the weighted approach converges into the standard t-test > as well. > > The real problem that researchers face with microarrays is NOT that > their t-test variances are too small, but that they often have > insufficient sample to detect the differences they need to detect. The > ready solution is to get enough data. > > -.- -.. .---- .--. ..-. > Stephen P. Baker, MScPH, PhD (ABD) (508) 856-2625 > Sr. Biostatistician- Information Services > Lecturer in Biostatistics (775) 254-4885 fax > Graduate School of Biomedical Sciences > University of Massachusetts Medical School, Worcester > 55 Lake Avenue North stephen.baker@umassmed.edu > Worcester, MA 01655 USA > > ------------------------------ > > Message: 6 > Date: Tue, 16 Dec 2003 10:24:31 -0500 > From: "Garrett Frampton" <gmframpt@bu.edu> > Subject: RE: [BioC] ttest or fold change > To: <bioconductor@stat.math.ethz.ch> > Message-ID: <00b801c3c3e8$b3ed2cc0$e1be299b@GARRETT> > Content-Type: text/plain; charset="US-ASCII" > > Dr. Baker, > > You wrote about "the problem" that the t-test denominator may be > accidentally "too small". You say that this issue has been solved > within the T-test. It is my belief that this problem has only been > partially solved. It is true that this "problem" has been solved for a > single hypothesis test within the T-test, but it has not been solved for > microarray data analysis as a whole. > > It is possible to gain power by using local estimates of variance based > upon more than one gene. This sort of approach is extremely useful for > experiments with only a few replicates because it deals with the > situation where the within group variance for a single gene happens to > be very small. This is the approach implemented in Cyber-T; > http://visitor.ics.uci.edu/genex/cybert/. By looking at the dataset as > a whole, rather than 1 gene at a time, it is possible to eliminate > false-positives that arise as a result of coincidentally low within > group variance. > > Do you agree? > Other than this minor point I think you did a wonderful job putting the > statistical concepts that so many struggle with into words. > > > Garrett Frampton > Research Associate > Boston University School of Medicine - Microarray Resource > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor -- Ram?n D?az-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncol?gicas (CNIO) (Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz PGP KeyID: 0xE89B3462 (http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc)

ADD REPLY • link 21.0 years ago Ramon Diaz ★ 1.1k

0

Entering edit mode

Garrett Frampton ▴ 20

@garrett-frampton-434

Last seen 10.2 years ago

Dr. Baker, Thank you very much for the reply. It was quite enlightening and I agreed with almost everything. Particularly the idea that the is no substitute for collecting enough data to have the power to see that changes that you are looking for. Nevertheless, it will be along time before we can get away from analyzing small datasets (3 vs 3 for example). It is often important to perform a small study in order to get preliminary data for a larger one. In fact, in most cases this would be advisable in order to get an idea of technical and biological variability prior to designing the larger study. Consequently, it is important to be able to analyze small datasets. Suppose that we have a large dataset from a study with two experimental conditions (100 vs 100). Assume that there are large, reproducible differences (many fold, many standard deviations) between the conditions for a number of genes (1-5% of the data). A T-test can be used on this dataset to define a group of differentially expressed genes. Select 3 samples at random from each group and use two statistical tests, a T-test and the Bayesian T-test implemented in Cyber-T. At any significance cut-off, the genes found to be differentially expressed by the Bayesian T-test will be in much better agreement with the genes found by a T-test from the 100 samples than the regular T-test will be. I think that this is at odds with your conclusion. GMF ----- Original Message ----- From: "Baker, Stephen" <stephen.baker@umassmed.edu> To: <bioconductor@stat.math.ethz.ch> Sent: Tuesday, December 16, 2003 5:45 PM Subject: RE: [BioC] ttest or fold change > Garrett et al, > > The t-test (or ANOVA) does not have a problem with "accidentally too > small" variances, either with one or more than one outcome of interest. > The estimate of the error variance by t-tests and ANOVA is a Least > Squares estimate and is the UNBIASED ESTIMATOR that is also the lower > bound on the variance for the "best" (minimum variance) linear unbiased > estimator (BLUE) of the effects being tested (see Graybill 1976). > > Some bayesian methods can generate smaller estimates of variances by > biasing the estimate toward some overall measure such as the average of > variances for nearby genes. These are BIASED estimates based on an > assumption that a particular gene should really be like genes that are > "nearby" in some sense, such as they have similar expression levels. > You would have to present a lot of data to me to convince me that any > randomly selected gene should have a variance like some other set of > genes, especially when I have an unbiased estimate at hand that is > non-controversial, requires no defense, and uses methods that have > withstood 100 years of review and scrutiny. I'm familiar with shrunken > estimates of effects that can have a smaller "mean squared error", but > these are random effects, not variances which control the power and type > I error rate. > > These approaches, in addition to producing biased estimates sometimes > require the analyst to impose his or her own particular biases, called > "prior beliefs" or "priors" on as to how much these estimates should be > biased by requiring that the analyst input how much weight is given to > the data from that gene and how much weight is given to the other set > that the gene is supposed to "be more like". Again, it would take some > pretty strong arguments to convince me that any particular analysts > prior beliefs about how much the data for a gene or data from other > genes should or should not be weighted. I would be concerned about how > much convincing a readership, reviewer, or study group would need if > they ever decide to "open the black box" and ask me to explain why such > an approach is reasonable/justifiable. > > The program Garrett mentioned, Cyber-T, uses such an approach. To quote > the Cyber-T manual "...This weighting factor IS CONTROLLED BY THE > EXPERIMENTER AND WILL DEPEND ON HOW CONFIDENT THE EXPERIMENTER IS that > the background variance of a closely related set of genes approximates > the variance of the gene under consideration". Now if one was looking > at just ONE gene, it makes sense that someone might put a lot of > thought into it, have looked at a lot of similar genes or other data and > come to the conclusion that a gene should be like some other genes and > THEN use this approach. But this is not the case when you have 10,000 > or 22,000 genes, at least not in the world I'm familiar with. > > I use empirical bayes methods for fitting general linear mixed models, > where the priors are objective, not my own opinion. Cyber-T does offer > the option of setting low confidence in the prior which is an objective > prior, but the manual points out that this results in the standard > Student t-test! Another feature of Cyber-T is that when you have > "enough" data, the weighted approach converges into the standard t-test > as well. > > The real problem that researchers face with microarrays is NOT that > their t-test variances are too small, but that they often have > insufficient sample to detect the differences they need to detect. The > ready solution is to get enough data. > > -.- -.. .---- .--. ..-. > Stephen P. Baker, MScPH, PhD (ABD) (508) 856-2625 > Sr. Biostatistician- Information Services > Lecturer in Biostatistics (775) 254-4885 fax > Graduate School of Biomedical Sciences > University of Massachusetts Medical School, Worcester > 55 Lake Avenue North stephen.baker@umassmed.edu > Worcester, MA 01655 USA > > ------------------------------ > > Message: 6 > Date: Tue, 16 Dec 2003 10:24:31 -0500 > From: "Garrett Frampton" <gmframpt@bu.edu> > Subject: RE: [BioC] ttest or fold change > To: <bioconductor@stat.math.ethz.ch> > Message-ID: <00b801c3c3e8$b3ed2cc0$e1be299b@GARRETT> > Content-Type: text/plain; charset="US-ASCII" > > Dr. Baker, > > You wrote about "the problem" that the t-test denominator may be > accidentally "too small". You say that this issue has been solved > within the T-test. It is my belief that this problem has only been > partially solved. It is true that this "problem" has been solved for a > single hypothesis test within the T-test, but it has not been solved for > microarray data analysis as a whole. > > It is possible to gain power by using local estimates of variance based > upon more than one gene. This sort of approach is extremely useful for > experiments with only a few replicates because it deals with the > situation where the within group variance for a single gene happens to > be very small. This is the approach implemented in Cyber-T; > http://visitor.ics.uci.edu/genex/cybert/. By looking at the dataset as > a whole, rather than 1 gene at a time, it is possible to eliminate > false-positives that arise as a result of coincidentally low within > group variance. > > Do you agree? > Other than this minor point I think you did a wonderful job putting the > statistical concepts that so many struggle with into words. > > > Garrett Frampton > Research Associate > Boston University School of Medicine - Microarray Resource > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 21.0 years ago Garrett Frampton ▴ 20

0

Entering edit mode

Hello all, I have been following the recent exchanges Re: ttest or fold change with a lot of interest, particularly the limitations of small sample sizes (i.e., 3 chips per treatment). The question I would like to raise is that for Affy chips, why not use the probe-level values instead of summary values for statistical tests as Chu, Weir & Wolfinger (2002, Math. Biosci. 176:35-51) suggest? It seems like you throw away a lot of power to detect differences when you summarize 14 or 20 PM probes into one summary value. I In fact, the median polish used by RMA to summarize is a linear additive model somewhat similar to the mixed model used by Chu et al.; RMA only considers the probes from one chip, while Chu et al's uses the probes from all chips, along with cell line, treatment, and interaction effects (I still use the gcrma background correction and normalization on PM values). I'm not suggesting that this is a good substitute for conducting more replicates (I, too, am from a behavioral ecology background and tend to think an adequate sample size is at least 15-20), but I think it is a way to get more accurate information on differential expression from only a few replicates. I would like to get your thoughts on whether this is or isn't a valid method for analysis and why. Thanks, Jenny Jenny Drnevich, Ph.D. Department of Animal Biology University of Illinois 515 Morrill Hall 505 S Goodwin Ave Urbana, IL 61801 USA ph: 217-244-6826 fax: 217-244-4565 e-mail: drnevich@uiuc.edu

ADD REPLY • link 21.0 years ago Jenny Drnevich ★ 2.2k

0

Entering edit mode

I am currently preparing a manuscript discussing this issue in relation to the RMA model (or variants of). Some of this implemented in the affyPLM package. I am a little perplexed by your statement that RMA only considers probes from individual chips. Perhaps, you need to take a closer look at the model. Ben On Wed, 2003-12-17 at 09:03, Jenny Drnevich wrote: > Hello all, > > I have been following the recent exchanges Re: ttest or fold change with a > lot of interest, particularly the limitations of small sample sizes (i.e., > 3 chips per treatment). The question I would like to raise is that for Affy > chips, why not use the probe-level values instead of summary values for > statistical tests as Chu, Weir & Wolfinger (2002, Math. Biosci. 176:35-51) > suggest? It seems like you throw away a lot of power to detect differences > when you summarize 14 or 20 PM probes into one summary value. I In fact, > the median polish used by RMA to summarize is a linear additive model > somewhat similar to the mixed model used by Chu et al.; RMA only considers > the probes from one chip, while Chu et al's uses the probes from all chips, > along with cell line, treatment, and interaction effects (I still use the > gcrma background correction and normalization on PM values). I'm not > suggesting that this is a good substitute for conducting more replicates > (I, too, am from a behavioral ecology background and tend to think an > adequate sample size is at least 15-20), but I think it is a way to get > more accurate information on differential expression from only a few > replicates. I would like to get your thoughts on whether this is or isn't a > valid method for analysis and why. > > Thanks, > Jenny > > > > > > Jenny Drnevich, Ph.D. > Department of Animal Biology > University of Illinois > 515 Morrill Hall > 505 S Goodwin Ave > Urbana, IL 61801 > USA > > ph: 217-244-6826 > fax: 217-244-4565 > e-mail: drnevich@uiuc.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

ADD REPLY • link 21.0 years ago Ben Bolstad ★ 1.1k

0

Entering edit mode

>I am a little perplexed by your statement that RMA only considers probes >from individual chips. Perhaps, you need to take a closer look at the >model. > >Ben > Oops, you're right. My brain is starting to shut down in preparation for the holidays! I haven't yet looked at affyPLM, but I will now... Jenny Drnevich, Ph.D. Department of Animal Biology University of Illinois 515 Morrill Hall 505 S Goodwin Ave Urbana, IL 61801 USA ph: 217-244-6826 fax: 217-244-4565 e-mail: drnevich@uiuc.edu

ADD REPLY • link 21.0 years ago Jenny Drnevich ★ 2.2k

Login before adding your answer.