sample size for microarray experiments having 2 factors with one random effect

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 3.6 years ago

United States

This appears to be a randomized complete block design. The way I would compute the sample size is: Use a routine that computes sample size for a randomized complete block design. If you are planning to use log2 expression, then enter "1" as the size of the difference you want to detect. That corresponds to 2-fold. Using someone else's data for the same experiment, compute the sd for each gene (e.g. using Limma). Use the 70th or 80th percentile of SD as the SD for computing sample size. (This will be somewhat anti-conservative, but software for RCB sample size will not include EBayes computations which boost the power for any sample size, which is like decreasing the SD.) Then just enter the p-value and power that you want. Again, you might want to consider using a smaller p-value to adjust for multiple comparisons. If so, you could look at the q-value versus p-value plot for the data you used to compute SD, and pick the p-value corresponding to your desired q-value. The number of replicates in any experiment should be at least 3. (Those of you working in the medical field will think this is ridiculously small, but in underfunded areas of biology we are happy if we have funds for more than 3 reps.) There is also software from uab.edu called PowerAtlas. I haven't looked recently, but I think it is primarily for completely randomized designs. --Naomi At 10:09 PM 4/23/2009, shirley zhang wrote: >Dear list, > >I have the following affymetrix microarray experiment: > >2 fixed effects, each factor has two levels >1 random effect (patient) > >Can anybody tell me how to calculate the sample size for it? > >Thanks, >Shirley > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

Microarray Microarray • 964 views

ADD COMMENT • link 15.6 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

shirley zhang ★ 1.0k

@shirley-zhang-2038

Last seen 10.2 years ago

Dear Dr. Altman, Thanks for your quick response. Is the method you suggested similar to what Dr. KEVIN DOBBIN and RICHARD SIMON proposed in Biostatistics 2005? (http://biostatistics.oxfordjournals.org/cgi/reprint/6/1/27) Sorry that I did not make my experiment design clear. There are 2 fixed effects (tissue and status). We got two different tissues from the same patient. Patients are grouped into two category based on their status. Here we are interested in finding genes commonly changed by status across different tissue types. We are suggested to use lme function in nlme package by treating tissue and status as fixed effects, and patient as random effect. Is my experiment still a randomized complete block design? Thanks again for your help, Shirley On Thu, Apr 23, 2009 at 10:40 PM, Naomi Altman <naomi at="" stat.psu.edu=""> wrote: > This appears to be a randomized complete block design. ?The way I would > compute the sample size is: > > Use a routine that computes sample size for a randomized complete block > design. > If you are planning to use log2 expression, then enter "1" as the size of > the difference you want to detect. ?That > corresponds to 2-fold. > Using someone else's data for the same experiment, compute the sd for each > gene (e.g. using Limma). ?Use the 70th or > 80th percentile of SD as the SD for computing sample size. ?(This will be > somewhat anti-conservative, but software for RCB > sample size will not include EBayes computations which boost the power for > any sample size, which is like decreasing the SD.) > > Then just enter the p-value and power that you want. ?Again, you might want > to consider using a smaller p-value to adjust for multiple comparisons. > If so, you could look at the q-value versus p-value plot for the data you > used to compute SD, and pick the p-value corresponding to your desired > q-value. > > The number of replicates in any experiment should be at least 3. ?(Those of > you working in the medical field will think this is ridiculously small, but > in underfunded > areas of biology we are happy if we have funds for more than 3 reps.) > > There is also software from uab.edu called PowerAtlas. ?I haven't looked > recently, but I think it is primarily for completely randomized designs. > > --Naomi > > > At 10:09 PM 4/23/2009, shirley zhang wrote: >> >> Dear list, >> >> I have the following ?affymetrix microarray experiment: >> >> 2 fixed effects, each factor has two levels >> 1 random effect (patient) >> >> Can anybody tell me how to calculate the sample size for it? >> >> Thanks, >> Shirley >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Naomi S. Altman ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?814-865-3791 (voice) > Associate Professor > Dept. of Statistics ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?814-863-7114 (fax) > Penn State University ? ? ? ? ? ? ? ? ? ? ? ? 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > -- Xiaoling

ADD COMMENT • link 15.6 years ago shirley zhang ★ 1.0k

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 3.6 years ago

United States

Dear Shirley, This is called a split plot design. Status is the whole plot factor. Tissue is the subplot factor. Sample size computations are harder. You will need software for split plot designs. --Naomi At 11:03 PM 4/23/2009, shirley zhang wrote: >Dear Dr. Altman, > >Thanks for your quick response. Is the method you suggested similar to >what Dr. KEVIN DOBBIN and RICHARD SIMON proposed in Biostatistics >2005? (http://biostatistics.oxfordjournals.org/cgi/reprint/6/1/27) > >Sorry that I did not make my experiment design clear. > >There are 2 fixed effects (tissue and status). We got two different >tissues from the same patient. Patients are grouped into two category >based on their status. Here we are interested in finding genes >commonly changed by status across different tissue types. We are >suggested to use lme function in nlme package by treating tissue and >status as fixed effects, and patient as random effect. Is my >experiment still a randomized complete block design? > >Thanks again for your help, >Shirley > >On Thu, Apr 23, 2009 at 10:40 PM, Naomi Altman <naomi at="" stat.psu.edu=""> wrote: > > This appears to be a randomized complete block design. The way I would > > compute the sample size is: > > > > Use a routine that computes sample size for a randomized complete block > > design. > > If you are planning to use log2 expression, then enter "1" as the size of > > the difference you want to detect. That > > corresponds to 2-fold. > > Using someone else's data for the same experiment, compute the sd for each > > gene (e.g. using Limma). Use the 70th or > > 80th percentile of SD as the SD for computing sample size. (This will be > > somewhat anti-conservative, but software for RCB > > sample size will not include EBayes computations which boost the power for > > any sample size, which is like decreasing the SD.) > > > > Then just enter the p-value and power that you want. Again, you might want > > to consider using a smaller p-value to adjust for multiple comparisons. > > If so, you could look at the q-value versus p-value plot for the data you > > used to compute SD, and pick the p-value corresponding to your desired > > q-value. > > > > The number of replicates in any experiment should be at least 3. (Those of > > you working in the medical field will think this is ridiculously small, but > > in underfunded > > areas of biology we are happy if we have funds for more than 3 reps.) > > > > There is also software from uab.edu called PowerAtlas. I haven't looked > > recently, but I think it is primarily for completely randomized designs. > > > > --Naomi > > > > > > At 10:09 PM 4/23/2009, shirley zhang wrote: > >> > >> Dear list, > >> > >> I have the following affymetrix microarray experiment: > >> > >> 2 fixed effects, each factor has two levels > >> 1 random effect (patient) > >> > >> Can anybody tell me how to calculate the sample size for it? > >> > >> Thanks, > >> Shirley > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor at stat.math.ethz.ch > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > Naomi S. Altman 814-865-3791 (voice) > > Associate Professor > > Dept. of Statistics 814-863-7114 (fax) > > Penn State University 814-865-1348 (Statistics) > > University Park, PA 16802-2111 > > > > > > > >-- >Xiaoling > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 15.6 years ago Naomi Altman ★ 6.0k

Login before adding your answer.