limma small vs large number of samples

0

Entering edit mode

Giovanni Bucci ▴ 60

@giovanni-bucci-6524

Last seen 10.6 years ago

Thank you, Jim. One more question. The eBayes step adjusts the variance across genes. Since there is already a good estimate for the variance due to the large number of samples does the eBayes step shrink the variance even further? Thank you, Giovanni On Tue, Apr 29, 2014 at 9:20 AM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Giovanni, > > > On 4/28/2014 8:54 PM, Giovanni Bucci wrote: > >> Hello everybody, >> >> I have 32 samples, 4 factors with 2 levels each. Each level has 2 >> replicates. >> >> str(gxexprs) >>> >> num [1:15584, 1:32] 7.94 6.67 9.93 9.62 12.19 ... >> >> Group >>> >> [1] R52VQ R52VQ R52VE R52VE R52EQ R52EQ R52EE R52EE R95VQ R95VQ R95VQ >> R95VE >> [13] R95VE R95VE R95EQ R95EQ R95EQ R95EE R95EE R95EE R97VQ R97VQ R97VQ >> R97VE >> [25] R97VE R97VE R97EQ R97EQ R97EQ R97EE R97EE R97EE >> 16 Levels: R52VQ R52VE R52EQ R52EE R95VQ R95VE R95EQ R95EE R97VQ ... R97EE >> >> >> design <- model.matrix(~0+Group) >> fit <- lmFit(gxexprs, design) >> >> contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) >> fit2 <- contrasts.fit(fit, contrast.matrix) >> fit2 <- eBayes(fit2) >> >> TTable = topTable(fit2) >> >> global_p_val = TTable$P.Val >> >> >> gxexprs = gxexprs[, 1:4] >> >> >> ## same code as above but the expression matrix has only the first 4 >> columns which represent the contrast tested above >> >> design <- model.matrix(~0+Group) >> fit <- lmFit(gxexprs, design) >> >> contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) >> fit2 <- contrasts.fit(fit, contrast.matrix) >> fit2 <- eBayes(fit2) >> >> TTable = topTable(fit2) >> >> local_p_val = TTable$P.Val >> >> local_p_val has much greater values than global_p_val even though they >> represent the same comparison. >> >> What is the explanation for this? >> > > The denominator of your t-statistic is based on the mean square error of > the model (which is based on the intra-group variance of all groups). When > you have all the other groups in the model, the number of observations used > to estimate variance is larger, so you get more degrees of freedom for you > test (and the variance estimate is more accurate), so you get smaller > p-values in general. > > Best, > > Jim > > > >> Can you point to some diagnostic functions that will show the difference? >> >> Thank you, >> >> Giovanni >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]

• 848 views

ADD COMMENT • link 11.0 years ago Giovanni Bucci ▴ 60

Login before adding your answer.