limma small vs large number of samples
2
0
Entering edit mode
@giovanni-bucci-6524
Last seen 10.1 years ago
Hello everybody, I have 32 samples, 4 factors with 2 levels each. Each level has 2 replicates. >str(gxexprs) num [1:15584, 1:32] 7.94 6.67 9.93 9.62 12.19 ... >Group [1] R52VQ R52VQ R52VE R52VE R52EQ R52EQ R52EE R52EE R95VQ R95VQ R95VQ R95VE [13] R95VE R95VE R95EQ R95EQ R95EQ R95EE R95EE R95EE R97VQ R97VQ R97VQ R97VE [25] R97VE R97VE R97EQ R97EQ R97EQ R97EE R97EE R97EE 16 Levels: R52VQ R52VE R52EQ R52EE R95VQ R95VE R95EQ R95EE R97VQ ... R97EE design <- model.matrix(~0+Group) fit <- lmFit(gxexprs, design) contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) fit2 <- contrasts.fit(fit, contrast.matrix) fit2 <- eBayes(fit2) TTable = topTable(fit2) global_p_val = TTable$P.Val gxexprs = gxexprs[, 1:4] ## same code as above but the expression matrix has only the first 4 columns which represent the contrast tested above design <- model.matrix(~0+Group) fit <- lmFit(gxexprs, design) contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) fit2 <- contrasts.fit(fit, contrast.matrix) fit2 <- eBayes(fit2) TTable = topTable(fit2) local_p_val = TTable$P.Val local_p_val has much greater values than global_p_val even though they represent the same comparison. What is the explanation for this? Can you point to some diagnostic functions that will show the difference? Thank you, Giovanni [[alternative HTML version deleted]]
• 872 views
ADD COMMENT
0
Entering edit mode
@ryan-c-thompson-5618
Last seen 2 days ago
Icahn School of Medicine at Mount Sinaiā€¦
Hi Giovanni, The explanation is quite simple. When you include all the data, limma can arrive at a much more robust estimate of the variance, so it can more confidently call differentially expressed genes. In other words, regardless of which specific contrast you are testing, the variance is estimated from all of the data included in the model. -Ryan On Mon Apr 28 17:54:32 2014, Giovanni Bucci wrote: > Hello everybody, > > I have 32 samples, 4 factors with 2 levels each. Each level has 2 > replicates. > >> str(gxexprs) > num [1:15584, 1:32] 7.94 6.67 9.93 9.62 12.19 ... > >> Group > [1] R52VQ R52VQ R52VE R52VE R52EQ R52EQ R52EE R52EE R95VQ R95VQ R95VQ R95VE > [13] R95VE R95VE R95EQ R95EQ R95EQ R95EE R95EE R95EE R97VQ R97VQ R97VQ R97VE > [25] R97VE R97VE R97EQ R97EQ R97EQ R97EE R97EE R97EE > 16 Levels: R52VQ R52VE R52EQ R52EE R95VQ R95VE R95EQ R95EE R97VQ ... R97EE > > > design <- model.matrix(~0+Group) > fit <- lmFit(gxexprs, design) > > contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) > fit2 <- contrasts.fit(fit, contrast.matrix) > fit2 <- eBayes(fit2) > > TTable = topTable(fit2) > > global_p_val = TTable$P.Val > > > gxexprs = gxexprs[, 1:4] > > > ## same code as above but the expression matrix has only the first 4 > columns which represent the contrast tested above > > design <- model.matrix(~0+Group) > fit <- lmFit(gxexprs, design) > > contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) > fit2 <- contrasts.fit(fit, contrast.matrix) > fit2 <- eBayes(fit2) > > TTable = topTable(fit2) > > local_p_val = TTable$P.Val > > local_p_val has much greater values than global_p_val even though they > represent the same comparison. > > What is the explanation for this? > > Can you point to some diagnostic functions that will show the difference? > > Thank you, > > Giovanni > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Hi Giovanni, On 4/28/2014 8:54 PM, Giovanni Bucci wrote: > Hello everybody, > > I have 32 samples, 4 factors with 2 levels each. Each level has 2 > replicates. > >> str(gxexprs) > num [1:15584, 1:32] 7.94 6.67 9.93 9.62 12.19 ... > >> Group > [1] R52VQ R52VQ R52VE R52VE R52EQ R52EQ R52EE R52EE R95VQ R95VQ R95VQ R95VE > [13] R95VE R95VE R95EQ R95EQ R95EQ R95EE R95EE R95EE R97VQ R97VQ R97VQ R97VE > [25] R97VE R97VE R97EQ R97EQ R97EQ R97EE R97EE R97EE > 16 Levels: R52VQ R52VE R52EQ R52EE R95VQ R95VE R95EQ R95EE R97VQ ... R97EE > > > design <- model.matrix(~0+Group) > fit <- lmFit(gxexprs, design) > > contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) > fit2 <- contrasts.fit(fit, contrast.matrix) > fit2 <- eBayes(fit2) > > TTable = topTable(fit2) > > global_p_val = TTable$P.Val > > > gxexprs = gxexprs[, 1:4] > > > ## same code as above but the expression matrix has only the first 4 > columns which represent the contrast tested above > > design <- model.matrix(~0+Group) > fit <- lmFit(gxexprs, design) > > contrast.matrix <- makeContrasts(contrasts="R52VQ - R52VE",levels=design) > fit2 <- contrasts.fit(fit, contrast.matrix) > fit2 <- eBayes(fit2) > > TTable = topTable(fit2) > > local_p_val = TTable$P.Val > > local_p_val has much greater values than global_p_val even though they > represent the same comparison. > > What is the explanation for this? The denominator of your t-statistic is based on the mean square error of the model (which is based on the intra-group variance of all groups). When you have all the other groups in the model, the number of observations used to estimate variance is larger, so you get more degrees of freedom for you test (and the variance estimate is more accurate), so you get smaller p-values in general. Best, Jim > > Can you point to some diagnostic functions that will show the difference? > > Thank you, > > Giovanni > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT

Login before adding your answer.

Traffic: 306 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6