Limma package: problem with p-value and t
1
0
Entering edit mode
Julie Zhu ★ 4.3k
@julie-zhu-3596
Last seen 11 months ago
United States
Dear Gordon, I noticed that there is inconsistency between the t and p.value on the topTable output using Limma package to analyze a dye swap two channel array experiment. logFC AveExpr t P.Value adj.P.Val B 1401 3.216254 11.74137 22.49546 1.041470e-05 0.07874703 3.435097 1537 3.344735 11.88007 18.07448 2.702061e-05 0.07874703 2.951114 The two-tailed p-value for t=22.49546 with df=1 would be 0.028 2*pt(-abs(22.49546),df=1), but the P.Value in the above output is 1.04e-05. Did I miss something? Could you please explain the inconsistency? Thanks! Here is the design matrix and relevant code snippets. Dye mutant 1 -1 1 -1 1 1 1 1 design = cbind(Dye=1, mutant = c(-1,-1,1,1)) fit = lmFit(MA, design) ### MA is a MAlist object fit3 = eBayes(fit) topTable(fit3, number=2, coef="mutant", adjust="fdr") Here is my R session information. R version 2.9.0 (2009-04-17) i386-apple-darwin8.11.1 locale: en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] limma_2.18.0 Thank you very much for your help! Best regards, Julie
limma limma • 663 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Julie Zhu wrote: > Dear Gordon, > > I noticed that there is inconsistency between the t and p.value on the > topTable output using Limma package to analyze a dye swap two channel array > experiment. > logFC AveExpr t P.Value adj.P.Val B > 1401 3.216254 11.74137 22.49546 1.041470e-05 0.07874703 3.435097 > 1537 3.344735 11.88007 18.07448 2.702061e-05 0.07874703 2.951114 > > The two-tailed p-value for t=22.49546 with df=1 would be 0.028 > 2*pt(-abs(22.49546),df=1), but the P.Value in the above output is 1.04e-05. > Did I miss something? Could you please explain the inconsistency? Thanks! The degrees of freedom will be somewhere around 3, but a bit larger because of the ebayes fit. You can get the actual value from your MArrayLM object, so what you really want is 2*pt(-abs(22.49546), fit$df.prior + fit$df.residual[1]) assuming you did something like fit <- lmFit(<stuff>) fit <- eBayes(fit) Best, Jim > > Here is the design matrix and relevant code snippets. > > Dye mutant > 1 -1 > 1 -1 > 1 1 > 1 1 > > design = cbind(Dye=1, mutant = c(-1,-1,1,1)) > fit = lmFit(MA, design) > ### MA is a MAlist object > fit3 = eBayes(fit) > topTable(fit3, number=2, coef="mutant", adjust="fdr") > > Here is my R session information. > > R version 2.9.0 (2009-04-17) > i386-apple-darwin8.11.1 > > locale: > en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] limma_2.18.0 > > Thank you very much for your help! > > Best regards, > > Julie > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826
ADD COMMENT
0
Entering edit mode
Hi James, Thanks you so much for helping me solving the mystery! Could you please explain why the degree freedom is so large? I only have 4 arrays, so total 3 df (n-1). One is used for estimating dye effect, 1 is used for estimating mutant effect, so the residual is 1 df. Why is there an additional of 2.397083 df added? Is it because the information borrowing from the neighboring genes? How is the df.prior calculated? Thanks so much for sharing your knowledge! Best regards, Julie On 9/17/09 4:50 PM, "James W. MacDonald" <jmacdon at="" med.umich.edu=""> wrote: > > > Julie Zhu wrote: >> Dear Gordon, >> >> I noticed that there is inconsistency between the t and p.value on the >> topTable output using Limma package to analyze a dye swap two channel array >> experiment. >> logFC AveExpr t P.Value adj.P.Val B >> 1401 3.216254 11.74137 22.49546 1.041470e-05 0.07874703 3.435097 >> 1537 3.344735 11.88007 18.07448 2.702061e-05 0.07874703 2.951114 >> >> The two-tailed p-value for t=22.49546 with df=1 would be 0.028 >> 2*pt(-abs(22.49546),df=1), but the P.Value in the above output is 1.04e-05. >> Did I miss something? Could you please explain the inconsistency? Thanks! > > The degrees of freedom will be somewhere around 3, but a bit larger > because of the ebayes fit. You can get the actual value from your > MArrayLM object, so what you really want is > > 2*pt(-abs(22.49546), fit$df.prior + fit$df.residual[1]) > > assuming you did something like > > fit <- lmFit(<stuff>) > fit <- eBayes(fit) > > Best, > > Jim > > >> >> Here is the design matrix and relevant code snippets. >> >> Dye mutant >> 1 -1 >> 1 -1 >> 1 1 >> 1 1 >> >> design = cbind(Dye=1, mutant = c(-1,-1,1,1)) >> fit = lmFit(MA, design) >> ### MA is a MAlist object >> fit3 = eBayes(fit) >> topTable(fit3, number=2, coef="mutant", adjust="fdr") >> >> Here is my R session information. >> >> R version 2.9.0 (2009-04-17) >> i386-apple-darwin8.11.1 >> >> locale: >> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] limma_2.18.0 >> >> Thank you very much for your help! >> >> Best regards, >> >> Julie >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Hi Julie, Your calculation of df is off. You lose one df for each column of your design matrix, so you have 2 residual df. The computation of the prior df is outlined (I believe) in the first paper cited in the limma User's Guide: http://www.bepress.com/sagmb/vol3/iss1/art3 Best, Jim Julie Zhu wrote: > Hi James, > > Thanks you so much for helping me solving the mystery! Could you please > explain why the degree freedom is so large? I only have 4 arrays, so total 3 > df (n-1). One is used for estimating dye effect, 1 is used for estimating > mutant effect, so the residual is 1 df. Why is there an additional of > 2.397083 df added? Is it because the information borrowing from the > neighboring genes? How is the df.prior calculated? Thanks so much for > sharing your knowledge! > > Best regards, > > Julie > > > On 9/17/09 4:50 PM, "James W. MacDonald" <jmacdon at="" med.umich.edu=""> wrote: > >> >> Julie Zhu wrote: >>> Dear Gordon, >>> >>> I noticed that there is inconsistency between the t and p.value on the >>> topTable output using Limma package to analyze a dye swap two channel array >>> experiment. >>> logFC AveExpr t P.Value adj.P.Val B >>> 1401 3.216254 11.74137 22.49546 1.041470e-05 0.07874703 3.435097 >>> 1537 3.344735 11.88007 18.07448 2.702061e-05 0.07874703 2.951114 >>> >>> The two-tailed p-value for t=22.49546 with df=1 would be 0.028 >>> 2*pt(-abs(22.49546),df=1), but the P.Value in the above output is 1.04e-05. >>> Did I miss something? Could you please explain the inconsistency? Thanks! >> The degrees of freedom will be somewhere around 3, but a bit larger >> because of the ebayes fit. You can get the actual value from your >> MArrayLM object, so what you really want is >> >> 2*pt(-abs(22.49546), fit$df.prior + fit$df.residual[1]) >> >> assuming you did something like >> >> fit <- lmFit(<stuff>) >> fit <- eBayes(fit) >> >> Best, >> >> Jim >> >> >>> Here is the design matrix and relevant code snippets. >>> >>> Dye mutant >>> 1 -1 >>> 1 -1 >>> 1 1 >>> 1 1 >>> >>> design = cbind(Dye=1, mutant = c(-1,-1,1,1)) >>> fit = lmFit(MA, design) >>> ### MA is a MAlist object >>> fit3 = eBayes(fit) >>> topTable(fit3, number=2, coef="mutant", adjust="fdr") >>> >>> Here is my R session information. >>> >>> R version 2.9.0 (2009-04-17) >>> i386-apple-darwin8.11.1 >>> >>> locale: >>> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] limma_2.18.0 >>> >>> Thank you very much for your help! >>> >>> Best regards, >>> >>> Julie >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826
ADD REPLY

Login before adding your answer.

Traffic: 271 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6