Question

how does limma calculate the coefficients with using different design matrix?

0

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 8 hours ago

WEHI, Melbourne, Australia

Dear Xiaokuan, You haven't told us about your data, but it must contain missing values or weights, because otherwise the two topTable results you give would have been identical. In the absence of missing values or weights, the results would have been as you expected. With missing values or weights, the first topTable result you give (the one with designcontr) is exact whereas the second result is approximate. The fact that contrast.fit() gives approximate results in the presence of missing values or weights is mentioned on the documentation page for contrast.fit, and has been discussed a few times on this list, see https://www.stat.math.ethz.ch/pipermail/bioconductor/2009-December/030 947.html Professor Albyn Jones is trying to pursuade me to implement something similar to contrasts.fit() that would be exact in all cases. The only way I could do this would be to add an argument 'contrasts' to lmFit. I may yet do that! Best wishes Gordon > Date: Sat, 14 Aug 2010 12:30:57 -0700 (PDT) > From: Xiaokuan Wei <weixiaokuan at="" yahoo.com=""> > To: bioconductor <bioconductor at="" stat.math.ethz.ch=""> > Subject: [BioC] how does limma calculate the coefficients with using > different design matrix? > > Dear list, > > I have two design matrixes: > > design<-model.matrix(~-1+factor(grp)) > designcontr<-model.matrix(~factor(grp)) > > then make contrast matrix: > cont.matrix<-rbind(rep(-1,9),diag(9)) > cont.matrix_contr <- rbind(rep(0,9),diag(9)) > > after using topTable to get the specific probe values, I found: > e.g. > using designcontr and cont.matrix_contr: > ID Coef1 Coef2 Coef3 Coef4 Coef5 Coef6 Coef7 Coef8 Coef9 AveExpr F P.Value > adj.P.Val > > feature1 -0.07976 2.900216 3.06401 3.210565 0.229553 1.545972 2.433364 2.62326 > 2.719911 13.18788 1133.135 1.60E-15 7.20E-11 > > > using design and cont.matrix: > ID Coef1 Coef2 Coef3 Coef4 Coef5 Coef6 Coef7 Coef8 Coef9 AveExpr F P.Value > adj.P.Val > > feature1 -0.07976 0.229553 1.545972 2.433364 2.62326 2.719911 2.900216 3.06401 > 3.210565 13.18788 1133.135 1.60E-15 7.20E-11 > > > > I try to understand why there is difference between cont.matrix_contr and > cont.matrx results, since both of them is try to get the difference between each > grp(grp2 to grp10) and grp1. i.e. grp[i](i=2-10) - grp[1] > > even I used the different contrasts, the coef should? be the same. I am > confused. > > > Thank you for your help. > > Regards, > Xiaokuan

probe limma probe limma • 1.5k views

ADD COMMENT • link updated 14.7 years ago by Xiaokuan Wei ▴ 230 • written 14.7 years ago by Gordon Smyth 52k

score 0 · Answer 1 · 2010-08-15

I would be happy to share my version of lmFit and related functions, which include a contrast argument. I haven't yet since I am not ready to guarantee that they respect all the class definitions and downstream analytical functions, though they work in my context. I have no doubt that Gordon has a better grasp of those issues than I do. Accurate computation should be the default behavior whenever possible and practical, as it seems to be here! albyn Quoting Gordon K Smyth <smyth at="" wehi.edu.au="">: > Dear Xiaokuan, > > You haven't told us about your data, but it must contain missing > values or weights, because otherwise the two topTable results you > give would have been identical. In the absence of missing values or > weights, the results would have been as you expected. > > With missing values or weights, the first topTable result you give > (the one with designcontr) is exact whereas the second result is > approximate. > > The fact that contrast.fit() gives approximate results in the > presence of missing values or weights is mentioned on the > documentation page for contrast.fit, and has been discussed a few > times on this list, see > > https://www.stat.math.ethz.ch/pipermail/bioconductor/2009-December/0 30947.html > > Professor Albyn Jones is trying to pursuade me to implement > something similar to contrasts.fit() that would be exact in all > cases. The only way I could do this would be to add an argument > 'contrasts' to lmFit. I may yet do that! > > Best wishes > Gordon > >> Date: Sat, 14 Aug 2010 12:30:57 -0700 (PDT) >> From: Xiaokuan Wei <weixiaokuan at="" yahoo.com=""> >> To: bioconductor <bioconductor at="" stat.math.ethz.ch=""> >> Subject: [BioC] how does limma calculate the coefficients with using >> different design matrix? >> >> Dear list, >> >> I have two design matrixes: >> >> design<-model.matrix(~-1+factor(grp)) >> designcontr<-model.matrix(~factor(grp)) >> >> then make contrast matrix: >> cont.matrix<-rbind(rep(-1,9),diag(9)) >> cont.matrix_contr <- rbind(rep(0,9),diag(9)) >> >> after using topTable to get the specific probe values, I found: >> e.g. >> using designcontr and cont.matrix_contr: >> ID Coef1 Coef2 Coef3 Coef4 Coef5 Coef6 Coef7 Coef8 Coef9 AveExpr F P.Value >> adj.P.Val >> >> feature1 -0.07976 2.900216 3.06401 3.210565 0.229553 1.545972 >> 2.433364 2.62326 >> 2.719911 13.18788 1133.135 1.60E-15 7.20E-11 >> >> >> using design and cont.matrix: >> ID Coef1 Coef2 Coef3 Coef4 Coef5 Coef6 Coef7 Coef8 Coef9 AveExpr F P.Value >> adj.P.Val >> >> feature1 -0.07976 0.229553 1.545972 2.433364 2.62326 2.719911 >> 2.900216 3.06401 >> 3.210565 13.18788 1133.135 1.60E-15 7.20E-11 >> >> >> >> I try to understand why there is difference between cont.matrix_contr and >> cont.matrx results, since both of them is try to get the difference >> between each >> grp(grp2 to grp10) and grp1. i.e. grp[i](i=2-10) - grp[1] >> >> even I used the different contrasts, the coef should? be the same. I am >> confused. >> >> >> Thank you for your help. >> >> Regards, >> Xiaokuan

score 0 · Answer 2 · 2010-08-16

Dear Gordon, Thank you for your detailed explaination. In fact, my data don't have missing values and in function of lmFit, I didn't use weights argument. I dig deeper for this issue trying to see what the coefficients are after lmFit. When use designcontr: fit<-lmFit(dat,designcontr) tp1<-topTable(eBayes(fit)) tp1<-topTable(eBayes(fit),n=Inf) tp1[tp1$ID=='feature1',] ID g0 g1 g2 g3 g4 g5 g6 g7 g8 g9 AveExpr F 1 feature1 11.32317 -0.07976296 2.900216 3.06401 3.210565 0.2295532 1.545972 2.433364 2.62326 2.719911 13.18788 111975.0 P.Value adj.P.Val 1 5.284493e-27 5.296354e-25 When use design: fit<-lmFit(dat,design) tp2<-topTable(eBayes(fit),n=Inf) tp2[tp2$ID=='feature1',] ID g0 g1 g2 g3 g4 g5 g6 g7 g8 g9 AveExpr F 1 feature1 11.32317 11.24341 11.55272 12.86914 13.75653 13.94643 14.04308 14.22338 14.38718 14.53373 13.18788 111975.0 P.Value adj.P.Val 1 5.284493e-27 5.296354e-25 So, as you said, if I use contrasts.fit with designcontr and cont.matrix_contr, I get exact the coefficents of g1 to g9 for coef1 to coef9. but when I used design and contr.matrix, only the coef1 is g1-g0=11.24341-11.32317=-0.07976 which is the same vaule as using designcontr and cont.matrix_contr. however, the other coefficents from coef2 to coef9 are not the difference between each g[i] with g0. Do you have any comments on this? Thank you very much. Xiaokuan ________________________________ From: Gordon K Smyth <smyth@wehi.edu.au> Cc: Bioconductor mailing list <bioconductor@stat.math.ethz.ch> Sent: Sun, August 15, 2010 10:40:58 PM Subject: [BioC] how does limma calculate the coefficients with using different design matrix? Dear Xiaokuan, You haven't told us about your data, but it must contain missing values or weights, because otherwise the two topTable results you give would have been identical. In the absence of missing values or weights, the results would have been as you expected. With missing values or weights, the first topTable result you give (the one with designcontr) is exact whereas the second result is approximate. The fact that contrast.fit() gives approximate results in the presence of missing values or weights is mentioned on the documentation page for contrast.fit, and has been discussed a few times on this list, see https://www.stat.math.ethz.ch/pipermail/bioconductor/2009-December/030 947.html Professor Albyn Jones is trying to pursuade me to implement something similar to contrasts.fit() that would be exact in all cases. The only way I could do this would be to add an argument 'contrasts' to lmFit. I may yet do that! Best wishes Gordon > Date: Sat, 14 Aug 2010 12:30:57 -0700 (PDT) > To: bioconductor <bioconductor@stat.math.ethz.ch> > Subject: [BioC] how does limma calculate the coefficients with using > different design matrix? > > Dear list, > > I have two design matrixes: > > design<-model.matrix(~-1+factor(grp)) > designcontr<-model.matrix(~factor(grp)) > > then make contrast matrix: > cont.matrix<-rbind(rep(-1,9),diag(9)) > cont.matrix_contr <- rbind(rep(0,9),diag(9)) > > after using topTable to get the specific probe values, I found: > e.g. > using designcontr and cont.matrix_contr: > ID Coef1 Coef2 Coef3 Coef4 Coef5 Coef6 Coef7 Coef8 Coef9 AveExpr F P.Value > adj.P.Val > > feature1 -0.07976 2.900216 3.06401 3.210565 0.229553 1.545972 2.433364 2.62326 > 2.719911 13.18788 1133.135 1.60E-15 7.20E-11 > > > using design and cont.matrix: > ID Coef1 Coef2 Coef3 Coef4 Coef5 Coef6 Coef7 Coef8 Coef9 AveExpr F P.Value > adj.P.Val > > feature1 -0.07976 0.229553 1.545972 2.433364 2.62326 2.719911 2.900216 3.06401 > 3.210565 13.18788 1133.135 1.60E-15 7.20E-11 > > > > I try to understand why there is difference between cont.matrix_contr and > cont.matrx results, since both of them is try to get the difference between >each > grp(grp2 to grp10) and grp1. i.e. grp[i](i=2-10) - grp[1] > > even I used the different contrasts, the coef should be the same. I am > confused. > > > Thank you for your help. > > Regards, > Xiaokuan [[alternative HTML version deleted]]