help request. repost Define alternate design matrix in R
1
0
Entering edit mode
@gordon-smyth
Last seen 5 hours ago
WEHI, Melbourne, Australia
Dear Josquin, The design matrix that you give is actually over-determined, so I don't think it could have come out of model.matrix() unless you've added columns to it later. Are you seeking tissue-specific treatment effects, or do you want to look for treatment effects that are consistent across the two tissues? I'll assume the latter. If you want to get genes that are DE for Treat2 vs Treat1, and genes that are DE for Treat3 vs Treat1, you can proceed like this: design <- model.matrix(~Treat+Tissue) I'll assume you have formed a DGEList object y, and have estimated the dispersion in some way using the above design matrix. Then: fit <- glmFit(y,design) lrt <- glmLRT(y,fit,coef=2) topTags(lrt) gives you DE genes between Treat2 and Treat1, and lrt <- glmLRT(y,fit,coef=3) topTags(lrt) gives you DE genes between Treat3 and Treat1. Best wishes Gordon >> From: josquin.tibbits at dpi.vic.gov.au >> Date: May 11, 2011 11:46:21 AM GMT+10:00 >> To: bioconductor at r-project.org >> Subject: [BioC] help request. repost Define alternate design matrix in R >> >> Hi, >> I am using EdgeR to find DE genes in an experiment with two factors >> (Tissue (2 levels) and Treatment (3 levels). Currently the design matrix >> output by model.matrix() looks like this: >> >> Int Tissue Treat1 Treat2 Treat3 >> S1 1 0 0 0 1 >> S2 1 0 0 0 1 >> S3 1 0 0 1 0 >> S4 1 0 0 1 0 >> S5 1 0 1 0 0 >> S6 1 0 1 0 0 >> S7 1 1 0 0 1 >> S8 1 1 0 0 1 >> S9 1 1 0 1 0 >> S10 1 1 0 1 0 >> S11 1 1 1 0 0 >> S12 1 1 1 0 0 >> attr(,"assign") >> [1] 0 1 2 2 2 >> attr(,"contrasts") >> attr(,"contrasts")$Tissue >> [1] "contr.treatment" >> attr(,"contrasts")$Treat >> [1] "contr.treatment" >> >> and the model is a comparison of this design and the model without the >> last column. >> >> I would like to specify a design matrix where a) I drop each of the >> Treat levels (in separate runs of EdgeR) to find the DE lists >> corresponding to each level of the Treatment and to b) specify a model >> where level1 of the >> >> Treatment is the reference against which the other levels are compared. >> >> I am stuck on how to specify the alternate design matrix and how to >> pass this to the model.matrix() command (or even if I have to do this). >> I am aware this is a general R command and have read the manual page >> and the manual pages for the contrasts() function. My inability to >> make logical sense of these is no doubt testament to my lack of R >> expertise and some help would be appreciated. >> >> In reading the limma documentation it indicates that a reference can be >> specified by the option ref=" ". Would this also work for me e.g. >> model.matrix(targets, ref="Treat1")? >> >> For specifying the alternate contrasts would this also be achieved in a >> similar way to that shown in the limma manual by setting up a contrasts >> matrix e.g. contrast.matrix <- makeContrasts(Treat2,levels=design) to >> achieve also the contrast of Treat2 (still using Treat1 as a >> reference)? >> >> My next question is then the obvious, how is this then used in the >> testing. Do I specify it in the glmFit() or using the glmLRT() function >> on the original output from the glmFit but specifying the contrasts >> matrix as >> >> the option? e.g >> glm1 <- glmFit(DGEList.object, design, dispersion = common.dispersion) # >> fits the model >> lrt1 <- glmLRT(DGEList.object, glm1) ## fits the original design contrast >> lrt2 <- glmLRT(DGEList.object, glm1, contrast.matrix) ## fits the second >> contrast >> >> or, am I way off track...... >> looking forward to being set straight. >> >> Josquin --------------------------------------------- Professor Gordon K Smyth, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. smyth at wehi.edu.au http://www.wehi.edu.au http://www.statsci.org/smyth ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
limma edgeR limma edgeR • 1.5k views
ADD COMMENT
0
Entering edit mode
@josquintibbitsdpivicgovau-4311
Last seen 10.4 years ago
Hi Gordon, you are correct that the matrix did not come out of model.matrix(). I put it together to illustrate rather than the one I get which is much bigger. I am also interested in finding tissue-specific treatment effects and was planning to just analyse the tissues separately, but if there is a way of specifying this as well that would also be most helpful. Thanks for your help. Josquin ========================================= Dr Josquin Tibbits Senior Research Scientist Victorian AgriBiosciences Centre La Trobe University R&D Park 1 Park Drive Bundoora VIC 3083 Phone: +61 3 9032 7055 Email: Josquin.Tibbits@dpi.vic.gov.au From: Gordon K Smyth <smyth@wehi.edu.au> To: josquin.tibbits@dpi.vic.gov.au Cc: Bioconductor mailing list <bioconductor@r-project.org> Date: 11/05/2011 02:31 PM Subject: [BioC] help request. repost Define alternate design matrix in R Dear Josquin, The design matrix that you give is actually over-determined, so I don't think it could have come out of model.matrix() unless you've added columns to it later. Are you seeking tissue-specific treatment effects, or do you want to look for treatment effects that are consistent across the two tissues? I'll assume the latter. If you want to get genes that are DE for Treat2 vs Treat1, and genes that are DE for Treat3 vs Treat1, you can proceed like this: design <- model.matrix(~Treat+Tissue) I'll assume you have formed a DGEList object y, and have estimated the dispersion in some way using the above design matrix. Then: fit <- glmFit(y,design) lrt <- glmLRT(y,fit,coef=2) topTags(lrt) gives you DE genes between Treat2 and Treat1, and lrt <- glmLRT(y,fit,coef=3) topTags(lrt) gives you DE genes between Treat3 and Treat1. Best wishes Gordon >> From: josquin.tibbits@dpi.vic.gov.au >> Date: May 11, 2011 11:46:21 AM GMT+10:00 >> To: bioconductor@r-project.org >> Subject: [BioC] help request. repost Define alternate design matrix in R >> >> Hi, >> I am using EdgeR to find DE genes in an experiment with two factors >> (Tissue (2 levels) and Treatment (3 levels). Currently the design matrix >> output by model.matrix() looks like this: >> >> Int Tissue Treat1 Treat2 Treat3 >> S1 1 0 0 0 1 >> S2 1 0 0 0 1 >> S3 1 0 0 1 0 >> S4 1 0 0 1 0 >> S5 1 0 1 0 0 >> S6 1 0 1 0 0 >> S7 1 1 0 0 1 >> S8 1 1 0 0 1 >> S9 1 1 0 1 0 >> S10 1 1 0 1 0 >> S11 1 1 1 0 0 >> S12 1 1 1 0 0 >> attr(,"assign") >> [1] 0 1 2 2 2 >> attr(,"contrasts") >> attr(,"contrasts")$Tissue >> [1] "contr.treatment" >> attr(,"contrasts")$Treat >> [1] "contr.treatment" >> >> and the model is a comparison of this design and the model without the >> last column. >> >> I would like to specify a design matrix where a) I drop each of the >> Treat levels (in separate runs of EdgeR) to find the DE lists >> corresponding to each level of the Treatment and to b) specify a model >> where level1 of the >> >> Treatment is the reference against which the other levels are compared. >> >> I am stuck on how to specify the alternate design matrix and how to >> pass this to the model.matrix() command (or even if I have to do this). >> I am aware this is a general R command and have read the manual page >> and the manual pages for the contrasts() function. My inability to >> make logical sense of these is no doubt testament to my lack of R >> expertise and some help would be appreciated. >> >> In reading the limma documentation it indicates that a reference can be >> specified by the option ref=" ". Would this also work for me e.g. >> model.matrix(targets, ref="Treat1")? >> >> For specifying the alternate contrasts would this also be achieved in a >> similar way to that shown in the limma manual by setting up a contrasts >> matrix e.g. contrast.matrix <- makeContrasts(Treat2,levels=design) to >> achieve also the contrast of Treat2 (still using Treat1 as a >> reference)? >> >> My next question is then the obvious, how is this then used in the >> testing. Do I specify it in the glmFit() or using the glmLRT() function >> on the original output from the glmFit but specifying the contrasts >> matrix as >> >> the option? e.g >> glm1 <- glmFit(DGEList.object, design, dispersion = common.dispersion) # >> fits the model >> lrt1 <- glmLRT(DGEList.object, glm1) ## fits the original design contrast >> lrt2 <- glmLRT(DGEList.object, glm1, contrast.matrix) ## fits the second >> contrast >> >> or, am I way off track...... >> looking forward to being set straight. >> >> Josquin --------------------------------------------- Professor Gordon K Smyth, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. smyth@wehi.edu.au http://www.wehi.edu.au http://www.statsci.org/smyth ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:22}}
0
Entering edit mode
Hi Josquin, If you want tissue-specific treatment effects, I suggest you define one treatment factor that defines all your distinct treatment groups, then just use the original edgeR approach to make pairwise comparisons. In microarray work, which is analogous here, most biologists have found that to be the simplest and clearest way to handle complicated experiments, rather than to use the linear model formulas. Classic edgeR is setup for that approach. Best wishes Gordon On Wed, 11 May 2011, josquin.tibbits at dpi.vic.gov.au wrote: > Hi Gordon, > > you are correct that the matrix did not come out of model.matrix(). I put > it together to illustrate rather than the one I get which is much bigger. > I am also interested in finding tissue-specific treatment effects and was > planning to just analyse the tissues separately, but if there is a way of > specifying this as well that would also be most helpful. Thanks for your > help. > > Josquin > > ========================================= > Dr Josquin Tibbits > Senior Research Scientist > Victorian AgriBiosciences Centre > La Trobe University R&D Park > 1 Park Drive > Bundoora VIC 3083 > > Phone: +61 3 9032 7055 > Email: Josquin.Tibbits at dpi.vic.gov.au > > > > > From: Gordon K Smyth <smyth at="" wehi.edu.au=""> > To: josquin.tibbits at dpi.vic.gov.au > Cc: Bioconductor mailing list <bioconductor at="" r-project.org=""> > Date: 11/05/2011 02:31 PM > Subject: [BioC] help request. repost Define alternate design matrix > in R > > > > Dear Josquin, > > The design matrix that you give is actually over-determined, so I don't > think it could have come out of model.matrix() unless you've added columns > > to it later. > > Are you seeking tissue-specific treatment effects, or do you want to look > for treatment effects that are consistent across the two tissues? I'll > assume the latter. > > If you want to get genes that are DE for Treat2 vs Treat1, and genes that > are DE for Treat3 vs Treat1, you can proceed like this: > > design <- model.matrix(~Treat+Tissue) > > I'll assume you have formed a DGEList object y, and have estimated the > dispersion in some way using the above design matrix. Then: > > fit <- glmFit(y,design) > lrt <- glmLRT(y,fit,coef=2) > topTags(lrt) > > gives you DE genes between Treat2 and Treat1, and > > lrt <- glmLRT(y,fit,coef=3) > topTags(lrt) > > gives you DE genes between Treat3 and Treat1. > > Best wishes > Gordon > > >>> From: josquin.tibbits at dpi.vic.gov.au >>> Date: May 11, 2011 11:46:21 AM GMT+10:00 >>> To: bioconductor at r-project.org >>> Subject: [BioC] help request. repost Define alternate design matrix in > R >>> >>> Hi, >>> I am using EdgeR to find DE genes in an experiment with two factors >>> (Tissue (2 levels) and Treatment (3 levels). Currently the design > matrix >>> output by model.matrix() looks like this: >>> >>> Int Tissue Treat1 Treat2 Treat3 >>> S1 1 0 0 0 1 >>> S2 1 0 0 0 1 >>> S3 1 0 0 1 0 >>> S4 1 0 0 1 0 >>> S5 1 0 1 0 0 >>> S6 1 0 1 0 0 >>> S7 1 1 0 0 1 >>> S8 1 1 0 0 1 >>> S9 1 1 0 1 0 >>> S10 1 1 0 1 0 >>> S11 1 1 1 0 0 >>> S12 1 1 1 0 0 >>> attr(,"assign") >>> [1] 0 1 2 2 2 >>> attr(,"contrasts") >>> attr(,"contrasts")$Tissue >>> [1] "contr.treatment" >>> attr(,"contrasts")$Treat >>> [1] "contr.treatment" >>> >>> and the model is a comparison of this design and the model without the >>> last column. >>> >>> I would like to specify a design matrix where a) I drop each of the >>> Treat levels (in separate runs of EdgeR) to find the DE lists >>> corresponding to each level of the Treatment and to b) specify a model >>> where level1 of the >>> >>> Treatment is the reference against which the other levels are compared. >>> >>> I am stuck on how to specify the alternate design matrix and how to >>> pass this to the model.matrix() command (or even if I have to do this). > >>> I am aware this is a general R command and have read the manual page >>> and the manual pages for the contrasts() function. My inability to >>> make logical sense of these is no doubt testament to my lack of R >>> expertise and some help would be appreciated. >>> >>> In reading the limma documentation it indicates that a reference can be > >>> specified by the option ref=" ". Would this also work for me e.g. >>> model.matrix(targets, ref="Treat1")? >>> >>> For specifying the alternate contrasts would this also be achieved in a > >>> similar way to that shown in the limma manual by setting up a contrasts > >>> matrix e.g. contrast.matrix <- makeContrasts(Treat2,levels=design) to >>> achieve also the contrast of Treat2 (still using Treat1 as a >>> reference)? >>> >>> My next question is then the obvious, how is this then used in the >>> testing. Do I specify it in the glmFit() or using the glmLRT() function > >>> on the original output from the glmFit but specifying the contrasts >>> matrix as >>> >>> the option? e.g >>> glm1 <- glmFit(DGEList.object, design, dispersion = common.dispersion) > # >>> fits the model >>> lrt1 <- glmLRT(DGEList.object, glm1) ## fits the original design > contrast >>> lrt2 <- glmLRT(DGEList.object, glm1, contrast.matrix) ## fits the > second >>> contrast >>> >>> or, am I way off track...... >>> looking forward to being set straight. >>> >>> Josquin > > --------------------------------------------- > Professor Gordon K Smyth, > Bioinformatics Division, > Walter and Eliza Hall Institute of Medical Research, > 1G Royal Parade, Parkville, Vic 3052, Australia. > smyth at wehi.edu.au > http://www.wehi.edu.au > http://www.statsci.org/smyth > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:25}}
ADD REPLY

Login before adding your answer.

Traffic: 409 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6