Hi
So I am trying to work out why these two bits of code give different results. The overall aim is to look for genes that are changing differently between responders and non-responders (to therapy) between two different treatment groups (baseline only data i.e. a single timepoint). However, I am also interested if there are genes that are changing differently between the two treatment groups aside from responder and non-responder, kind of as a control to make sure the two groups are similar.
This code is (I believe) for answering the last question, which is to look for genes that are changing depending on medication assignment:
code 1
design <- model.matrix(~0+meta$Current.Medication+meta$Age+meta$Gender+meta$Ethnicity)
colnames(design) <- c('Drug1','Drug2','Age','Gender','EthnicityA','EthnicityC','Ethnicity_O')
contrast.matrix <- makeContrasts('Drug1-Drug2', levels = design)
dge <- DGEList(counts=countsn)
keep <- filterByExpr(dge, design)
dge <- dge[keep,,keep.lib.sizes=FALSE]
dge <- calcNormFactors(dge)
v <- voom(dge, design, plot=TRUE)
fit <- lmFit(v,design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
top2 <- topTable(fit2,coef=1,number=Inf,sort.by="P")
qs <- qvalue::qvalue(p = top2$P.Value)
top2$q.Val <- qs$qvalues
nrow(subset(top2, top2$q.Val < 0.05))
[1] 0
So 0 DEGs are found.
This next code can test the interaction of medication*response, but also can (I think) compare these two medication groups Drug1-Drug2 like I have done above. However, the results for the medication only comparison below are different.
code 2
design <- model.matrix(~meta$Current.Medication*meta$DAS28.CRP.EULARresp.V7+meta$Age+meta$Gender+meta$Ethnicity)
colnames(design) <- c('Intercept','Medication','Response','Response:Medication','Age','Gender','EthnicityA','EthnicityC','Ethnicity_O')
dge <- DGEList(counts=countsn)
keep <- filterByExpr(dge, design)
dge <- dge[keep,,keep.lib.sizes=FALSE]
dge <- calcNormFactors(dge)
v <- voom(dge, design, plot=TRUE)
fit <- lmFit(v,design)
cont.matrix <- cbind(Medication=c(0,1,0,0,0,0,0,0,0),Response=c(0,0,1,0,0,0,0,0,0),Diff=c(0,0,0,1,0,0,0,0,0))
fit2 <- contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2)
top2 <- topTable(fit2,coef=1,number=Inf,sort.by="P")
qs <- qvalue::qvalue(p = top2$P.Value)
top2$q.Val <- qs$qvalues
nrow(subset(top2, top2$q.Val < 0.05))
[1] 14451
So 14451 DEGs instead of 0, seems like a different test has been done, but I thought these would be the same.
Why do I see such dramatic differences? Can some one help?
Thanks,
Chris
OK thanks Gordon, so how would I modify the second block of code to test the same hypothesis as the first?
You can't. The fact that you have changed the model, adding in extra parameters, changes the hypotheses for all the coefficients.
If the number of responders and non-responders is equal, then you can roughly approximate the Code 1 test using the second model by testing the contrast
c(0,2,0,1,0,0,0,0,0)
.Unless you have a really intimate understanding of the parametrization of interaction models in R, it is much easier to recode the interaction as single factor, as I suggest in the limma User's Guide.
Also, beware that it doesn't usually make scientific sense to test the "main effect" part of an interaction model, which is what you are doing here.
OK Gordon, I actually have already done as you suggest in the manual and used the single factor method by pasting the groups together. So your saying the above when you test response*drug is not the same interaction as I would be testing by using (responderDrug1-non-responderDrug1)-(responderDrug2-non-responderDrug2). Well that is interesting because they did give me different results and I was wondering about that as well. It seems I will just stick with the suggested method. Thanks once again for your help.