Hi there! I am trying to perform a particular comparison after correcting for individual batch effects in DESeq2. The experiment involves the following samples:
condition individual individual_by_group side
di A A L
di A A R
di B B L
di B B R
di C C L
di C C R
est D A L
est D A R
est E B L
est E B R
est F C L
est F C R
What I would like to do is to compare di L samples against est L samples. However, because of tremendous differences between individuals due to inherent variability (as is known to be an issue with this experiment and as seen by PCA), I don't get very many DEGs. Therefore, I need to first correct for individual batch effects. Indeed, if I correct for the individual batch effect using the design ~individual + side, I can then successfully pull out DEGs between di L and di R (so two sides of the same animals, which differ by treatment condition). However, I can't figure out how to do this for di L vs. est L.
Based on the DESeq2 manual, the section "Group-specific condition effects, individuals nested within groups" is most relevant. They provide the following example:
## grp ind cnd ind.n
## 1 X 1 A 1
## 2 X 1 B 1
## 3 X 2 A 2
## 4 X 2 B 2
## 5 X 3 A 3
## 6 X 3 B 3
## 7 Y 4 A 1
## 8 Y 4 B 1
## 9 Y 5 A 2
## 10 Y 5 B 2
## 11 Y 6 A 3
## 12 Y 6 B 3
Based on this, the equivalent design for me would be ~condition + condition:individual_by_group + condition:side
. Using this, however, I cannot address the di L vs. est L question. Again from the manuals' example: "Above, the terms grpX.cndB and grpY.cndB give the group-specific condition effects, in other words, the condition B vs A effect for group X samples, and likewise for group Y samples. These terms control for all of the six individual effects. These group-specific condition effects can be extracted using results with the name argument." Therefore, conditiondi.sideL vs. conditionest.sideL I think actually tells me (di L vs. di R) vs. (est L vs. est R), not di L vs. est L.
I've tried all sorts of other designs like ~condition + condition:individual_by_group + side or ~condition:individual_by_group + condition + condition:side
, which have either not worked or thrown an error: "Error in checkContrast(contrast, resNames) :
all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)'".
I also cannot do the straightforward ~individual + condition:side
or something like that because that would lead to a "model matrix not full rank" issue (because individual can predict di vs. est).
Any suggestions? Thanks so much!!
Best, Elisa
Thanks, Kevin! Much appreciated! I agree that your suggestion would make the most direct sense. I did try that already, but it failed because diL and estL can be inferred based on which individual, and since they are thus not wholly independent, the model matrix not full rank error occurred. The closest I could get might be using the ind.n animal designation in the second example above, or in my case, the "individiualbygroup" numbering, but that doesn't really do the right kind of batch correction I'd need. Any ideas on how to get around this? Thanks so much!
Hi,
In DESeq2, we can't compare base levels of di vs est because this is confounded with your fixed effects blocking the individual differences.
The only way to make these comparisons is to model the individual effects as random effects, which can be accomplished with limma using duplicateCorrelation. As far as I know that is the only way to compare across these groups while controlling for the individual correlations within groups.