Question

DESeq2 nested interaction and individual batch correction question

0

Entering edit mode

elisa.t.zhang • 0

@elisatzhang-23830

Last seen 4.1 years ago

Hi there! I am trying to perform a particular comparison after correcting for individual batch effects in DESeq2. The experiment involves the following samples:

condition   individual  individual_by_group side

di  A   A   L
di  A   A   R
di  B   B   L
di  B   B   R
di  C   C   L
di  C   C   R
est D   A   L
est D   A   R
est E   B   L
est E   B   R
est F   C   L
est F   C   R

What I would like to do is to compare di L samples against est L samples. However, because of tremendous differences between individuals due to inherent variability (as is known to be an issue with this experiment and as seen by PCA), I don't get very many DEGs. Therefore, I need to first correct for individual batch effects. Indeed, if I correct for the individual batch effect using the design ~individual + side, I can then successfully pull out DEGs between di L and di R (so two sides of the same animals, which differ by treatment condition). However, I can't figure out how to do this for di L vs. est L.

Based on the DESeq2 manual, the section "Group-specific condition effects, individuals nested within groups" is most relevant. They provide the following example:

##    grp ind cnd ind.n
## 1    X   1   A     1
## 2    X   1   B     1
## 3    X   2   A     2
## 4    X   2   B     2
## 5    X   3   A     3
## 6    X   3   B     3
## 7    Y   4   A     1
## 8    Y   4   B     1
## 9    Y   5   A     2
## 10   Y   5   B     2
## 11   Y   6   A     3
## 12   Y   6   B     3

Based on this, the equivalent design for me would be ~condition + condition:individual_by_group + condition:side. Using this, however, I cannot address the di L vs. est L question. Again from the manuals' example: "Above, the terms grpX.cndB and grpY.cndB give the group-specific condition effects, in other words, the condition B vs A effect for group X samples, and likewise for group Y samples. These terms control for all of the six individual effects. These group-specific condition effects can be extracted using results with the name argument." Therefore, conditiondi.sideL vs. conditionest.sideL I think actually tells me (di L vs. di R) vs. (est L vs. est R), not di L vs. est L.

I've tried all sorts of other designs like ~condition + condition:individual_by_group + side or ~condition:individual_by_group + condition + condition:side, which have either not worked or thrown an error: "Error in checkContrast(contrast, resNames) : all elements of the contrast as a list of length 2 should be elements of 'resultsNames(object)'". I also cannot do the straightforward ~individual + condition:side or something like that because that would lead to a "model matrix not full rank" issue (because individual can predict di vs. est).

Any suggestions? Thanks so much!!

Best, Elisa

deseq2 • 656 views

ADD COMMENT • link updated 4.2 years ago by Kevin Blighe ★ 4.0k • written 4.2 years ago by elisa.t.zhang • 0

score 1 · Answer 1 · 2020-07-09

1

Entering edit mode

Kevin Blighe ★ 4.0k

@kevin

Last seen 4 weeks ago

Republic of Ireland

I would go about this by merging the condition and side columns to create a new column, condition_side, and then trying the formula:

~ individual + condition_side

Then, the comparison would be di_L - est_L

For downstream analyses, you could then, later, remove the individual effects from the variance-stabilised or regularised log expression levels via limma::removeBatcheffect()

ADD COMMENT • link 4.2 years ago Kevin Blighe ★ 4.0k

0

Entering edit mode

Thanks, Kevin! Much appreciated! I agree that your suggestion would make the most direct sense. I did try that already, but it failed because diL and estL can be inferred based on which individual, and since they are thus not wholly independent, the model matrix not full rank error occurred. The closest I could get might be using the ind.n animal designation in the second example above, or in my case, the "individiualbygroup" numbering, but that doesn't really do the right kind of batch correction I'd need. Any ideas on how to get around this? Thanks so much!

ADD REPLY • link 4.2 years ago elisa.t.zhang • 0

1

Entering edit mode

Hi,

In DESeq2, we can't compare base levels of di vs est because this is confounded with your fixed effects blocking the individual differences.

The only way to make these comparisons is to model the individual effects as random effects, which can be accomplished with limma using duplicateCorrelation. As far as I know that is the only way to compare across these groups while controlling for the individual correlations within groups.

ADD REPLY • link 4.2 years ago Michael Love 42k