Question

Modelling batch effect in DESeq2 - how to interpret the results?

0

Entering edit mode

annamariabugaj ▴ 10

@e7994b78

Last seen 2.2 years ago

Norway

My experiment consists of 4 clones - 3 mutants and 1 WT, 3 samples each clone. The samples were collected at 3 different days. I am comparing each mutant clone to the WT.

pData_1$condition <- factor(pData_1$condition, 
                        levels = c("WT", "C3", "C6","C10"))

I include the batch information in my design:

pData_1$condition <- relevel(pData_1$condition, ref = "WT") # set WT as control by manual.

# input data into DESeq, method 2: from matrix
dds <- DESeqDataSetFromMatrix(countData = data_count, 
                              colData = pData_1, 
                              design = ~ batch+condition)

The design matrix looks as follows:

> # show the corresponding Design Matrix that DESeq2 will use
> model.matrix(~ batch + condition, pData_1)
   (Intercept) batchb_2 batchb_3 conditionC3 conditionC6 conditionC10
1            1        0        0           0           0            1
2            1        1        0           0           0            1
3            1        0        1           0           0            1
4            1        0        0           1           0            0
5            1        1        0           1           0            0
6            1        0        1           1           0            0
7            1        0        0           0           1            0
8            1        1        0           0           1            0
9            1        0        1           0           1            0
10           1        0        0           0           0            0
11           1        1        0           0           0            0
12           1        0        1           0           0            0
attr(,"assign")
[1] 0 1 1 2 2 2
attr(,"contrasts")
attr(,"contrasts")$batch
[1] "contr.treatment"

attr(,"contrasts")$condition
[1] "contr.treatment"

The results names I get are:

> resultsNames(dds)
[1] "Intercept"           "batch_b_2_vs_b_1"    "batch_b_3_vs_b_1"   
[4] "condition_C3_vs_WT"  "condition_C6_vs_WT"  "condition_C10_vs_WT"

Now my question is if I understand correctly that my interesting results would just be the names [4] [5] [6] and I don't need to pull the contrasts combining the condition effect with batch effect, right? The batch is already accounted for in the "condition" results?

And another question - what would be the correct way of testing difference between all mutant clones as a group vs WT? Is just grouping them as "mutants" and "wild type" correct? Or is the unequal number of samples a problem (9 mutants vs 3 WT)?

DESeq2 BatchEffect • 1.1k views

ADD COMMENT • link 2.6 years ago annamariabugaj ▴ 10

score 1 · Answer 1 · 2022-10-05

Now my question is if I understand correctly that my interesting results would just be the names [4] [5] [6] and I don't need to pull the contrasts combining the condition effect with batch effect, right? The batch is already accounted for in the "condition" results?

Yes. The DE genes responsible for differences between batches are not usually biologically interesting, you can ignore those contrasts.

And another question - what would be the correct way of testing difference between all mutant clones as a group vs WT?

There are a couple of ways, but a simple, readable way is to make a new column of ColData that is just 'mutant" and "WT", and use that in your design instead.