My experiment consists of 4 clones - 3 mutants and 1 WT, 3 samples each clone. The samples were collected at 3 different days. I am comparing each mutant clone to the WT.
pData_1$condition <- factor(pData_1$condition,
levels = c("WT", "C3", "C6","C10"))
I include the batch information in my design:
pData_1$condition <- relevel(pData_1$condition, ref = "WT") # set WT as control by manual.
# input data into DESeq, method 2: from matrix
dds <- DESeqDataSetFromMatrix(countData = data_count,
colData = pData_1,
design = ~ batch+condition)
The design matrix looks as follows:
> # show the corresponding Design Matrix that DESeq2 will use
> model.matrix(~ batch + condition, pData_1)
(Intercept) batchb_2 batchb_3 conditionC3 conditionC6 conditionC10
1 1 0 0 0 0 1
2 1 1 0 0 0 1
3 1 0 1 0 0 1
4 1 0 0 1 0 0
5 1 1 0 1 0 0
6 1 0 1 1 0 0
7 1 0 0 0 1 0
8 1 1 0 0 1 0
9 1 0 1 0 1 0
10 1 0 0 0 0 0
11 1 1 0 0 0 0
12 1 0 1 0 0 0
attr(,"assign")
[1] 0 1 1 2 2 2
attr(,"contrasts")
attr(,"contrasts")$batch
[1] "contr.treatment"
attr(,"contrasts")$condition
[1] "contr.treatment"
The results names I get are:
> resultsNames(dds)
[1] "Intercept" "batch_b_2_vs_b_1" "batch_b_3_vs_b_1"
[4] "condition_C3_vs_WT" "condition_C6_vs_WT" "condition_C10_vs_WT"
Now my question is if I understand correctly that my interesting results would just be the names [4] [5] [6] and I don't need to pull the contrasts combining the condition effect with batch effect, right? The batch is already accounted for in the "condition" results?
And another question - what would be the correct way of testing difference between all mutant clones as a group vs WT? Is just grouping them as "mutants" and "wild type" correct? Or is the unequal number of samples a problem (9 mutants vs 3 WT)?
thank you!