Hi,
I always gets confused with the design matrix and even though I figure out the solution I am not very confident. So I though of asking for help.
I have 3 experiments: Control, wild and knockout. 2 time points: 2 and 4 hr. and 3 replicate for each.
I am not sure which of the following design matrix and contrast is best suited to get DEG for specified scenario.
Case 1:
dds <- DESeqDataSet(se, design = ~Time + Experiment + Time*Experiment)
dds <- dds[rowSums(counts(dds))>1,]
dds$Experiment <- relevel(dds$Experiment, ref="Control")
dds <- DESeq(dds,parallel = T)
resultsNames(dds)
[1] "Intercept" "Time_4_vs_2" "Experiment_Wild_vs_Control" "Experiment_Knockout_vs_Control" "Time4.ExperimentWild" "Time4.ExperimentKnockout"
res.05 <- results(dds,c(0,0,-1,1,-1,1), alpha=0.05) #For knockout vs wild (both 2hr and 4hr)
res.05 <- results(dds,c(0,0,0,1,0,1), alpha=0.05) #For knockout vs control (both 2hr and 4hr)
res.05 <- results(dds,c(0,0,1,0,0,0), alpha=0.05) #For wild vs control 2hr
res.05 <- results(dds,c(0,0,0,-1,0,1), alpha=0.05) #For knockout 4h vs knockout 2hr
Case 2:
dds <- DESeqDataSet(se, design = ~Time + Experiment)
dds <- dds[rowSums(counts(dds))>1,]
dds$Experiment <- relevel(dds$Experiment, ref="Control")
dds <- DESeq(dds,parallel = T)
resultsNames(dds)
[1] "Intercept" "Time2" "Time4" "ExperimentControl" "ExperimentWild" "ExperimentKnockout"
res.05 <- results(dds,c(0,0,0,0,-1,1), alpha=0.05) #For knockout vs wild (both 2hr and 4hr)
res.05 <- results(dds,c(0,0,0,-1,0,1), alpha=0.05) #For knockout vs control (both 2hr and 4hr)
res.05 <- results(dds,c(0,1,0,-1,1,0), alpha=0.05) #For wild vs control 2hr
res.05 <- results(dds,c(0,-1,1,0,0,1), alpha=0.05) #For knockout 4h vs knockout 2hr
Can anyone explain me which case/contrast to use for above scenario? And a little explanation would be very much appreciated.
Thanks
Nitesh
Hi Michael,
For knockout vs wild (both 2hr and 4hr), I meant that testing DEG between knockout and wild considering both time points (2hr and 4hr). So its overall change in expression pattern between experiments. Null hypothesis would be that there is no significant diff in expression pattern of genes between knockout and wild, considering both time points. So taking into account all the samples I have. Does this make sense?
Thanks
Nitesh
Change in both should count. Is it same if I take union of DEGs between knockout and wild at 2hr and 4hr? I thought doing it at same time will have more statistical power.
(1) Requiring DE at both time points has less power than (2) DE at either time point or both. Can you say which of 1 or 2 you are interested in?