I am having issues with the design fomula for DESeq2 analysis of time-course data. I have a pilot experiment with 6 samples from blood, 3 of each condition, no replicates, and at the time-points of 0, 12 and 24 hrs. I've gathered from the DESeq2 vignette and reading forums that I should be using the likelihood ratio test for this type of analysis but I'm unsure what design formula to specify. The questions I want to address are:
- What are the differences in gene expression between timepoints 0 vs 12, 12 vs 24 and 0 vs 24hrs.
- What are the condition-specific differences between gene expression between all the timepoint comparisons.
- Is there a differences between gene expression at different time-points in a model which does not take into account cell numbers per sample versus a model with cell numbers included as additional variables?
An example of my phenotype data is:
timepoint condition granulocytes lymphocytes monocytes
s1 0 con1 60 36 3
s2 12 con1 46 47 5
s3 24 con1 33 59 6
s4 0 con2 61 34 3
s5 12 con2 49 50 6
s6 24 con2 30 60 7
And my code looks like this:
ddsMat <- DESeqDataSetFromMatrix(countData = counts,
colData = pheno,
design = ~ condition + timepoint)
ddsMat2 <- DESeqDataSetFromMatrix(countData = counts,
colData = pheno,
design = ~ condition + timepoint + granulocytes + lymphocytes + monocytes)
ddsTC <- DESeq(ddsMat, test="LRT", reduced = ~ timepoint)
ddsTC2 <- DESeq(ddsMat2, test="LRT", reduced = ~ timepoint)
t0vs11 <- results(ddsTC, contrast=c("timepoint","12","0"), alpha=.05, test="Wald")
t11vs24 <- results(ddsTC, contrast=c("timepoint","24","12"), alpha=.05, test="Wald")
con1vscon2 <- results(ddsTC, contrast=c("condition","con1","con2"), alpha=.05, test="Wald")
t0vs11 <- results(ddsTC2, contrast=c("timepoint","12","0"), alpha=.05, test="Wald")
t11vs24 <- results(ddsTC2, contrast=c("timepoint","24","12"), alpha=.05, test="Wald")
con1vscon2 <- results(ddsTC2, contrast=c("condition","con1","con2"), alpha=.05, test="Wald")
For the purposes of this experiment I can consider the 2 conditions as different replicates of the same sample, if the analysis cannot be done without replicates, and as I am planning on repeating this with replicates in the future so will need to understand what design formula to use for that instance.
Thanks in advance for help!
Thanks for your reply! I thought the lack of samples might be the issue, but wanted to check if there was any way to answer these questions regardless.