Dear community,
I am having troubles with the DESeq2 analysis of time course experiments and was thinking someone could point me to the right direction.
I have 3 different conditions. For conditions 1 and 2 there are time points 0, 8, 16, 24, 48, 72 hours. For condition 3 there are time points 0, 8, 24, 48, 72, 96, 120 hours.
While trying to design a model
dds <-DESeqDataSet(dataset,
~condition + time + condition:time) as described here (in the time course section), DESeq2 throws an error
"..the model matrix is not full rank, so the model cannot be fit as specified.
Levels or combinations of levels without any samples have resulted in
column(s) of zeros in the model matrix".
I guess this is because some of the conditions do not have all the time points?
I tried the following:
model <- model.matrix(~condition + time + condition:time, colData(data_dds))
and removing the columns that have only zeros, but since I do not see all the conditions in this model matrix (the first one is always hidden), supplying this matrix to DESeqDataSet gives the same error.
Is there a way to make such a model?
Are there any alternative ways to test condition-specific and time/timepoint-specific effects?
If you have a specific 'shape' of time-profile you want to investigate, then maybe using time as a continuous variable would allow you to compensate for the 'missing' timepoints. If you kept the time term linear, then you may be able to detect genes that have differential gradient (on a log-like scale). Translating the time-variable by a specific offset would mean the intercept term would be the difference between the fitted curves at the timepoint corresponding to the (negative) offset, effecting interpolation. I've tried this unsuccessfully in the past, in cases with fewer timepoints, but it might be worth try.
Yes, I was considering to use time as a continuous variable. But since I am just starting learning bioinformatics, I am still not knowledgeable enogh to do this on my own:) Maybe I'll try if I find a good tutorial.