Hi,
I am working with time-series RNA-seq data, and would like to make sure I have the right design for my purpose.
The experiments like:
Time = rep(c("0h", "2h", "4h", "6h", "8h"), each = 2) Treat = rep(c("Control", "Treat"), each = 10) nameD <- paste(Treat, Time, c(rep(LETTERS[1:2], 5), rep(LETTERS[3:4], 5)), sep = "_") coldata <- data.frame(row.names = nameD, Time = Time, Treat = Treat)
> coldata Time Treat Control_0h_A 0h Control Control_0h_B 0h Control Control_2h_A 2h Control Control_2h_B 2h Control Control_4h_A 4h Control Control_4h_B 4h Control Control_6h_A 6h Control Control_6h_B 6h Control Control_8h_A 8h Control Control_8h_B 8h Control Treat_0h_C 0h Treat Treat_0h_D 0h Treat Treat_2h_C 2h Treat Treat_2h_D 2h Treat Treat_4h_C 4h Treat Treat_4h_D 4h Treat Treat_6h_C 6h Treat Treat_6h_D 6h Treat Treat_8h_C 8h Treat Treat_8h_D 8h Treat
This is longitudinal experiments, two biological replicates for treat and control respectively, and each sample is repeated sequenced in different time. Besides, treatments started before 0 hour, 0 hour only mean when the samples were sequenced.
I would like to find genes showing difference in any time point between Treat and control. I think the likelihood test is right choice.
The full model I used as
design(dds) <- formula(~ Time + Treat + Time:Treat)
the reduced model I used is different from Vignettes of DESeq2, I used
reduced_1 = formula(~ Time)
rather than
reduced_2 = formula(~ Time + Treat)
I think likelihood test with reduced_2 could not give small pvalue to genes showing consistent difference between treat and control, i.e., genes expression are parallel between two condition.
A few questions:
- Am I right to use reduced_1? (From the discussion and comments I think it fits my purpose).
- Because it is repeated measurements for each sample, the genes expression also correlate. Would it be a problem for the test?
- I have tried two test so far:
- Setting one (do not consider interaction): full = formula(~ Time + Treat), reduced = formula(~ Time)
- Setting two (consider interaction): full = formula(~ Time + Treat + Time:Treat), reduced = formula(~ Time)
- To my surprise, setting one actually find more DE genes than setting two basing on FDR <= 0.01. I noticed that fold change is small between treat and control, is it connected to the model complexity and lead to less DE genes? or better reasons?
Thanks in advance.
But cshao has no sample groups in his design table, hence this won't work that way, I think. Hence my question below.
True, it depends on the experiment. Even if not the same samples, another reason to control for time 0 would be: individuals in control group kept in container 1, individuals in treatment group kept in container 2. Context can be more or less important depending on the details of experiment.
Good point, I edited my posts.
I don't think one can control for the sample effect here, because the comparison of interest is across treatment group, which contain different samples. Maybe Simon sees a way though. Given that at time 0, you expect a treatment effect already, I would use your "setting 2" above, with the interaction term in the full model to allow for varying treatment effect at different times, and then a reduced of
~ time
."I noticed that fold change is small between treat and control"
Read the section in ?results which talks about results tables for the likelihood ratio test. Briefly, there is a single likelihood ratio statistic and p-value, but there are many possible log fold changes to show. The results table prints the last one by default, and you need to use the
name
argument to pull out others. SeeresultsNames(dds)
for all of the log fold changes fit by the DESeq() function.I don't typically like to focus on the total number of genes with a small FDR; the important thing is to build the correct set for your question of interest.
Thanks. The fold change I talked about is the largest fold change in any time points for each genes, I did the calculation myself. About the second question, i.e., the gene expression correlation among time points, how do you think about it?