I'm trying to figure out how to interpret a model design with 3 factors plus interactions.
I have young and old mice, untreated and treated, in two timepoints. I'm interested to see the treatment effect and the age effect.
> head(sample.summary)
sample age treatment timepoint
1 s1 young untreated 1
2 s2 young treated 1
3 s3 young untreated 3
4 s4 young treated 3
5 s5 old untreated 1
6 s6 old treated 1
7 s7 old untreated 3
8 s8 old treated 3
I am treating it as a time-series analysis, with 3 categorical factors plus interactions.
dds <- DESeqDataSetFromTximport(all.txi.kallisto, sample.summary, ~ age*treatment*timepoint)
dds <- DESeq(dds, test = "LRT", reduced = ~ age+treatment+timepoint)
resultsNames(dds)
[1] "Intercept" "age_young_vs_old" "treatment_treated_vs_untreated"
[4] "timepoint_3_vs_1" "ageyoung.treatmenttreated" "ageyoung.timepoint3"
[7] "treatmenttreated.timepoint3" "ageyoung.treatmenttreated.timepoint3"
1) Do just two timepoints qualify the analysis as a time-series?
2) What is the best practice? One model with 3 factors (like above) or four models with 2-factors (one model for young, one for old, one for untreated, one for treated)?
3) How should we interpret the resultsNames?
- "treatmenttreatedvs_untreated" is the "treatment effect for old in 1mo" or the "treatment effect for old"?
- "ageyoung.treatmenttreted" is the "treatment effect for young compared to old, in 1mo" or the "treatment effect for young compared to old"?
- "treatmenttreated.timepoint3" is the "timepoint effect for treated compared to untreated" or "timepoint effect for treated compared to untreated, in old" ?
- "ageyoung.treatmenttreated.timepoint3" is the "timepoint effect for treated vs untreated, and young vs old"?
4) To get the total treatment effect, I would need to combine all the comparisons that contain the treatment factor, correct? So, "treatmenttreatedvs_untreated", "ageyoung.treatmenttreated", "treatmenttreated.timepoint3", "ageyoung.treatmenttreated.timepoint3"
5) How would I get the "treatment effect for old in 3mo" or the "age effect for untreated in 0mo" and so on?
The head of my sessionInfo is:
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6
other attached packages: DESeq2_1.24.0
Any insight or material that could help me unpuzzle would be so greatly appreciated! Thank you!
Thank you so much for your time Mike. I have been reading these resources again and again, but my puzzling persists. The basis of my confusion is how the introduction of the third factor changes the interpretation of each comparison. For example:
For a design with one factor (~ treatment), we see the treatment effect on gene expression.
If we add one more factor (~ age + treatment), we see the treatment effect on expression and the age effect on expression.
If we add their interaction (~ age + treatment + age:treatment), then the main effect (treatmenttreatedvs_untreated) changes by adding the age control; and we see the treatment effect for the "old" level of age. Which makes me interpret it like each effect (resultName) is a combination of both factors.
If this is correct then if I add one more factor plus all the interactions (~ treatment + age + timepoint + treatment:age + treatment:timepoint + age:timepoint + treatment:age:timepoint), then all the effects will be a combination of all three factors. So for example, "treatmenttreatedvs_untreated" is the treatment effect for old in 1mo, because old is the control level for age and 1mo is the control level for timepoint. Does this make sense?
PS: Just to make sure I haven't missed sth, this is the vignette you are referring to, correct?
I'd recommend meeting with a statistician or someone familiar with linear models to discuss the interpretation of the coefficients in a design with second order interactions. It's not very easy to explain in a comment box / via email, but really helps from a face-to-face meeting.
Re: vignette, that one is very old. To find the most up-to-date vignette for a particular package, go to https://bioconductor.org/pacakges/DESeq2 and then click on HTML under the header Documentation. Or from within R you can do
vignette("DESeq2")
.