Hi all,
I have an RNA-seq data set of samples that I wish to use for analysis of differential expression with DESeq2.
The data has three time points; 1 day, 5 day and 10 days with two different tissue samples taken from each time point (Tissue A, reference and Tissue B), 7 replicates of each.
Initially to test differences between two timepoints or tissues I created a new factor (group) that was a combination of tissue_days and carried out DESeq2 analysis with the design of ~group. However, what I am ideally after is differential expression between the tissues over time, I am interested in those events at all time points including day 1 so I dont want that to be ignored/modelled out.
I have been following the helpful information at : http://www.bioconductor.org/help/workflows/rnaseqGene/#time-course-experiments however I do not want to remove any tissue specific
differences over time, they are important to the experiment.
I have used a design formula of ~ time + tissue + tissue:time and ended up with the resultsNames(dds) of:
[1] "Intercept" "Tissue_B_vs_A" "time_5_vs_1" "time_10_vs_1" "TissueB.time5" [6] "TissueB.time10"
which I have taken from this that:
Tissue_B_vs_A - will give me the results between the two tissues at the reference time point day 1 - is this correct as being the reference time point as seems not from count plots at time 1?
time_5_vs_1 - results between two time points day 5 and 1 for the reference tissue (A)
time_10_vs_1 - results between two time points day 10 and 1 for the reference tissue (A)
TissueB.time5 - results of changes seen between Tissue B and A at time between time points day 5 and day 1 which is one of the results I am definately after. To get the results for Tissue B vs A at time point 5 vs 1 I believe I need to get results with contrast=list(c("time_5_vs_1", "tissueB.time5") ?
TissueB.time10 - results as for TissueB.time5 but between time points day 10 and day 1 - also what I am after.
However, I would like to see the results for the changes between tissues and time points day 10 vs day 5. Would I need to relevel the reference day time point which is currently 1 to 5 and run the analysis again or can get to this through a better design formula? Likewise I am interested in the time_10_vs_5 result which is not listed but would this also happen with releveling?
I am not sure I have set the design correctly for this experiment so would like and appreciate your feedback.
Questions I am trying to answer:
- Are there differentially expressed genes between tissues at time point 1, or 5, or 10 (I think this is answered with the combined factor) ie time day 1 Tissue B vs A (does not take into account any other changes or if those genes show changes at other time points).
- Are there tissue specific changes over time? What genes show a change in Tissue B over the time points that arent changed in Tissue A or vice versa. Is this possible with a single design ?
-Should I be using LRT with a reduced formula instead?
Many thanks for your time, I do appreciate any feedback.
Thank you Gavin for taking the time to reply - I appreciate your insight. (snipped as comment moved to under Gavin's post) sorry.
Take a look at ?results.
You need to specify test="Wald" in order to generate p-values for a specific coefficient. Otherwise, the table is the LRT that you specified by 'full' and 'reduced' arguments when you ran test="LRT".