Hi,
I have been trying to learn deseq2 and managed to work on a two factorial design (followed a similar post DESeq2 Model Design and worked through my analysis, thanks to Michael). I'm now into a three factor time-series design and would need some help please in designing my model and describing contrasts. I have been looking into lots of forum questions but still confused, have also read the vignette.
Here is my design: Group: A and B, Treatment: control and treated, time: 0,1,2,4 - 3 replicates each
group treatment timepoints
1 A control T0
2 A control T0
3 A control T0
4 B control T0
5 B control T0
6 B control T0
7 A control T1
8 A control T1
9 A control T1
10 B control T1
11 B control T1
12 B control T1
13 A treated T1
14 A treated T1
15 A treated T1
16 B treated T1
17 B treated T1
18 B treated T1
19 A control T2
20 A control T2
21 A control T2
22 B control T2
23 B control T2
24 B control T2
25 A treated T2
26 A treated T2
27 A treated T2
28 B treated T2
29 B treated T2
30 B treated T2
31 A control T4
32 A control T4
33 A control T4
34 B control T4
35 B control T4
36 B control T4
37 A treated T4
38 A treated T4
39 A treated T4
40 B treated T4
41 B treated T4
42 B treated T4
key questions for DE:
1) time-series DE within a specific group, 2) time-series DE within specific treatment 3) group vs treatment time series combinations
Different model designs tried:
~group+treatment+timepoints+treatment:group
~group+treatment+timepoints+group:timepoints
~group+group:treatment+group:timepoints - test="LRT", reduced = ~group+group:timepoints
I am having a hard time to understand what design should I use for a complete analysis (I should add in: newbie in R, so learning as I go).
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New Zealand.1252
[3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C
[5] LC_TIME=English_New Zealand.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggplot2_2.2.0 DESeq2_1.10.1 RcppArmadillo_0.7.500.0.0
[4] Rcpp_0.12.8 SummarizedExperiment_1.0.2 Biobase_2.30.0
[7] GenomicRanges_1.22.4 GenomeInfoDb_1.6.3 IRanges_2.4.8
[10] S4Vectors_0.8.11 BiocGenerics_0.16.1
loaded via a namespace (and not attached):
[1] genefilter_1.52.1 locfit_1.5-9.1 splines_3.2.2 lattice_0.20-34
[5] colorspace_1.3-1 htmltools_0.3.5 survival_2.40-1 XML_3.98-1.5
[9] foreign_0.8-67 DBI_0.5-1 BiocParallel_1.4.3 RColorBrewer_1.1-2
[13] lambda.r_1.1.9 plyr_1.8.4 stringr_1.1.0 zlibbioc_1.16.0
[17] munsell_0.4.3 gtable_0.2.0 futile.logger_1.4.3 memoise_1.0.0
[21] latticeExtra_0.6-28 knitr_1.15.1 geneplotter_1.48.0 AnnotationDbi_1.32.3
[25] htmlTable_1.7 acepack_1.4.1 xtable_1.8-2 openssl_0.9.5
[29] scales_0.4.1 base64_2.0 Hmisc_4.0-1 annotate_1.48.0
[33] XVector_0.10.0 gridExtra_2.2.1 digest_0.6.10 stringi_1.1.2
[37] grid_3.2.2 tools_3.2.2 magrittr_1.5 lazyeval_0.2.0
[41] tibble_1.2 RSQLite_1.1 Formula_1.2-1 cluster_2.0.5
[45] futile.options_1.0.0 Matrix_1.2-7.1 data.table_1.10.0 assertthat_0.1
[49] rpart_4.1-10 nnet_7.3-12
Thanks everyone for your time!!!
Dear Michael,
I'm analysing a RNA-seq dataset with a very similar design to the one described by Sukhi, except that I do not have control and infected samples at time 0 (only have control and infected samples at T1, T2, T3).
The approach I took was combining all 3 factors (Group, Treatment, Time) into a single factor and using design=~ Donor + Group_Treatment_Time. Would this be a good strategy?
I'm interested in time-specific differences:
i) between control and infection for each group
ii) between groups for control (or infection)
and I'm interested in finding genes with a difference in baseline expression (a main effect), ie lines moving in parallel:
iii) between control and infection for each group
iv) between groups for control (or infection)
Thank you for your time.
I'd recommend you partner with a local statistician as well. You have donor, group, treatment and time variables, and there are many ways to do the modeling. Note that it's not DESeq2 specific, any modeling you can do with linear models in R you can use DESeq2 to do.