Hello,
I am new to time-series and multifactorial design. I have read the forum discussions and DESeq2 vignette that deal with multifactorial designs and have build my model. However, I have few questions that would help me understand and interpret the output.
My data contains four time points (T0, T8, T16, T24), two genotypes (m4, p5) and three conditions (control, susceptible and resilient) as shown below:
samples time genotype condition
T0_1 T0 m3 control
T0_2 T0 m3 control
T0_3 T0 m3 control
T0_1 T0 m3 resilient
T0_2 T0 m3 resilient
T0_3 T0 m3 resilient
T0_1 T0 m3 susceptible
T0_2 T0 m3 susceptible
T0_3 T0 m3 susceptible
T0_1 T0 p5 control
T0_2 T0 p5 control
T0_3 T0 p5 control
T0_1 T0 p5 resilient
T0_2 T0 p5 resilient
T0_3 T0 p5 resilient
T0_1 T0 p5 susceptible
T0_2 T0 p5 susceptible
T0_3 T0 p5 susceptible
T8_1 T8 m3 control
T8_2 T8 m3 control
T8_3 T8 m3 control
T8_1 T8 m3 resilient
T8_2 T8 m3 resilient
T8_3 T8 m3 resilient
T8_1 T8 m3 susceptible
T8_2 T8 m3 susceptible
T8_3 T8 m3 susceptible
T8_1 T8 p5 control
T8_2 T8 p5 control
T8_3 T8 p5 control
T8_1 T8 p5 resilient
T8_2 T8 p5 resilient
T8_3 T8 p5 resilient
T8_1 T8 p5 susceptible
T8_2 T8 p5 susceptible
T8_3 T8 p5 susceptible
T16_1 T16 m3 control
T16_2 T16 m3 control
T6_3 T16 m3 control
T16_1 T16 m3 resilient
T16_2 T16 m3 resilient
T16_3 T16 m3 resilient
T16_1 T16 m3 susceptible
T16_2 T16 m3 susceptible
T16_3 T16 m3 susceptible
T16_1 T16 p5 control
T16_2 T16 p5 control
T16_3 T16 p5 control
T16_1 T16 p5 resilient
T16_2 T16 p5 resilient
T16_3 T16 p5 resilient
T16_1 T16 p5 susceptible
T16_2 T16 p5 susceptible
T16_3 T16 p5 susceptible
T24_1 T24 m3 control
T24_2 T24 m3 control
T24_3 T24 m3 control
T24_1 T24 m3 resilient
T24_2 T24 m3 resilient
T24_3 T24 m3 resilient
T24_1 T24 m3 susceptible
T24_2 T24 m3 susceptible
T24_3 T24 m3 susceptible
T24_1 T24 p5 control
T24_2 T24 p5 control
T24_3 T24 p5 control
T24_1 T24 p5 resilient
T24_2 T24 p5 resilient
T24_3 T24 p5 resilient
T24_1 T24 p5 susceptible
T24_2 T24 p5 susceptible
T24_3 T24 p5 susceptible
I want to see the effect of condition on genotype over 4 timepoints. So for this, I made the following full model:
readcounts <- read.csv("data-2022-10-20_2.csv", row.names = 1)
metadata <- read.csv("metadata_1.csv", row.names = 1)
dds <- DESeqDataSetFromMatrix(countData = readcounts, colData = metadata, design = ~ time + genotype + condition + time:condition)
And reduced model with likelihood ratio test:
dds_reduced <- DESeq(dds, test="LRT", reduced = ~ time + genotype + condition)
My first question is my model correct?
Next, when I run the following script to see the results:
resultsNames(dds_reduced)
I get the following combinations:
[1] "Intercept" "time_T16_vs_T8" "time_T24_vs_T8"
[4] "time_T0_vs_T8" "genotype_p5_vs_m3" "condition_resilient_vs_control"
[7] "condition_susceptible._vs_control" "timeT16.conditonresilient" "timeT24.conditionresilient"
[10] "timeT0.conditonresilient" "timeT16.conditonsusceptible." "timeT24.conditionsusceptible."
[13] "timeT0.conditionsusceptible."
So, my second question is why do I get only these selected 12 combination? What about other possible combinations... let's say "time_T24_vs_T0", "time_T24_vs_T16", "timeT0.conditioncontrol" and "timeT8.conditionresilient" that are not present there? and so on....
I want to clarify if my model or script is not working properly here or that's how time-series multifactorial design works? Any logic behind having these selected combinations?
Many thanks in advance!