Hi everyone,
I am trying to analyse some RNA-Seq data in DESeq2, but am struggling a little conceptually in understanding how exactly to investigate what genotype specific changes I have.
My data consists of 2 genotypes, 3 timepoints and 2 treatments (treated/untreated) - 48 samples in total (4 replicates per). Therefore, in the sample information file, I have formatted it like so:
Sample Genotype Timepoint Treatment. 1 Hi 18 Drug 2 Hi 18 DMSO 3 Lo 18 Drug 4 Lo 18 DMSO And so on, for timepoints 2, 6, and 18h.
First, I compared the effect of the treatment for each genotype at each timepoint:
dds <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~Genotype+ Timepoint + Treatment)
dds$group <- factor(paste0(dds$Genotype, dds$Timepoint, dds$Treatment))
design(dds) <- ~group
dds <- DESeq(dds)
A_Genotype1_18h_TreatedvsWT<- lfcShrink(dds, contrast=c("group", "HiT18NMTi", "HiT18DMSO"), type="ashr")
I am confident that this gives the correct comparisons, as I am just comparing two groups made of combinations of the variables.
However, I now want to see which genes are differentially affected by the drug treatment based on genotype (Hi/Lo), at each timepoint. This is where my uncertainty comes in. I have read DESeq2 interaction design with factor containing multiple levels which has a similar issue, as well was ?results and the vignette, but I am still not sure how to do it using the current dds object/design above, so I was thinking of doing it as follows:
dds <- DESeqDataSetFromMatrix(countData = countdata, colData = coldata, design = ~Genotype+ Timepoint + Treatment)
dds$group <- factor(paste0(dds$Genotype, dds$Timepoint))
design(dds) <- ~group + Treatment + group:Treatment
dds <- DESeq(dds)
18h_comparison <- results(dds, contrast=list("Hi18:Drug", "Lo18:Drug")
6h_comparison <- results(dds, contrast=list("Hi6:Drug", "Lo6:Drug")
2h_comparison <- results(dds, contrast=list("Hi2:Drug", "Lo2:Drug")
Would this work?