I have an RNA-Seq dataset with 3 variables and 45 samples:
muscle (A or B)
genotype (WT or TG)
region (1 or 2)
So I want to compare e.g. WT/A/1 against WT/A/2, but also all other possible comparisons.
So far I've used the following design:
dds <- DESeqDataSetFromMatrix(counttable, colData = coldata, design= ~ genotype + muscle + region)
dds <- dds[rowSums(counts(dds)) > 1, ] # Removes rows with zero reads in all samples
dds <- DESeq(dds) # Runs DESeq2
dds2 <- dds # Copy dds to dds2
dds2$group <- factor(paste0(dds2$genotype, dds2$muscle, dds2$region))
design(dds2) <- ~ group
dds2 <- DESeq(dds2)
resultsNames(dds2) # Here I get all the possible groups and then I could finally contrast them
results_WT/A/1_vs_WT/A/2 <- results(dds2, contrast=c("group", "WT/A/1", "WT/A/2"))
However, if I do the whole analysis with only one set of conditions I'd liek to compare, my differential expression list will look quite a lot different. e.g. I only use featureCounts on those .bam files that correspond to WT/A for region 1 and 2, so I can easily contrast WT/A/1 vs. WT/A/2. Just the result is quite different from using the design I showed above.
Am I doing this correctly in general, and which method should I use to correctly interpret my RNA-Seq dataset specifically (~ group design or doing the comparisons one by one with a simple design e.g. design = ~ region ... and only using the files for one condition).
Thanks.
Can you give more description on what region?
If you consider only Muscle type and genotype this design will suite good design=~ genotype+ muscle +muscle:genotype
You will have the expression across the genotypes in different muscle groups
The region is the most important part, as it is laser-capture microdissected muscle tissue.
It would be ideal to be able to compare the two different types of muscle at the region of interest vs. the control region. The genotype would be maybe a little secondary for now, but still I want to see whether the genotype brings changes in the region of interest and between the muscles.