I am analysing an experiment where three mice of one genotype and 5 of another have been treated with a drug, and samples taken before and after treatment. This is thus a "between and within subject" design, with the design ~genotype + genotype:mouse + genotype:treatment
i think.
We wish to ask: What is the effect of the transgene? What is the effect of the treatment? Is the effect of the treatment different between the two genotypes?
The PCA suggests PC1 is the identity of the mouse the sample came from, PC2 is the effect of the treatment. None of the first 5 PCs correspond to genotype.
My coldata (design
) has columns sample_name
, mouse
, genotype
, treatment
.
I have created the model matrix as follows:
design$mouse <- factor(design$mouse)
design$mouse.n <- factor(c(rep(c(1,2,3), times=2), rep(c(1,2,3,4,5), times = 2)))
design$genotype <- relevel(design$genotype, "WT")
design$treatment <- relevel(design$treatment, "minus")
mm <- model.matrix( ~ genotype + genotype:mouse.n + genotype:treatment, data = design)
# Deal with the fact that there is no mouse 4 or 5 in the transgenic condition.
all_zero <- apply(mm, 2, function(x) all(x==0))
mm <- mm[, !all_zero]
I then executed the DESeq analysis so:
dds <- DESeqDataSetFromTximport(txi, colData=design, design=~genotype + treatment)
design(dds) <- ~ genotype + genotype:mouse + genotype:treatment
dds_processed <- DESeq(dds, full = mm)
Extracting the effects of genotype is easy(?)
genotype <- results(dds_processed, name = "genotypeTg")
And the interaction term:
diff_in_diffs <- results(dds_processed, contrast = list("genotypeTg:treatmentplus", "genotypeWT:treatmentplus"))
But what about the effect of treatment, accounting for per mouse and per genotype variance?
I think that worked out that it should be:
treatment_wt <- results(dds_processed, name = "genotypeTg:treatmentplus")
treatment_tg <- results(dds_processed, name = "genotypeWT:treatmentplus")
treatment_all <- results(dds_processed, contrast = list(c("genotypeTg:treatmentplus", "genotypeWT:treatmentplus")), listValues = c(0.5, -1))
Is this right? Or the best way to do it? Since I'm looking for effects that don't (necessarily) differ between genotypes, would I be better re-analysing without the genotype in the design?
dds_treatment <- DESeqDataSetFromTximport(txi, colData=design, design=~mouse+ treatment)
dds_treatment_processed(dds_treatment)
treatment_all <- results(dds_treatment_processed, name = "treatment_plus_vs_minus")
Are these equivalent?
Thanks for that - I think I follow what you are saying about extracting the genotype main effect. I'm less sure about extracting the average treatment effect - wouldn't extracting this from
~genotype + treatment
mean you weren't controlling for mouse specific effects (which I think are pretty large)? Isn't the average treatment effect, effectively, like a paired t-test, hence why I wondered about extracting it from~ mouse + treatment
. Is there any particular reason that I shouldn't be using an entirely different design for each question? I can't think what it would be, but it feels wrong.Yes, you're right, you are better off with
~mouse + treatment
for the average treatment effect, I'll edit my post.