Entering edit mode
I have 3 donors, and each donor has 2 samples (KO and WT). The matrix is:
Donor_ID Condition
1 WT
1 KO
2 WT
2 KO
3 WT
3 KO
I am interested in looking at differentially expressed genes (DEG) between KO and WT while correcting for donor variability.
I have doubts about my model, especially regarding the intercept. Should it be:
dds <- DESeqDataSetFromMatrix(data, colData = meta, design = ~ Donor_ID + Condition)
res <- results(dds, name = "Condition_KO_vs_WT")
or
dds <- DESeqDataSetFromMatrix(data, colData = meta, design = ~ 0 + Donor_ID + Condition)
res <- results(dds, name = "ConditionKO")
because I get different results after using lfcShrink.
1) I looked at ExploreModelMatrix, but it didn't help me much. For instance, in the first model with an intercept, I don't see any mention of the coefficient
Condition_KO_vs_WT
(which I can also get fromresultsNames(dds)
). Instead, I see the coefficientConditionKO
, which is a bit confusing. Why is that?2) If the first model represents the KO vs WT comparison, how can we interpret the second model without an intercept?
I have to restrict my time on the support site for software related questions.
For questions about statistical designs and analysis choices, I'd recommend consulting with a local statistician, or anyone familiar with linear models in R. DESeq2 uses the same linear modeling framework as basic linear models implemented in R, e.g.
lm
andmodel.matrix
.