Hello together, hello Michael,
I have an understanding problem using results(dds, ...)
I have RNAseq data and a grouping of two conditions with interaction of time with the defined reference levels, see code below. And run the code as below.
coldata$Cond <- factor(coldata$Cond, levels = c("A","B"))
coldata$Time <- factor(coldata$Time, levels = c("2","5","7"))
formula = ~ Cond + Cond:Time - 1
dds <- DESeqDataSetFromMatrix(countData = round(cts),
colData = coldata,
design = formula)
dds <- DESeq(dds)
> resultsNames(dds)
[1] "CondA" "CondB" "CondA.Time5" "CondA.Time7" "CondB.Time5" "CondB.Time7"
"CondA" contains implicitly Time2, as far as I understood. So from my expectation, calling "result(dds, name="CondA")" is calling actually the comparison of reference vs reference, which I expect to cause only 0 logFC. But I get the following head output with 0 << logFC (in this head section; p-values > 0.1 in other sections ).
log2 fold change (MLE): CondA
Wald test p-value: CondA
DataFrame with 12659 rows and 6 columns
baseMean log2FoldChange lfcSE stat pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
gene1 1535.6225 11.16962 0.1156640 96.5696 0 0
gene2 7642.6626 12.91082 0.0487283 264.9556 0 0
gene3 30.0613 4.90618 0.0976447 50.2453 0 0
gene4 36.8394 5.45187 0.1199860 45.4375 0 0
gene5 1533.5904 10.99511 0.0618889 177.6589 0 0
Did I miss something? Is the reference level something, else, so there is a reasonable explanation to that?
At the end I want to compare for each Condition each time-point against each other, and for each time-point the two Conditions against each other
Thank you very much.
Hello James,
Thank you for answering
I haved used -1 in the design formula, so my understanding so far was that the comparison is always against reference (some in the groups), at least with contrast. Or is that different with "name"?
I am currently testing this Cond difference at Time5 with
But this gives me different results. How does it come?
No, putting -1 in the design matrix is a cell means model, which is the opposite. If you used
~ Cond + Cond:Time
, you would specify a factor effects model in which case there would be a baseline.The call to
results
that I suggested gives you the correct interaction. I don't know what your call toresults
does. I don't see anything in the help page or vignette that says you should use a list of two vectors that are each length two. But I don't really useDESeq2
primarily because I find the contrasts specification mystifying, so maybe it's perfectly fine? Hopefully Mike Love will be along in a bit to explain.Yes, I agree, the correct contrast setting is quite challenging. The vignette says
And ?results shows the combination of main effect and interaction term in one character vector. EDIT: Using the formula without -1, would give me an intercept (again reflects CondA at time2) and results(dds, name="Intercept") provides me the same results like for CondA, where reference is unclear.
When I apply the model.matrix approach for getting the contrasts, the code
gives me the same result as calling
So I conclude this is the correct call? Still, what is
results(dds, contrast = list("CondA.Time5","CondB.Time5"))
then providing me?