Hi everyone,
I've recently started using DESeq2 version 1.20. Since there were several changes, I decided to start my pipeline from scratch.
In my design, I have two variables (treatment + infection). To make it easier, I created a variable of the grouping of two variables:
levels(dds$group) [1] "treated_Inf" "treated_NonInf" "untreated_Inf" "untreated_NonInf"
What I understand from the pipeline, you can either:
- relevel the variable before using DESeq, to get the right variables to compare
dds$group<-relevel(dds$group,ref="untreated_NonInf") levels(dds$group) [1] "untreated_NonInf" "treated_Inf" "treated_NonInf" "untreated_Inf"
- use the argument contrast in results, to compare levels that are not reference.
res<-results(dm,contrast = c("group","untreated_Inf","untreated_NonInf"))
As far as I get it, these two comparisons should produce the same results or (at least) similar results. This happens in most cases; however, there are some specific DE genes that become non-DE. For what I understand, this shouldn't happen.
Shouldn't I get the same results? Or am I doing something wrong?
Thanks!
Thanks for the answer!
For what I can understand, I need to change the reference level if I want to do pairwise comparisons among the groups and then proceed to lfcshrink(); however, using the recommended "apeglm" algorithm, I can't use contrast, I'll have to change the reference level to get the right coefs, and then run DESeq() again, right?
Ah yes you do have to change reference level for apeglm, that’s a complication. You don’t have to change reference level for ashr though, which is also very good at providing shrunken LFC according to our testing.
So, do you think the current approach I'm following with lfcshrink using ashr would yield similar results and is as correct as doing LRT test + changes in reference + lfcshrink() with apeglm?
Yes each of these approaches is fine. The LFC shrinkage is separate from the Wald vs LRT choice for pvalues actually. LRT is probably preferable if you want pvalues here. Either LFC method can be used, just for apeglm you’d have to change the reference level before running if you want to compare against multiple different reference groups. You actually only need run nbinomWaldTest, you don’t need to re-estimate dispersion.