Error when providing contrast coefficients to lfcShrink() of DESeq2 package
1
0
Entering edit mode
Nik Tuzov ▴ 90
@nik-tuzov-8783
Last seen 11 months ago
United States

Hello:

There seems to be a bug in lfcShrink() when using a single factor with more than two levels (three in this case). DESeq() creates a model matrix with 3 columns, while lfcShrink() quietly changes that to "expanded" format with four columns. When I try to provide contrast coefficients to lfcShrink(), it wants to have it both ways. First, it checks if there are exactly three coefficients, then it checks if there are exactly four, so there is no way it can work.

Is it necessary to use the "expanded" any longer? I understand that in older versions shrinkage was applied to all the regression coefficients individually, and the issue was that the reference level did not undergo shrinkage. Nowadays it seems that the shrinkage is applied to the linear combination of regression coefficients, so it should be no longer important what the reference level is (please correct me if I am wrong).

Regards, Nik

# Experiment with estimating an arbitrary contrast by providing L vector for L'beta.
countOfFeatures = 100
countOfSamples = 12
meanResponse = 100
# variance is mu + mu^2/size 
setSize = 2
set.seed(314)
cnts <- matrix(rnbinom(n = countOfFeatures * countOfSamples, mu = meanResponse, size = setSize), 
               ncol = countOfSamples)

# 1 factor, 3 levels
condition <- factor(rep(c("A", "B", "C"), each = 4))
# Design via formula
dds <- DESeqDataSetFromMatrix(cnts, DataFrame(condition), design = ~condition)
dds <- DESeq(dds)
resultsNames(dds)
# [1] "Intercept"        "condition_B_vs_A" "condition_C_vs_A"
# 
# So, the (0, 1, -1) contrast should correspond to "B vs C"
res.bc <- results(dds, contrast = c(0, 1, -1))

# Shrinkage:
res.shrunk <- lfcShrink(dds, contrast = c(0, 1, -1))
# This generates an error:

# resultsNames from the expanded model matrix to be referenced by 'contrast':
#   'Intercept', 'conditionA', 'conditionB', 'conditionC'
# Error in checkContrast(contrast, resNames) : 
#   numeric contrast vector should have one element for every element of 'resultsNames(object)'
# 
# Ok, let's specify 4 coefficients:
res.shrunk <- lfcShrink(dds, contrast = c(0, 0, 1, -1))
# Another error:
# Error in checkContrast(contrast, resNames) : 
# numeric contrast vector should have one element for every element of 'resultsNames(object)'
deseq2 • 1.0k views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Thanks for the report, I can reproduce on my end and I've pushed a fix to the devel branch:

> res.shrunk <- lfcShrink(dds, contrast = c(0, 1, -1))
Error in lfcShrink(dds, contrast = c(0, 1, -1)) : 
  for type='normal' and numeric contrast, user must provide 'res' object

This then prompts the following:

> res.shrunk <- lfcShrink(dds, res=res.bc, contrast = c(0, 0, 1, -1))
using 'normal' for LFC shrinkage, the Normal prior from Love et al (2014).

Note that type='apeglm' and type='ashr' have shown to have less bias than type='normal'.
See ?lfcShrink for more details on shrinkage type, and the DESeq2 vignette.
Reference: https://doi.org/10.1093/bioinformatics/bty895
resultsNames from the expanded model matrix to be referenced by 'contrast':
'Intercept', 'conditionA', 'conditionB', 'conditionC'

Note that for the past two releases, lfcShrink also prints message above, about the other methods being a better choice. I haven't yet changed the default value of type from "normal" to "apeglm", but apeglm and ashr both outperform the 2014 estimator as shown in that paper. I will likely change the default value for type this cycle or next, as it's been out there and tested by the community now (and the documentation and workflows all use apeglm for the past year now).

ADD COMMENT
0
Entering edit mode

Thanks for replying. Was I right when I suggested that the "expanded matrix" is now redundant?

ADD REPLY
1
Entering edit mode

No, expanded model matrices are used with “normal” with the contrast argument. This is in the Details of the man page. The EMM had the advantage of symmetry (unlike, for example ridge or lasso regression), but at the cost of interpretability and as I appreciated over time, a potential for FP on interaction terms (which is why I changed so 2014-style shrinkage wouldn’t run on designs with interaction terms). The other approaches are much easier for interpretation and work with all designs.

ADD REPLY

Login before adding your answer.

Traffic: 523 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6