Hello:
There seems to be a bug in lfcShrink() when using a single factor with more than two levels (three in this case). DESeq() creates a model matrix with 3 columns, while lfcShrink() quietly changes that to "expanded" format with four columns. When I try to provide contrast coefficients to lfcShrink(), it wants to have it both ways. First, it checks if there are exactly three coefficients, then it checks if there are exactly four, so there is no way it can work.
Is it necessary to use the "expanded" any longer? I understand that in older versions shrinkage was applied to all the regression coefficients individually, and the issue was that the reference level did not undergo shrinkage. Nowadays it seems that the shrinkage is applied to the linear combination of regression coefficients, so it should be no longer important what the reference level is (please correct me if I am wrong).
Regards, Nik
# Experiment with estimating an arbitrary contrast by providing L vector for L'beta.
countOfFeatures = 100
countOfSamples = 12
meanResponse = 100
# variance is mu + mu^2/size
setSize = 2
set.seed(314)
cnts <- matrix(rnbinom(n = countOfFeatures * countOfSamples, mu = meanResponse, size = setSize),
ncol = countOfSamples)
# 1 factor, 3 levels
condition <- factor(rep(c("A", "B", "C"), each = 4))
# Design via formula
dds <- DESeqDataSetFromMatrix(cnts, DataFrame(condition), design = ~condition)
dds <- DESeq(dds)
resultsNames(dds)
# [1] "Intercept" "condition_B_vs_A" "condition_C_vs_A"
#
# So, the (0, 1, -1) contrast should correspond to "B vs C"
res.bc <- results(dds, contrast = c(0, 1, -1))
# Shrinkage:
res.shrunk <- lfcShrink(dds, contrast = c(0, 1, -1))
# This generates an error:
# resultsNames from the expanded model matrix to be referenced by 'contrast':
# 'Intercept', 'conditionA', 'conditionB', 'conditionC'
# Error in checkContrast(contrast, resNames) :
# numeric contrast vector should have one element for every element of 'resultsNames(object)'
#
# Ok, let's specify 4 coefficients:
res.shrunk <- lfcShrink(dds, contrast = c(0, 0, 1, -1))
# Another error:
# Error in checkContrast(contrast, resNames) :
# numeric contrast vector should have one element for every element of 'resultsNames(object)'
Thanks for replying. Was I right when I suggested that the "expanded matrix" is now redundant?
No, expanded model matrices are used with “normal” with the contrast argument. This is in the Details of the man page. The EMM had the advantage of symmetry (unlike, for example ridge or lasso regression), but at the cost of interpretability and as I appreciated over time, a potential for FP on interaction terms (which is why I changed so 2014-style shrinkage wouldn’t run on designs with interaction terms). The other approaches are much easier for interpretation and work with all designs.