I would like to calculate shrunken log fold changes from RNA-seq data, and I recently ran in to the same case that was described in this old post by Mike Love: DESeq2 and shrinkage of log2 fold changes
One case where I would not use it, is if it is expected that nearly all genes will have no change and there is little to no variation across replicates (so near technical replication), and then say < 10 genes with very large fold changes. This scenario could occur in non-biological samples, for example technical replicates plus DE spike ins. The reason this would cause a problem is that the prior is formed according to a high percentile of the large fold changes, but it could miss if there were singular DE genes, and form a prior which is not wide enough to accommodate very large fold changes. It is trivial to turn off the prior in this case (betaPrior=FALSE).
In my case I actually do have biology where for some contrasts only 1 or 2 genes change significanctly, and the built-in shrinkage methods indeed sometimes overshrink these to something close to zero.
I have been using the very hacky shrunkLFC = LFC * (1-pval)
which at least takes things in approximately the right direction (and maybe with sufficiently vigorous hand waving can be justified with a probabilistic interpretation?). But if anyone has more rigorous suggestions, I would love to hear about them.
Thanks!
This seems to work pretty well, thanks. I found that
prior.scale=0.5
orprior_scale=0.25
gave reasonable-looking results, at least with the datasets I've tried so far.I'm dealing with a similar situation. I assume the coefficients related to batch effect would need to be included in idx also, is this correct? Thanks for all your work.
Edit: Question answered by myself. I looked at the DESeq2 code and found that with default parameters only the coef of interest is shrunk, not batch effects. This might've been obvious to others..
For an approach that still calculates prior based on mle, one can pass through multiplier, which multiplies the resulting prior.scale. Obviously this cannot be recommended without having a very good reason to deviate from the published protocol.