Question

New function lfcShrink() in DESeq2

18

Entering edit mode

Michael Love 43k

@mikelove

Last seen 4 hours ago

United States

I've received a couple questions from users about why I moved the shrinkage of log2 fold changes from the DESeq() to the lfcShrink() function. Here's my answer I give in the vignette:

"In version 1.16, the log2 fold change shrinkage is no longer default for the DESeq and nbinomWaldTest functions, by setting the defaults of these to betaPrior=FALSE, and by introducing a separate function lfcShrink, which performs log2 fold change shrinkage for visualization and ranking of genes. While for the majority of bulk RNA-seq experiments, the LFC shrinkage did not affect statistical testing, DESeq2 has become used as an inference engine by a wider community, and certain sequencing datasets show better performance with the testing separated from the use of the LFC prior. Also, the separation of LFC shrinkage to a separate function lfcShrink allows for easier methods development of alternative effect size estimators."

As shown in the new vignette, the following code produces moderated log2 fold changes:

dds <- DESeq(dds)
resultsNames(dds)
res <- lfcShrink(dds, coef=2)
plotMA(res)

Note: If you have version 1.16 (no longer release), you need to include res=res

res <- results(dds)
res <- lfcShrink(dds, coef=2, res=res)

The last reason given above was a big motivation, we have some nice new estimators with better performance than the one we proposed in the 2014 paper, and this new function will allow us to have finer control of these without cluttering the DESeq() function arguments.

The 2014 shrinkage estimator for fold change worked well for most bulk RNA-seq datasets, but not for all of the kinds of datasets in which DESeq2 is being used. In particular, one kind of dataset where combining shrinkage with testing was not good were those in which there were singular very large fold changes (e.g. abs(LFC) > 6), despite all of the other log2 fold changes being very close to 0. Here the shrinkage was too great. This case looks much better with our new estimators. I think the 2014 shrinkage estimator applied to zero inflated count data, without any adjustment of the estimator to account for the zero component, would likely produce too much shrinkage as well.

deseq2 • 27k views

ADD COMMENT • link updated 5.3 years ago by Micheal Stark • 0 • written 7.6 years ago by Michael Love 43k

0

Entering edit mode

Thanks for the clarification re: new suggested workflow. I'm unsure as to why there's a difference between calling lfcshrink() using coef=2 or contrast for an A vs B comparison. I would have thought that both res_A and res_B below would return exactly the same log2FoldChange. The difference is very small but I'm wondering why is this?

set.seed(101)
dds <- makeExampleDESeqDataSet(betaSD=1)
dds <- DESeq(dds, betaPrior=FALSE)
res <- results(dds)
res_A <- lfcShrink(dds=dds, coef=2, res=res)
res_B <- lfcShrink(dds=dds, contrast=c("condition","B","A"), res=res)

ADD REPLY • link 7.2 years ago jjvalletta • 0

0

Entering edit mode

Here's a post where I explain the difference. I wanted to provide 'contrast' to allow users to recreate previous LFCs exactly, although moving forward, we will have more features for the 'coef' argument:

A: DESeq2 lfcShrink() usage of coef vs. usage of contrast

ADD REPLY • link 7.2 years ago Michael Love 43k

score 1 · Answer 1 · 2017-05-23

1

Entering edit mode

CodeAway ▴ 70

@codeaway-12991

Last seen 4.1 years ago

Thank you very much, Michael, for the clarification, as I was looking for this answer and was getting confused about the removal of the shrinkage feature by default from DESeq2.

So, would you say that for regular bulk RNAseq, we should use this shrinkage feature by default as usual like in the older DESeq2?

If yes, then can I simply get that old DESeq2 effect by just explicitly setting "betaPrior=TRUE" in the DESeq2 command?

It will be great if you could please clarify.

Many thanks.

ADD COMMENT • link 7.5 years ago CodeAway ▴ 70

1

Entering edit mode

You can get the shrunken LFC either with lfcShrink like above or with betaPrior=TRUE. It will be the same shrunken LFC and the same as previously calculated in DESeq2. The difference is that betaPrior=TRUE will give you a p-value for the shrunken LFC, while lfcShrink (at the moment) is only giving you the LFC, and is keeping the p-value for the test of the MLE LFC. (Note that usually these are very similar p-values.) Eventually lfcShrink may also provide information in addition to the shrunken log2 fold change. Let me know if that makes it clear.

ADD REPLY • link 7.5 years ago Michael Love 43k

1

Entering edit mode

Thanks, Michael. Yes, that makes it clear. So, in short, for regular bulk RNAseq data, would you recommend using the shrunken LFCs vs not using it? As far as I understand, the main reason you decided to change the default in DESeq2, is for types of data other than bulk RNAseq, correct?

Thanks a lot for your very quick response!

ADD REPLY • link 7.5 years ago CodeAway ▴ 70

1

Entering edit mode

Yes, for bulk RNA-seq I'd recommend using the shrunken LFC. And using the code above will make your workflow more compatible with future developments, as you'll just be able to swap in new and better estimators as we introduce them.

ADD REPLY • link 7.5 years ago Michael Love 43k

score 0 · Answer 2 · 2017-05-08

0

Entering edit mode

Federico Marini ▴ 180

@federico-marini-6465

Last seen 3 months ago

Germany

As one of the question-askers, thanks for the clarification!
I guess the new shrinkage estimators will be added in the devel branch over time?

ADD COMMENT • link 7.6 years ago Federico Marini ▴ 180

0

Entering edit mode

Yes, hopefully this devel cycle.

ADD REPLY • link 7.6 years ago Michael Love 43k

Michael Love · Answer 3 · 2017-05-23

0

Entering edit mode

qimaera • 0

@qimaera-13100

Last seen 7.5 years ago

caveat: I am new to Bioconductor and have been going through the workshop content and using the workflows to get familiar with it. So, apologies if the answers to my questions are actually staring me in the face and I cannot see them.

While running the RNA-seq Gene workflows R code (last build - April 27, 2017), I get the following error message:

" Error in lfcShrink(dds, contrast = c("dex", "trt", "untrt"), res = res) :
could not find function "lfcShrink"

when I run the

res <- lfcShrink(dds, contrast=c("dex","trt","untrt"), res=res)

code fragment.

I checked on the github site for the function and found one in helper.R, but when I use that function, I get the following error message:

Error in plotMA(res, ylim = c(-5, 5)) :
'x' must be a data frame with columns named 'baseMean', 'log2FoldChange'.

Any advice on dealing with this?

Many thanks.

ADD COMMENT • link updated 7.5 years ago by Michael Love 43k • written 7.5 years ago by qimaera • 0

0

Entering edit mode

I'd guess that you are using an older version of DESeq2. This is for versions >= 1.16. Check your sessionInfo()

ADD REPLY • link 7.5 years ago Michael Love 43k

0

Entering edit mode

Hi Michael,

I have the same problem.

resLFC <- lfcShrink(dds, coef=2, res=res)

from the documentation page gives the same error 'could not find function "lfcShrink".'

I have DESeq2_1.10.1. Any advice?

@qimaera Did you solve this issue?

Thanks!

EDIT: Sorry, I read the version incorrect. Of course it does not work because I have 1.10.

ADD REPLY • link 7.4 years ago abhinrl • 0