Question

DESeq2 very slow with many samples

0

Entering edit mode

mchikina • 0

@mchikina-9296

Last seen 9.4 years ago

United States

The final step of DESeq2 dispersion estimation takes a very long time to run on a dataset with 27 groups . I was wondering if there is a good strategy for speeding it up.

rnaseq deseq2 • 5.0k views

ADD COMMENT • link updated 9.4 years ago by Michael Love 43k • written 9.4 years ago by mchikina • 0

score 1 · Answer 1 · 2015-12-01

1

Entering edit mode

Michael Love 43k

@mikelove

Last seen 15 hours ago

United States

note that you can use multiple cores with parallel=TRUE. Here's my response to a similar question:

DESeq2 with many samples

ADD COMMENT • link 9.4 years ago Michael Love 43k

0

Entering edit mode

Thanks for the suggestions.

The issue I have is that I have 27 groups with only 3 replicates. Because the number of replicates is low I would prefer a count based method for hypothesis testing. The data is also quite noisy so I would like to be able to run the analysis repeatedly with different normalization strategies which is why speed is important.

Is there a way to use a less complex model for the dispersion estimation (treating some samples as replicates) but still obtain the coefficients for the full model? It doesn't seem like the current workflow allows for this.

ADD REPLY • link 9.4 years ago mchikina • 0

0

Entering edit mode

That's 81 samples total, which is plenty for voom to work quite well. Even though each group has only 3 replicates, keep in mind that DESeq2, edgeR, and limma-voom all estimate a single dispersion or variance parameter for each gene shared across all groups, so no matter which one you use, you are estimating the dispersion/variance from all 81 samples, giving you a quite robust estimate.

ADD REPLY • link 9.4 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

Thanks, voom does work well and fast. I was wondering how I may combine it with a normalization for technical parameters such as gc bias, length bias. I have been using the CQN generated glm.offsets to do this before.

ADD REPLY • link 9.4 years ago mchikina • 0

1

Entering edit mode

If you use voom, and have further questions, I'd recommend starting a new post with new tags. The posts send emails to the authors based on tags, so if you tag with limma, you will get responses from the package authors.

ADD REPLY • link 9.4 years ago Michael Love 43k

0

Entering edit mode

That really shouldn't take so long then. Try prefiltering as we show in vignette and use multiple cores (also shown in vignette). Make sure you are using the current release (version 1.10).

ADD REPLY • link 9.4 years ago Michael Love 43k