Hi, is it possible to optimise (speed up) dispersion estimation step in DEXSeq (1.34.0) analysis when working with large number of samples? I noticed that almost 80-90% of the analysis time is actually dispersion estimation, which in my last run with 64 samples (32 per group) took ~8h even though I parallelised analysis on 18 workers/CPU. Can I do something about that by changing some of the available estimateDispersions() parameters (fitType, maxit, niter), or on some other way? Of course, without large affection on quality of the results?
I tried to install DEXSeqAlt but got the message: package ‘DEXSeqAlt’ is not available (for R version 4.0.0)
Thanks, Aleksandar
Constantin Ahlmann-Eltze has greatly sped up dispersion estimation (again) this time in a separate package, and we are working this into DESeq2 as a switch. DEXSeq uses DESeq2 for dispersion estimation, so this will be available within this devel cycle.
However, one way to speed up analysis is to filter out unexpressed genes. Can you post your code so the DEXSeq developers can review?
Thanks Michael, then the problem will be solved (at least partly) with the new release(s), that's great news. I will post my code and ask for some starting directions for DEXSeq filtering in separate question with dexseq tag.