Hello,
I get this "y = estimateDisp(y, design, robust = TRUE)" running forever even that the trio estimateGLM runs just fine (and I am able to finish the DE analysis with them and using glmFit and glmLRT).
Therefore, is it safe to replace "y = estimateDisp(y, design, robust = TRUE)" with the trio estimateGLM when using glmQLFit and glmQLTest?
Cheers,
Daniel
> That's strange,
estimateDisp
should be faster than the trio.In my case is the other way around. estimateDisp runs ~30 minues while the trio estimateGLM run in less than 2 minutes.
> it's odd that one would run forever and the others would be faster
I agree.
> Are you using the latest version of R/Bioconductor/edgeR?
I use edgeR_3.12.0
> Make sure that there are no libraries with zero library sizes/non-finite offsets.
I checked and this is not the case.
> Have you filtered out low abundance genes?
Of course.
> If you can, call
debug(estimateDisp)
and step through the function until> you get to the part that stalls; this would be helpful for us to figure out what's going on.
I need to look into this.
As, I have stated in my previous post if I use the "old" approach (i.e. trio of estimateGLM, glmFit and glmLRT) everything goes fine, quickly, and smoothly. On same data (and same contrasts and same designs and same filtering) if I switch to QLF approach (i.e. estimateDisp, glmQLFit and glmQLTest) it goes fine except that "y = estimateDisp(y, design, robust = TRUE)" takes ~30 minutes. Indeed I have a very complex design (e.g. several time points, several batches, several treatments, several controls, etc.) and therefore the filtering step (for low counts) cannot be very effective as in cases with one treatment versus one control (which might lead to having PARTIALLY low-count genes for some groups of samples or time points).