Within the function momentsDisEstimate()
of DESeq2
a rough method-of-moments estimate of the mean counts is derived by this equation (bv - xim*bm)/bm^2
. (I am aware that this is just an initial set to maximize Cox-Reid adjusted likelihood of the gene-wise dispersion estimat.)
So I was curious and tried to calculate the MoME myself - but I always came to something looking like the inverse of it.
Unfortunately I am not aware how to write mathematical formulas on the support bioconductor page. Thats why I formulated a question on:
https://stats.stackexchange.com/questions/399685/deseq2-method-of-moments-negative-binomial
momentsDispEstimate <- function(object) {
xim <- if (!is.null(normalizationFactors(object))) {
mean(1/colMeans(normalizationFactors(object)))}
else { mean(1/sizeFactors(object)) }
bv <- mcols(object)$baseVar
bm <- mcols(object)$baseMean
(bv - xim*bm)/bm^2 }
Thank you for your time and answer.
Thank you for your quick answers.
so it comes from the relationship sigma^2 = \mu + \alpha \mu^2 ?
It still remains a question to me why it is:
(bv - xim*bm)/bm^2 }
and not
(bv - xim*bm)/(xim*bm)^2 }
or
(bv - bm)/bm^2
I don't remember the details, but I believe I profiled both of these options and went with the code that is there. Remember, this value is only used as part of a process to pick an initial point for the line search, so it's ok to be very rough. This particular estimate is used as an upper bound for the much better estimate of dispersion from
roughDispEstimate()
. I think the latter was sometimes giving implausibly high values and so this estimate we are talking about is to stabilize the initialization process.