Question

DESeq2 - Method-of-moments - negative binomial

0

Entering edit mode

Fischer-philipp ▴ 30

@fischer-philipp-18490

Last seen 2.9 years ago

Austria

Within the function momentsDisEstimate() of DESeq2 a rough method-of-moments estimate of the mean counts is derived by this equation (bv - xim*bm)/bm^2. (I am aware that this is just an initial set to maximize Cox-Reid adjusted likelihood of the gene-wise dispersion estimat.)

So I was curious and tried to calculate the MoME myself - but I always came to something looking like the inverse of it.

Unfortunately I am not aware how to write mathematical formulas on the support bioconductor page. Thats why I formulated a question on:

https://stats.stackexchange.com/questions/399685/deseq2-method-of-moments-negative-binomial

momentsDispEstimate <- function(object) {
xim <- if (!is.null(normalizationFactors(object))) {
mean(1/colMeans(normalizationFactors(object)))}
else { mean(1/sizeFactors(object)) }
bv <- mcols(object)$baseVar
bm <- mcols(object)$baseMean
(bv - xim*bm)/bm^2 }

Thank you for your time and answer.

deseq2 method-of-moments negative binomial • 998 views

ADD COMMENT • link updated 6.1 years ago by Michael Love 43k • written 6.1 years ago by Fischer-philipp ▴ 30

score 1 · Answer 1 · 2019-03-27

1

Entering edit mode

Michael Love 43k

@mikelove

Last seen 3 days ago

United States

There are two definitions of “dispersion” parameter for the negative binomial, one is the inverse of the other.

ADD COMMENT • link 6.1 years ago Michael Love 43k

0

Entering edit mode

Thank you for your quick answers.

so it comes from the relationship sigma^2 = \mu + \alpha \mu^2 ?

It still remains a question to me why it is: (bv - xim*bm)/bm^2 }

and not

(bv - xim*bm)/(xim*bm)^2 }

or

(bv - bm)/bm^2

ADD REPLY • link 6.1 years ago Fischer-philipp ▴ 30

0

Entering edit mode

I don't remember the details, but I believe I profiled both of these options and went with the code that is there. Remember, this value is only used as part of a process to pick an initial point for the line search, so it's ok to be very rough. This particular estimate is used as an upper bound for the much better estimate of dispersion from roughDispEstimate(). I think the latter was sometimes giving implausibly high values and so this estimate we are talking about is to stabilize the initialization process.

ADD REPLY • link 6.1 years ago Michael Love 43k