Hi,
I am trying to understand and compare the DESeq2 model and the BNB-R (https://github.com/siamakz/BNBR) model. The corresponding references are:
- DESeq2: Love, M. I., Huber, W., & Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15(12), 550. https://doi.org/10.1186/s13059-014-0550-8
- BNB-R: Dadaneh, S. Z., Zhou, M., & Qian, X. (2018). Bayesian negative binomial regression for differential expression with confounding factors. Bioinformatics, 34(19), 3349–3356. https://doi.org/10.1093/bioinformatics/bty330
My understanding of the BNB-R model is that it regards the sample-specific size factor r_j
of the negative binomial distribution as a parameter that has to be estimated through Bayesian inference (i.e. sampling from its posterior). In DESeq2, there is a pre-estimated sample-specific size factor s_j
included in the mean, but there is also the dispersion parameter alpha_i
. Therefore, am I right that DESeq2 imposes additional overdispersion (having the pre-estimated size factors s_j
as well as alpha_i
)?
Thanks for your quick reply. And sorry, maybe I have to clarify my thoughts a bit. In DESeq2, the variance of the negative binomial distribution of a count
K_ij
(withi
indexing the gene andj
the sample) isVar(K_ij) = mu_ij + alpha_i * mu_ij^2 = s_j * exp(x_j^T * beta_i) + alpha_i * (s_j * exp(x_j^T * beta_i))^2
. And as far as I understand the BNB-R model, we there have justVar(K_ij) = r_j * exp(x_j^T * beta_i) + 1/r_j * (r_j * exp(x_j^T * beta_i))^2
. Am I correct? So ifr_j
from the BNB-R model corresponds tos_j
in the DESeq2 model, shouldn't then bealpha_i = 1/r_j
? You're right, "additional overdispersion" is probably not the correct term. Perhaps I should have said "alternative dispersion parameterization".I haven’t read that paper yet, so I don’t know how the models map to each other.