I have read this paper (https://academic.oup.com/nar/article/40/10/4288/2411520?login=false) and got interested in understanding a little bit more about BCV estimates and how it might relate to some 'true' population-level BCV. As I understand there are three different estimates - a common BCV for all genes, a BCV for each abundance quantile bin and a BCV for each gene based on some middle ground for low-sample-size experiments. The idea is to identify mean differences regardless of differences between extra-Poissonian mean-dependent variances.
Is there a particular scenario and/or option in the package to estimate BCV for each gene directly, e.g., if I have a sample size large enough where this quantity might be closer to the true BCV?
Furthermore, is there a particular reason why the BCV is defined as a variance-squared mean relationship? (Do most genes follow this relationship? Perhaps some genes could be better modelled by a variance-mean cubed relationship?)
Cheers, James
I believe the published study provides answers to your questions. Yunshun Chen and Pedro Baldoni's recent papers on edgeR may also be of interest to you https://gksmyth.github.io/pubs/index.html Block Blast
Thank you Gordon and Joseph. I presume question (1) would be answered by using a prior G0 of 0?
For question (2), I was expecting a biological reason (noisier gene regulatory network) more than a technical reason (e.g., the parameter is explicitly modellable using NB models as a non-Poisson component). I've heard in some papers attempting to estimate a biological variability in a single condition arrive at a 'biological variance' using degrees higher than 3 (e.g., https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-020-00842-z + https://github.com/ljljolinq1010/expression-noise-across-fly-embryogenesis/blob/master/scripts/noiseFunction.R).