EdgeR Biological CVs
1
0
Entering edit mode
James • 0
@ca4d237a
Last seen 9 days ago
Germany

I have read this paper (https://academic.oup.com/nar/article/40/10/4288/2411520?login=false) and got interested in understanding a little bit more about BCV estimates and how it might relate to some 'true' population-level BCV. As I understand there are three different estimates - a common BCV for all genes, a BCV for each abundance quantile bin and a BCV for each gene based on some middle ground for low-sample-size experiments. The idea is to identify mean differences regardless of differences between extra-Poissonian mean-dependent variances.

Is there a particular scenario and/or option in the package to estimate BCV for each gene directly, e.g., if I have a sample size large enough where this quantity might be closer to the true BCV?

Furthermore, is there a particular reason why the BCV is defined as a variance-squared mean relationship? (Do most genes follow this relationship? Perhaps some genes could be better modelled by a variance-mean cubed relationship?)

Cheers, James

edgeR • 361 views
ADD COMMENT
1
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

I think your questions are answered by the published paper. You might also like to consult recent papers on edgeR by Yunshun Chen and Pedro Baldoni (listed at https://gksmyth.github.io/pubs/index.html ).

Added later

I notice that you added a followup comment to a spam bot post, but your comment was removed when the spam posting was removed.

I don't know what you mean by estimating the BCV "directly". The BCV is the CV of an unobserved quantity, so no "direct" estimate is possible. edgeR provides the theoretically best estimate of the BCV regardless of sample size. If the sample size is very large, then edgeR's empirical Bayes moderation will be correspondingly less important, but you can never make the estimate closer to the true BCV (in expectation) by turning the empirical Bayes moderation off.

The above published papers give a argument why the mean-variance relationship should be approximately quadratic. This argument does not depend on the negative binomial distribution and it is not a matter of mathematical convenience. It also has nothing to do with whether the regulatory network is noisy or not. The quadratic mean-variance relationship is derived before the negative binomial distribution is introduced.

You might read the section called "Technical and biological variation produce a quadratic mean-variance relationship" in the edgeR v4 paper: https://doi.org/10.1093/nar/gkaf018

ADD COMMENT

Login before adding your answer.

Traffic: 441 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6