Down/Up-regulated genes asymmetry in edgeR differential expression analysis
2
0
Entering edit mode
Pauly Lin ▴ 160
@pauly-lin-7537
Last seen 9.2 years ago
University of New South Wales, Australia

Dear all,

I have performed edgeR differential analysis on RNA-Seq data with six samples (3 vs 3). edgeR finds around 30 up regulated genes but more than 100 down regulated genes. Should I be concerned with this big difference? I have also used limma to perform differential analysis on microarray data from the same individuals, and there's no such asymmetry in the number of down and up regulated genes. I have been told that edgeR assumes that the number of up regulated genes is similar to the number of down regulated genes - is that true?

Thanks!

Paul

edgeR rnaseq • 3.4k views
ADD COMMENT
2
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 1 hour ago
The city by the bay

In the bigger scheme of things, this asymmetry isn't particularly dramatic. If you had, say, 30 up-regulated genes and 3000 down-regulated genes, that would be a bit more interesting. As it is now, I wouldn't worry about it, as the numbers involved are too low to be of concern.

Of course, it's worth pointing out that asymmetry isn't a problem in most cases. The affected part of the analysis is that of TMM normalization, in the calcNormFactors function. In TMM normalization, the 30% of most extreme M-values on either side (i.e., up- or down-regulated) are trimmed away, and normalization is performed with the M-values of the remaining (presumably non-DE) genes. As long as the DE proportions on either side do not exceed 30%, normalization will be okay.

So, what you've been told (or at least, how you're saying it) is mostly wrong. There are still slivers of truth, though. Firstly, at the maximum number of DE genes that TMM normalization can tolerate (60% of total), they must be split evenly between up- and down-regulation in order to avoid exceeding the 30% threshold on either side. Secondly, if you have pronounced asymmetry, normalization will become less accurate as trimming will start eating into non-DE genes on the side without any DE genes. This will distort the M-value distribution of non-DE genes, leading to a biased estimate. However, this asymmetry needs to be fairly extreme to have an effect.

ADD COMMENT
0
Entering edit mode
Pauly Lin ▴ 160
@pauly-lin-7537
Last seen 9.2 years ago
University of New South Wales, Australia
Thanks, Aaron! Paul
ADD COMMENT

Login before adding your answer.

Traffic: 483 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6