CQN and EdgeR Library Size for Normalization
2
0
Entering edit mode
shankasal • 0
@shankasal-15611
Last seen 6.4 years ago

I'm performing quantile normalization with CQN and then using edgeR on some ATAC-seq samples I have and I'm trying to understand/determine the following:

When setting the values for library size, Should I use the sum of read counts that fall within the peaks from the total peak (performed for each sample) or should I use the total aligned reads per sample. 

Thanks

CQN atac-seq normalization edgeR • 1.5k views
ADD COMMENT
2
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 1 hour ago
The city by the bay

I have tended to use the total aligned reads per sample for edgeR's lib.size when performing differential binding analyses, because it is easier to interpret as sequencing depth. Any global increases or decreases in binding (or in this case, accessibility) between conditions would alter the proportion of reads in peaks, conflating technical differences in sequencing depth with actual biological differences in chromatin structure.

For the actual differential analysis, though, it barely matters. The CQN offsets will override any library size specification - and more generally, if you computed TMM normalization factors, they would also compensate for any differences in the library size specification. A different set of library sizes will alter the calculation of the average log-CPMs and predicted log-fold changes, but this should be a very modest effect.

ADD COMMENT
0
Entering edit mode
shankasal • 0
@shankasal-15611
Last seen 6.4 years ago

Thanks Aaron, that's a satisfying answer. I had been using the total aligned reads and will continue as such.

ADD COMMENT

Login before adding your answer.

Traffic: 561 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6