MEDIPS - RPKM
2
0
Entering edit mode
malbert1 • 0
@malbert1-6922
Last seen 9.7 years ago
Australia

I was hoping someone could please help explain how RPKM values are calculated using the MEDIPS package?

From the source code (v1.16) it says RPKM values are calculated the following way:

 (genome_count(Set)*10^9)/(window_size*number_regions(Set))

I am assuming that genome_count(Set) is referring to  "number of reads detected in each bin" and that window_size it the bin size. However, I am a bit unsure what number_regions(Set) refers to? Is that the number of windows generated for the reference genome or something else? And how does this compare/deviate from calculating the RPKM values by dividing the genome_count*10^9 with the total read count (sum(genome_count(Set)))

Also, does anyone know whether the strategy for calculating RPKM values has changes between MEDIPS versions (fx. between v1.6 and v1.16)?

Cheers,

Maria

 

 

 

MEDIPS • 1.7k views
ADD COMMENT
1
Entering edit mode
Lukas Chavez ▴ 570
@lukas-chavez-5781
Last seen 6.8 years ago
USA/La Jolla/UCSD
Dear Maria, your interpretations of genome_count(Set) and window_size are correct, but number_regions(Set) is the total number of reads considered for calculating coverage at the genome wide windows (or 'total read count' as you say). It differs from the sum over the counts of all windows, because MEDIPS counts each (extended) read at all window it overlaps (this is done because it is unknown at which position of the enriched DNA fragment the mark is located). MEDIPS version 1.10.0 was a major update where also the way of calculating rpkm values was changed (see also the section ‘Updates' of the MEDIPS vignette). For example, the window size was not considered previously. All the best, Lukas > On 25 Mar 2015, at 15:53, malbert1 [bioc] <noreply@bioconductor.org> wrote: > > Activity on a post you are following on support.bioconductor.org <https: support.bioconductor.org=""/> > User malbert1 <https: support.bioconductor.org="" u="" 6922=""/> wrote Question: MEDIPS - RPKM <https: support.bioconductor.org="" p="" 66027=""/>: > > > I was hoping someone could please help explain how RPKM values are calculated using the MEDIPS package? > > From the source code (v1.16) it says RPKM values are calculated the following way: > > (genome_count(Set)*10^9)/(window_size*number_regions(Set)) > > I am assuming that genome_count(Set) is referring to "number of reads detected in each bin" and that window_size it the bin size. However, I am a bit unsure what number_regions(Set) refers to? Is that the number of windows generated for the reference genome or something else? And how does this compare/deviate from calculating the RPKM values by dividing the genome_count*10^9 with the total read count (sum(genome_count(Set))) > > Also, does anyone know whether the strategy for calculating RPKM values has changes between MEDIPS versions (fx. between v1.6 and v1.16)? > > Cheers, > > Maria > > > > > > You may reply via email or visit MEDIPS - RPKM >
ADD COMMENT
0
Entering edit mode
malbert1 • 0
@malbert1-6922
Last seen 9.7 years ago
Australia

Thanks for your reply Lukas and sorry for the confusion! 

Cheers,

Maria

 

ADD COMMENT

Login before adding your answer.

Traffic: 956 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6