Shoud I use "aligned reads" or total reads (aligned + unassigned) to the RPKM value?
1
0
Entering edit mode
@gustavoborin01-6892
Last seen 10.2 years ago

Dear all,

I'm recalculating the RPKM value of a RNASeq data on Rsubread through featureCounts function, and I'd like to know if should I use just the "assigned" reads or the total reads, including "unassigned ambiguity, multimapping..." (see below), in the RPKM formula. Looking for the answer in forums and in the Mortazavi et al. (2008), I've just find out that " N is the total number of mappable reads in the experiment". So, could anybody please help in this regards?

RPKM = N/(L*T) 

where: 

N: number of reads assigned to a gene

L: lenght of the gene (kb)

T: total mapped reads (Millions)

 

T_reesei_F24.1_GGCTAC_L008_R1_001.cleanreads.fastq.gz_tophat2.F24h.1_accepted_hits.bam  
Assigned 32270962
Unassigned_Ambiguity 6896
Unassigned_MultiMapping 116803
Unassigned_NoFeatures 10751746
Unassigned_Unmapped 0
Unassigned_MappingQuality 0
Unassigned_FragementLength 0
Unassigned_Chimera 0

Thanks in advance! 

rnaseq R rsubread rpkm • 1.7k views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

You should use the assigned reads only. For your purposes, the library size consists of all the reads that you will be using to infer transcript abundance, not the total number of reads that you generated.
 

ADD COMMENT

Login before adding your answer.

Traffic: 459 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6