Question

Conversion from counts to FPKM

0

Entering edit mode

chiara.facciotto • 0

@chiarafacciotto-18662

Last seen 4.9 years ago

Hi,

I am trying to use DESeq2 to convert raw counts to fpkm, so I can compare gene aboundance across genes and not only across samples. I have a couple of questions on how to do so:

Should I first normalize the counts and transform them with vst and then use the fpkm() function, or should I simply input the raw counts and the fkpm() function will then take care of normalization as well?
How do I make sure that the genes in my GRanges object containing the gene lengths match the genes in the dss object?

Thank you!

counts fpkm fpkm() normalization deseq2 • 16k views

ADD COMMENT • link updated 18 months ago by Hamza • 0 • written 6.4 years ago by chiara.facciotto • 0

score 2 · Answer 1 · 2018-12-13

2

Entering edit mode

Michael Love 43k

@mikelove

Last seen 2 hours ago

United States

You should input the raw counts and then use `fpkm()` to generate FPKM values. The default is to use a robust estimate of library size (median ratio normalization) in place of the total count which is a sub-optimal estimator. Note that FPKM are not variance stabilized.

It is up to you to provide exonic basepair lengths, we don't have any code for that. An easier approach is to use a pipeline, such as Salmon followed by tximport, which keeps all the information together for you (and provides a much more accurate estimate of the length of the gene, using the average transcript length, as opposed to the sum of the exonic basepairs).

ADD COMMENT • link 6.4 years ago Michael Love 43k

0

Entering edit mode

Chiara, as per Michael, FPKM units are not variance stabilised, and neither are they comparable across samples. There is no cross-sample normalisation employed when deriving FPKM expression units.

ADD REPLY • link 6.1 years ago Kevin Blighe ★ 4.0k

0

Entering edit mode

Hi Michael, I have a dataset which seems to be heavily influenced by batch effects. Is it possible to remove batch effects using limma and then use the fpkm function (or another function) to calculate FPKM values? For a certain package I need to use fragment size normalized values

ADD REPLY • link 18 months ago Hamza • 0

score 0 · Answer 2 · 2019-03-23

0

Entering edit mode

Ahmed Alhendi • 0

@ahmed-alhendi-20292

Last seen 3.7 years ago

University of Leicester, UK

Try countToFPKM package. This package provides an easy to use function to convert the read count matrix into FPKM matrix. Implements the following equation:

$enter image description here$

The fpkm() function requires three inputs to return FPKM as numeric matrix normalized by library size and feature length:

counts A numeric matrix of raw feature counts.
featureLength A numeric vector with feature lengths that can be obtained using
biomaRt.
meanFragmentLength A numeric vector with mean fragment lengths,
which can be calculate with
Picard using CollectInsertSizeMetrics.

Also see https://github.com/AAlhendi1707/countToFPKM

ADD COMMENT • link 6.1 years ago • updated 6.0 years ago Ahmed Alhendi • 0

0

Entering edit mode

Hi! I have small RNA data with raw counts and i want to correlate it with long RNA data with FPKM values. My question is is this countTOPFKM package is suitable for small RNA data?

ADD REPLY • link 6.0 years ago arpanamv83 • 0