Filtering Rna-seq counts
1
0
Entering edit mode
Aurora ▴ 20
@aurora-15104
Last seen 6.0 years ago

Hi,

Filtering Rna-seq counts before performing differential expression analysis is generally recommanded. I wonder why is it recommanded?  What makes the analysis better that if no filtering was performed.?

 

Thank you for answers

 

rna-seq filtering • 1.3k views
ADD COMMENT
1
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

To quote from the edgeR Workflow:

"Genes that have very low counts across all the libraries should be removed prior to downstream analysis. This is justified on both biological and statistical grounds. From biological point of view, a gene must be expressed at some minimal level before it is likely to be translated into a protein or to be considered biologically important. From a statistical point of view, genes with consistently low counts are very unlikely be assessed as significantly DE because low counts do not provide enough statistical evidence for a reliable judgement to be made. Such genes can therefore be removed from the analysis without any loss of information."

Filtering improves dispersion estimation (because one doesn't try to estimate dispersions for genes with no information), improves statistical power (because it reduces the amount of testing) and decreases computation. Most important of all, filtering allows good empirical Bayes estimation across genes because it makes the remaining genes more homogeneous.

ADD COMMENT
0
Entering edit mode

Thanks a lot !

ADD REPLY

Login before adding your answer.

Traffic: 778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6