How to deal with up to 75% of DE genes in RNA-seq (TMM, edgeR) ?
1
0
Entering edit mode
aec ▴ 90
@aec-9409
Last seen 4.4 years ago

Dear all,

The drug effects are so drastic that the cell lines have more than 75% of the genes differentially expressed (3 control vs 3 treatment). I used the TMM normalization and EdgeR for the analysis, which assume a maximum of 60% of differentially expressed genes. How to proceed with such a case? Should I trust the results if the assumptions are violated?

Thanks,

 

differential expression TMM normalization rnaseq edger • 2.8k views
ADD COMMENT
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 8 hours ago
The city by the bay

Short answer: if the normalization assumptions are violated, you cannot trust the DE analysis results.

Generally, I would not consider the number of DE genes to be indicative of whether TMM normalization is okay or not. If you have a very high-powered experiment, you will detect many DE genes at a given significance threshold, even if most of them have near-zero log-fold changes and do not cause any meaningful violation of TMM's assumption. Conversely, failure to detect many DE genes does not mean that TMM normalization is suitable, given that the DE analysis already assumes that normalization was correct.

Your case is interesting in that you have detected many DE genes despite not having a particularly high-powered experimental design. This suggests that the log-fold changes of the ~15% of untrimmed DE genes are large - at least 0.5, perhaps? - which would affect the accuracy of the computed normalization factors. If you are concerned about DE, you could try increasing the trimming proportion to 80% (logratioTrim to 0.4). However, there is also the possibility that the effect of drug treatment is simply too drastic, and that the entire transcriptome is changing en masse, e.g., due to apoptosis. You would probably need spike-ins to have any chance of normalizing in this situation.

P.S. If this is an edgeR question, put an edgeR tag on your post.

ADD COMMENT
0
Entering edit mode

Aaron, the logratioTrim is a parameter of edgeR I can change ?

The option of adding spike-ins is not duable right now, would it be a proper approximation of trusting only DE genes with extreme fold change i .e FC >2 or FC>4 ?

ADD REPLY
1
Entering edit mode

1) You can set logratioTrim as an argument in calcNormFactors.

2) The problem is that you don't know how wrong your normalization is. Consider an example where most genes decrease in abundance by 10-fold in your treated cells. After normalization, the majority of genes would appear to be non-DE, and you would instead observe 10-fold "upregulation" for genes that did not change in abundance. So it's hard to say whether an extreme log-fold change is likely to be correct when the normalization cannot be trusted.

ADD REPLY

Login before adding your answer.

Traffic: 826 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6