Question

Highly uneven mapping rates and counts between samples

0

Entering edit mode

chris86 ▴ 420

@chris86-8408

Last seen 4.9 years ago

UCL, United Kingdom

Hi

I have Illumina mRNA-seq samples where it seems because of low RINs (2-4) in a bunch of them compared to the others, I am getting very widely varying mapping rates (15%-70%) and therefore counts per sample (e.g. 8,000,000 mapped reads vs 40,000,000). Plus I can't really use RIN/mapping rate as a covariate because it is very confounded with a group of interest.

Is there a preferred way of analyzing this type of data? If I do the usual VST through DESEQ2 I get a cluster of samples with irregular high expression of a lot of genes, also the ones with low numbers of overall sample counts, presumably this is because of what I describe above. I was wondering if quantile normalisation would help or are there any other ideas?

I also used Salmon to quantify the data using the gc bias and validate mappings flags. Reads are 150bp.

Thanks,

Chris

limma normalization deseq2 • 759 views

ADD COMMENT • link updated 5.8 years ago by Aaron Lun ★ 28k • written 5.8 years ago by chris86 ▴ 420

score 0 · Answer 1 · 2019-02-06

I'll answer from the limma side. There's a variety of possibilities:

Using voomWithQualityWeights() will help if the reduction in sample quality manifests as increased variation without introducing bias. This will downweight the contribution of the affected samples to the variance estimate.
If bias is introduced in a systematic manner, normalization may help. For example, normalizeCyclicLoess() will remove trended biases with respect to abundance. Packages like EDASeq use additional covariates to correct for other effects like GC content, gene length, etc., which may be helpful if the library preparations are so variable across samples. I feel quantile normalization is too aggressive for my tastes, I would not expect the same distribution from two different conditions.
If bias is introduced in a gene-specific manner, and quality is confounded with your group of interest... you're stuffed.