Hello,
My understanding is that it is possible to add a flag for GC content correction when quantifying reads using salmon but that this is really meant for paired-end data. I'm wondering about what options there are to perform GC content correction on single-end RNA-seq data and to what extent it would be recommended to perform such a correction prior to differential gene expression analysis?
To this wend, I was wondering whether it makes sense/would be recommended to perform GC content correction using the EDASeq R package prior to normalization and differential gene expression analysis using the DESeq2 workflow.
In a way, I was thinking that since each gene is compared to itself across samples in DGE analysis, the GC content differences across genes may not play an important role, but was not sure if this is correct.
Many thanks!
Maya
Great, thanks! I do have the MultiQC output and will review them for the GC content curves. Right now, I'm doing DGE but would also like to look at differential transcript usage using something like the DRIMSeq package.
If the GC curves look similar across samples and are roughly representative of the transcriptome (e.g. there are some 30% GC reads and some 70% GC reads), you can skip GC content correction. I didn't do extensive testing of the single end version.