I'm interested in in comparing counts for different features within a gene within one condition (say 3' UTR counts vs CDS counts) in RNA-seq data. Initially I was hoping to treat these essentially as different samples so just plug it into limma or DESeq2 and treat region 1 as sample 1 and region 2 as sample 2. However, these features have different lengths so I need to normalize counts for length, but these programs expect raw counts as input. Is there anyway that I can do differential expression on TPM or somehow put length information into these programs?
Thanks
I believe that DEXSeq still compares two different samples so that gene/exon length cancels out. I didn't see anywhere that I could enter the length of different features. Is there?
Directly comparing coverage between different genomic regions tends to be a rather thankless task. Length of the regions aside, you've also got considerations like GC content, mapping/sequencing biases, etc. that affect the coverage in each region. If you want to use limma, consider whether it's possible to rephrase your biological question in terms of "differential DE", i.e., test for differences between regions in the DE log-fold change between conditions. This is a safer alternative as the aforementioned biases cancel out within each region, such that you can compare log-fold changes between regions.
It sounds like you want to compare coverage depth in different regions within a single sample. I don't know there are any count-based tools that would work for that purpose.