Differential expression across gene features
3
2
Entering edit mode
Jake ▴ 90
@jake-7236
Last seen 2.4 years ago
United States

I'm interested in in comparing counts for different features within a gene within one condition (say 3' UTR counts vs CDS counts) in RNA-seq data. Initially I was hoping to treat these essentially as different samples so just plug it into limma or DESeq2 and treat region 1 as sample 1 and region 2 as sample 2. However, these features have different lengths so I need to normalize counts for length, but these programs expect raw counts as input. Is there anyway that I can do differential expression on TPM or somehow put length information into these programs?

Thanks

limma deseq2 differential gene expression • 1.6k views
ADD COMMENT
1
Entering edit mode
@gordon-smyth
Last seen 18 hours ago
WEHI, Melbourne, Australia

You cannot test for a difference in expression level between gene features in limma or edgeR. This is definitely not something that these packages are designed to do. (I imagine the same applies to DESeq2.) As Aaron has outlined, there would be a host of problems with trying to do that.

You can however test whether the relative abundance of 3'UTR reads to CDS reads changes between treatment conditions. In limma, this is done by diffSplice(). In edgeR, it is diffSpliceDGE().

If you really just want to compare expression of 3'UTR to CDS directly, I would examine this by making coverage plots for your favourite genes (for example using IGV or IGB).

ADD COMMENT
0
Entering edit mode
@ryan-c-thompson-5618
Last seen 12 weeks ago
Icahn School of Medicine at Mount Sinai…

You are probably looking for something like DEXSeq, which is designed for finding differential usage of gene features.

ADD COMMENT
0
Entering edit mode

I believe that DEXSeq still compares two different samples so that gene/exon length cancels out. I didn't see anywhere that I could enter the length of different features. Is there?

ADD REPLY
1
Entering edit mode

Directly comparing coverage between different genomic regions tends to be a rather thankless task. Length of the regions aside, you've also got considerations like GC content, mapping/sequencing biases, etc. that affect the coverage in each region. If you want to use limma, consider whether it's possible to rephrase your biological question in terms of "differential DE", i.e., test for differences between regions in the DE log-fold change between conditions. This is a safer alternative as the aforementioned biases cancel out within each region, such that you can compare log-fold changes between regions.

ADD REPLY
0
Entering edit mode

It sounds like you want to compare coverage depth in different regions within a single sample. I don't know there are any count-based tools that would work for that purpose.

ADD REPLY

Login before adding your answer.

Traffic: 614 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6