Difference between gene-level test and transcript-level test with RATs (Relative Abundance of Transcrits)
1
0
Entering edit mode
@useranonyme-20431
Last seen 5.2 years ago

Hi

I'm trying to achieve a Differential Transcript Usage (DTU) analysis, in order to find genes that are differently alternatively spliced between my two conditions.

The package RATs seems to allow this and I saw that two methods were used to find DTU genes:

At the gene level, RATs compares the set of each gene’s isoform abundances between the two conditions to identify if the abundance ratios have changed. At the transcript level, RATs compares the abundance of each individual transcript against the pooled abundance of its sibling isoforms to identify changes in the proportion of the gene’s expression attributable to that specific transcript.

I don't understand what is the difference between the 2 tests, I made a diagram to try to illustrate what I understood from the test transcripts-level:

https://i.ibb.co/pPx609Y/Capture-d-cran-2019-07-09-20-35-39.png

Is it right ?

And so what is the difference with the gene-level test ?

Thank's by advance

RATs RNA-Seq Isoforms DTU expression • 2.3k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

I have deleted the post on biostar

ADD REPLY
1
Entering edit mode
fruce_ki ▴ 20
@fruce_ki-13220
Last seen 5.4 years ago
Austria/Vienna/Research Institute for M…

Hi,

Thank you for taking an interest in RATs.

Say you have two conditions A and B and a gene with isoforms 1, 2, 3, etc...

The gene level test compares the full set of abundances (A1, A2, A3, ...) against the full set of abundances (B1, B2, B3, ...). This can tell you in which genes the ratios change, but it can't tell you which isoforms are responsible for the change.

For the transcript level test all sets have only two components, the abundance of the isoform in question and the abundance of all the other isoforms together. So it compares (A1, A2+A3+...) vs (B1, B2+B3+...) for transcript 1, (A2, A1+A3+...) vs (B2, B1+B3+...) for transcript 2, (A3, A1+A2+...) vs (B3, B1+B2+...) for transcript 3, etc. This can tell you when the abundance of that specific isoform changes.

Does that make sense?

PS. I am a little surprised to find this question on bioconductor, as I have not yet submitted the package to bioconductor.

EDIT: I should probably note that when I say abundances, I mean the scaled TPM values, not the proportions. The size of the counts is important in determining significance, proportions lose that information.

ADD COMMENT
0
Entering edit mode

Thank's for this explanation, I see better !

I have a last question, for exemple in a gene-level test we have this abundances for the 3 transcripts between 2 conditions:

g.test.2(obsx= c(100, 200, 300), obsy= c(5000, 200, 300))

The g.test.2 will be significative only because the first transcript saw his abundance greatly increased, even if the other transcripts have not changed ?

So this gene will be DTU only because the gene is also DEG ?

I probably miss something and I apologize if my question seems silly aha

EDIT: so if I understand if we take the scaledTPM values this solves the problem

ADD REPLY
0
Entering edit mode

You can have any combination of DGE and DTE and DTU going on at the same time, and they are all valid.

In your example, you have one DTE isoform (from 100 to 5000) and the whole gene being DGE (from 600 to 5500) andv the gene also being DTU (from 0.17/0.33/0.5 to 0.91/0.04/0.05). That isoform will probably be flagged as DTU (0.17 to 0.91), but the other two may also be flagged as DTU (0.33 to 0.04, 0.50 to 0.05), even though they are not DTE. You also have a primary isoform switch going on.

I would not say the gene is DTU because it is DGE. There are many regulatory mechanisms acting at various levels, so unless you actually know what regulatory change occurred, you can't say that one observation is causing the other. They are all just different views of the same complex event.

ADD REPLY
0
Entering edit mode

Ok, thank's ! :-)

I compare my list od DEG and my list of DTU, and the overlap is small, so the exemple that I used ( obsx= c(100, 200, 300), obsy= c(5000, 200, 300) ) is maybe rare biologically speaking

When I compare the results with rMATS the overlap with RATs is way better ( not surprising as both approaches study splicing )

ADD REPLY
0
Entering edit mode

DGE and DTU are different things, regulated by different mechanisms. You can have one or the other or both. I would not normally expect a high overlap between DGE and DTU genes.

ADD REPLY
0
Entering edit mode

Also, I'm not sure why you are using the g.test.2 function directly. Do you only have one gene to test?

You should use TPMs scaled to the size of the sample's library, but I'm not sure what "problem" you think this will fix. You didn't mention a problem, it was just a question about test design.

ADD REPLY
0
Entering edit mode

No it was just to see an example with a single gene, I have the scaled TPM for all of the genes

ADD REPLY

Login before adding your answer.

Traffic: 957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6