If transcript-to-gene conversion is needed in kallisto > tximport > DESeq2 pipeline
1
0
Entering edit mode
Yunlu Zhu • 0
@yunlu-zhu-15240
Last seen 6.8 years ago
USA

Hi guys, 

I'm new to bioinformatics so this may be a naive question. My workflow:

1.  tximport function to create the txi data frame from .h5 kallisto files
2.  DESeqDataSetFromTximport function to generate DESeq data set

Question:

Do I need to generate gene-level unnormalized counts during the import*? or just to use transcript-level counts**?

---

I found the article "Importing transcript abundance datasets with tximport" very helpful but still got confused with the following paragraph:

Note: there are two suggested ways of importing estimates for use with differential gene expression (DGE) methods. The first method, which we show below for edgeR and for DESeq2, is to use the gene-level estimated counts from the quantification tools, and additionally to use the transcript-level abundance estimates to calculate a gene-level offset that corrects for changes to the average transcript length across samples. The code examples below accomplish these steps for you, keeping track of appropriate matrices and calculating these offsets. For edgeR you need to assign a matrix to y$offset, but the function DESeqDataSetFromTximport takes care of creation of the offset for you. Let’s call this method “original counts and offset”. 

Thanks in advance,
Yunlu

*

txi.kallisto.gl <- tximport(files, type = "kallisto", tx2gene = tx2gene)

**

txi.kallisto <- tximport(files, type = "kallisto", txOut = TRUE)​
deseq2 rnaseq kallisto tximport • 2.8k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 16 hours ago
United States

Here’s my comment to this Q on another thread:

C: Incredibly high/low foldChange

To repeat from that thread, it works and will find DE at the transcript level, but you should consider to apply a stricter padj threshold. And we are meanwhile working on improvements.

ADD COMMENT
0
Entering edit mode

Yes, the DE analysis seems to work well. but for gene ontology and pathway-level analysis, is it necessary to generate a gene-level dds? Thanks!

ADD REPLY
0
Entering edit mode

You could use stageR or the aggregation methods from DEXSeq to combine to gene level. Yes, the gene set methods need gene level results.

ADD REPLY
0
Entering edit mode

I'll just use tximport (txOut = FALSE and tx2gene) to estimate abundance at gene level, then proceed to the DESeq2 and following analysis. Thank you very much Mike! This is very helpful!

ADD REPLY

Login before adding your answer.

Traffic: 1078 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6