tximport, Kallisto and Deseq2 (quick answer)
1
0
Entering edit mode
Mozart ▴ 30
@mozart-20625
Last seen 4.1 years ago

Hello there, I am trying to analyse a dataset using deseq2 with kallisto via tximport. I am using the following code:

tximport(files, type = "kallisto", tx2gene = tx2gene, ignoreTxVersion = TRUE) So, in this case the counts from abundances are 'no', I guess (because I am not specifying so It should come as default setting). These counts are not normalised for length bias, right? But if I got it right, because of DESeqDataSetFromTximport function:

Note: there are two suggested ways of importing estimates for use with differential gene expression (DGE) methods. The first method, which we show below for edgeR and for DESeq2, is to use the gene-level estimated counts from the quantification tools, and additionally to use the transcript-level abundance estimates to calculate a gene-level offset that corrects for changes to the average transcript length across samples. The code examples below accomplish these steps for you, keeping track of appropriate matrices and calculating these offsets. For edgeR you need to assign a matrix to y$offset, but the function DESeqDataSetFromTximport takes care of creation of the offset for you. Let’s call this method “original counts and offset”.

I circumvent the length bias because the DESeqDataSetFromTximport function automatically correct the counts, coming from the tsv file, for length bias, right?:

dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition)

but if someone could confirm this, that would be great.

kallisto tximport deseq2 counts • 5.3k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 1 day ago
United States

The helper function DESeqDataSetFromTximport takes care of everything for you. It uses counts plus a length offset by default, but if it detects that you used scaledTPM or lengthScaledTPM then it doesn’t bring an offset. It does the right thing either way.

ADD COMMENT
0
Entering edit mode

Thanks a lot for your quick reply. So, just for clarity sake, that may be one of the possible solution:

txi.kallisto.tsv <- tximport(files, type = "kallisto", tx2gene = tx2gene, ignoreTxVersion = TRUE) 
sampleTable <- data.frame(condition = factor(c("a","a","a","b","b","b")) 
rownames(sampleTable) <- colnames(txi.kallisto.tsv$counts) 
dds <- DESeqDataSetFromTximport(txi.kallisto.tsv, sampleTable, ~condition)

dds <- DESeq(dds) dds$condition <- relevel(dds$condition, ref = "b")  
dds <- DESeq(dds) 
res <- results(dds)

right?

ADD REPLY
0
Entering edit mode

Yes, this is the recommended pipeline. But using the scaled TPM would also work. The function takes care of everything

ADD REPLY
1
Entering edit mode

That's sounds great, thanks a lot either for this support and for giving me the opportunity to do good science thanks to your packages.

ADD REPLY

Login before adding your answer.

Traffic: 662 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6