Hello there, I am trying to analyse a dataset using deseq2 with kallisto via tximport. I am using the following code:
tximport(files, type = "kallisto", tx2gene = tx2gene, ignoreTxVersion = TRUE) So, in this case the counts from abundances are 'no', I guess (because I am not specifying so It should come as default setting). These counts are not normalised for length bias, right? But if I got it right, because of DESeqDataSetFromTximport function:
Note: there are two suggested ways of importing estimates for use with differential gene expression (DGE) methods. The first method, which we show below for edgeR and for DESeq2, is to use the gene-level estimated counts from the quantification tools, and additionally to use the transcript-level abundance estimates to calculate a gene-level offset that corrects for changes to the average transcript length across samples. The code examples below accomplish these steps for you, keeping track of appropriate matrices and calculating these offsets. For edgeR you need to assign a matrix to y$offset, but the function DESeqDataSetFromTximport takes care of creation of the offset for you. Let’s call this method “original counts and offset”.
I circumvent the length bias because the DESeqDataSetFromTximport function automatically correct the counts, coming from the tsv file, for length bias, right?:
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition)
but if someone could confirm this, that would be great.
Thanks a lot for your quick reply. So, just for clarity sake, that may be one of the possible solution:
right?
Yes, this is the recommended pipeline. But using the scaled TPM would also work. The function takes care of everything
That's sounds great, thanks a lot either for this support and for giving me the opportunity to do good science thanks to your packages.