Confused about tximport-DESeq2 setup
1
0
Entering edit mode
Dunois • 0
@f7ec0822
Last seen 2.0 years ago
Universe

The Downstream DGE in Bioconductor section in the tximport vignette has two Notes in it and nothing else, and the way things are explained there is confusing.

Which of the two code snippets below is the correct approach for importing (and subsequently passing on to DESeq2) expression levels quantified using Salmon with the transcript-gene relationship given by a two column data.frame named tx2gene?

(1):

txi <- tximport::tximport(files = flist, type = "salmon", tx2gene = tx2gene, countsFromAbundance="lengthScaledTPM")
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~cond)

(2):

txi <- tximport::tximport(files = flist, type = "salmon", tx2gene = tx2gene)
dds <- DESeqDataSetFromTximport(txi, sampleTable, ~cond)
DESeq2 tximport salmon • 1.0k views
ADD COMMENT
2
Entering edit mode
ATpoint ★ 4.6k
@atpoint-13662
Last seen 5 hours ago
Germany

They’re almost identical in what they effectively do for the user, which is making sure that differences in average transcript length per gene and sample does not bias the counts. The first one modifies the counts to correct for average tx length so you get a single matrix of raw counts ready for downstream analysis. The second one produces an offset matrix of average lengths per gene and sample which DESeq2 then can use to incorporate into its model. Both are valid, the first one is more generic since some tools/approaches (like limma-voom) do not support a length offset matrix. I prefer the generic one but choice is yours.

See also the vignette: https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html#Downstream_DGE_in_Bioconductor

ADD COMMENT

Login before adding your answer.

Traffic: 771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6