Question

confused with tximport counts abundance using salmon input

1

Entering edit mode

Al90 ▴ 10

@431fabe7

Last seen 2.7 years ago

Sweden

Hi, I am using tximport to assemble transcript level expression data from Salmon into gene-level expression data. I have read through the documentation but I am still unsure on how to interpret the "counts" and "abundance" matrix. As far as I understood:

Counts = best estimate of the original counts
Abundance = TPMs (at least when using Salmon input data)

I have gathered the data in two different ways:

# With countsFromAbundance as a default setting
txi.salmon <- tximport(files, type = "salmon", txIn=T, tx2gene = tx2gene, ignoreTxVersion = T)
# With countsFromAbundance = scaledTPM
txi.scaled_tpm <- tximport(files, type = "salmon", txIn=T, tx2gene = tx2gene, ignoreTxVersion = T,countsFromAbundance="scaledTPM")

# Comparing the counts matrix
sum(!txi.scaled_tpm$counts==txi.salmon$counts)
# [1] 1079547

# Comparing the abundance matrix
 sum(!txi.scaled_tpm$abundance==txi.salmon$abundance)
 #[1] 0

Why do the counts matrixes differ?
Are the counts an estimate of the original counts and the abundance the TPMs?

RNASeq tximport • 3.3k views

ADD COMMENT • link 2.7 years ago Al90 ▴ 10

score 3 · Answer 1 · 2022-04-08

3

Entering edit mode

Michael Love 43k

@mikelove

Last seen 6 days ago

United States

See the tximport paper for discussion of the different options for using transcript-level counts in gene-level DE analysis. There are two strategies: original counts with an offset to account for differences in feature length across samples (default) or generating counts from abundance (and discarding the estimated counts). I prefer the default method, but they tend to give similar results.

counts are estimates of the number of reads, and abundance is the TPM which divides counts by the feature length. I prefer counts for statistical testing and this is the default in the vignette.

ADD COMMENT • link 2.7 years ago Michael Love 43k

0

Entering edit mode

Thanks for the fast answer. There is still something I don't understand, why does the count matrix differ when I add a different option for countsFromAbundance?

ADD REPLY • link 2.7 years ago Al90 ▴ 10

2

Entering edit mode

Those are methods that directly modify counts. The default is to use an offset and not modify the counts.