Expected counts from RSEM in DESeq2
1
2
Entering edit mode
@chudarchudar-9587
Last seen 5.2 years ago

Hi All,

I am new to DESeq2 analysis and I follow the trinity pipeline for DESeq2 analysis. In that pipeline, RSEM is used to quantify the transcript abundance which generates the expected counts. These expected counts will be rounded off and later fed into DESeq2 pipeline for further analysis.

I would like to know whether these expected counts generated from RSEM can be fed into DESeq2 instead of raw counts for computing differential expressed genes?

Regards

Chudar

RSEM deseq2 expected_counts raw_counts • 18k views
ADD COMMENT
5
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Yes, RSEM expected counts can be used with DESeq2.

The recommended pipeline would be to use tximport(), then DESeqDataSetFromTximport().

There is an example of importing RSEM gene-level estimated counts in the tximport vignette.

The tximport pipeline in addition to just reading in the counts table, incorporates the average transcript length per gene as a normalization factor for gene-level DE analysis. See the citation listed at the tximport landing page for more details:

https://bioconductor.org/packages/release/bioc/html/tximport.html

ADD COMMENT
0
Entering edit mode

Hi! I must import RSEM data to DESeq2 for downstream analyses. My RSEM output has the following columns:

gene_id transcript_id(s) length effective_length expected_count TPM FPKM

I tried to follow the tximport tutorial from

https://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html

but I got the following error:

Error in DESeqDataSetFromTximport(txi, sampleTable, ~condition) :
  all(lengths > 0) is not TRUE
Calls: DESeqDataSetFromTximport -> stopifnot
Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
  45 duplicate rownames were renamed by adding numbers

My code is as follows:

library(tximportData)
library(tximport)
library(readr)
library(DESeq2)


# files is a vector with the list of 10 RSEM output files
names(files) <- paste0("sample", 1:10)

#import files
txi <- tximport(files, type = "rsem", txIn = FALSE, txOut = FALSE)
names(txi)

# cond is a vector with conditions to be used for differential analysis
sampleTable <- data.frame(condition = factor(cond))
rownames(sampleTable) <- colnames(txi$counts)

dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition)

Do you have any suggestions?

Thanks!

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Catalina 10.15.7
Matrix products: default
BLAS/LAPACK: /Users/miniconda3/envs/bioinfo/lib/libopenblasp-r0.3.7.dylib

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.0.2
ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks a lot for the helpful reply; adding

txi$length[txi$length <= 0] <- 1

before

dds <- DESeqDataSetFromTximport(txi, sampleTable, ~condition)

has apparently solved the problem.

BTW, is it possible to import into DESeq2 rsem data and combine them with data from featureCounts in order build a merged matrix with normalized counts? - Thanks!

ADD REPLY
1
Entering edit mode

That is not trivial I would say, you would need to correct for / model the differences in quantification method somehow. I haven't attempted this.

ADD REPLY

Login before adding your answer.

Traffic: 946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6