How to use specific size factors and average transcript lengths from tximport in DESeq2?
1
0
Entering edit mode
grothmn • 0
@9903acb3
Last seen 7 weeks ago
Germany

Hi,

I would like to use Salmon count data and DEseq2 to identify differentially expressed genes with pre-defined size factors for the different samples (as total transcripts are biased between samples) :

sizeFactors(dds) <- size_factors
dds <- DESeq(dds)
#using pre-existing size factors
#estimating dispersions
#gene-wise dispersion estimates
#mean-dispersion relationship
#final dispersion estimates
#fitting model and testing

Do I understand correctly that using pre-existing size factors ignores normalisation factors derived from using 'avgTxLength' from assays(dds), correcting for library size (i.e. when no pre-existing size factors were defined)? How can I combine pre-existing size factors with average transcript length normalisation to derive normalisationFactors?

Thanks!

DESeq2 tximport • 652 views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 1 day ago
United States

You can just use your own normalization factors.

This is the code that does internal size factor estimation with avgTxLength from Salmon or other transcript abundance tools:

https://github.com/thelovelab/DESeq2/blob/devel/R/methods.R#L384-L390

https://github.com/thelovelab/DESeq2/blob/devel/R/core.R#L2185-L2189

Instead of estimating sf here, you want to apply your own predefined vector

Then you have:

sf # predefined, should have geometric mean of ~1
nm <- assays(dds)[["avgTxLength"]]
nm <- nm / exp(rowMeans(log(nm))) # divide out the row-wise geometric mean
nf <- t( t(nm) * sf )
normalizationFactors(dds) <- nf
ADD COMMENT
0
Entering edit mode

t( t(nm) * sf )

That's exactly what I was looking for!

Thanks a lot, Michael

ADD REPLY

Login before adding your answer.

Traffic: 516 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6