Question

rlog transformation using DESeq2

0

Entering edit mode

anpham ▴ 60

@anpham-7402

Last seen 7.6 years ago

I have some questions about rlog transformation using DESeq2:

1) This transformation normalizes for library size, which is sequencing depth or total number of mapped reads, correct?

2) This transformation does not normalize for gene or transcript length, correct? If this is true, is there a program you would recommend to normalize transcript length using the output from rlog transformation?

Thanks!

rlog transformation deseq2 • 1.9k views

ADD COMMENT • link updated 8.5 years ago by Michael Love 43k • written 8.5 years ago by anpham ▴ 60

score 0 · Answer 1 · 2016-10-18

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 13 hours ago

United States

By default, it corrects for library size using not total number of reads but instead a robust estimate of library size (see original DESeq paper which is listed as a citation from ?estimateSizeFactors).

If you use the tximport pipeline, the transformations will also normalize for differences in average transcript length across samples. Note that this is correcting for differences in transcript length across samples but not normalizing wrt transcript length across rows (genes). You could do this yourself by then dividing each row by the average transcript length.

ADD COMMENT • link 8.5 years ago Michael Love 43k

0

Entering edit mode

Thank you for the helpful response. I have a follow-up question.

Using DESeq2 workflow, I performed rlog transformation on gene-level raw counts to obtain Transformed Gene-level Count (normalized for sequencing depth). Now, can I use the tximport pipeline on this Transform Gene-level Count to obtain Transformed Normalized Gene-level Count, i.e. normalized for sequencing depth and differences in transcript length across samples?

My goal is to use this normalized gene-level counts for candidate gene expression differential analysis (not genome-wide). Thanks.

ADD REPLY • link 8.5 years ago anpham ▴ 60

0

Entering edit mode

If you want to do differential expression with DESeq2 you should just use DESeq() on the dds object containing the counts (not normalized). Dealing with differing library size occurs within the model, you should not pre-normalize.

If you think that there is differential isoform usage across samples, I would recommend the tximport pipeline before DESeq2 (I recommend this in general for a number of reasons discussed in the workflow). This entails running software like Salmon, Sailfish or kallisto on your reads first, then reading these files into R using tximport as described in the tximport vignette.

ADD REPLY • link 8.5 years ago Michael Love 43k