Question

Which rlog should I use for DESeq2 analysis

3

Entering edit mode

rafaelsolersanblas ▴ 50

@rafaelsolersanblas-22935

Last seen 3.4 years ago

Alicante

I have a question regarding the rlog normalization.

I have many samples to compare, with only one factor. A vs Treat, B vs Treat, C vs Treat ... So, should I put everything in the same DESeqDataSet object even though the variability between groups is very large (I did the differential expression analysis with the comparisons separated), and then calculate the rlog of all samples?, Or put the different comparisons in different DESeqDataSet objects and extract the rlog of each comparison, and later join the rlogs by EnsemblID?

Thank you!

samples rlog DESeq2 • 2.1k views

ADD COMMENT • link updated 3.4 years ago by Michael Love 43k • written 3.4 years ago by rafaelsolersanblas ▴ 50

1

Entering edit mode

swbarnes2 ★ 1.4k

@swbarnes2-14086

Last seen 4 days ago

San Diego

The rlogged/vst values are not used at all in assessing DE genes. They are provided to use in applications like PCA plots or heatmaps.

In general, it is preferable to keep all your samples in a single object, and use contrasts to specify what subgroups you want to compare.

ADD COMMENT • link 3.4 years ago swbarnes2 ★ 1.4k

score 2 · Accepted Answer · 2021-11-16

2

Entering edit mode

Michael Love 43k

@mikelove

Last seen 5 days ago

United States

I’d prefer VST. Sometimes the rlog can overshrink differences between groups. You can use vst() function.

ADD COMMENT • link 3.4 years ago Michael Love 43k

0

Entering edit mode

Yes, I read that in the vignette, thank you Professor Love!

However, I still have the doubt of whether to group everything in the same DESeqDataSet or to separate it into different ones. I suppose that to perform data visualization, it is better to put everything together, and for differential analysis to do everything separately, right?

Also, I am seeing if I transform them with blind = F, since between groups I expect great genetic variability (not within the groups themselves). Although if I want to do an unsupervised hierarchical clustering with a z-score of very different tumor samples to cluster them transcriptomically, would you apply blind = T?

Thank you so much!

ADD REPLY • link 3.4 years ago rafaelsolersanblas ▴ 50

0

Entering edit mode

The question about "whether to group everything in the same DESeqDataSet or to separate it into different ones" is a FAQ in the vignette.

I recommend blind=FALSE generally. The design is not used in performing the transformation, which is fixed for all samples equally. It is only used to understand the global amount of within-group variability. It will still be unsupervised with blind=FALSE.

ADD REPLY • link 3.4 years ago Michael Love 43k