Question

DESeq2:varianceStabilizingTransformation suggested. Results suspicious

0

Entering edit mode

jovel_juan ▴ 30

@jovel_juan-7129

Last seen 10 months ago

Canada

I run some analysis with DESeq2 and got the following warning:

NOTES: - Data was quantified with Kallisto. - I did not aggregate the data at the gene level. I run it at the transcript level, and was using a cut-off of qval < 0.01, instead of 0.05.

In this data, for 27.2% of genes with a sum of normalized counts above 100, it was the case that a single sample's normalized count made up more than 90% of the sum over all samples. the threshold for this warning is 10% of genes. See plotSparsity(dds) for a visualization of this. We recommend instead using the varianceStabilizingTransformation or shifted log (see vignette).

What does it mean? Should I run the differential expression test indicating the varianceStabilizingTransformation instead of the default DESeq transformation? If yes, how to do that?

I did not do anything and run the test with the wrapper as always: ddsMat <- DESeq (ddsMat)

I got thousands of deregulated transcripts in each comparison. Because of the warning described above, I also run the same comparison using SLEUTH and only got a couple of dozens in each case.

What should I do?

deseq2 • 1.3k views

ADD COMMENT • link 6.1 years ago jovel_juan ▴ 30

0

Entering edit mode

Thanks for your answer Mike!

Those are cultured heart cells with a specific phenotype subjected to two different drugs or to the combination of both drugs. Each treatment was compared to non-treated cells.

Please see attached PCA of one of the comparison throwing the referred warning. It is clear that sample bs13 is different from the rest.

Should I remove that sample and repeat the analyses?

enter image description here

ADD REPLY • link 6.1 years ago jovel_juan ▴ 30

0

Entering edit mode

I move these to comments, rather than "answers".

Can you tell me the sample sizes? Is this bulk RNA-seq?

ADD REPLY • link 6.1 years ago Michael Love 43k

0

Entering edit mode

Ok, the images came through in my email, but not here.

I think you can ignore the warning. I'm guessing you have 1 out of 2 samples with much higher counts overall, and this is why 90% of the row sum count is coming from a single sample for a large fraction of genes.

ADD REPLY • link 6.1 years ago Michael Love 43k

0

Entering edit mode

Sorry, figure did not load in the previous message. Here a new attempt.

enter image description here

ADD REPLY • link 6.1 years ago jovel_juan ▴ 30

0

Entering edit mode

This is a typical example, but all other comparisons have something similar. What to do?

ADD REPLY • link 6.1 years ago jovel_juan ▴ 30

score 0 · Answer 1 · 2019-02-14

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 8 hours ago

United States

What kind of samples do you have? How many? That warning is flagged for extremely sparse data. I wrote it to be self explanatory, that you have a lot of genes where most of the row count comes from a single sample.

ADD COMMENT • link 6.1 years ago Michael Love 43k