all pairwise comparisons, some genes can have zero counts everywhere in a given comparison
1
1
Entering edit mode
anton.kratz ▴ 60
@antonkratz-8836
Last seen 7 months ago
Japan, Tokyo, The Systems Biology Insti…

I have 12 different biological states in triplicate and do all pairwise differential expression tests between these samples with DESeq2.

There can be any combination of two given states such that specific clusters (genes) have absolutely zero counts in all three replicates each, for this particular comparison.

For example, in raw counts it could look like this for a particular cluster:

stateA_rep1 stateA_rep2 stateA_rep2 stateB_rep1 stateB_rep2 stateB_rep3 stateC_rep1 stateC_rep
2
stateC_rep
3
0 0 0 46 34 67 0 0 0

The clusters for which this is true in any given comparison have differing baseMean values and can even have (very low) fold changes.

When rendering the result as a MA-plot, these clusters manifest themselves as horizontal "streaks" in the area of very low baseMean. In the example, when comparing stateA vs stateC I might get something like a log2FoldChange of 0.3.

I think I understand what is going on here: DESeq2 adds very low pseudocounts because the same clusters might have (and indeed, do) much higher counts in other samples (such as stateB) and thus can be properly compared (stateA vs stateB, stateC vs stateB), avoiding infinite fold changes.

However, I am wondering how to best deal with this in those specific comparsions where all counts are absolutely zero?

Somehow the result seems counterintuitive to me and difficult to defend, i.e. I have some clusters now with totally zero expression in a comparison, yet I get differetn baseMean values and even fold changes.

I think I cannot just remove those clusters from the original input because depending on the comparison in question, the clusters can have substanital counts in other samples. Also I am running DESeq2 on the entire table as I understand the recommendation from the vignette (https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#if-i-have-multiple-groups-should-i-run-all-together-or-split-into-pairs-of-groups).

deseq2 • 1.1k views
ADD COMMENT
3
Entering edit mode
@mikelove
Last seen 1 day ago
United States
The base mean is across all samples, so it makes sense it is not zero. Are you using lfcShrink? This will give you an LFC of ~0 for these.
ADD COMMENT
0
Entering edit mode

Thank you. I was not using lfcShrink,  but now I am and this fixed my issue - I am on DESeq2 v 1.16.1 and was simply not aware that lfcShrink is not called implicitly anymore, now I am calling explicitly.

ADD REPLY

Login before adding your answer.

Traffic: 350 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6