Hello - I have an issue with genes (or OTUs in my case) which have zero expression across a condition.
Seems to be the same issue as
A: DESeq2; FoldChange values with only 0's
Works as expected (well as I'd expect) in DESeq2 1.10.1
Below are a few of the details
model = ~ Block + Treatment
contrast = c("Treatment","D","Control")
Treatment is a factor with 5 levels (Block has multiple levels, but I suspect that's irrelevant anyway)
Example of dodgy "gene"
aggregate(t(counts(dds["XXX"],normalize=T)),list(dds$Treatment),sum)
Group.1 XXX
1 Control 0.0000
2 A 101.9302
3 B 130.5273
4 C 240.4031
5 D 0.0000
DESeq 1.16.1 (R 3.4)
res["XXX",]
log2 fold change (MAP): Treatment D vs Control
Wald test p-value: Treatment D vs Control
DataFrame with 1 row and 6 columns
baseMean log2FoldChange lfcSE stat pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
XXX 3.152404 -20.7867 4.592323 -4.526401 5.999657e-06 0.0003431804
I get the same results if I set betaPrior to True
-------------------------------------------------------------------------
DESeq2 1.10.1 (R 3.2)
res["XXX",]
log2 fold change (MAP): Treatment D vs Control
Wald test p-value: Treatment ND vs Control
DataFrame with 1 row and 6 columns
baseMean log2FoldChange lfcSE stat pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
XXX 3.152404 0 1.3954 0 1 1
Ah, o.k. thanks Mike.
I guess it's pretty much an edge case, especially in RNA-seq, and not too difficult for me to correct for.
I'll have a look at using LRT. As you guess my data (mostly soil microbiom) might violate a few of the expectations of the DESeq model - it's a bit of an open question how to deal with data consisting mostly of zeros. Having said that, we've had pretty good results using the DESeq Wald test previously, but I can't remember having more than a two factor condition before.
Thanks for your help.