Hello,
I am following the guide found here (http://www-huber.embl.de/users/klaus/Teaching/DESeq2Predoc2014.html) to estimate differential gene expression in my two-condition RNAseq dataset. When I look at my histogram of observed p-value statistics I end up with a plot like this:
dds <- nbinomWaldTest(dds)
ddsRes <- results(dds, pAdjustMethod = "BH")
ddsRes <- ddsRes[ !is.na(ddsRes$padj), ]
ddsRes <- ddsRes[ !is.na(ddsRes$pvalue), ]
table(ddsRes[,"padj"] < 0.1) #240 TRUE 19923 FALSE
hist(ddsRes$pvalue, col = "lavender", main = "TRT vs NoTRT", xlab = "p-values")
When I run fdrtool on this dataset the estimated empirical sigma is 1.176:
FDR.ddsRes <- fdrtool(ddsRes$stat, statistic= "normal", plot = T)
ddsRes[,"padj"] <- p.adjust(FDR.ddsRes$pval, method = "BH")
table(ddsRes[,"padj"] < 0.1) #27 TRUE 20136 FALSE
When I look at the number of rejected hypotheses after running the fdrtool procedure the number is much smaller (27 rejections compared to 240 using the BH FDR adjustment directly).
I'm wondering:
1) is it appropriate to use the fdrtool procedure when sigma is estimated to be >1? All of the examples I've seen thus far (including in the DESeq tutorials) focus on the case where sigma <1.
2) if the answer to 1) is yes, what are some of the assumptions about the experiment and results that would make a sigma >1 need to be controlled with the more conservative (in this case) fdrtool correction?
Any help would be greatly appreciated!