Hello you all, and thanks in advance for opening and giving time to read my question.
I've been with this problem in my head for a while and decided to post here since I've found no answer. I'm using DESeq2 with adjusted p-value (padj) cutoff of 0.05 set through the "alpha" argument of the "res()" function, as Michael Love recommended here (in my script: res <- results(dds, alpha=.05)).
The issue is: despite the cutoff, when I look at the table I still see various big padj values, as seen below:
res <- results(dds, alpha=0.05)
resOrdered <- res[order(res$padj, decreasing = T),]
log2 fold change (MLE): condition stage2 vs stage1
Wald test p-value: condition stage2 vs stage1
DataFrame with 27686 rows and 6 columns
baseMean log2FoldChange lfcSE stat pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
GeneA 311.30063 -0.000045184 0.370183 -0.000122059 0.999903 0.999903
GeneB 87.84123 0.000101288 0.513417 0.000197282 0.999843 0.999903
GeneC 6.11153 -0.001102367 1.544488 -0.000713743 0.999431 0.999608
GeneD 57.20622 -0.000380087 0.582574 -0.000652426 0.999479 0.999608
GeneE 7.42678 0.001275319 1.390976 0.000916852 0.999268 0.999570
table(res$padj < .05)
FALSE TRUE
14009 1589
As you can see, various genes have got very high p-values, which I find strange even because the standard cutoff of padj should be 0.1. Should I still filter the table "manually" after the result for some reason? Thank you again.
Thank you for your answer So the proper way to conduct the analysis is to put the filters in the results() function and later filter the dataframe according to the thresholds chosen?