Question

DESeq2 adujsted p-value cutoff seemingly not working

0

Entering edit mode

Guilherme • 0

@da5a80e7

Last seen 8 hours ago

Brazil

Hello you all, and thanks in advance for opening and giving time to read my question.

I've been with this problem in my head for a while and decided to post here since I've found no answer. I'm using DESeq2 with adjusted p-value (padj) cutoff of 0.05 set through the "alpha" argument of the "res()" function, as Michael Love recommended here (in my script: res <- results(dds, alpha=.05)).

The issue is: despite the cutoff, when I look at the table I still see various big padj values, as seen below:

res <- results(dds, alpha=0.05)
resOrdered <- res[order(res$padj, decreasing = T),]

log2 fold change (MLE): condition stage2 vs stage1

Wald test p-value: condition stage2 vs stage1

DataFrame with 27686 rows and 6 columns

     baseMean     log2FoldChange     lfcSE     stat    pvalue      padj

     <numeric>      <numeric>     <numeric> <numeric> <numeric> <numeric>

GeneA 311.30063   -0.000045184  0.370183 -0.000122059  0.999903  0.999903

GeneB  87.84123    0.000101288  0.513417  0.000197282  0.999843  0.999903

GeneC   6.11153   -0.001102367  1.544488 -0.000713743  0.999431  0.999608

GeneD  57.20622   -0.000380087  0.582574 -0.000652426  0.999479  0.999608

GeneE   7.42678    0.001275319  1.390976  0.000916852  0.999268  0.999570

table(res$padj < .05)

FALSE  TRUE 

14009  1589

As you can see, various genes have got very high p-values, which I find strange even because the standard cutoff of padj should be 0.1. Should I still filter the table "manually" after the result for some reason? Thank you again.

DESeq2 • 287 views

ADD COMMENT • link 5 days ago • updated 10 hours ago Guilherme • 0

score 0 · Answer 1 · 2025-02-06

From `?results'

On p-values:

     By default, independent filtering is performed to select a set of
     genes for multiple test correction which maximizes the number of
     adjusted p-values less than a given critical value 'alpha' (by
     default 0.1). See the reference in this man page for details on
     independent filtering. The filter used for maximizing the number
     of rejections is the mean of normalized counts for all samples in
     the dataset. Several arguments from the 'filtered_p' function of
     the genefilter package (used within the 'results' function) are
     provided here to control the independent filtering behavior. (Note
     'filtered_p' R code is now copied into DESeq2 package to avoid
     gfortran requirements.) In DESeq2 version >= 1.10, the threshold
     that is chosen is the lowest quantile of the filter for which the
     number of rejections is close to the peak of a curve fit to the
     number of rejections over the filter quantiles. 'Close to' is
     defined as within 1 residual standard deviation. The adjusted
     p-values for the genes which do not pass the filter threshold are
     set to 'NA'.

The value of alpha is used for independent filtering of the genes, not filtering of the DataFrame you get from results