DESeq2 produces adjusted p-values = 1
1
0
Entering edit mode
Assa Yeroslaviz ★ 1.5k
@assa-yeroslaviz-1597
Last seen 23 days ago
Germany

Hi,

 

I'm working on a mouse data set, where the mice were exposed to different treatments (HP, DR). In the experiment we have 9 conditions with 3 (or 4) samples per condition. I have one "zero" control and two controls after pre-conditioning of the mice. I also have two time points were RNASeq data was prepared (4h and 24h). we did both RNASeq and miRNA-Seq.

We are running multiple comparisons and all looks well, except one comparisonin the miRNASeq data set. In this comparison I get an adjusted p-value of 1 for all the genes. The comparison is between the zero control and one of the controls after pre-conditioning. In the RNASeq data we get good results with quite a few significant genes under the threshold adj. p-value.

I know it can't be directly compared, but it is a comparison of the workflow which I would like to point to.

I would like to know, what can be the explanation for an adjusted p-value of 1 in all genes, when the p-value and log2FC looks quite normal.

my DESeq2 version is 1.8.1

thanks,

Assa

 

 

deseq2 adjusted pvalue • 10k views
ADD COMMENT
0
Entering edit mode

Hi Assa. Are you trying to compare samples without replicates? This happens in this case.. Pvalues without replicates doesn't makes much sense anyway.

ADD REPLY
0
Entering edit mode

No I have for each condition either 3 or 4 replicates

ADD REPLY
4
Entering edit mode
@steve-lianoglou-2771
Last seen 21 months ago
United States

It's a property of the Benjamini & Hochberg correction, and doesn't have anything to do with DESeq2 per se, as this is just working on the pvalues it generates. When you don't have some type of enrichment of small/significant pvalues -- which is to say when it looks like there really isn't much of anything that's significant -- the adjusted pvalues get hammered hard (and also exhibit this discretized behavior).

You said in your post that "... the p-value and log2FC looks quite normal," but what does that mean? Take a look at a histogram of the pvalues you have in this comparison, how does it look? I'm guessing there's no small "bump" towards the low pvalue side? Maybe even a relative paucity of pvalues on the left/significant side?

We can also show that you get something that approximates this behavior when we generate a vector of pvalues you would get under the null (ie. no significant results) -- in this case, the p-values would be uniformly distributed between [0,1]. Let's say there are 500 miRNA's you are testing here and none of them are really differentially expressed:

set.seed(2/3)
pval <- runif(500)
padj <- p.adjust(pval, 'BH')
summary(padj)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 0.4832  0.9131  0.9131  0.9408  0.9855  0.9961

## Generate a slightly hosed pval distribution where we generate pvalues
## from [0.05, 1], and you see this discretization of adjust pvalues even more
pval2 <- runif(500, 0.05)
padj2 <- p.adjust(pval2, 'BH')
summary(padj2)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 0.9988  0.9988  0.9988  0.9989  0.9988  0.9999

Hope that helps

ADD COMMENT
1
Entering edit mode

Yes, Steve is correct.

Here is a comment I posted a while ago which uses a plot to show how you get repeated values:

C: adj.P metod = \"BH\" many exact the same value

In your case those values are 1, meaning no genes survived multiple test correction.

ADD REPLY

Login before adding your answer.

Traffic: 634 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6