Question

Anomal pvalue distribution and qvalue calculation

1

Entering edit mode

oliver.selmoni ▴ 10

@oliverselmoni-16095

Last seen 6.8 years ago

Hi,

I am using the qvalue R package to calculate qvalues of out of this pvalues distribution:

Pval distribution

The lowest qs are around 0.22.

If we have a look at the p-values histogram, we see that there are actually more low ps than expected by chance. But with a q cut-off at 0.15 there is only one marker resulting as significant. When raising the threshold to 0.25 instead, the number of significant tests raise to 10'000.

I think this is due to the fact that low ps frequency decreases toward 0, therefore making it not possible to calculate respective qvalues.

How do you interpret this? Is there a way to obtain a qvalue in this situation?

thank you

Os

qvalue pvalue histogram multiple testing correction • 1.6k views

ADD COMMENT • link updated 6.8 years ago by Nikos Ignatiadis ▴ 180 • written 6.8 years ago by oliver.selmoni ▴ 10

2

Entering edit mode

Nikos Ignatiadis ▴ 180

@nikos-ignatiadis-8823

Last seen 5.8 years ago

Heidelberg

Without any further information about your data (e.g. maybe discrete tests), the results you obtain from qvalue appear plausible to me. What you are discovering here is that in many cases, from a statistical perspective, the problem of detection is much easier than the problem of localization. In other words, detecting whether there is at least some signal in your data (i.e. testing whether at least one null hypothesis is false) can sometimes be much easier than pin-pointing the precise location of the signal (i.e. which null hypotheses you can reject).

In your case, it is likely that a large proportion of your hypotheses are not null and this manifests itself in a global shift of the p-value histogram. However, your individual tests have too low power to be able to detect anything; at least at a FDR threshold of below say 0.2.

ADD COMMENT • link 6.8 years ago Nikos Ignatiadis ▴ 180

score 3 · Accepted Answer · 2018-06-12

1. It looks curious that the three very lowest histogram bins are shorter than some of the higher ones. Maybe this just happened by chance, but it's untypical for the density of p for alternatives to be non-monotonous, and I'd check the source of these p-values.

2. The qvalue result looks fine. For an intuition, see also e.g. Slide 5 of http://www.huber.embl.de/users/whuber/pub/170731-jsm-huber-03.pdf

3. You could perhaps try http://bioconductor.org/packages/release/bioc/html/IHW.html for a different multiple testing treatment that can often increase power.