I am trying to make a volcano plot using the outputs provided by RankProd. Many of my genes have a p-value = 0, according to RankProd, which makes plotting the -log10(p-value) difficult to plot.
Are p-values = 0 the product of an error, or an underflow issue? What's the best way to address such underflow issues?
I don't think it's an underflow issue. Instead, the p-values appear to be computed as
# permutations more extreme than observed/# permutations
And if you don't have any permuted statistics that are as or more extreme than your observed statistic, this ends up being
0/# permutations
There are arguments for doing something like
# permutations more extreme + 1 / # permutations
Which is what the limma package does, where you always count the observed result as one of the permuted observations. With 1000 permutations, it's a difference between 0 and 0.001, which is probably the basis of the argument - neither one is likely to be correct (the observed statistic is only expected to be part of the null distribution if there truly are no differences), but 0.001 is probably a bit more interpretable than a p == 0, so there you go.
Also, a related paper by B. Phipson and G. Smyth (Stat Appl Genet Mol Biol. 2010): Permutation P-values should never be zero: calculating exact P-values when permutations are randomly drawn