Question

The distribution function for zero-altered negative binomial density produces a probability greater 1

0

Entering edit mode

mrk.carty ▴ 30

@mrkcarty-7442

Last seen 9.5 years ago

United States

Hi all,

I'm using the pzanegbin distribution function from the VGAM package to compute p-value for events coming from hurdle negative binomial regression null distribution. Here are details of the computation:

> library(VGAM)
Loading required package: stats4
Loading required package: splines
> options(digits=22)
> 1 - pzanegbin(q = 480,size = 3.212259754597279748367,munb = 30.57921554406697595141 ,pobs0 = 0.04685295678266754304531 )
[1] -2.220446049250313080847e-16
> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.9.5 (Mavericks)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] VGAM_0.9-8
>

I'm not sure why I'm getting this result...could this be a bug in the code?

Best,

Mark

VGAM pzanegbin • 1.0k views

ADD COMMENT • link updated 9.5 years ago by Ryan C. Thompson ★ 7.9k • written 9.5 years ago by mrk.carty ▴ 30

score 1 · Answer 1 · 2015-10-30

My guess is that there is no closed-form solution for the CDF of this non-standard distribution, so you are getting a numerical approximation, which isn't guaranteed to stay within the expected domain of [0,1]. You could use pmin(..., 1) to clamp the maximum value of the function to 1, but see below:

Edit: I just noticed that the value you get is exactly equal to -.Machine$double.eps:

> .Machine$double.eps
[1] 2.220446049250313080847e-16
> 1 - pzanegbin(q = 445,size = 3.212259754597279748367,munb = 30.57921554406697595141  ,pobs0 = 0.04685295678266754304531 )
[1] -2.220446049250313080847e-16
> 1 - pzanegbin(q = 445,size = 3.212259754597279748367,munb = 30.57921554406697595141  ,pobs0 = 0.04685295678266754304531 ) == -.Machine$double.eps
[1] TRUE
> 1 - pzanegbin(q = Inf,size = 3.212259754597279748367,munb = 30.57921554406697595141  ,pobs0 = 0.04685295678266754304531 ) == -.Machine$double.eps
[1] TRUE

According to the help text for .Machine, this value is "the smallest positive floating-point number 'x' such that '1 + x != 1'". This means that for the values you are passing to the function, it is operating near the limits of floating-point accuracy, and any calculations are likely to have very large errors as a result.