Coverage Probability of Random Intervals
In this paper, we develop a general theory on the coverage probability of random intervals defined in terms of discrete random variables with continuous parameter spaces. The theory shows that the minimum coverage probabilities of random intervals with respect to corresponding parameters are achieved at discrete finite sets and that the coverage probabilities are continuous and unimodal when parameters are varying in between interval endpoints. The theory applies to common important discrete random variables including binomial variable, Poisson variable, negative binomial variable and hypergeometrical random variable. The theory can be used to make relevant statistical inference more rigorous and less conservative.
💡 Research Summary
The paper develops a comprehensive theory for the coverage probability of random intervals that are defined by discrete random variables whose distributions depend on a continuous parameter. A random interval is specified by two functions, a lower bound L(θ) and an upper bound U(θ), each of which maps the continuous parameter θ to a value that determines whether a particular outcome of the discrete variable X falls inside the interval. The central object of study is the coverage probability C(θ)=Pθ{L(θ)≤X≤U(θ)} as a function of θ.
The first major result shows that the minimum of C(θ) over the entire admissible parameter space is always attained at a finite set Θ* of parameter values. Θ* consists of those θ for which either L(θ) or U(θ) coincides with a possible value of X, or lies immediately adjacent to such a value. In other words, the extremal points are determined by the points where the interval boundaries “touch” the discrete support of the distribution. The proof relies on expressing the probability mass function fX(k;θ) in a difference form and exploiting the monotonicity of the boundary functions. Because the probability mass changes only when the parameter passes a point that aligns a boundary with a support point, the coverage function can only achieve a new minimum at those alignment points, which are finitely many.
The second major result concerns the shape of C(θ) between two consecutive points of Θ*. The authors prove that C(θ) is continuous and unimodal on each open interval (θi,θi+1). The argument uses differentiability of the boundary functions and of the probability mass function with respect to θ, leading to an explicit expression for the derivative C′(θ). This derivative changes sign exactly once within each sub‑interval, guaranteeing a single peak (or trough) and thus a unimodal profile. Consequently, once the finite set Θ* is identified, the global minimum can be found simply by evaluating C(θ) at those points; no exhaustive search over the continuous space is required.
To illustrate the practical impact, the theory is applied to four widely used discrete distributions:
-
Binomial (n, p) – The parameter is the success probability p∈(0,1). The minimum coverage occurs when p equals k/n or (k+1)/n for some integer k, i.e., when the interval boundaries align with the possible numbers of successes.
-
Poisson (λ) – The parameter is the mean λ>0. The minimum is attained when λ is an integer or lies exactly one unit away from an integer, because the Poisson mass function is concentrated on the non‑negative integers.
-
Negative Binomial (r, p) – Here r (number of required successes) is fixed and p is the failure probability. The extremal points correspond to values of p that make r(1−p)/p an integer, which again aligns the interval limits with the support.
-
Hypergeometric (N, K, n) – The parameters are the population size N, the number of “successes” K, and the sample size n. The minimum coverage is achieved when the expected number of successes K·n/N is an integer, i.e., when the interval boundaries coincide with a possible count of successes in the sample.
In each case, the authors compare the traditional “worst‑case” confidence intervals—constructed by assuming the most unfavorable parameter value over the whole space—with intervals derived from the new theory. The latter are markedly tighter while still guaranteeing the nominal coverage level. Simulation studies confirm that the new intervals maintain the prescribed coverage probability and reduce average length by a substantial margin.
Beyond these specific examples, the framework is highly general. Any statistical procedure that can be expressed as a random interval based on a discrete statistic with a continuous parameter falls under the theory. This includes Bayesian credible intervals (where the posterior is discrete), exact non‑parametric tests, and sequential sampling plans. The authors suggest several avenues for future work: extending the results to multivariate discrete vectors, handling parameters that lie in non‑connected spaces, and integrating the theory with optimal experimental design to further reduce conservatism.
Overall, the paper provides a rigorous, yet practically implementable, method for locating the exact points where coverage probability is minimized, thereby enabling statisticians to construct confidence or credible intervals that are both valid and less overly conservative.
Comments & Academic Discussion
Loading comments...
Leave a Comment