Either a Confidence Interval Covers, or It Doesn't (Or Does It?): A Model-Based View of Ex-Post Coverage Probability
In Neyman’s original formulation, a 1-alpha confidence interval procedure is justified by its long-run coverage properties, and a single realized interval is to be described only by the slogan that it either covers the parameter or it does not. On this view, post-data probability statements about the coverage of an individual interval are taken to be conceptually out of bounds. In this paper, I present two kinds of arguments against treating that “either-or” reading as the only legitimate interpretation of confidence. The first is informal, via a set of thought experiments in which the same joint probability model is used to compute both forward-looking and backward-looking probabilities for occurred-but-unobserved events. The second is more formal, recasting the standard confidence-interval construction in terms of infinite sequences of trials and their associated 0/1 coverage indicators. In that representation, the design-level coverage probability 1-alpha and the degenerate conditional probabilities given the full data appear simply as different conditioning levels of the same model. I argue that a strict behavioristic reading that privileges only the latter is in tension with the very mathematical machinery used to define long-run error rates. I then sketch an alternative view of confidence as a predictive probability (or forecast) about the coverage indicator, together with a simple normative rule for when intermediate probabilities for single coverage events should be allowed. Keywords: confidence intervals; coverage probability; frequentist inference; single-case probability; predictive probability; Neyman. Disclaimer: The findings and conclusions in this report are those of the author and do not necessarily represent the official position of the Centers for Disease Control and Prevention.
💡 Research Summary
The paper revisits the classic Neyman‑Frequentist interpretation of a 1‑α confidence interval, which traditionally is justified solely by its long‑run coverage probability. Under that view a realized interval is described by the binary slogan “it either covers the parameter or it does not,” and any post‑data probability statement about the coverage of that single interval is deemed conceptually forbidden. The author challenges this exclusive “either‑or” reading through two complementary arguments.
First, a series of informal thought experiments demonstrates that the same joint probability model can be used to compute both forward‑looking (predictive) and backward‑looking (post‑event) probabilities for events that have already occurred but remain unobserved (i.e., whether the interval actually covers the true parameter). In the forward‑looking perspective the model yields the design‑level coverage probability 1‑α, the proportion of intervals that would contain the parameter in an infinite sequence of repetitions. In the backward‑looking perspective the observed data together with a 0/1 coverage indicator form a conditional probability that, once the data are fixed, collapses to 0 or 1. However, when the data‑generating mechanism is retained as part of the probability space, the coverage indicator remains a random variable, and a predictive probability about it can be meaningfully defined.
Second, the author formalises this intuition by recasting the usual confidence‑interval construction in terms of an infinite sequence (θ, X₁, X₂, …) and associated coverage indicators Cₙ∈{0,1}. The design‑level statement P(Cₙ=1)=1‑α is simply the marginal probability of coverage under the full model. The conditional statement P(Cₙ=1 | X₁,…,Xₙ) is the probability of coverage given the realized sample; after the sample is observed this conditional probability is degenerate (0 or 1), but it is still a conditional probability derived from the same underlying model. Thus the long‑run error rate and the post‑data “certainty” are just two different conditioning levels of a single probability model.
The paper argues that a strict behaviorist stance—one that privileges only the degenerate conditional probability and rejects any intermediate probability for a single interval—is at odds with the mathematical machinery that defines long‑run error rates. While frequentists traditionally deny the existence of a meaningful post‑data probability, the author shows that the model itself supplies a coherent predictive probability about the coverage indicator, without invoking Bayesian priors.
To make the proposal operational, the author sketches a normative rule for when intermediate probabilities should be reported: (i) the data‑generating process must be explicitly modelled; (ii) the observed data set must be sufficiently informative to quantify uncertainty about the parameter; and (iii) the decision context must benefit from a quantified forecast of coverage. In such settings, reporting a post‑data coverage probability (e.g., “there is a 92 % chance that this interval contains the true effect”) is both mathematically defensible and practically useful.
In conclusion, the paper does not deny the importance of the design‑level 1‑α guarantee; rather, it expands the interpretation of confidence intervals by showing that the same probability model can simultaneously support long‑run coverage guarantees and predictive statements about the coverage of an individual interval. This model‑based view bridges the gap between the traditional frequentist “either‑or” narrative and a more nuanced, probabilistic forecasting perspective, offering a richer language for statistical communication in fields such as policy analysis, clinical trials, and public health.
Comments & Academic Discussion
Loading comments...
Leave a Comment