Binomial and ratio-of-Poisson-means frequentist confidence intervals applied to the error evaluation of cut efficiencies
The evaluation of the error to be attributed to cut efficiencies is a common question in the practice of experimental particle physics. Specifically, the need to evaluate the efficiency of the cuts for background removal, when they are tested in a signal-free-background-only energy window, originates a statistical problem which finds its natural framework in the ample family of solutions for two classical, and closely related, questions, i.e. the determination of confidence intervals for the parameter of a binomial proportion and for the ratio of Poisson means. In this paper the problem is first addressed from the traditional perspective, and afterwards naturally evolved towards the introduction of non standard confidence intervals both for the binomial and Poisson cases; in particular, special emphasis is given to the intervals obtained through the application of the likelihood ratio ordering to the traditional Neyman prescription for the confidence limits determination. Due to their attractiveness in term of reduced length and of coverage properties, the new intervals are well suited as interesting alternative to the standard Clopper-Pearson PDG intervals.
💡 Research Summary
The paper addresses a common statistical problem in experimental particle physics: how to assign an uncertainty to cut efficiencies when these efficiencies are measured in a signal‑free, background‑only energy window. In such a situation the data can be modeled either as a binomial process (the number of events that survive a cut out of a total number of background events) or as the ratio of two independent Poisson means (the counts before and after the cut). Both cases reduce to the classic problem of constructing confidence intervals for a binomial proportion p or for the ratio r = λ₁/λ₂ of two Poisson means.
Historically the community has relied on the Clopper–Pearson (CP) intervals for the binomial case and on the Particle Data Group (PDG) approximations (Gaussian or conditional‑binomial) for the Poisson‑ratio case. While CP intervals are “exact” in the sense that they guarantee coverage ≥ the nominal confidence level, they are notoriously conservative: for small sample sizes or extreme outcomes (k = 0 or k = n) the intervals become excessively wide, inflating the quoted uncertainty. The PDG approximations suffer from similar problems when event counts are low or the ratio is far from unity, because they depend on symmetric Gaussian approximations that break down in the tails.
The authors propose an alternative construction based on the likelihood‑ratio ordering principle applied within the Neyman confidence‑belt framework. The method proceeds as follows:
- For each possible value of the parameter (p for the binomial, r for the Poisson ratio) compute the likelihood of the observed data.
- Determine the maximum likelihood (ML) for the observed data (p̂ = k/n for the binomial; r̂ = N₁/N₂ for the Poisson case).
- Form the likelihood‑ratio λ(θ) = L(data | θ)/L_max.
- Rank all θ values by decreasing λ and include them in the confidence set until the cumulative probability reaches the desired confidence level (e.g., 68 % or 95 %).
Because the ordering is driven by the likelihood ratio rather than by the raw probability of the data, the resulting intervals automatically adapt to the shape of the likelihood surface. In the binomial case the ordering reduces to including values of p that are closest to the ML estimate in terms of the likelihood ratio, which yields intervals that are substantially shorter than CP intervals while preserving the nominal coverage. In the Poisson‑ratio case the joint likelihood of (N₁, N₂) can be expressed as a function of r, and the same ordering produces intervals that are symmetric in the likelihood‑ratio sense even when the underlying distribution is highly asymmetric.
The paper provides extensive Monte‑Carlo studies for a variety of (n, k) and (N₁, N₂) configurations. The key findings are:
- The likelihood‑ratio (LR) intervals are on average 15 %–30 % narrower than CP intervals for the binomial proportion, with the greatest gain when k is near 0 or n.
- Coverage of the LR intervals is very close to the nominal level across the whole parameter space; under‑coverage is essentially absent, and over‑coverage is dramatically reduced compared with CP.
- For the ratio of Poisson means, the LR intervals outperform the PDG Gaussian and conditional‑binomial approximations, especially for low counts or extreme ratios, delivering accurate coverage with minimal length.
Implementation details are discussed. The authors supply algorithms that can be coded in common analysis frameworks (ROOT, Python, C++). The parameter space is discretized on a fine grid, the likelihood ratio is evaluated point‑by‑point, and the cumulative probability is obtained either by direct summation or by a fast Monte‑Carlo sampling of the underlying distribution. Public code and lookup tables are provided to facilitate adoption.
In the concluding discussion the authors argue that modern high‑precision experiments (e.g., LHC analyses, neutrino oscillation measurements, dark‑matter direct‑detection searches) require uncertainties that are neither overly conservative nor under‑estimated. The LR‑based intervals strike a balance: they retain the rigorous frequentist guarantee of coverage while delivering substantially tighter error bars than the traditional CP or PDG methods. Consequently, they constitute a practical and statistically sound alternative for the routine evaluation of cut‑efficiency uncertainties in particle‑physics data analysis.
Comments & Academic Discussion
Loading comments...
Leave a Comment