Negatively Biased Relevant Subsets Induced by the Most-Powerful One-Sided Upper Confidence Limits for a Bounded Physical Parameter

Negatively Biased Relevant Subsets Induced by the Most-Powerful   One-Sided Upper Confidence Limits for a Bounded Physical Parameter
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Suppose an observable x is the measured value (negative or non-negative) of a true mean mu (physically non-negative) in an experiment with a Gaussian resolution function with known fixed rms deviation s. The most powerful one-sided upper confidence limit at 95% C.L. is UL = x+1.64s, which I refer to as the “original diagonal line”. Perceived problems in HEP with small or non-physical upper limits for x<0 historically led, for example, to substitution of max(0,x) for x, and eventually to abandonment in the Particle Data Group’s Review of Particle Physics of this diagonal line relationship between UL and x. Recently Cowan, Cranmer, Gross, and Vitells (CCGV) have advocated a concept of “power constraint” that when applied to this problem yields variants of diagonal line, including UL = max(-1,x)+1.64s. Thus it is timely to consider again what is problematic about the original diagonal line, and whether or not modifications cure these defects. In a 2002 Comment, statistician Leon Jay Gleser pointed to the literature on recognizable and relevant subsets. For upper limits given by the original diagonal line, the sample space for x has recognizable relevant subsets in which the quoted 95% C.L. is known to be negatively biased (anti-conservative) by a finite amount for all values of mu. This issue is at the heart of a dispute between Jerzy Neyman and Sir Ronald Fisher over fifty years ago, the crux of which is the relevance of pre-data coverage probabilities when making post-data inferences. The literature describes illuminating connections to Bayesian statistics as well. Methods such as that advocated by CCGV have 100% unconditional coverage for certain values of mu and hence formally evade the traditional criteria for negatively biased relevant subsets; I argue that concerns remain. Comparison with frequentist intervals advocated by Feldman and Cousins also sheds light on the issues.


💡 Research Summary

The paper revisits a classic problem in high‑energy physics: estimating an upper limit on a non‑negative physical parameter µ when the measured quantity x follows a Gaussian distribution with known standard deviation σ. The historically standard 95 % one‑sided upper limit, µ_UL = x + 1.64σ, is referred to as the “original diagonal line”. While this construction guarantees that, before data are taken, 95 % of repeated experiments will produce intervals covering the true µ (unconditional coverage), it exhibits serious shortcomings once the data are observed, especially when x is negative.

The author draws on Leon Gleser’s work on “recognizable relevant subsets” to show that the sample space for x contains subsets—most notably the region x < –1.64σ—in which the conditional coverage of the original diagonal line is strictly less than the nominal 95 %. In these subsets the procedure is “negatively biased” (anti‑conservative). A betting game illustrates the point: if a statistician (Peter) declares after each experiment that µ ≤ x + 1.64σ with 95 % confidence, an opponent (Paula) can safely bet against him whenever x < –1.64σ and win in the long run, despite using no more information than Peter. This demonstrates that the original method does not make the most relevant inference for the observed data, a tension that echoes the historic Neyman–Fisher debate over the role of pre‑data coverage versus post‑data statements.

The paper surveys several ad‑hoc modifications that have been employed in particle‑physics analyses to avoid “unphysical” limits. The first replaces x by max(0, x), yielding µ_UL = max(0, x) + 1.64σ. This guarantees a non‑negative limit and gives 100 % coverage for µ ≤ 1.64σ, but it is overly conservative for larger µ. The second, advocated by Cowan, Cranmer, Gross, and Vitells (CCGV), introduces a “power‑constrained limit” (PCL): µ_UL = max(–1, x) + 1.64σ, corresponding to a 16 % power constraint. The PCL attains exact 95 % unconditional coverage while providing 100 % coverage for a subset of µ values, thereby formally evading the criteria for negatively biased relevant subsets. However, the author argues that the same logical problem persists: in the region where the limit is forced to a constant, the conditional coverage still deviates from the nominal level, and the method remains a patch rather than a principled solution.

The paper also discusses Bayesian approaches that use a uniform prior for µ ≥ 0 and the CLs technique; in this simple Gaussian case they produce identical limits to the max(0, x) modification and are therefore conservative. Nonetheless, Bayesian intervals depend on the chosen prior and do not eliminate the empty‑set issue when x is very negative.

The most satisfactory solution presented is the Feldman–Cousins (FC) construction. By ordering outcomes according to the likelihood‑ratio statistic, the FC method yields confidence belts that automatically transition from one‑sided upper limits to two‑sided intervals as x increases, never producing an empty set and preserving exact 95 % coverage for every µ. Consequently, no recognizable relevant subsets with negative bias exist under the FC prescription.

In conclusion, the author emphasizes that the original diagonal‑line upper limit, while optimal in the Neyman–Pearson sense of maximal power, can be misleading for individual experiments because it ignores ancillary information (such as the sign of x) that is relevant to the post‑data inference. Modifications that simply truncate or shift the line either become overly conservative or retain hidden conditional bias. The Feldman–Cousins approach, by integrating the ordering principle, resolves the conflict between pre‑data coverage guarantees and post‑data relevance, and should be preferred for reporting upper limits on bounded physical parameters.


Comments & Academic Discussion

Loading comments...

Leave a Comment