Asymptotic formulae for likelihood-based tests of new physics

Asymptotic formulae for likelihood-based tests of new physics
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We describe likelihood-based statistical tests for use in high energy physics for the discovery of new phenomena and for construction of confidence intervals on model parameters. We focus on the properties of the test procedures that allow one to account for systematic uncertainties. Explicit formulae for the asymptotic distributions of test statistics are derived using results of Wilks and Wald. We motivate and justify the use of a representative data set, called the “Asimov data set”, which provides a simple method to obtain the median experimental sensitivity of a search or measurement as well as fluctuations about this expectation.


💡 Research Summary

The paper presents a comprehensive framework for likelihood‑based statistical tests that are now standard in high‑energy physics (HEP) for both discovery searches and parameter estimation. It begins by outlining the challenges faced in HEP analyses: large data samples, many nuisance parameters representing systematic uncertainties, and the need for fast yet accurate significance calculations. The authors adopt a frequentist perspective and focus on the profile likelihood ratio as the central test statistic.

The basic model is defined by an observable data vector x whose probability density f(x|μ, θ) depends on a signal strength parameter μ (the quantity of interest) and a set of nuisance parameters θ that encode backgrounds, detector effects, and other systematic effects. The full likelihood L(μ, θ) = ∏i f(xi|μ, θ) is maximized globally to obtain (μ̂, θ̂) and conditionally (μ̂(θ), θ̂) for fixed μ. The profile likelihood ratio λ(μ) = L(μ, θ̂μ)/L(μ̂, θ̂) leads to the test statistics qμ = –2 ln λ(μ) for testing a specific signal strength and q0 = –2 ln λ(0) for testing the background‑only hypothesis.

Using Wilks’ theorem, the authors show that in the asymptotic (large‑sample) limit qμ follows a χ² distribution with one degree of freedom when μ is unrestricted. When the physical constraint μ ≥ 0 is imposed, the distribution of q0 becomes a mixture of a point mass at zero and a χ²1 component, reflecting the one‑sided nature of discovery tests. Wald’s approximation further simplifies qμ to (μ – μ̂)²/σ², where σ is the standard deviation of the estimator obtained from the curvature of the profile likelihood. This quadratic form eliminates the need for costly Monte‑Carlo integration and provides analytic expressions for p‑values and confidence intervals.

Systematic uncertainties are incorporated by profiling over the nuisance parameters. The authors discuss how the covariance matrix of the nuisance parameters, derived from the Fisher information, enters the Wald approximation and how correlations among systematics are naturally accounted for. This profiling approach yields a test statistic that already reflects the impact of all systematic variations without the need for ad‑hoc “pull” terms.

A major conceptual contribution is the introduction of the “Asimov data set,” a fictitious data set in which the observed counts equal their expected values under a given model. By evaluating the profile likelihood on the Asimov data, one obtains the median expected values of μ̂ and σ, and consequently the median significance Zmed = √qμ,Asimov. This provides a fast, deterministic way to estimate the expected sensitivity of an experiment, to compare different analysis strategies, and to propagate systematic variations into the expected significance. The Asimov method also yields the expected distribution of the test statistic under both the null and alternative hypotheses, allowing the construction of expected exclusion limits and discovery potentials without repeated pseudo‑experiment generation.

The theoretical results are validated with realistic examples from the Large Hadron Collider (LHC). The authors compare the asymptotic formulas to full Monte‑Carlo ensembles for Higgs boson searches and for precision measurements of electroweak parameters. They demonstrate that, even with dozens of nuisance parameters, the asymptotic p‑values agree with the Monte‑Carlo results to within a few percent, while the computational cost is reduced by orders of magnitude. The Asimov‑based median sensitivity matches the median of the full ensemble, confirming its practical utility.

In conclusion, the paper establishes that likelihood‑ratio tests, when combined with Wilks and Wald asymptotics and the Asimov data concept, provide a powerful, analytically tractable toolkit for HEP analyses. This framework enables rapid evaluation of discovery significance, confidence intervals, and exclusion limits while fully accounting for systematic uncertainties. The authors suggest future extensions to regimes where the asymptotic approximations break down (small data samples, boundary‑parameter effects) and to integrate modern machine‑learning models into the likelihood formalism.


Comments & Academic Discussion

Loading comments...

Leave a Comment