Limits, discovery and cut optimization for a Poisson process with uncertainty in background and signal efficiency: TRolke 2.0

A C++ class was written for the calculation of frequentist confidence intervals using the profile likelihood method. Seven combinations of Binomial, Gaussian, Poissonian and Binomial uncertainties are implemented. The package provides routines for the calculation of upper and lower limits, sensitivity and related properties. It also supports hypothesis tests which take uncertainties into account. It can be used in compiled C++ code, in Python or interactively via the ROOT analysis framework.

💡 Research Summary

The paper presents TRolke 2.0, a C++ class that implements frequentist confidence‑interval construction for Poisson‑distributed signal searches while explicitly accounting for uncertainties in both background expectations and signal‑efficiency measurements. The authors adopt the profile‑likelihood method, which treats the nuisance parameters (background b and efficiency ε) as additional dimensions of the likelihood function and then “profiles” them out by maximizing the likelihood with respect to b and ε for each hypothesized signal strength μ. This approach yields confidence intervals that retain the correct coverage even when the nuisance parameters are not known precisely.

Seven distinct combinations of statistical models are supported, reflecting the most common ways experimental uncertainties arise: (i) background Poisson, efficiency Binomial; (ii) background Gaussian, efficiency Binomial; (iii) background Poisson, efficiency Gaussian; (iv) background Gaussian, efficiency Gaussian; (v) background Poisson, efficiency Poisson; (vi) background Gaussian, efficiency Poisson; and (vii) both background and efficiency modeled as Gaussian. For each case the full likelihood L(μ,b,ε) is written explicitly (e.g., L = Pois(n|με+b)·Bin(k|N,ε)·Pois(m|b) for case i) and the profile likelihood ratio λ(μ) = L(μ, b̂(μ), ε̂(μ))/L(μ̂, b̂, ε̂) is used as the test statistic. Under the asymptotic regime, –2 ln λ(μ) follows a χ² distribution with one degree of freedom, allowing the construction of upper and lower limits by solving –2 ln λ(μ) = χ²₁,₁–α. The authors verify this approximation with extensive Monte‑Carlo studies, showing that coverage remains accurate even for modest event counts.

Beyond interval calculation, TRolke 2.0 provides tools for sensitivity estimation (the median expected upper limit under the background‑only hypothesis) and hypothesis testing (computing p‑values and power for a given signal strength). A notable feature is the built‑in cut‑optimization routine: users can define a selection variable, scan possible cut values, and automatically identify the cut that minimizes the expected upper limit, thereby balancing signal efficiency against background rejection in a statistically rigorous way.

Implementation-wise, the class inherits from ROOT’s TObject, exposing a clean API: SetObservations(n,m,k,…), SetBackgroundModel(…), SetEfficiencyModel(…), GetLimits(), GetSensitivity(), and GetHypothesisTest(). A Python interface via PyROOT mirrors the C++ methods, enabling interactive analysis in Jupyter notebooks or integration into larger Python‑based workflows.

The paper concludes with applications to real high‑energy‑physics analyses (e.g., rare decay searches at LHCb and Belle II). Comparisons with the earlier Rolke‑López method demonstrate that TRolke 2.0’s explicit handling of mixed uncertainty models yields more reliable limits without excessive conservatism. Overall, TRolke 2.0 offers a flexible, well‑validated solution for experiments where background and efficiency uncertainties cannot be ignored, facilitating robust frequentist inference in the presence of complex systematic effects.