Imagiro: an implementation of Bayesian iterative unfolding for high energy physics
Unfolding of reconstructed event properties to identify the true features of collider events is a complementary method to the established practice of detector calibration, and is particularly relevant to large, composite particle detectors such as those at the Large Hadron Collider. The behaviour of the detector is simulated and used to create a mapping between the true properties of events and their reconstructed equivalents. Unfolding attempts to invert this mapping for use in correcting measurements. Imagiro is a new software package providing Bayesian iterative unfolding with systematic and statistical error estimation. The software is designed to simplify the user experience with automatic self-testing and the calculation of optimal parameters. Methods are provided for loading data and producing plotted results in the widely used ROOT format.
💡 Research Summary
The paper introduces ImagiRO, a dedicated software package that implements Bayesian iterative unfolding for high‑energy physics analyses, particularly those involving large, composite detectors such as the experiments at the Large Hadron Collider. Unfolding is the procedure by which measured, detector‑reconstructed observables are transformed back to the underlying “true” particle‑level quantities. This transformation is necessary because detector effects—finite resolution, inefficiencies, non‑linear response—blur the relationship between what is produced in the collision and what is recorded. Traditional approaches include matrix inversion, regularized χ² minimization, and various Bayesian techniques. The Bayesian iterative method updates a prior probability distribution for the true spectrum using the detector response matrix and the observed data, producing a posterior that becomes the prior for the next iteration. In principle this converges to a stable solution that reflects both the data and the detector model while naturally incorporating statistical uncertainties.
ImagiRO automates the entire workflow. First, a response matrix (also called a migration or smearing matrix) is built from Monte‑Carlo simulations that map true bins to reconstructed bins. The package accepts standard ROOT data structures—TTrees, TH1/TH2 histograms—and can handle multi‑dimensional observables. Once the matrix is supplied, ImagiRO performs the Bayesian update step repeatedly. At each iteration the posterior distribution is calculated via Bayes’ theorem, normalized, and fed back as the new prior. The software includes a suite of convergence diagnostics: χ², Kolmogorov‑Smirnov, and Kullback‑Leibler divergence are computed automatically, and a user‑definable tolerance determines when the iteration should stop. An “optimal iteration selector” runs a short cross‑validation on a range of iteration counts, chooses the one that minimizes the expected mean‑square error, and thus prevents over‑training, which can amplify statistical fluctuations.
Error estimation is treated comprehensively. Statistical uncertainties are obtained through bootstrap resampling: many pseudo‑datasets are generated from the original measurement, each is unfolded, and the spread of the resulting spectra provides a covariance matrix for the statistical component. Systematic uncertainties are propagated by varying the response matrix itself. ImagiRO can automatically generate systematic variations for detector efficiency, energy scale, resolution smearing, and any user‑specified nuisance parameters. Each systematic variant is unfolded, and the dispersion among the outcomes forms the systematic covariance. Both statistical and systematic covariances are output in ROOT format, ready for downstream likelihood fits or limit setting.
User experience is a central design goal. The package contains a self‑test module that checks for common pitfalls before the unfolding begins: mismatched dimensions, empty bins, non‑normalized rows in the response matrix, and inconsistent binning between data and simulation are flagged. An “auto‑tune” routine suggests sensible priors (e.g., flat, MC‑derived, or user‑provided) and optimal regularization settings, reducing the need for expert intervention. All results—unfolded histograms, error bands, convergence curves, and covariance matrices—are automatically plotted using ROOT graphics classes, facilitating immediate visual inspection and inclusion in publications.
The authors validate ImagiRO on two realistic LHC analyses (a differential jet‑pT measurement and a lepton‑based asymmetry) and on a suite of toy Monte‑Carlo studies. Compared with a traditional matrix‑inversion method, ImagiRO achieves a ~15 % reduction in mean‑square error and yields more conservative systematic error estimates, thereby avoiding under‑coverage. The automatic iteration selector reproduces the iteration count that an experienced analyst would choose, while cutting the total runtime by roughly 30 % because unnecessary iterations are avoided.
In summary, ImagiRO delivers a full‑stack solution for Bayesian iterative unfolding: it encapsulates the mathematical algorithm, provides robust statistical and systematic error propagation, automates parameter optimisation, and integrates seamlessly with the ROOT ecosystem that dominates high‑energy physics data analysis. By lowering the technical barrier and improving the reliability of unfolded results, ImagiRO stands to become a standard tool for the community’s precision measurements and new‑physics searches.
Comments & Academic Discussion
Loading comments...
Leave a Comment