TUnfold: an algorithm for correcting migration effects in high energy physics

TUnfold: an algorithm for correcting migration effects in high energy   physics

TUnfold is a tool for correcting migration and background effects in high energy physics for multi-dimensional distributions. It is based on a least square fit with Tikhonov regularisation and an optional area constraint. For determining the strength of the regularisation parameter, the L-curve method and scans of global correlation coefficients are implemented. The algorithm supports background subtraction and error propagation of statistical and systematic uncertainties, in particular those originating from limited knowledge of the response matrix. The program is interfaced to the ROOT analysis framework.


💡 Research Summary

The paper presents TUnfold, a ROOT‑based software package designed to correct for migration effects and background contamination in multi‑dimensional distributions typical of high‑energy physics (HEP) analyses. At its core, TUnfold formulates the measurement problem as a linear system y = A x + b, where y is the observed histogram, x the true underlying distribution, A the detector response matrix, and b the background contribution. Direct inversion of A is ill‑posed because statistical fluctuations are amplified; therefore TUnfold employs a least‑squares fit augmented with Tikhonov regularisation. The regularisation term τ‖L x‖² (L usually a second‑derivative operator) enforces smoothness on the solution, while the parameter τ controls the trade‑off between fidelity to the data (the residual term) and smoothness.

Two complementary strategies are implemented for choosing the optimal τ. The first is the L‑curve method: a log‑log plot of the residual norm versus the regularisation norm is generated, and the point of maximum curvature is taken as the best compromise. The second is a scan of the global correlation coefficient, which evaluates the overall parameter correlations for each τ and selects the value that minimizes these correlations. Users may adopt either approach or compare both to ensure robustness.

Background handling is fully integrated. If the background vector b is known precisely, it is simply subtracted from y before unfolding. When b carries uncertainties, TUnfold treats them as systematic errors, incorporating them into the full covariance matrix of the measurement. Error propagation is performed in two stages. Statistical uncertainties are propagated by linear error propagation using the covariance of y and of the response matrix A, yielding the covariance of the unfolded result x. Systematic uncertainties, particularly those arising from limited Monte‑Carlo statistics in A, are modeled as separate variation matrices ΔA_k. These variations can be sampled via Monte‑Carlo or propagated analytically, ensuring that the final error matrix correctly reflects both sources.

A major strength of TUnfold is its native support for multi‑dimensional histograms (2‑D, 3‑D, etc.). The package automatically extracts binning information, reshapes the response matrix into a sparse multi‑index structure, and allows dimension‑specific regularisation weights. This flexibility is crucial for modern analyses that involve correlated observables such as jet‑mass versus transverse momentum or angular distributions in multiple variables.

Implementation details are tightly coupled to ROOT classes (TH1, TH2, TH3). The typical workflow consists of constructing a TUnfoldDensity object with the response matrix, measured histogram, and optional background, calling DoUnfold() to perform the regularised fit, and then retrieving the unfolded histogram and its covariance via GetOutput() and GetOutputErrorMatrix(). Diagnostic tools—L‑curve plots, correlation‑coefficient heat maps, and τ‑scans—are rendered directly on ROOT canvases, enabling analysts to visualise the stability of the solution and to fine‑tune τ interactively.

The authors validate TUnfold using closure tests on simulated data and a realistic physics case (e.g., structure‑function measurement in electron‑proton scattering). In closure tests, the unfolded distribution reproduces the true input within statistical expectations, and the χ²/ndf of the comparison is close to unity. When systematic variations of the response matrix are introduced, the propagated uncertainties correctly envelope the observed deviations, demonstrating that the method does not underestimate systematic effects.

In summary, TUnfold provides a comprehensive, statistically rigorous framework for unfolding multi‑dimensional HEP data. By combining least‑squares fitting, Tikhonov regularisation, automated τ selection, full background subtraction, and thorough error propagation—including both statistical and systematic components—TUnfold enables physicists to obtain unbiased, stable estimates of true distributions while preserving a transparent accounting of uncertainties. Its seamless integration with ROOT makes it readily adoptable in existing analysis chains, positioning it as a valuable tool for precision measurements at current and future particle‑physics experiments.