Reducing multi-dimensional information into a 1-d histogram
We present two methods for reducing multidimensional information to one dimension for ease of understand or analysis while maintaining statistical power. While not new, dimensional reduction is not greatly used in high-energy physics and has applications whenever there is a distinctive feature (for instance, a mass peak) in one variable but when signal purity depends on others; so in practice in most of the areas of physics analysis. While both methods presented here assume knowledge of the background, they differ in the fact that only one of the methods uses a model for the signal, trading some increase in statistical power for this model dependence.
💡 Research Summary
The paper addresses a common problem in high‑energy physics analyses: a distinctive feature such as a mass peak appears in one observable, while the signal‑to‑background ratio depends strongly on several other variables. To retain the intuitive simplicity of a one‑dimensional histogram without sacrificing statistical power, the authors propose two projection techniques that compress an n‑dimensional event space into a single variable. Both methods assume that the background distribution is known a priori, but they differ in their treatment of the signal model.
The first technique, referred to as the “background‑only weight”, constructs a weight for each event solely from the background probability density function (B(\vec{x})) evaluated at the event’s full set of coordinates (\vec{x}). Typical choices are (w_i = 1/B(\vec{x}_i)) or (w_i = \sqrt{B(\vec{x}_i)}). After weighting, the events are projected onto the chosen analysis variable (e.g., invariant mass) and binned. Because the weight normalises the background contribution, the resulting histogram has a flat background expectation, and any residual structure is attributed to signal. This approach is model‑independent with respect to the signal, making it robust against signal‑model uncertainties, but it does not exploit the discriminating power of the signal shape and therefore can be slightly less efficient when signal and background overlap significantly.
The second technique incorporates an explicit signal model (S(\vec{x})) together with the background model. The weight is defined as the Neyman–Pearson likelihood ratio:
\
Comments & Academic Discussion
Loading comments...
Leave a Comment