sFit: a method for background subtraction in maximum likelihood fit

sFit: a method for background subtraction in maximum likelihood fit
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper presents a statistical method to subtract background in maximum likelihood fit, without relying on any separate sideband or simulation for background modeling. The method, called sFit, is an extension to the sPlot technique originally developed to reconstruct true distribution for each date component. The sWeights defined for the sPlot technique allow to construct a modified likelihood function using only the signal probability density function and events in the signal region. Contribution of background events in the signal region to the likelihood function cancels out on a statistical basis. Maximizing this likelihood function leads to unbiased estimates of the fit parameters in the signal probability density function.


💡 Research Summary

The paper introduces a novel statistical technique called sFit for performing background subtraction directly within a maximum‑likelihood fit, eliminating the need for separate sideband regions or dedicated Monte‑Carlo background models. The method builds on the earlier sPlot formalism, which assigns each event a weight (the sWeight) based on the probability that the event belongs to the signal component. In sFit, these sWeights are used to construct a modified likelihood that depends only on the signal probability density function (PDF) and the observed data in the signal region. Because the sWeights statistically cancel the contribution of background events, maximizing this modified likelihood yields unbiased estimates of the signal‑only parameters.

Methodology

  1. Joint PDF definition – The full data set is described by a two‑dimensional PDF (f(x,y)=f_s(x)g_s(y)+f_b(x)g_b(y)), where (x) denotes the variables of interest (e.g., invariant mass, decay time) and (y) denotes discriminating variables (e.g., particle‑identification scores). The signal and background components each have their own PDFs in both spaces.
  2. First‑stage fit – A conventional un‑binned maximum‑likelihood fit of the joint PDF is performed. This yields per‑event posterior probabilities (P_s(i)) and (P_b(i)) for being signal or background.
  3. sWeight construction – The sWeight for each event is defined as
    \

Comments & Academic Discussion

Loading comments...

Leave a Comment