CHIWEI: A code of goodness of fit tests for weighted and unweighted histograms
A self-contained Fortran-77 program for goodness of fit tests for histograms with weighted entries as well as with unweighted entries is presented. The code calculates test statistics for case of histogram with normalized weights of events and in case of unnormalized weights of events.
💡 Research Summary
The paper presents CHIWEI, a self‑contained Fortran‑77 program that implements goodness‑of‑fit tests for both weighted and unweighted histograms. Traditional Pearson chi‑square tests compare observed bin counts n_i with expected counts n p_i0, producing a statistic X² = Σ (n_i – n p_i0)² / (n p_i0) that follows a χ² distribution with m – 1 degrees of freedom when the null hypothesis H₀ (the model probabilities p_i0 are correct) holds.
When events carry weights w(x) = p(x)/g(x), where p(x) is the target probability density and g(x) is the sampling density, each bin i accumulates a total weight W_i = Σ_k w_i(k) and a sum of squared weights W_i² = Σ_k w_i(k)². For normalized weights (Σ_i W_i = n) the estimator of the bin probability is \hat p_i = W_i / n, which remains unbiased. The authors derive a generalized chi‑square statistic for this case: X²_norm = Σ_i (W_i – n p_i0)² / (n p_i0). Under H₀ this statistic is approximately χ² distributed with m – 1 degrees of freedom. If the weights are not normalized (i.e., multiplied by an overall constant), the effective degrees of freedom reduce by one, giving a χ² distribution with m – 2 degrees of freedom. This reduction reflects the additional constraint imposed by the overall scale of the weights.
The CHIWEI subroutine is called as CALL CHIWEI(P,W1,W2,N,NCHA,MODE,STAT,NDF,IFAIL). The inputs are:
- P: an array of expected probabilities p_i0,
- W1: an array of total weights per bin (W_i),
- W2: an array of total squared weights per bin (W_i²),
- N: total number of events (for normalized weights this equals the sum of counts),
- NCHA: number of bins m,
- MODE: 1 for normalized weights, 2 for unnormalized weights.
The outputs are STAT (the computed chi‑square value), NDF (the degrees of freedom, equal to m – MODE), and IFAIL (non‑zero if the calculation fails). The program uses the CERNLIB routine PROB(G100) to obtain the p‑value from the chi‑square distribution.
To validate the implementation, the authors consider a double‑Breit‑Wigner distribution p(x) on the interval
Comments & Academic Discussion
Loading comments...
Leave a Comment