Chi-squared Test for Binned, Gaussian Samples

Chi-squared T est for Binned, Gaussian Samples Nic holas R. Hutzler Division of Ph ysics, Mathematics, and Astronomy California Institute of T ec hnology P asadena, CA 91125 E-mail: hutzler@caltech.edu June 2019 Abstract. W e examine the χ 2 test for binned, Gaussian samples, including eﬀects due to the fact that the experimentally a v ailable sample standard deviation and the una v ailable true standard deviation ha ve diﬀeren t statistical prop erties. F or data formed by binning Gaussian samples with bin size n , w e ﬁnd that the exp ected v alue and standard deviation of the reduced χ 2 statistic is n − 1 n − 3 ± n − 1 n − 3 r n − 2 n − 5 r 2 N − 1 , (1) where N is the total num ber of binned v alues. This is strictly larger in b oth mean and standard deviation than the v alue of 1 ± (2 / ( N − 1)) 1 / 2 rep orted in standard treatmen ts, whic h ignore the distinction b et w een true and sample standard deviation. 1. In tro duction Precision measuremen ts of ph ysical quantities t ypically require a very large n umber of individual measurements of the same quantit y often tak en under v arying conditions, suc h as drifting signal-to-noise or man y experimental conﬁgurations with diﬀeren t signal sizes. F or this reason, as well as for simpliﬁcation of data analysis and reduction of computational requiremen ts, the data are t ypically binned together suc h that measuremen ts in the same bin were tak en within a time during which the conditions were similar. In order to chec k whether the binning is susceptible to the v arying conditions, as well as to search for unkno wn sources of noise, a χ 2 test [1, 2, 3] is commonly used. Regardless of whether or not it is an ideal choice of statistic for this case, it is fairly in tuitiv e as a measure of whether the assigned error bars are correctly capturing the statistics of the data. Ho w ev er, some of the simplifying assumptions used to construct the standard χ 2 can giv e results with a signiﬁcant bias for large data sets. W e discuss wh y the standard treatmen t underestimates b oth the mean and v ariance of the χ 2 statistic, and then determine the appropriate correction factors. Chi-squar e d T est for Binne d, Gaussian Samples 2 2. Chi-squared test for binned, Gaussian samples Consider a quantit y N x  1 of measurements x i without an y assigned uncertainties. Sa y that the measuremen ts are normally distributed with constant, true mean µ that is not know n to the experimenter. W e shall not assume that the data has a constant v ariance. Let us gather these data sequen tially in to groups G j with n consecutive points eac h. Now compute the usual sample mean, standard deviation, and standard error of eac h group of p oin ts: y j = 1 n X x i ∈ G j x i , s j = s 1 n − 1 X x i ∈ G j ( x i − y j ) 2 , s y j = 1 √ n s j . (2) W e hav e now binned our data in to a smaller set of N = N x /n  1 mean v alues y j and uncertainties s y j . As a chec k to see whether the assigned uncertainties are correctly capturing the statistical ﬂuctuations of the data we can p erform a χ 2 test as outlined in many standard texts [1, 2, 3]. W e will test the hypothesis that the y j are normally distributed ab out a constan t ¯ y (though this approac h is easily extended to mo dels with more degrees of freedom), and that the uncertain ties correctly describ e the statistical ﬂuctuations of the data ab out the mean. The reduced- χ 2 v alue of the data set is χ 2 red = 1 N − 1 N X j =1  y j − ¯ y σ y j  2 ≡ 1 N − 1 N X j =1 χ 2 j , (3) where ¯ y = ( P j y j /s 2 y j ) / ( P j 1 /s 2 y j ) is the w eigh ted mean of the y data, and σ y j is the true (unknown) standard deviation of the p oin ts { x i ∈ G j } , which need not b e constant o v er diﬀerent v alues of j . If the ﬂuctuations in the data are Gaussian in nature, and correctly accoun ted for by the uncertainties, then we hav e the usual result E[ χ 2 red ] = 1 , Std[ χ 2 red ] = r 2 N − 1 . (4) Ho w ever, the exp erimen ter do es not know the true standard deviation, and therefore actually computes the statistic ˜ χ 2 red = 1 N − 1 N X j =1  y j − ¯ y s y j  2 ≡ 1 N − 1 N X j =1 ˜ χ 2 j , (5) using s y j as an estimator for σ y j . W e wish to ﬁnd the statistical prop erties of this quantit y , which w e shall ﬁnd diﬀer from χ 2 red . In tuitiv ely , the sample standard deviation is computed from a ﬁnite n um b er of measuremen ts and therefore has some uncertain t y associated with it, and that uncertain t y should b e propagated through when examining the ˜ χ 2 red statistic. This is a w ell-kno wn eﬀect when estimating parameters from ﬁnite data sets and has b een previously explored in a n um b er of contexts, for example Poisson distributions, counting exp erimen ts, w eigh ted means, and histogram ﬁtting [4, 5, 6, 7, 8, 9, 10]. More sp eciﬁcally , while χ j ∼ N (0 , 1) is normally distributed, ˜ χ j is not: ˜ χ j ≡  y j − ¯ y s y j  ≈  y j − µ s y j  ∼ t ( n − 1) , (6) Chi-squar e d T est for Binne d, Gaussian Samples 3 the t -distribution with n − 1 degrees of freedom, which has larger tails for ﬁnite n than a normal distribution. Notice that we are treating ¯ y = µ as a constant, which is v alid in the limit N  1, though for smaller N the statistical prop erties of the weigh ted mean cannot b e ignored [9, 11, 12, 13, 14, 15]. In particular, the weigh ted mean also has correction factors due to the diﬀerence b et ween true and sample standard deviation, and has a non-trivial v ariance, b oth of whic h will impact the ˜ χ 2 red statistic. A go o d discussion of these complexities can b e found in reference [15]. The square of ˜ χ j is therefore distributed as ˜ χ 2 j ∼ F (1 , n − 1), the F -distribution with (1 , n − 1) degrees of freedom, which has E[ F (1 , n − 1)] = n − 1 n − 3 , V ar[ F (1 , n − 1)] = 2  n − 1 n − 3  2 n − 2 n − 5 . (7) This is as opp osed to the χ 2 j statistic, which has (appropriately) a χ 2 distribution. ˜ χ 2 red is therefore distributed as a sum of F -distributions, which is complicated [16]. How ev er, the exp ectation v alue and v ariance are straigh tforw ard to calculate, E[ ˜ χ 2 red ] = N N − 1 E  ˜ χ 2 j  = n − 1 n − 3 + O  N − 1  , (8) V ar[ ˜ χ 2 red ] = N ( N − 1) 2 V ar  ˜ χ 2 j  = 2 N − 1  n − 1 n − 3  2 n − 2 n − 5 + O  N − 2  . (9) This implies that the mean and standard deviation of the ˜ χ 2 red statistic are larger than those of the χ 2 red statistic b y E[ ˜ χ 2 red ] E[ χ 2 red ] = n − 1 n − 3 , Std[ ˜ χ 2 red ] Std[ χ 2 red ] = n − 1 n − 3 r n − 2 n − 5 , (10) up to further corrections of order O ( N − 1 ). A plot of these correction factors is sho wn in Figure 1. In the limit n → ∞ we reco v er the usual result, but for ﬁnite n w e will alw a ys exp ect larger v alues for b oth mean and standard deviation. W e can also see that c ho osing n ≤ 5 is not advisable, since the statistic will ha ve a non-con v ergen t v ariance. 5 10 20 50 100 1 1.1 1.2 1.4 1.6 1.8 2 2.5 3 3.5 Bin Size n E[ χ red 2 ]/E[ χ red 2 ] Std[ χ red 2 ]/Std[ χ red 2 ] ~ ~ Figure 1. Correction factors to the mean and standard deviation of ˜ χ 2 red . Chi-squar e d T est for Binne d, Gaussian Samples 4 3. Conclusion In summary , we ﬁnd that the standard χ 2 statistic computed from binning ﬁnite data sets underestimates the mean and v ariance for binned Gaussian samples, and derive simple, closed expressions for the biases. F or v ery large data sets with ﬁnite bin sizes, suc h as those commonly found in precision physics measuremen ts, these corrections can b e signiﬁcant and should not b e neglected. A cknow le dgments. I would lik e to ac kno wledge helpful discussions with Da vid W atson, and man y helpful discussions with the ACME Collab oration, in particular Da vid DeMille, John M. Doyle, and Brendon O’Leary . App endix: A simple example W e can see how the “usual” chi-squared statistic giv es an incorrect result by p erforming a simple n umerical test on some simulated data. Generate 1,000,000 points x i ∼ N (0 , 1), bin into groups of n = 10, and then compute means y j , standard errors σ y j , and the reduced chi-squared statistic ˜ χ 2 red (as describ ed in the main text) for the resulting 100,000 binned p oin ts. Nx = 1000000 //Number of x values nbin = 10 //Number of points to bin for j = 1:(Nx/nbin) //Step over bins x = randn(1,nbin) //Generate nbin normally distributed points y(j) = mean(x) //Means sigmayi(j) = std(x)/sqrt(nbin) //Standard errors end ybar = sum(y./sigmayi.^2)/sum(1./sigmayi.^2) //Weighted mean chi = (y-ybar)./sigmayi //chi chi2 = sum(chi.^2) //chi^2 dof = length(y)-1 //Degrees of freedom redchi2 = chi2/dof //Reduced chi^2 redchi2sigma = sqrt(2/dof) //‘‘Usual’’ uncertainty of chi^2 If w e run this piece of co de, w e will ﬁnd redchi2 = 1.2868 and redchi2sigma = 0.0045 (though of course the former will b e diﬀerent each time due to the random nature of the calculation.) This v alue diﬀers considerably from the na ¨ ıv e exp ectation of 1 ± 0 . 0045 based on the usual treatmen t that ignores the diﬀerence b et w een sample and true standard deviations, but is quite close to the exp ected v alue of 1 . 2857 ± 0 . 0073 from equations (8) and (9). Chi-squar e d T est for Binne d, Gaussian Samples 5 [1] Press W H, T eukolsky S A, V etterling W T and Flannery B P 2007 Numeric al R e cip es 3rd ed (Cam bridge Universit y Press) ISBN 978-0521880688 [2] Bevington P R and Robinson D K 2003 Data R e duction and Err or Analysis for the Physic al Scienc es 3rd ed (Boston: McGraw-Hill) [3] T a ylor J R 1996 An Intr o duction to Err or Analysis 2nd ed (Univ ersity Science Bo oks) ISBN 978- 0935702750 [4] Bak er S and Cousins R D 1984 Nucl. Instruments Metho ds Phys. R es. 221 437–442 ISSN 01675087 [5] Jading Y and Riisager K 1996 Nucl. Instruments Metho ds Phys. R es. Se ct. A A c c el. Sp e ctr ometers, Dete ct. Asso c. Equip. 372 289–292 ISSN 01689002 [6] Hammersley A and Antoniadis A 1997 Nucl. Instruments Metho ds Phys. R es. Se ct. A A c c el. Sp e ctr ometers, Dete ct. Asso c. Equip. 394 219–224 ISSN 01689002 [7] Mighell K J 1999 Astr ophys. J. 518 380–393 ISSN 0004-637X [8] Hausc hild T and Jen tschel M 2001 Nucl. Instruments Metho ds Phys. R es. Se ct. A A c c el. Sp e ctr ometers, Dete ct. Asso c. Equip. 457 384–401 ISSN 01689002 [9] Zhang N F 2006 Metr olo gia 43 195–204 ISSN 00261394 [10] Gagunash vili N 2010 Nucl. Instruments Metho ds Phys. R es. Se ct. A A c c el. Sp e ctr ometers, Dete ct. Asso c. Equip. 614 287–296 ISSN 01689002 [11] Hutzler N R 2014 A New Limit on the Ele ctr on Ele ctric Dip ole Moment Ph.D. thesis Harv ard Univ ersity [12] Co c hran W G 1937 Suppl. to J. R. Stat. So c. 4 102–118 [13] Gra ybill F A and Deal R B 1959 Biometrics 15 543–550 [14] B¨ oc kenhoﬀ A and Hartung J 1998 Biometric al J. 40 937–947 ISSN 0323-3847 [15] Hartung J, Knapp G and Sinha B K 2008 Statistic al Meta-Analysis with Applic ations (Wiley- In terscience) [16] Morrison D F 1971 J. Am. Stat. Asso c. 66 383 ISSN 01621459

Chi-squared Test for Binned, Gaussian Samples

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment