This article proposes a way to improve the presentation of histograms where data are compared to expectation. Sometimes, it is difficult to judge by eye whether the difference between the bin content and the theoretical expectation (provided by either a fitting function or another histogram) is just due to statistical fluctuations. More importantly, there could be statistically significant deviations which are completely invisible in the plot. We propose to add a small inset at the bottom of the plot, in which the statistical significance of the deviation observed in each bin is shown. Even though the numerical routines which we developed have only illustration purposes, it comes out that they are based on formulae which could be used to perform statistical inference in a proper way. An implementation of our computation is available at https://github.com/dcasadei/psde .
Most analyses compare the observed data to the expectation resulting from a theoretical model, like the Standard Model (SM), or some other hypothesis, like a best fitting function, or a Monte Carlo simulated distribution. It is common to make histograms in logarithmic scale, because their contents span orders of magnitude. Since differences are difficult to see in a logarithmic scale, an inset plot is often made at the bottom of the histogram. In this article, we focus on this inset and propose a way to make it intuitive and accurate. By "intuitive" we mean that it should make it obvious which bins contain an excess of data and which contain a deficit, while significant deviations should look more striking than insignificant ones. By "accurate" we mean that it should represent the actual significance of the deviation in each bin, rather than some approximation.
Below we define statistical significance in two probability models, Poisson and binomial. In Section 2 we show a sequence of presentation options that could be used (and most of them have been used), explaining their strengths and limitations. This will motivate, through a series of incremental improvements, our final proposal, which is given in Section 2.4. Finally, if the expectation is only known within some uncertainty, this uncertainty can be taken into account in the comparison to the data, as explained in Section 3. In this paper, we use the ROOT framework [1] to produce the plots and provide suggestions for implementing our formulae with this widely used software for data analysis and visualization, although our recommendations can be easily implemented within other frameworks too.
The key notion, which is needed to define the statistical significance, is the probability of finding a deviation at least as big as the one observed in the data, under the assumption that the chosen theoretical model describes our system. This probability is commonly called the “p-value” and it usually spans several orders of magnitude. For this reason, it is convenient to translate the p-value into a “z-value”, which is the deviation at the right of the mean of a Gaussian distribution, expressed in units of standard deviations, which would correspond to the same p-value. The equation which maps each p-value to a z-value, and vice versa, is: which is shown in Fig. 1. In ROOT, the z-value can be computed in one line of code, using the inverse error function:
A z-value ≥ 0 corresponds to a p-value ≤ 0.5, and negative z-values correspond to p-value > 0.5. Significant deviations are characterized by quite small p-values, corresponding to z-values ≥ 3. For example, it is common to refer to a p-value = 2.87 × 10 -7 as a “5σ effect”, meaning that the corresponding z-value is 5. Such deviations are usually considered very significant and a z-value of 5 or more is conventionally required in High Energy Physics to claim a discovery [2]. On the other hand, a z-value which is less than 1-2 units represents a common statistical fluctuation, which is uninteresting. The fluctuations are even more probable (i.e. less interesting) when the z-value is negative.
The p-value depends on the statistical distribution that the data are supposed to follow, the so-called “probability model”. The cases of Poisson and binomial distributed data are most common in experimental physics, and are addressed in this note.
This is the most common case, where event counts are plotted in each bin and the number of entries in each bin follows a Poisson distribution (e.g. Ref. [3]). If B ∈ R events are expected in a bin, the probability of observing D ∈ N events is
(B is also the variance of D). The Poisson p-value is
The above sums are simplified thanks to the identity:
where
The ratio between the upper incomplete Gamma function Γ (s, x) and the Gamma function Γ (s) in the previous equation is known as the upper regularized Gamma function
where P (s, x) is the cumulative distribution function for Gamma random variables with shape parameter s and scale parameter 1 [4]. In ROOT this function is available as
One should notice that it is not always true that the bin population follows the Poisson distribution. This does not happen when the total number of entries in a histogram is not a random variable, in which case the events in each bin are distributed accordingly to the multinomial distribution, which means that the bins cannot be considered statistically independent as we assumed in this section. In the extreme case of a fixed total number of entries and only two bins, the content of each bin follows the binomial distribution: if there is an excess of counts in one bin, with respect to the expectation, then the other bin must have a deficit in order to preserve the total number of events.
If the plotted quantity is a Bernoulli success rate, then it follows a binomial distribution. A common example would be a trigger efficiency “turn-on curve” like Fig. 1 in Ref. [6], where it would be info
This content is AI-processed based on open access ArXiv data.