How Analytic Choices Can Affect the Extraction of Electromagnetic Form Factors from Elastic Electron Scattering Cross Section Data
Scientists often try to incorporate prior knowledge into their regression algorithms, such as a particular analytic behavior or a known value at a kinematic endpoint. Unfortunately, there is often no unique way to make use of this prior knowledge, and thus, different analytic choices can lead to very different regression results from the same set of data. To illustrate this point in the context of the proton electromagnetic form factors, we use the Mainz elastic data with its 1422 cross section points and 31 normalization parameters. Starting with a complex unbound non-linear regression, we will show how the addition of a single theory-motivated constraint removes an oscillation from the magnetic form factor and shifts the extracted proton charge radius. We then repeat both regressions using the same algorithm, but with a rebinned version of the Mainz dataset. These examples illustrate how analytic choices, such as the function that is being used or even the binning of the data, can dramatically affect the results of a complex regression. These results also demonstrate why it is critical when using regression algorithms to have either a physical model in mind or a firm mathematical basis
💡 Research Summary
The paper investigates how analytic choices in regression analysis affect the extraction of the proton’s electromagnetic form factors (G_E and G_M) and the derived charge radius (r_p) from elastic electron‑proton scattering data. Using the extensive Mainz dataset (1422 cross‑section points with 31 normalization parameters), the authors perform two families of fits: an “unbound” regression where high‑order (11th) polynomials for G_E and G_M are freely fitted, and a “bound” regression where the polynomial coefficients are constrained to alternate in sign, mimicking a completely monotone function as suggested by nuclear‑theory expectations.
In the unbound fit the first‑order electric coefficient a_E1 = –3.331 yields r_p = 0.882 fm. Imposing the sign‑alternation constraint reduces the magnitude of a_E1 to –3.124 and shifts the extracted radius to r_p = 0.854 fm, a change of about 0.03 fm caused by a single theoretical prior. The authors repeat the exercise on a rebinned version of the Mainz data (658 points) that preserves the same 31 normalization parameters but averages over energy‑angle bins. The unbound fit now gives r_p = 0.863 fm, while the bound fit gives r_p = 0.845 fm, again showing a systematic shift of ~0.02 fm.
Statistical model‑selection criteria (χ² per degree of freedom, Akaike Information Criterion, Bayesian Information Criterion) are applied to both data sets. For the original 1422‑point set, the bound 7th‑order polynomial minimizes AIC/BIC (χ²/df ≈ 1.21), whereas the unbound 10th‑order polynomial yields the lowest χ²/df (≈ 1.14). For the rebinned 658‑point set, the bound 7th‑order (χ²/df ≈ 0.865) and unbound 9th‑order (χ²/df ≈ 0.830) are optimal. This demonstrates that the “best” model depends on both the functional constraints and the data granularity.
A detailed examination of the 31 normalization parameters shows that they shift by only a few tenths of a percent between fits, yet these small adjustments are necessary to reconcile the imposed endpoint constraints (G_E(0)=1, G_M(0)=μ_p) with the measured cross sections. The paper emphasizes that even minute changes in normalization can propagate into the extracted radius because the regression is highly sensitive to the low‑Q² behavior.
The authors conclude that (1) the choice of regression function (order, sign constraints), (2) the binning or reweighting of the data, and (3) the pre‑definition of model‑selection criteria all have a decisive impact on the extracted physical quantities. They argue that the “proton radius puzzle” – the discrepancy between muonic hydrogen measurements (≈ 0.841 fm) and electronic scattering results (≈ 0.875 fm) – may be partially rooted in such methodological differences. Consequently, any future extraction of form factors or radii must transparently report the analytic choices, justify any theoretical constraints, and employ robust statistical criteria to avoid hidden biases. The study serves as a cautionary example of how meta‑uncertainties in data analysis can dominate the error budget in precision nuclear physics.
Comments & Academic Discussion
Loading comments...
Leave a Comment