Parton distributions: determining probabilities in a space of functions
We discuss the statistical properties of parton distributions within the framework of the NNPDF methodology. We present various tests of statistical consistency, in particular that the distribution of results does not depend on the underlying parametrization and that it behaves according to Bayes’ theorem upon the addition of new data. We then study the dependence of results on consistent or inconsistent datasets and present tools to assess the consistency of new data. Finally we estimate the relative size of the PDF uncertainty due to data uncertainties, and that due to the need to infer a functional form from a finite set of data.
💡 Research Summary
The paper presents a comprehensive statistical study of parton distribution functions (PDFs) obtained with the NNPDF methodology, emphasizing that PDFs should be regarded as probability distributions over a space of functions rather than as fixed parametrizations. The authors first outline the four pillars of the NNPDF approach: (i) Monte‑Carlo replica generation, where the original experimental data set is reproduced 1000 times with statistical fluctuations; (ii) a highly redundant neural‑network parametrization, assigning a separate feed‑forward network (37 free parameters) to each of the seven parton flavours; (iii) a genetic‑algorithm optimisation that explores the high‑dimensional parameter space without becoming trapped in local minima; and (iv) cross‑validation, which splits each replica into training and validation subsets and stops the fit when the validation χ² begins to rise, thereby avoiding over‑fitting.
Statistical consistency is examined through several diagnostics. The authors compare the total χ² of the averaged replica set (χ²_tot) with the average replica‑by‑replica χ²(k) and with the χ² obtained when each replica is compared to its own pseudo‑data (χ²(E_i)). The fact that χ²(k)≈1 while χ²(E_i)≈2 demonstrates that the replicas have “learned’’ the underlying law and are closer to the true data than the noisy pseudo‑data. A distance metric d
Comments & Academic Discussion
Loading comments...
Leave a Comment