We demonstrate and explicate Bayesian methods for fitting the parameters that encode the impact of short-distance physics on observables in effective field theories (EFTs). We use Bayes' theorem together with the principle of maximum entropy to account for the prior information that these parameters should be natural, i.e.O(1) in appropriate units. Marginalization can then be employed to integrate the resulting probability density function (pdf) over the EFT parameters that are not of specific interest in the fit. We also explore marginalization over the order of the EFT calculation, M, and over the variable, R, that encodes the inherent ambiguity in the notion that these parameters are O(1). This results in a very general formula for the pdf of the EFT parameters of interest given a data set, D. We use this formula and the simpler "augmented chi-squared" in a toy problem for which we generate pseudo-data. These Bayesian methods, when used in combination with the "naturalness prior", facilitate reliable extractions of EFT parameters in cases where chi-squared methods are ambiguous at best. We also examine the problem of extracting the nucleon mass in the chiral limit, M_0, and the nucleon sigma term, from pseudo-data on the nucleon mass as a function of the pion mass. We find that Bayesian techniques can provide reliable information on M_0, even if some of the data points used for the extraction lie outside the region of applicability of the EFT.
Deep Dive into Bayesian Methods for Parameter Estimation in Effective Field Theories.
We demonstrate and explicate Bayesian methods for fitting the parameters that encode the impact of short-distance physics on observables in effective field theories (EFTs). We use Bayes’ theorem together with the principle of maximum entropy to account for the prior information that these parameters should be natural, i.e.O(1) in appropriate units. Marginalization can then be employed to integrate the resulting probability density function (pdf) over the EFT parameters that are not of specific interest in the fit. We also explore marginalization over the order of the EFT calculation, M, and over the variable, R, that encodes the inherent ambiguity in the notion that these parameters are O(1). This results in a very general formula for the pdf of the EFT parameters of interest given a data set, D. We use this formula and the simpler “augmented chi-squared” in a toy problem for which we generate pseudo-data. These Bayesian methods, when used in combination with the “naturalness prior”, f
Effective field theory (EFT) methods allow the treatment of problems in which there is a separation of scales. In these theories dynamics at the low-energy scale, m, say, is incorporated explicitly in the theory, while the degrees of freedom that enter the problem at the high-energy scale, Λ, are integrated out. (See, Refs. [1,2,3,4] for pedagogical introductions to EFT.) The impact of modes with p ∼ Λ on dynamics for p ∼ m is then accounted for via a sequence of contact operators of increasing dimension. If there is no predetermination as to which operators appear in this sequence then the theory is free of model assumptions about the high-energy dynamics. Therefore, in general, all contact operators consistent with the symmetries that are applicable at the scale p ∼ m should be included in the EFT expansion. The coefficients of these admissible contact operators encode the impact of high-energy physics on low-energy observables in a systematic and model-independent way. Observables corresponding to momenta p ∼ m can be computed as an expansion in powers of m/Λ, and the resultant formulae are model-independent predictions, depending only on the existence of the scale separation and the symmetries of the low-energy theory.
One popular application of EFT is to low-energy QCD. In this case the scale separation is between the mass of the pion, the pseudo-Goldstone boson of QCD’s spontaneouslybroken (approximate) chiral symmetry, and the masses of other hadronic degrees of freedom. The EFT which incorporates chiral symmetry and encodes this scale separation is known as chiral perturbation theory (χPT) [5,6,7,8,9,10,11]. The χPT expansion for a hadronic observable is then an expansion in powers of m/Λ, with loop diagrams introducing non-analytic dependence on this expansion parameter. 1 The dynamics at scale Λ impacts this expansion through certain coefficients which are not determined a priori. The lowenergy symmetries of QCD mandate that once determined in one process these parametersthe “low-energy constants” (LECs) of χPT-will appear in other processes too, thereby giving χPT predictive power once the LECs at a given order are known. 2 There are some instances in which an LEC can be rigorously computed from the underlying theory, but lattice calculations which do this for low-energy QCD exist in only a very few cases. In this situation the only model-independent way to find the LECs is to fit them to experimental data. Such parameter estimation is thus a crucial component of χPT, and indeed of all EFT programs.
The standard method of determining LECs from data is to perform a fit using the EFT expansion of that physical quantity at a fixed order, employing techniques such as least squares or maximum likelihood. But here we face several dilemmas as regards the"best" way to obtain the LECs, including:
Which data should be used to determine the LEC? More data is available as the maximum energy of the data set is increased, but the reliability of a fixed order EFT calculation decreases as the energy is increased.
What order of EFT calculation should be used to extract the LEC? The first one at which that LEC appears, or the highest one to which the expansion has been computed?
How should prior constraints on LECs (e.g. from the requirement of “naturalness” with respect to the scale Λ, or from other processes) be incorporated into the fit?
In an ideal situation none of these dilemmas matter, and all fitting paths lead to the same LEC (within errors). But if only somewhat imprecise experimental data is available in the region of validity of the EFT then the extracted LEC can be significantly sensitive to the manner in which the fit is done.
In this paper we argue that Bayesian methods (see e.g. Refs. [12,13]) are ideal for parameter estimation of LECs in EFTs, and that they resolve all the above dilemmas. In the Bayesian approach the central object is the posterior probability distribution function (pdf) for the LECs of interest, say a 0 and a 1 , and we want their joint, conditional distribution given a data set D: pr(a 0 , a 1 |D). Bayes’ theorem gives us the following relation between this and the more-usually computed pr(D|a 0 , a 1 ): pr(a 0 , a 1 |D) = pr(D|a 0 , a 1 )pr(a 0 , a 1 ) pr(D) .
Here the first factor on the right-hand side is the “likelihood” that is minimized in a χ 2 or least-squares approach. It is through the second factor that prior information can be incorporated in the fit. (The factor in the denominator may be determined by the requirement of a normalized pdf for pr(a 0 , a 1 |D).)
The fact that EFTs intrinsically depend on scale separation means that in an EFT fit there is information available on the size of LECs prior to the analysis of the data. In a standard, perturbative EFT with one high-energy scale the LECs should be “natural” with respect to the scale Λ, i.e. O(1) when measured in units of Λ. 3 Consequently we begin by encoding the fact that the LECs a 0 , . .
…(Full text truncated)…
This content is AI-processed based on ArXiv data.