Precision annealing Monte Carlo methods for statistical data assimilation and machine learning
📝 Abstract
In statistical data assimilation (SDA) and supervised machine learning (ML), we wish to transfer information from observations to a model of the processes underlying those observations. For SDA, the model consists of a set of differential equations that describe the dynamics of a physical system. For ML, the model is usually constructed using other strategies. In this paper, we develop a systematic formulation based on Monte Carlo sampling to achieve such information transfer. Following the derivation of an appropriate target distribution, we present the formulation based on the standard Metropolis-Hasting (MH) procedure and the Hamiltonian Monte Carlo (HMC) method for performing the high dimensional integrals that appear. To the extensive literature on MH and HMC, we add (1) an annealing method using a hyperparameter that governs the precision of the model to identify and explore the highest probability regions of phase space dominating those integrals, and (2) a strategy for initializing the state space search. The efficacy of the proposed formulation is demonstrated using a nonlinear dynamical model with chaotic solutions widely used in geophysics.
💡 Analysis
In statistical data assimilation (SDA) and supervised machine learning (ML), we wish to transfer information from observations to a model of the processes underlying those observations. For SDA, the model consists of a set of differential equations that describe the dynamics of a physical system. For ML, the model is usually constructed using other strategies. In this paper, we develop a systematic formulation based on Monte Carlo sampling to achieve such information transfer. Following the derivation of an appropriate target distribution, we present the formulation based on the standard Metropolis-Hasting (MH) procedure and the Hamiltonian Monte Carlo (HMC) method for performing the high dimensional integrals that appear. To the extensive literature on MH and HMC, we add (1) an annealing method using a hyperparameter that governs the precision of the model to identify and explore the highest probability regions of phase space dominating those integrals, and (2) a strategy for initializing the state space search. The efficacy of the proposed formulation is demonstrated using a nonlinear dynamical model with chaotic solutions widely used in geophysics.
📄 Content
PHYSICAL REVIEW RESEARCH 2, 013050 (2020) Precision annealing Monte Carlo methods for statistical data assimilation and machine learning Zheng Fang ,1,*,† Adrian S. Wong,1,† Kangbo Hao,1,† Alexander J. A. Ty ,1 and Henry D. I. Abarbanel1,2 1Department of Physics, University of California, San Diego, La Jolla, California 92093, USA 2Marine Physical Laboratory (Scripps Institution of Oceanography), University of California, San Diego, La Jolla, California 92093, USA (Received 6 July 2019; published 15 January 2020) In statistical data assimilation (SDA) and supervised machine learning (ML), we wish to transfer information from observations to a model of the processes underlying those observations. For SDA, the model consists of a set of differential equations that describe the dynamics of a physical system. For ML, the model is usually constructed using other strategies. In this paper, we develop a systematic formulation based on Monte Carlo sampling to achieve such information transfer. Following the derivation of an appropriate target distribution, we present the formulation based on the standard Metropolis-Hasting (MH) procedure and the Hamiltonian Monte Carlo (HMC) method for performing the high-dimensional integrals that appear. To the extensive literature on MH and HMC, we add (1) an annealing method using a hyperparameter that governs the precision of the model to identify and explore the highest probability regions of phase space dominating those integrals, and (2) a strategy for initializing the state-space search. The efficacy of the proposed formulation is demonstrated using a nonlinear dynamical model with chaotic solutions widely used in geophysics. DOI: 10.1103/PhysRevResearch.2.013050 I. INTRODUCTION Two seemingly distinct challenges for systematically trans- ferring information from a well-curated (but noisy) data set to a model of the processes producing the data, namely statistical data assimilation (SDA) [1–6] and machine learning [7–14], have been shown to be equivalent in their formal structure [7]. In artificial neural networks, the rules that direct the activities from layer to layer are equivalent to the rules for the temporal development of dynamical models in statistical data assimilation. In SDA, the number of measurements at each observation time plays the same role as the number of independent input-output pairs in machine learning. In this paper we formulate the information transfer prob- lem as a Monte Carlo evaluation of a high-dimensional ex- pected value integral. This formulation can be applied both to physical dynamical systems and to machine-learning prob- lems [7,15]. We also explore the evaluation of such integrals and add to the Monte Carlo procedures a strategy to identify the dominant contribution to the expected values. We use a problem description close to SDA in physical sciences [1], nevertheless, the statements and lessons identified here are directly usable in machine learning [7]. We first establish the problem in Sec. II and then briefly discuss the Hamiltonian Monte Carlo (HMC) method [16–18] in Sec. III. HMC overcomes the difficulties in traditional *Corresponding author: zfang@physics.ucsd.edu †These authors contributed equally to this work. Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI. Monte Carlo by complementing the original state space with a set of canonical variables moving in “time.” Here, in par- ticular, we study HMC with two additional features that are specified in Sec. V. (1) We use a precision annealing (PA) method derived from the formulation of the “action” A(X) = −log[π(X | Y)]. The target distribution π(X | Y), conditioned on all observa- tions, governs the expected values of the model state variables and parameters. (2) We use a different way to initialize the Monte Carlo search. We call the combination of these strategies precision an- nealing Hamilton Monte Carlo (PAHMC). We also report some results from standard Metropolis- Hastings (MH) Monte Carlo calculations [19,20] in which the proposals for movements in the state space are based on random perturbations of the present location. We label these methods as random proposal (RP) searches. We employ our PA and initialization strategies in both the RP and the HMC analyses. To show the effectiveness of our formulation, both ap- proaches are demonstrated on a chaotic dynamical model widely used in geophysics [21]. We present in Sec. VI results on estimating the observed and unobserved state variables of the model, estimating the unknown model parameters, as well as predicting forward in time. A. Example physics problems addressed by the methods discussed Here we very briefly discuss two physics problems where we have applied earlier methods for data assimilation: (i) determining the elec
This content is AI-processed based on ArXiv data.