Systematic Physics Constrained Parameter Estimation of Stochastic Differential Equations

Systematic Physics Constrained P arameter Estimation of Stochastic Di ﬀ erential Equations Daniel Peav oy a,1 , Christian L. E. Franzke b, ∗ , Gareth O. Roberts c a Comple xity Science Centre , University of W arwic k, Coventry , UK b Meteor ologic al Institute and Cent er for E arth System Resear ch and Sustaina bility (CEN), Unive rsity of Hamburg , Hambur g, Ge rmany c Department of Statist ics, Univer sity of W arwi ck, Covent ry , UK Abstract A systematic Bayesian framew ork is developed f or physics co nstrained parameter inference o f stochastic di ﬀ erential equations (SDE) from pa rtial obser vations. Physical constraints are derived for s tochastic climate mo dels but are applicable for many ﬂuid systems. A cond ition is deri ved for glob al stability o f stochastic c limate models ba sed on energy conservation. Stochastic cli- mate models ar e glob ally stable when a qua dratic for m, which is related to th e cub ic nonlinear operator, is negati ve deﬁnite. A n ew algorithm for the e ﬃ cie nt sampling o f such ne gative deﬁnite matrices is d ev eloped and also fo r imputing unob served d ata which impr ove the accur acy of the parameter estimates. The performance of this fram ew ork is evaluated on two conceptual climate models. K e ywor ds: Global Stability, Parameter Inference, Imputing Data, Stochastic Di ﬀ erential Equation s, Physical C onstrain ts, Stochastic C limate Mo dels 2010 MSC: 86-08, 2010 MSC: 65C60, 2010 MSC: 86A10 1. Introduction In m any areas of science the inference o f red uced or der h ybrid dyn amic-stocha stic mod els, which take the fo rm of stoch astic di ﬀ er ential equations (SDE), fro m data is very impo rtant. For many ap plications ru nning full resolution d ynamical m odels is computation ally prohib iti ve and in many situatio ns on e is main ly in terested in large- scale featu res and no t the exact evolution of the f ast, small scale f eatures, which typ ically determin e the time step size. Thus, reduced order stochastic m odels ar e an attractive altern ativ e. Examp les a re molecular d ynamics (2 1), engineer ing turb ulence (20) and climate science (14, 15, 22). The inferenc e of such mode ls has been don e using non-p arametric methods (40, 17, 8) fro m partial observations. These non-parame tric metho ds need very lon g time series for reliable pa- rameter estimates and can be used only for very lo w-dimen sional m odels because of the curse ∗ Center for Earth System Research and Sustainabilit y (CEN), Uni versity of Hamburg, Grindelber g 7, 20144 Ham- bur g, Germany , christian.fran zke@gmai l.com 1 Present a ﬃ liat ion: Manufact uring T echnology Centre, Coven try , UK Prep rint submitted to Computational Statistic s & Data Analysis J une 25, 2021 of dimension . More im portantly , they d o not necessarily o bey c onservation laws o r stability proper ties of the full dimen sional dynamical system. In m any areas of scienc e one can derive reduced order models from ﬁrst pr inciples (26, 31) such that certain fundame ntal pr operties of the f ull dyna mics are still valid. These metho ds provide us with para metric forms for the mo del ﬁtting. Physical con straints then not on ly co nstrain the parameters on e ha s to estimate but the y can also ensure glob al stability . Th us, the re is a need for systema tic phy sics con strained mod el and paramete r estimation p rocedu res ( 32, 28). For instance, the climate system is governed by co nservation laws like energy con serva- tion. Based on th is en ergy c onservation pro perty the normal fo rm of stochastic climate models has been derived by (2 6) using th e stochastic mode redu ction p rocedu re (2 3, 24, 2 5, 14, 1 5). This pr ocedur e allo ws the systematic der iv a tion of reduc ed order models fr om ﬁrst princip les. This normal form provides a parametric form for par ameter estimation f rom partial observations which we will use in this study . The funda mental fo rm of climate models is gi ven by d z d t = F + L z + B ( z , z ) , (1) where z ∈ R N denotes the N-dimensional st ate vector , F the extern al fo rcing, L a lin ear and B a qua dratic nonlinear op erator . The n onlinear oper ator B is conservin g energy z · B ( z , z ). For current climate m odels N is of the o rder of 1 0 6 − 10 8 . This shows that runnin g complex climate models is computation ally expensive. But for many applications l ike extended-ran ge (period s of more than 2 weeks), seasonal and decadal climate pred ictions one is o nly interested in the lar ge- scale circulation of th e climate system and not whether there will be a cyclone over Lon don on a particular day next year . The large-scale circulation can successfully be predicted using reduc ed order models (37, 1, 14, 15, 22). The stochastic mode redu ction p rocedu re (23, 24, 25) p rovides a systematic framew ork for deriving red uced ord er clim ate models with a closure which takes acco unt of the impact of the unresolved modes on the resolved modes. In order to deri ve reduced order models one splits the state vector z = x y ! into re solved x and u nresolved y modes. The stochastic m ode reduction proced ure now enab les u s to sy stematically derive a reduced order climate mod el which o nly depend s on x d x =  ˜ F + ˜ L x + ˜ B ( x , x ) + M ( x , x , x )  d t + a ( x , σ ) d W , (2) where M deno tes a cub ic nonlinear term, W is the Wi ener pr ocess and σ the di ﬀ u sion p arameters. In this stud y we will develop a systematic B ayesian framework for the e ﬃ cient e stimation of the model p arameters using Mar kov Chain Mon te Carlo (MCMC) methods from par tial obser- vations. W e are d ealing with par tial observ ations because we now o nly ha ve kno wledge o f the few r esolved modes x and are ignorant about the many unresolved modes y . Stochastic climate modeling is a complex prob lem and em pirically estimating the parameter s poses se veral pro blems. Fir st, the nonlinearity o f climate models req uires an ap proxim ation of the likelihood function . While it can be shown tha t this appro ximation c onv erges to the tru e likelihood, th is is not necessarily the case for real world app lications. Here we de velop a MCMC algorithm f or th e ﬁrst time for stoch astic clima te mod els and de monstrate that this algorithm perfor ms well. Second, the nonlinearity of the pr oblem cau ses the space of p arameters leading to stable and p hysical meaningful solutio ns to be come small a s the d imension of the pr oblem in - creases. W e show that a lo t of the p osterior mass is o n param eter values which lead to so lutions 2 exploding to inﬁnity in ﬁnite time. T o solve this problem we deriv e glob al stability conditions. These cond itions take the form of a negative deﬁnite matr ix. Hence, we devise a novel sam - pling strategy b ased on samp ling no n-negative matrices. W e show that th is sampling strategy is computatio nally e ﬃ cient a nd leads to stable solutions. In section 2 we introduce stochastic climate models and deriv e conditio ns f or global stability . Previous studies hav e s hown tha t reduced climate models with quad ratic no nlinearity e xperien ce unphy sical ﬁnite time blow up and long time instabilities (19, 27, 28, 41). Here we use the no rmal form of stochastic climate mod els (2 6) and deri ve su ﬃ cient conditions fo r glob al stab ility fo r the normal form of stochastic climate mod els. These n ormal form stochastic climate mod els ha ve cubic non linearities. T he here d erived s tability co ndition is more general than th e on e in Majda et al. (26). In section 3 we develop a Bayesian framework fo r the sy stematic estimation of the model par ameters using physical constrain ts. Here we develop an e ﬃ cient way of sampling negativ e-deﬁnite m atrices. W ith out this constraint the MCMC algorithm would p roduc e about 40% u nphy sical solutions which is clearly very in e ﬃ cient. Here we also d emonstrate th at for these kin ds of SDEs imputing d ata impr oves the parameter estimates c onsiderab ly . In sectio n 4 we demon strate the accuracy o f our framework on co nceptual climate models. W e summ arize our results in section 5. 2. Stochastic Climate Models and Global Stability Here we stud y the follo wing D dimensional normal for m of stocha stic climate mod els (which has the same structural form as Eq. (2), see (26)): d x i =         α i + D X j = 1 β i , j x j + D X j + 1 j X k = 1 γ i , j , k x j x k + D X j = 1 j X k = 1 k X l = 1 λ i , j , k , l x j x k x l         d t (3a) + D X j = 1 a i , j d W j + D X j = 1 j X k = 1 b i , j , k x j d W k . (3b) which we write for conv enience in a more compact form d x = µ ( x , A ) d t + a ( x , σ ) d W . (4) The parameters α, β, γ and λ are written as on e matrix A ∈ R D × P . W e a llow f or the inclusion of all po ssible linear, qu adratic and cub ic terms with forcin g terms entering ﬁrst, followed by linear, quad ratic and then th e cubic terms. W e include them into a matrix A as A i , 1 = α i , A i , j + 1 = β i , j , A i , j ( j − 1) / 2 + k + D + 1 = γ i , j , k and A i , f ( j , k , l ) = λ i , j , k , l . The in dex functio n for the cubic term is gi ven by f ( j , k , l ) = 1 + D + D ( D + 1) 2 + j ( j − 1)( j + 1) 6 + k ( k − 1) 2 + l . (5) Global stability implies that the cu bic terms act as a non linear damping. For climate mod- els energy con servation of the no nlinear op erator imp lies g lobal stab ility . However , in genera l, global stability does not necessarily i mply energy conservation. The energy equation b ased only on the cubic terms can be written as 1 2 d E d t = D X i = 1 x i d x i d t = D X i = 1 D X j = 1 j X k = 1 k X l = 1 A i , f ( j , k , l ) x i x j x k x l . (6) 3 Here we o nly consider the cubic term because this term will ultimately d etermine global stability . Majda et al. (26) have shown that the normal f orm of stoch astic clim ate m odels a llows for lin early unstable modes. These linearly u nstable modes are associated with important weather systems and waves and are an in trinsic and imp ortant part of climate m odels. Once th ese linearly unstable modes hav e r eached a certain amplitud e the nonlinea r cubic term s will govern their ev olution an d, thus, ensure globa l st ability . W e con sider now the vector v with D ( D + 1) 2 compon ents of the form v ( i − 1) i / 2 + j = x i x j with 1 ≤ j ≤ i ≤ D . Now we ﬁ nd a negative deﬁnite matrix M such that (26) v T Mv = 1 2 d E d t ≤ 0 . (7) A su ﬃ cient solution is as follows: L et matrix M ∈ R ( D + 1) D / 2 × ( D + 1) D / 2 be M ( i − 1) i / 2 + j , ( k − 1) k / 2 + l =                A i , f ( i , j , l ) , if k > j and l ≤ j 0 , if k > j a nd l > j A i , f ( i , j , l ) + A i , f ( i , j , k ) , if k ≤ j and l < k A i , f ( i , j , l ) , if k ≤ j and l = k , (8) where 1 ≤ j ≤ i ≤ D and 1 ≤ l ≤ k ≤ D . While this solution is no t necessarily unique th e imposition of th is con straint is still necessary in ord er to red uce the amo unt of par ameter values leading to unstable m odels and , thus, to red uce the com putation al expense of the pa rameter infer- ence. As we will show below , not im posing th is c onstraint will lead to un stable an d u nphy sical solutions in about 40% of parameter values in our MCMC scheme. In sum mary , th e stochastic climate model in E q. (4) is g lobally stable if the tensor M is negativ e deﬁn ite and M determines the com ponen ts of the cubic o perator λ . This is an important result for the constrained parame ter estima tion of stochastic climate models (26, 32). 3. Physics Constraint P aramet er Sampling W e use a Markov Chain Mon te Carlo algor ithm for the p arameter infer ence which was pro - posed by (7) and (18). W e use this appro ach because of its ﬂexibility . For instance, the exact algorithm s by (5) an d (6) can not be e asily applied to mu ltidimension al di ﬀ usions. Fu rthermo re, the exact algo rithms requ ire that the drif t fun ction must be the grad ient of a po tential; for in- stance, stochastic climate mode ls cannot b e written in such a form. Our MCMC alg orithm ﬁrst u pdates the d i ﬀ usion param eters (see algor ithm 1), then updates the imputed d ata (algo rithm 2) a nd ﬁn ally up dates the drift par ameters (section 3.2) . T o e ﬃ ciently propo se impu ted data we us th e Modiﬁed Linear Bridge sampler (section 3.1 ). T o physically constrain the drift parameters we d ev elop a scheme to s ample negativ e deﬁnite ma trices (section 3.3 and algorithms 3 and 4). The novel aspect of o ur MCMC alg orithm is the p hysics c onstraint samp ling which ensu res the stability of the reduce d stochastic m odel. Moreover , th is alg orithm overcom es the de pen- dency between the di ﬀ usion p arameters an d the missing data by chan ging v a riables t o the u nder- lying Bro wnian mo tion W ∈ R d and co nditionin g on this when per formin g the p arameter up date. This ensures consistency between the parameters and the path (7, 18). In order to imp rove th e accuracy of the E uler appr oximation we introduce laten t data po ints between all pairs of ob ser- vations. While this is not tr ivial f or no nlinear models this can be acco mplished by introdu cing a 4 suitable d i ﬀ usion brid ge (10, 18). For this p urpose we deﬁne a p rocess Z , which co nditions o n the endp oint x T , by d X t = a ( X t , σ ) d Z t + x T − X t T − t d t , X 0 = x 0 , (9) where T is the next observation time. In discrete time the transfo rmation is X i + 1 = X i + a ( X i , σ )( Z i + 1 − Z i ) + x T − X i m − i (10) where m − 1 is the n umber of imputed points between two observations. Deﬁning the process Z ensures that the d ominatin g measure is para meter free and , hence, impr oves the performan ce of the MH sampler . See Dargatz (9) for more details. W e sample σ according to Algorithm 1. W e use zero -based number ing an d N − 1 observation intervals ind exed 0 . . . N − 2 . W e assume that the inter-observation times ∆ are all equal and that there are m − 1 im puted poin ts p er interval, giving a time i nterval of δ = ∆ / m . W e use the notation X i = X t i and µ i = µ ( X t i , A ). W e assume that we have perfe ct obser vations for ease o f no tation but our m ethod can be exten ded to the case of measur ement err or (e.g . (1 8)). T he extension to variable inter-observation times is straight f orward. For simplicity , we wr ite the alg orithm for perfect obser vation of the system so that X im , i = 0 , . . . , N − 1 are ﬁx ed. In Algorithm 1, φ denotes a Gaussian distribution an d q the Gaussian p ropo sal density (which is deﬁned below in Eq . 1 9) and Σ a c ovariance matrix (given be low in Eq . 16). Algorithm 1 Sample parameter s entering the di ﬀ usion matrix. Draw σ ∗ ∼ q ( σ ∗ | σ ) Initialize α = lo g( q ( σ | σ ∗ )) − lo g( q ( σ ∗ | σ )) + log( p ( σ ∗ )) − lo g( p ( σ )) for i = 0 to N − 2 do for j = 0 to m − 2 do Z im + j + 1 = Z im + j + a − 1 ( X im + j , σ )  X im + j + 1 − X im + j − X im + m − X im + j m − j  X ∗ im + j + 1 = X ∗ im + j + X im + m − X ∗ im + j m − j + a ( X ∗ im + j , σ ∗ )( Z im + j + 1 − Z im + j ) α = α + log( φ ( X ∗ im + j + 1 ; X ∗ im + j + µ ∗ im + j δ , δ Σ ∗ im + j ) + log | a ( X ∗ im + j , σ ∗ ) | − log( φ ( X im + j + 1 ; X im + j + µ im + j δ , δ Σ im + j ) − log | a ( X im + j , σ ) | end for end for Set { σ , X } = { σ ∗ , X ∗ } with pro bability min(1 , exp ( α )) else retain { σ , X } T o update missing d ata between observations we use an indepe ndence sampler as in (36) using the propo sal process d X ∗ = ξ ( X ∗ , X T ) d t + a ( X ∗ , σ ) d W ∗ , (11) where X T is the next observation, X ∗ the pro posed data, and where a ( X ∗ , σ ) is the sam e d i ﬀ usion function as th at in E q. (4). ξ denotes the modiﬁed linear b ridge ( see below in sectio n 3 .1). The propo sal pro cess Eq. (1 1) will ha ve a measure th at is absolutely con tinuous with respect to the 5 target p rocess in Eq. (4) because of their comm on di ﬀ usion fun ction. T o upd ate all of the miss ing data, we p ropo se a bloc k at a time from Eq. (11) an d then a ccept the pro posed b lock acc ording to the MH ratio. If th e inter-observation interval is large then the acceptance rate may beco me very lo w and so one may sub-sample smaller blocks. For s ome interval i we set X ∗ 0 = X im and X ∗ m = X ( i + 1) m then we p ropo se X ∗ 1 : X ∗ m − 1 and accept or reject the block using the MH acceptance proba bility α = p δ ( X ∗ m | X ∗ m − 1 , A ) Q m − 2 j = 0 p δ ( X ∗ j + 1 | X ∗ j , A ) q δ ( X im + j + 1 | X im + j , ξ , σ ) p δ ( X ( i + 1) m | X im + m − 1 , A ) Q m − 2 j = 0 p δ ( X im + j + 1 | X im + j , A ) q δ ( X ∗ j + 1 | X ∗ j , ξ , σ ) , (12) where p δ is the transition density of the target d X t = µ ( X t , A ) d t + a ( X t , σ ) d W t , X 0 = x 0 , t ∈ [0 , T ] (13) under the Eu ler appro ximation over the time interval δ and wh ere q δ denotes the tran sition den sity of the p ropo sal. W e choose pr oposal processes so that gi ven X ∗ j , X ∗ j + 1 is approximately Gau ssian distributed. However , Eq. (1 1) is no t a tr ue Gau ssian process b ecause o f the state depen dent noise term. Details for updating the missing data are giv en in Algorithm 2. Algorithm 2 Sample missing data between observations. for i = 0 to N − 2 do Set X ∗ 0 = X im Set α = 0 for j = 0 to m − 2 do X ∗ j + 1 ∼ q δ ( X ∗ j + 1 | ξ ( X ∗ j , X im + m ) , σ ) α = α + log( φ ( X ∗ j + 1 ; X ∗ j + δ µ ∗ j , δ Σ ∗ j ) + log( q δ ( X im + j + 1 | ξ ( X im + j , X im + m ) , σ )) − log( φ ( X im + j + 1 ; X im + j + δ µ im + j , δ Σ im + j )) − lo g( q δ ( X ∗ j + 1 | X ∗ j , ξ ( X ∗ j , X im + m ) , σ )) end for α = α + log( φ ( X im + m ; X ∗ m − 1 + δ µ ∗ m − 1 , δ Σ ∗ m − 1 )) − log( φ ( X im + m ; X im + m − 1 + δ µ im + m − 1 , δ Σ im + m − 1 )) if exp( α ) > U (0 , 1 ) then for j = 0 to m − 2 do X im + j + 1 = X ∗ j + 1 end for end if end for Algorithms 1 and 2 ar e com bined with standard MH updates f or th e parameters A entering into the d rift fu nction. First we update the d i ﬀ usion parameter s using Algorithm 1, th en we update all imputed data. A fter that the drift parameters will b e updated (see section 3 .2). In both algorithm s X ∗ denotes the pr oposal which will b e gener ated dependend ing on the av ailability of observations. One could use Random -W alk proposals b ut in ou r case of po lynomial models it is more e ﬃ c ient to imp lement another Gib bs samp ling step. Repeated ly alternating between these three steps will p roduce MCMC samples that can be used t o estimate the parameters. In practice we increase the amount o f missing data m un til we see convergence in the marginal d istributions of the parameters. 6 3.1. Sampling of Di ﬀ usion P a ths Because we want to impute missing d ata we n eed e ﬃ cien t method s for simulating di ﬀ u sion paths fr om Eq. (3) that ar e cond itioned upon given start X 0 = x 0 and end X m = x m points. W e consider the total time in terval τ m − τ 0 = ∆ divided into m equ idistant sub -intervals so that τ k + 1 − τ k = ∆ / m = δ . Having N observations, for each of i = 0 , 1 , 2 , . . . N − 1, X im is an observation. Between every pair of observations the di ﬀ usion b ridge will n eed to be simulated. W e use an independ ence sam- pler with propo sal den sity of the f orm q ( X ∗ | X ) = q ( X ∗ ). Here, w e c onsider pr oposal pr ocesses of the form of Eq. (11). W e use a Modiﬁed L inear Brid ge p roposal for sampling of p arameters of the d rift o f equ ations of the form o f Eq. 4. For this purp ose we apply Ito’ s form ula to the drift functio n of Eq. 4. This giv es the approx imating p rocess d Z t = ( Q ( X ) Z t + r ( X , t )) d t + Σ ( X ) d B t . (14) with Q i j = ∂µ i ( X s ) ∂ x j r i ( t ) = µ i ( X s ) − X j ∂µ i ∂ x j X j ( s ) + 1 2 X j , k , l a jl ( X t ) a kl ( X t ) ∂ 2 µ i ∂ x j ∂ x k ( X t )( t − s ) Σ = a ( X s ) This is a loca l lin earization of the no nlinear d i ﬀ usion over a small time window (30, 39). First we construct b ridge distributions fo r gen eral multivariate linear di ﬀ u sions (4 ). If at time s we ha ve X s = d and at time T , X T = e then the distribution of X t for 0 ≤ s < t ≤ T can be sho wn to be Gaussian with mean ν d , e ( s , t ) = Γ ( t , T ) Γ ( s , T ) − 1 m + d ( s , t ) + Γ ( s , T ) T ( Γ ( s , T ) T ) − 1 m − e ( t , T ) , (15) where Γ ( s , t ) = Z t s e ( s − u ) Q ΣΣ T e ( t − u ) Q T d u , m + x ( s , t ) = x + Z t s e ( s − u ) Q r ( u ) d u and m − x ( s , t ) = x − Z t s e ( t − u ) Q r ( u ) d u . The covariance matrix is given by Σ ( s , t ) = Γ ( t , T ) Γ ( s , T ) − 1 Γ ( s , t ) . (16) In gener al this m atrix can b e co mputed as f ollows: if we d iagonalize Q so that Q = U Λ U − 1 then compu te the matrix A with comp onents V i j = ( U − 1 ΣΣ T U − T ) i j Λ ii + Λ j j  e ( t − s ) Λ j j − e ( s − t ) Λ ii  , (17) then Γ ( s , t ) = UV U T . (18) 7 The propo sal distribution is giv en by q ( X k + 1 | X k , X m , V ) = φ  X k + 1 ; ν x k , x m ( k δ , ( k + 1) δ ) , Σ x k ( k δ , ( k + 1) δ )  , (19) where ν x j , x m ( j δ , ( j + 1) δ ) and Σ x 0 ( j δ , ( j + 1) δ ) are given in E qs. (15) and (16) respectively . In co ntrast to a linear brid ge sampler here we upd ate at each im puted po int. T his mean s recompu ting the matrices Γ ( s , t ) at each po int, although Q and U are only calcu lated once. 3.2. Infer ence for Drift P arameters Now we gi ve details of the computation al impleme ntation o f the sampling of parameters in the drift fun ction. Since the drift paramete rs enter linearly we ca n construct a Gibbs sampler where their cond itional poster ior is Gaussian. Th is g reatly improves the m ixing of the Markov Chain. 3.2.1. Gibbs Sampler Consider N observations with time interval δ . W e set Y t = X t + 1 − X t and let U ∈ R N − 1 × P be the design matr ix of the data, scaled by δ . The colu mns of U are ind exed in the same way as the columns of the param eter matrix A . For example, a tw o dimension al system w ould have P = 10 and the following design matrix U = δ             1 X 1 , 1 X 1 , 2 X 2 1 , 1 X 1 , 1 X 1 , 2 X 2 1 , 2 X 3 1 , 1 . . . . . . . . . . . . . . . . . . . . . 1 X N − 1 , 1 X N − 1 , 2 X 2 N − 1 , 1 X N − 1 , 1 X N − 1 , 2 X 2 N − 1 , 2 X 3 N − 1 , 1 X 2 1 , 1 X 1 , 2 X 1 , 1 X 2 1 , 2 X 3 1 , 2 . . . . . . . . . X 2 N − 1 , 1 X N − 1 , 2 X N − 1 , 1 X 2 N − 1 , 2 X 3 N − 1 , 2             The log likelihood can be written L ( A ; X ) = − 1 2 N − 1 X t = 1 | Σ Dri f t t | − 1 2 N − 1 X t = 1 D X i , j = 1        Y ti − P X k = 1 U tk A ik        Σ − 1 Dri f t ti j        Y t j − P X k = 1 U tk A jk        , (20) where th e instantan eous covariance ma trix Σ Dri f t t is co mputed from Σ 1 / 2 Dri f t t , j , k = ( d j , k + P D l = 1 e l , j , k X t , j ) ∆ 1 / 2 . W e have DP para meters to infer in the m atrix A . W e use a z ero mean G aussian prior w ith covariance matrix Γ D r i f t ∈ R DP × DP . Let Λ ∈ R DP × DP , be a matrix with compon ents Λ ( i − 1) P + j , ( k − 1) P + l = N − 1 X t = 1 U t j Σ − 1 Dri f t tik U tl + Γ − 1 Dri f t ( i − 1) P + j , ( k − 1) P + l (21) where i , k = 1 . . . D and j , l = 1 . . . P . Let e ∈ R DP with comp onents e ( i − 1) P + j = X t , k U t , j Σ − 1 Dri f t tik Y tk . (22) The posterio r m ean µ ( i − 1) P + j of A i , j is given by the solu tion of Λ µ = b and the po sterior covari- ance is Cov( A i , j , A k , l ) = Λ − 1 ( i − 1) P + j , ( k − 1) P + l . 8 0 10000 −0.4 −0.2 0 A 00 0 10000 1 1.5 2 A 01 0 10000 −1 −0.5 0 A 02 0 10000 −0.5 0 0.5 A 03 0 10000 −1 0 1 A 04 0 10000 −1 −0.5 0 A 05 0 10000 −2.6 −2.4 −2.2 A 06 0 10000 −0.5 0 0.5 A 07 0 10000 −1 −0.5 0 A 08 0 10000 −0.5 0 0.5 A 09 0 10000 −0.8 −0.6 −0.4 A 10 0 10000 −1 −0.5 0 A 11 0 10000 −1 0 1 A 12 0 10000 −0.2 0 0.2 A 13 0 10000 −2 −1 0 A 14 0 10000 −0.5 0 0.5 A 15 0 10000 −0.1 0 0.1 A 16 0 10000 −1 −0.5 0 A 17 0 10000 −1 0 1 A 18 0 10000 −2 −1.5 −1 A 19 Figure 1: Out put of a Gibbs sampler for 20 drift parameters of a two dimensional model from Eq. (3). The observat ion interv al is δ = 10 − 3 and T = 10 , 000. The true v alues are shown in red. W e ap plied the above Gibbs sampler to a large data s et fro m a tw o dimen sional model o f the form of Eq. (3) with rand om v alues for the di ﬀ usion p arameters. W e chose a ﬁne observation interval of δ = 10 − 3 and lon g observation period T = 10 , 000. Fig. 1 d isplays the trace plots fo r all 2 0 p arameters (n ote that the indices ar e fro m 0 rather than 1 as in the text). Using th is large data set the algor ithm i s able to rep rodu ce the tru e v alues shown in r ed. W e perfo rmed a further simulation study to test the dep endence of the posterio r estimates upon the data set used. W e in ferred all of the drif t param eters fo r a simp le two d imensiona l mod el of the f orm of Eq. (3) u sing data s ets of length T = { 10 , 1 00 , 10 00 } a nd with o bservation interval ∆ = { 0 . 1 , 0 . 0 1 , 0 . 00 1 } . No te that the di ﬀ usion fu nction is arbitr ary in this model. Th e results are shown in T able 1 . For each param eter we estimated the posterior mean a nd the poster ior 10th-9 0th perce ntiles. A cell is colored blu e w here the true value f alls within this range. No te that if we were using the true likelihoo d, rather than a n appro ximation , then we would expect there to be a round 80% blue boxes. The erro r o f the estima tes for each d ata set can b e quantiﬁed using the quadr atic P osterior Expected Loss (PEL) function f ( ˆ π, X obs ) = Z Θ ( θ ∗ − θ ) 2 ˆ π ( θ | X obs ) d θ , (23) 9 where θ represents all of the p arameters, ˆ π is the estimated posterior distribution and θ ∗ is the true value of the parameter . W e perform ed a test with both the Gibbs sampler an d data imputation. In T ab. 2 the d ata is observed at interval ∆ = 0 . 1. The sma ller interv als ∆ = { 0 . 01 , 0 . 0 01 } ar e ob tained by imputing data with m = { 10 , 100 } respectively . Th e table shows that imputing data approximately doubles the Posterior Ex pected L oss. As e xpected the con ﬁdence in tervals are broader but with mor e im - puted data th e algorith m can recover the true values. This shows that our data imp uting strategy successfully improves th e parameter estimates. Our aim is to in fer mod els that can be used fo r predictio n. This can be problem atic when dealing with non-linear m odels as some (g enerally un known) region s o f th e p arameter space will giv e so lutions that explod e to inﬁnity with proba bility 1 . This is a particular p roblem wh en, as exempliﬁed by T able 1, large amounts of data are needed to regain the true v alues. T o de monstrate this problem we performed an inference on a two dime nsional cub ic mo del using N = 1 , 000 observations at ∆ = 0 . 1. For eac h in ferred par ameter value we then simulated the solution fo r T = 100. After th is time we reco rded whether the solutio n retained ﬁnite values or ha d exploded. The marginal posterior d istributions of th e cub ic p arameters are plo tted in Fig. 2. Each plot shows two h istograms: one in blu e records the distribution of stable parameter values an d in r ed are those th at exploded . Notice th at, wh en loo king at the marginal distributions, the stable and unstab le regions largely overlap; it is di ﬃ cult to separate the tw o r egions. In this case 40% of v alues were u nstable. T ests (not shown) indicate that this is an e ven bigger problem in high er dim ensions. The refore, it is essential to use con straints on the par ameter space to enable only phy sically me aningfu l solu tions. The necessary conditions have been derived in section 2. Thus, as sh own above, when upd ating the drif t p arameters we e nsure that M is negative deﬁnite. In p ractice it is su ﬃ cient to check o nly whether the sym metric part ( M + M T ) / 2 is negativ e-deﬁnite. In th e next section we will develop a sy stematic way of sampling negativ e deﬁnite matrices. 3.3. S ampling Ne gative Deﬁnite Matrices T o sample negativ e d eﬁnite ma trices we use the Com ponen t Wis e algorithm ( 32). Here we sample the density of a n × n matrix M , with norm ally distributed components, subject to the constraint that it is negati ve d eﬁnite. This algorithm updates M com ponen t wise and is based on the f ollowing p roperty : a n × n matrix is ne gative d eﬁnite if and on ly if all k ≤ n l eading principal minors ob ey | M ( k ) | ( − 1) k > 0. The k th pr incipal mino r is the deter minant of th e up per left k × k sub-matrix . Conside r the par ameters along the main diag onal. As they on ly enter M once, each will h av e an associated u pper b ound. The Alg orithm 3 works by ca lculating the upp er boun d associated with th e c onstraints fro m each principal min or . It do es th is to ﬁnd the least upp er bound and thereby the truncation point of the normal distribution. Here, N − ( µ, u , σ 2 ) is the right trunca ted normal distrib ution with mean µ , stand ard de viation σ and upper b ound u . Th e o ﬀ -diago nal pa rameters en ter twice so ther e will be a quadr atic function determinin g their limits for each leading princ ipal min or . For p arameters in elem ent M ( k ) i j there will be an associated quadra tic relation a ( k ) i j M 2 i j + b ( k ) i j M i j + c ( k ) i j = 0 where the coe ﬃ cien ts 10 T = 10 T = 10 0 T = 10 00 0 . 1 0 . 01 0 . 001 0 . 1 0 . 01 0 . 001 0 . 1 0 . 01 0 . 001 A 00 = 0 2 . 18 ( − 2 . 5 , 6 . 89) − 0 . 44 ( − 4 . 79 , 3 . 74) − 1 . 95 ( − 6 . 07 , 2 . 2) − 0 . 13 ( − 1 . 07 , 0 . 77) 0 . 65 ( − 0 . 29 , 1 . 6) 0 . 64 ( − 0 . 28 , 1 . 58) 0 . 02 ( − 0 . 13 , 0 . 17) 0 . 01 ( − 0 . 13 , 0 . 17) 0 . 02 ( − 0 . 13 , 0 . 18) A 01 = 5 − 2 . 04 ( − 7 . 23 , 3 . 04) 4 . 86 (0 . 84 , 9 . 04) 6 . 21 (2 . 31 , 10 . 12) 2 . 54 (1 . 78 , 3 . 28) 4 . 73 (3 . 93 , 5 . 51) 5 . 22 (4 . 43 , 6 . 03) 2 . 84 (2 . 7 , 2 . 97) 4 . 59 (4 . 45 , 4 . 73) 4 . 86 (4 . 73 , 5) A 02 = 0 2 . 63 ( − 0 . 16 , 5 . 39) 2 . 15 ( − 0 . 17 , 4 . 49) 1 . 63 ( − 0 . 66 , 4) 0 . 45 ( − 0 . 52 , 1 . 41) 0 . 58 ( − 0 . 42 , 1 . 6) 0 . 13 ( − 0 . 89 , 1 . 13) − 0 . 08 ( − 0 . 22 , 0 . 05) − 0 . 14 ( − 0 . 27 , 0) − 0 . 19 ( − 0 . 32 , − 0 . 04) A 03 = 0 1 . 54 ( − 2 . 75 , 5 . 84) − 0 . 17 ( − 3 . 12 , 2 . 76) 0 . 26 ( − 2 . 62 , 3 . 2) 0 . 27 ( − 0 . 38 , 0 . 93) 0 . 01 ( − 0 . 72 , 0 . 73) − 0 . 17 ( − 0 . 88 , 0 . 54) − 0 . 01 ( − 0 . 08 , 0 . 06) 0 ( − 0 . 07 , 0 . 06) 0 ( − 0 . 07 , 0 . 06) A 04 = 0 − 3 . 51 ( − 8 . 17 , 1 . 13) 0 . 61 ( − 3 , 4 . 3) 1 . 3 ( − 2 . 32 , 5 . 07) − 0 . 96 ( − 2 . 09 , 0 . 2) − 0 . 58 ( − 1 . 8 , 0 . 68) − 0 . 07 ( − 1 . 3 , 1 . 16) − 0 . 01 ( − 0 . 04 , 0 . 01) − 0 . 02 ( − 0 . 04 , 0 . 01) − 0 . 02 ( − 0 . 05 , 0 . 01) A 05 = 0 0 . 32 ( − 2 . 19 , 2 . 81) 0 ( − 2 . 31 , 2 . 31) 0 . 19 ( − 2 . 11 , 2 . 49) 0 . 66 ( − 0 . 11 , 1 . 43) 0 . 02 ( − 0 . 79 , 0 . 84) − 0 . 27 ( − 1 . 05 , 0 . 55) 0 ( − 0 . 06 , 0 . 07) 0 . 01 ( − 0 . 06 , 0 . 07) 0 ( − 0 . 07 , 0 . 07) A 06 = − 3 − 0 . 88 ( − 2 . 96 , 1 . 24) − 2 . 58 ( − 4 . 3 , − 0 . 87) − 3 . 17 ( − 4 . 83 , − 1 . 49) − 1 . 8 ( − 2 , − 1 . 6) − 2 . 92 ( − 3 . 12 , − 2 . 71) − 3 . 05 ( − 3 . 25 , − 2 . 85) − 1 . 71 ( − 1 . 77 , − 1 . 65) − 2 . 75 ( − 2 . 81 , − 2 . 69) − 2 . 91 ( − 2 . 97 , − 2 . 85) A 07 = 0 0 . 58 ( − 1 . 87 , 3 . 06) − 1 . 32 ( − 3 . 32 , 0 . 7) − 1 . 48 ( − 3 . 54 , 0 . 52) 0 . 05 ( − 0 . 49 , 0 . 58) − 0 . 07 ( − 0 . 66 , 0 . 51) − 0 . 22 ( − 0 . 79 , 0 . 37) − 0 . 02 ( − 0 . 07 , 0 . 04) − 0 . 01 ( − 0 . 06 , 0 . 05) 0 ( − 0 . 05 , 0 . 06) A 08 = 0 − 0 . 83 ( − 3 , 1 . 31) − 0 . 28 ( − 2 . 31 , 1 . 76) − 0 . 18 ( − 2 . 21 , 1 . 86) − 0 . 48 ( − 1 . 03 , 0 . 07) − 0 . 37 ( − 0 . 96 , 0 . 23) − 0 . 16 ( − 0 . 75 , 0 . 42) − 0 . 02 ( − 0 . 08 , 0 . 03) − 0 . 01 ( − 0 . 06 , 0 . 04) − 0 . 01 ( − 0 . 07 , 0 . 04) A 09 = 0 0 . 71 ( − 0 . 11 , 1 . 53) − 0 . 4 ( − 1 . 16 , 0 . 38) − 0 . 49 ( − 1 . 26 , 0 . 27) 0 . 33 ( − 0 . 02 , 0 . 67) 0 . 04 ( − 0 . 3 , 0 . 39) 0 . 09 ( − 0 . 25 , 0 . 44) 0 . 08 (0 . 02 , 0 . 13) 0 . 1 (0 . 04 , 0 . 16) 0 . 12 (0 . 05 , 0 . 17) A 10 = 0 − 0 . 38 ( − 5 . 19 , 4 . 24) − 2 . 21 ( − 6 . 44 , 2 . 03) − 1 . 17 ( − 5 . 21 , 2 . 9) − 0 . 31 ( − 1 . 22 , 0 . 61) − 0 . 18 ( − 1 . 1 , 0 . 75) 0 . 06 ( − 0 . 89 , 0 . 99) 0 . 02 ( − 0 . 13 , 0 . 17) 0 . 05 ( − 0 . 1 , 0 . 2) 0 . 05 ( − 0 . 1 , 0 . 2) A 11 = 0 − 2 . 93 ( − 8 . 11 , 2 . 15) 2 . 42 ( − 1 . 64 , 6 . 48) 1 . 67 ( − 2 . 29 , 5 . 64) 0 . 88 (0 . 1 , 1 . 64) 0 . 39 ( − 0 . 4 , 1 . 19) 0 . 33 ( − 0 . 48 , 1 . 12) − 0 . 06 ( − 0 . 2 , 0 . 08) 0 . 11 ( − 0 . 03 , 0 . 25) 0 . 11 ( − 0 . 04 , 0 . 24) A 12 = 5 5 . 96 (3 . 22 , 8 . 68) 4 . 2 (1 . 89 , 6 . 63) 4 . 72 (2 . 33 , 7 . 09) 1 . 68 (0 . 72 , 2 . 65) 4 . 56 (3 . 52 , 5 . 57) 5 . 09 (4 . 08 , 6 . 12) 2 . 88 (2 . 74 , 3 . 02) 4 . 7 (4 . 56 , 4 . 84) 4 . 98 (4 . 84 , 5 . 11) A 13 = 0 2 . 68 ( − 1 . 67 , 6 . 93) − 1 . 41 ( − 4 . 36 , 1 . 57) − 1 . 88 ( − 4 . 74 , 1 . 02) − 1 . 05 ( − 1 . 72 , − 0 . 38) − 0 . 61 ( − 1 . 33 , 0 . 1) − 0 . 74 ( − 1 . 46 , − 0 . 02) − 0 . 01 ( − 0 . 08 , 0 . 05) − 0 . 03 ( − 0 . 1 , 0 . 03) − 0 . 03 ( − 0 . 09 , 0 . 04) A 14 = 0 − 1 . 13 ( − 5 . 84 , 3 . 6) 4 . 67 (1 . 01 , 8 . 45) 4 . 23 (0 . 47 , 7 . 92) 1 . 39 (0 . 24 , 2 . 55) 0 . 78 ( − 0 . 44 , 2 . 03) 0 . 91 ( − 0 . 33 , 2 . 14) 0 . 02 ( − 0 . 01 , 0 . 05) 0 . 03 (0 , 0 . 05) 0 . 03 (0 , 0 . 05) A 15 = 0 0 . 42 ( − 2 . 07 , 2 . 94) − 0 . 6 ( − 2 . 88 , 1 . 77) − 0 . 84 ( − 3 . 13 , 1 . 45) − 0 . 22 ( − 1 . 01 , 0 . 56) − 0 . 11 ( − 0 . 9 , 0 . 7) − 0 . 24 ( − 1 . 06 , 0 . 55) 0 ( − 0 . 07 , 0 . 06) − 0 . 01 ( − 0 . 08 , 0 . 05) − 0 . 02 ( − 0 . 09 , 0 . 04) A 16 = 0 − 0 . 82 ( − 2 . 94 , 1 . 29) 0 . 12 ( − 1 . 61 , 1 . 83) 0 . 39 ( − 1 . 26 , 2) − 0 . 08 ( − 0 . 28 , 0 . 12) − 0 . 04 ( − 0 . 24 , 0 . 17) − 0 . 02 ( − 0 . 22 , 0 . 19) − 0 . 03 ( − 0 . 09 , 0 . 03) − 0 . 08 ( − 0 . 14 , − 0 . 02) − 0 . 09 ( − 0 . 15 , − 0 . 03) A 17 = 0 − 0 . 46 ( − 2 . 95 , 2 . 04) − 3 . 21 ( − 5 . 28 , − 1 . 22) − 2 . 97 ( − 4 . 99 , − 0 . 93) − 0 . 71 ( − 1 . 25 , − 0 . 17) − 0 . 66 ( − 1 . 24 , − 0 . 08) − 0 . 78 ( − 1 . 35 , − 0 . 19) 0 . 01 ( − 0 . 05 , 0 . 06) − 0 . 01 ( − 0 . 06 , 0 . 04) 0 ( − 0 . 05 , 0 . 05) A 18 = 0 0 . 66 ( − 1 . 5 , 2 . 86) 1 . 33 ( − 0 . 71 , 3 . 35) 1 . 63 ( − 0 . 41 , 3 . 6) 0 . 65 (0 . 11 , 1 . 22) 0 . 45 ( − 0 . 13 , 1 . 04) 0 . 58 ( − 0 . 01 , 1 . 18) 0 . 06 (0 . 01 , 0 . 12) 0 . 02 ( − 0 . 03 , 0 . 08) 0 . 03 ( − 0 . 02 , 0 . 09) A 19 = − 3 − 2 . 61 ( − 3 . 44 , − 1 . 78) − 3 . 23 ( − 4 . 02 , − 2 . 45) − 3 . 47 ( − 4 . 24 , − 2 . 69) − 1 . 46 ( − 1 . 8 , − 1 . 12) − 2 . 73 ( − 3 . 07 , − 2 . 38) − 3 . 04 ( − 3 . 39 , − 2 . 7) − 1 . 75 ( − 1 . 81 , − 1 . 69) − 2 . 81 ( − 2 . 87 , − 2 . 75) − 2 . 98 ( − 3 . 04 , − 2 . 92) 8.48 5.25 4.96 1.19 0.37 0.36 0.4 3 0 .02 0.01 T able 1: Drift paramete r estimate s for a t wo dimen sional cubic model with arbitra ry di ﬀ usion func tion. On the l eft is the true valu e of the parame ter . The length of the data set used for the inference is labeled as T an d the observ ation interv al is ∆ = { 0 . 1 , 0 . 01 , 0 . 001 } . In each cell the paramet er is estimated from the posterior mean and in brack ets is shown the 10th-90t h percentile s of the posterior . The blue coloring is where the true val ue falls in this range. The bottom of the table shows the Poster ior Expected L oss in eac h case. 11 T = 10 T = 10 0 T = 10 00 0 . 1 0 . 01 0 . 001 0 . 1 0 . 01 0 . 001 0 . 1 0 . 01 0 . 001 A 00 = 0 2 . 18 ( − 2 . 5 , 6 . 89) − 2 . 46 ( − 8 . 46 , 3 . 52) − 2 . 59 ( − 8 . 35 , 3 . 24) − 0 . 13 ( − 1 . 07 , 0 . 77) − 0 . 68 ( − 1 . 84 , 0 . 5) − 0 . 69 ( − 1 . 85 , 0 . 49) 0 . 02 ( − 0 . 13 , 0 . 17) − 0 . 08 ( − 0 . 27 , 0 . 09) − 0 . 08 ( − 0 . 26 , 0 . 09) A 01 = 5 − 2 . 04 ( − 7 . 23 , 3 . 04) 4 . 99 ( − 0 . 7 , 10 . 61) 4 . 84 ( − 0 . 49 , 10 . 22) 2 . 54 (1 . 78 , 3 . 28) 4 . 89 (3 . 83 , 5 . 97) 4 . 76 (3 . 74 , 5 . 79) 2 . 84 (2 . 7 , 2 . 97) 5 . 04 (4 . 87 , 5 . 21) 4 . 88 (4 . 71 , 5 . 03) A 02 = 0 2 . 63 ( − 0 . 16 , 5 . 39) 0 . 61 ( − 2 . 94 , 4 . 05) 0 . 39 ( − 2 . 95 , 3 . 75) 0 . 45 ( − 0 . 52 , 1 . 41) − 0 . 06 ( − 1 . 42 , 1 . 3) − 0 . 2 ( − 1 . 49 , 1 . 13) − 0 . 08 ( − 0 . 22 , 0 . 05) − 0 . 26 ( − 0 . 44 , − 0 . 08) − 0 . 28 ( − 0 . 45 , − 0 . 09) A 03 = 0 1 . 54 ( − 2 . 75 , 5 . 84) 3 . 25 ( − 1 . 23 , 8 . 07) 3 . 26 ( − 1 . 07 , 7 . 77) 0 . 27 ( − 0 . 38 , 0 . 93) 0 . 67 ( − 0 . 34 , 1 . 67) 0 . 59 ( − 0 . 41 , 1 . 59) − 0 . 01 ( − 0 . 08 , 0 . 06) 0 . 02 ( − 0 . 06 , 0 . 1) 0 . 02 ( − 0 . 06 , 0 . 09) A 04 = 0 − 3 . 51 ( − 8 . 17 , 1 . 13) − 2 . 53 ( − 7 . 76 , 2 . 61) − 2 . 53 ( − 7 . 61 , 2 . 41) − 0 . 96 ( − 2 . 09 , 0 . 2) − 1 . 6 ( − 3 . 35 , 0 . 18) − 1 . 41 ( − 3 . 14 , 0 . 27) − 0 . 01 ( − 0 . 04 , 0 . 01) − 0 . 03 ( − 0 . 05 , 0) − 0 . 02 ( − 0 . 05 , 0) A 05 = 0 0 . 32 ( − 2 . 19 , 2 . 81) 1 . 85 ( − 1 . 6 , 5 . 2) 1 . 85 ( − 1 . 43 , 5 . 11) 0 . 66 ( − 0 . 11 , 1 . 43) 1 . 18 (0 . 03 , 2 . 32) 1 . 12 ( − 0 . 02 , 2 . 25) 0 ( − 0 . 06 , 0 . 07) 0 . 05 ( − 0 . 04 , 0 . 13) 0 . 05 ( − 0 . 04 , 0 . 13) A 06 = − 3 − 0 . 88 ( − 2 . 96 , 1 . 24) − 4 . 06 ( − 6 . 63 , − 1 . 76) − 4 . 02 ( − 6 . 5 , − 1 . 79) − 1 . 8 ( − 2 , − 1 . 6) − 3 . 19 ( − 3 . 44 , − 2 . 94) − 3 . 06 ( − 3 . 3 , − 2 . 83) − 1 . 71 ( − 1 . 77 , − 1 . 65) − 3 . 01 ( − 3 . 08 , − 2 . 94) − 2 . 92 ( − 2 . 98 , − 2 . 85) A 07 = 0 0 . 58 ( − 1 . 87 , 3 . 06) 0 . 44 ( − 2 . 27 , 3 . 17) 0 . 53 ( − 2 . 05 , 3 . 23) 0 . 05 ( − 0 . 49 , 0 . 58) 0 . 43 ( − 0 . 37 , 1 . 27) 0 . 39 ( − 0 . 42 , 1 . 19) − 0 . 02 ( − 0 . 07 , 0 . 04) 0 . 03 ( − 0 . 04 , 0 . 09) 0 . 03 ( − 0 . 03 , 0 . 09) A 08 = 0 − 0 . 83 ( − 3 , 1 . 31) − 2 . 22 ( − 4 . 98 , 0 . 62) − 2 . 12 ( − 4 . 84 , 0 . 65) − 0 . 48 ( − 1 . 03 , 0 . 07) − 1 ( − 1 . 86 , − 0 . 14) − 0 . 92 ( − 1 . 76 , − 0 . 09) − 0 . 02 ( − 0 . 08 , 0 . 03) − 0 . 02 ( − 0 . 09 , 0 . 05) − 0 . 02 ( − 0 . 09 , 0 . 05) A 09 = 0 0 . 71 ( − 0 . 11 , 1 . 53) 1 . 29 (0 . 06 , 2 . 61) 1 . 33 (0 . 1 , 2 . 6) 0 . 33 ( − 0 . 02 , 0 . 67) 0 . 7 (0 . 17 , 1 . 2) 0 . 71 (0 . 19 , 1 . 2) 0 . 08 (0 . 02 , 0 . 13) 0 . 14 (0 . 05 , 0 . 23) 0 . 15 (0 . 06 , 0 . 23) A 10 = 0 − 0 . 38 ( − 5 . 19 , 4 . 24) − 4 . 63 ( − 10 . 46 , 0 . 84) − 4 . 46 ( − 10 . 07 , 0 . 86) − 0 . 31 ( − 1 . 22 , 0 . 61) 0 . 66 ( − 0 . 55 , 1 . 87) 0 . 6 ( − 0 . 51 , 1 . 75) 0 . 02 ( − 0 . 13 , 0 . 17) − 0 . 02 ( − 0 . 2 , 0 . 16) − 0 . 01 ( − 0 . 2 , 0 . 16) A 11 = 0 − 2 . 93 ( − 8 . 11 , 2 . 15) 2 . 18 ( − 3 . 8 , 8 . 3) 2 . 24 ( − 3 . 53 , 8 . 27) 0 . 88 (0 . 1 , 1 . 64) 0 . 1 ( − 0 . 89 , 1 . 09) 0 . 15 ( − 0 . 82 , 1 . 14) − 0 . 06 ( − 0 . 2 , 0 . 08) 0 . 08 ( − 0 . 09 , 0 . 26) 0 . 11 ( − 0 . 06 , 0 . 29) A 12 = 5 5 . 96 (3 . 22 , 8 . 68) 8 . 28 (4 . 62 , 11 . 9) 7 . 82 (4 . 39 , 11 . 36) 1 . 68 (0 . 72 , 2 . 65) 5 . 8 (4 . 52 , 7 . 07) 5 . 57 (4 . 36 , 6 . 81) 2 . 88 (2 . 74 , 3 . 02) 5 . 05 (4 . 88 , 5 . 22) 4 . 89 (4 . 72 , 5 . 05) A 13 = 0 2 . 68 ( − 1 . 67 , 6 . 93) 4 . 03 ( − 0 . 84 , 9 . 1) 4 . 08 ( − 0 . 63 , 8 . 96) − 1 . 05 ( − 1 . 72 , − 0 . 38) − 0 . 83 ( − 1 . 76 , 0 . 11) − 0 . 81 ( − 1 . 72 , 0 . 1) − 0 . 01 ( − 0 . 08 , 0 . 05) − 0 . 01 ( − 0 . 09 , 0 . 08) − 0 . 01 ( − 0 . 09 , 0 . 08) A 14 = 0 − 1 . 13 ( − 5 . 84 , 3 . 6) 0 . 2 ( − 5 . 79 , 6 . 51) 0 . 02 ( − 5 . 85 , 6 . 12) 1 . 39 (0 . 24 , 2 . 55) 0 . 12 ( − 1 . 44 , 1 . 66) 0 . 07 ( − 1 . 43 , 1 . 61) 0 . 02 ( − 0 . 01 , 0 . 05) 0 . 03 (0 , 0 . 06) 0 . 03 (0 , 0 . 06) A 15 = 0 0 . 42 ( − 2 . 07 , 2 . 94) 1 . 49 ( − 1 . 84 , 5) 1 . 59 ( − 1 . 64 , 4 . 85) − 0 . 22 ( − 1 . 01 , 0 . 56) 0 . 15 ( − 0 . 89 , 1 . 18) 0 . 2 ( − 0 . 8 , 1 . 19) 0 ( − 0 . 07 , 0 . 06) 0 . 01 ( − 0 . 07 , 0 . 08) 0 ( − 0 . 07 , 0 . 08) A 16 = 0 − 0 . 82 ( − 2 . 94 , 1 . 29) − 3 . 07 ( − 5 . 88 , − 0 . 39) − 3 . 14 ( − 5 . 86 , − 0 . 5) − 0 . 08 ( − 0 . 28 , 0 . 12) − 0 . 15 ( − 0 . 45 , 0 . 15) − 0 . 15 ( − 0 . 45 , 0 . 15) − 0 . 03 ( − 0 . 09 , 0 . 03) − 0 . 07 ( − 0 . 15 , 0 . 02) − 0 . 08 ( − 0 . 17 , 0) A 17 = 0 − 0 . 46 ( − 2 . 95 , 2 . 04) − 1 . 37 ( − 4 . 65 , 1 . 87) − 1 . 17 ( − 4 . 45 , 1 . 91) − 0 . 71 ( − 1 . 25 , − 0 . 17) − 0 . 59 ( − 1 . 36 , 0 . 19) − 0 . 57 ( − 1 . 31 , 0 . 17) 0 . 01 ( − 0 . 05 , 0 . 06) 0 . 05 ( − 0 . 02 , 0 . 12) 0 . 05 ( − 0 . 03 , 0 . 12) A 18 = 0 0 . 66 ( − 1 . 5 , 2 . 86) 0 . 3 ( − 2 . 65 , 3 . 15) − 0 . 01 ( − 2 . 73 , 2 . 67) 0 . 65 (0 . 11 , 1 . 22) 0 . 24 ( − 0 . 51 , 0 . 99) 0 . 18 ( − 0 . 54 , 0 . 92) 0 . 06 (0 . 01 , 0 . 12) 0 . 02 ( − 0 . 04 , 0 . 08) 0 . 02 ( − 0 . 04 , 0 . 08) A 19 = − 3 − 2 . 61 ( − 3 . 44 , − 1 . 78) − 4 . 44 ( − 5 . 58 , − 3 . 38) − 4 . 1 ( − 5 . 16 , − 3 . 13) − 1 . 46 ( − 1 . 8 , − 1 . 12) − 3 . 09 ( − 3 . 53 , − 2 . 66) − 2 . 94 ( − 3 . 36 , − 2 . 55) − 1 . 75 ( − 1 . 81 , − 1 . 69) − 3 . 06 ( − 3 . 14 , − 2 . 99) − 2 . 97 ( − 3 . 04 , − 2 . 9) 8.48 11.06 10.32 1.19 0.76 0.68 0. 43 0.01 0.01 T able 2: Drift paramete r estimate s for a t wo dimen sional cubic model with arbitra ry di ﬀ usion func tion. On the l eft is the true v alue of the para meter . The data used is the same as that of T able 1 sampled at the ∆ = 0 . 1 interv al. In this case data is impute d to obtain the interv als ∆ = { 0 . 01 , 0 . 001 } . The bottom of the table shows the Posterio r Expected Loss in each case. 12 Figure 2: Margi nal distributi ons of the cubic paramete rs inferred from a data set wit h N = 1 , 000 observ ations at interv al ∆ = 0 . 1. T he blue histogram shows the parameters that gav e stable soluti ons to the SDE, while the red is for those that gav e unstable solutions. The true v alues are gi ven by the red lines. 13 Algorithm 3 Sample parameter s along diagonal for i = 1 to n do U i = 0 for j = i to n do x = − P j k , i k = 1 ( − 1) i + k M ik | M ( j ) {− i } , {− k } | ! / | M ( j ) {− i } , {− i } | end for if x < U i then U i = x end if M ii ∼ N − ( µ i , U i , σ 2 i ) end for are function s of the o ther param eters. Th ese coe ﬃ cients are found to be a ( k ) i j = −| M ( k ) / { i , j } ,/ { i , j } | (24a) b ( k ) i j = ( − 1) i + j j − 1 X k , i k = 1 M jk ( − 1) j − 1 + k | M ( k ) / { i , j } ,/ { j , k } | (24b) + ( − 1) i + j N X k , i k = j + 1 M jk ( − 1) j + k | M ( k ) / { i , j } ,/ { j , k } | (24c) + ( − 1) i + j i − 1 X k , j k = 1 M ik ( − 1) i − 1 + k | M ( k ) / { i , j } ,/ { i , k } | (24d) + ( − 1) i + j N X k , j k = i + 1 M ik ( − 1) i + k | M ( k ) / { i , j } ,/ { i , k } | (24e) c ( k ) i j = N X k , j k = 1 M ik ( − 1) i + k              k − 1 X l , i l = 1 M jl ( − 1) j − 1 + l | M ( k ) / { i , j } ,/ { l , k } | + (24f ) N X l , i l = k + 1 M jl ( − 1) j + l | M ( k ) / { i , j } ,/ { l , k } |              , (24g) where | M ( k ) / { i , j } ,/ { l , k } | r epresents the k th p rincipal minor with rows i and j and c olumns l an d k removed. For each compon ent M i j this qua dratic form can be solved to g iv e upp er and lower bound s on the parameter . The matrix M can b e cycled throug h updating each parameter in turn. Algorithm 4 describ es the sampling of o ﬀ -diago nal elements using the coe ﬃ cients in Eq. (24g). Here, the notation , N + − ( µ, u − , u + , σ 2 ) refers to the do ubly tru ncated no rmal d istribution with mean µ , left trunca tion u − , right truncation u + and standard deviation σ . T o simulate from tr uncated normal d istributions we ar e u sing the in verse Cu mulative Density Function (CDF) method. One simply calculates the corresponding CDF of the lower and u pper 14 Algorithm 4 Sample parameter s o ﬀ diagonal for i = 1 to n do for j = i + 1 to n do u + = ∞ u − = −∞ for k = j to n do Calculate a ( k ) i j , b ( k ) i j and c ( k ) i j and solve a ( k ) i j x 2 + b ( k ) i j x + c ( k ) i j = 0. Set mn = min( x 1 , x 2 ) and mx = max( x 1 , x 2 ) end for if m x < u + then u + = m x end if if mn > u − then u − = mn end if M i j ∼ N + − ( µ i j , u − , u + , σ 2 i j ) end for end for bound aries and then d raws a unifor m rando m variable b etween th ese num bers. Inverting the CDF gives a ran dom v ariable from the Normal distribution restricted to this region. For our prob lem we use the r ejection samp ler method pro posed by (34). This metho d draws uncorr elated sam ples directly fro m the target density . Rejection samp ling f rom a distribution h ( x ) is based on a pro posal d istribution g ( x ) such that h ( x ) ≤ C g ( x ) holds fo r some con stant C and all of the supp ort of h ( x ). For a one sided tr uncated Normal the expo nential distrib ution is a good propo sal. First it is tran slated to coincide with the truncation point, then the rate p arameter is optimized in orde r to closely match the tail of th e Normal distribution. g ( z ; α, µ − ) = α exp( − α ( z − µ − )) I z ≥ µ − (25) The optimal value of α is calculated b y maximizing the expected ac ceptance pr obability an d is shown to be α ∗ ( µ − ) = µ − + p ( µ − ) 2 + 4 2 (26) More details are given in (3 4). W e performed a num erical study to compare the standar d Normal a nd Expo nential p ropo sals. The e ﬃ ciency of pro posing x fro m th e standard n ormal and then a ccepting if x > µ − falls to approx imately 0 . 023 while f or the optimized exponential proposal it is approxim ately 0 . 5 . For the doubly truncated Normal one uses either an expone ntial o r uniform distribution, as a propo sal, depending u pon the size of the truncated region. If the following holds u + > u − + 2 √ e u − + p ( u − ) 2 + 4 exp( ( u − ) 2 − u − p ( u − ) 2 + 4 4 ) then it can be shown th at the expon ential is m ore e ﬃ cien t, oth erwise th e un iform is better (34). Fig. 3 shows b oth the unifo rm and e xpon ential app roxima tions f or both cases. 15 0 0.5 1 1.5 2 2.5 3 3.5 4 0 1 2 3 0 0.5 1 1.5 2 2.5 3 3.5 4 0 2 4 6 standard normal truncated normal exponential uniform standard normal truncated normal exponential uniform Figure 3: Doubly trunca ted normal distrib ution. The top ﬁgure has u − = 2 and u + = 3 and is bette r approxi mated with the expon ential distrib ution. The bottom ﬁgure has u − = 2 and u + = 2 . 5 and the uniform is more e ﬃ ci ent. 16 W e use Algorithms 3 and 4, along with the methods of samplin g trun cated Normal variables, to sample the stability matrix M . W e tested this algorithm on a three dimensional model with T = 10 0. I n this case the dimen sion o f M is n = D ( D + 1) / 2 = 6. Note th at th is MCMC algorith m still mixes well. Fig. 4 compares the posterior distrib utions estimated fr om the Comp onent W ise Algorithm an d the stan dard Gibbs sam pler . Notice the large di ﬀ e rences between t he d istributions in each case. Similar results ( not shown) a re obtained for the o ﬀ d iagonal parame ters using Algorithm 4. 4. Results 4.1. Deterministic Do uble W ell P otential Model The ﬁrst con ceptual climate mod el we consider is a cubic model coupled to the chaotic Lorenz s ystem (29). It is fully determ inistic an d con sists of a slo w variable which can be tho ught of as represen ting a climate process and three fast v ariables which can be thoug ht of as repre- senting chaotic w eather ﬂuctuations. The slow v ariable mov es in side a do uble well poten tial an d is pertur bed by the chao tic Lor enz system, which acts e ﬀ ectively as n oise when ǫ → 0. The equations are as follows d x d t = x − x 3 + 4 90 ǫ y 2 (27a) d y 1 d t = 10 ǫ 2 ( y 2 − y 1 ) (27b) d y 2 d t = 1 ǫ 2 (28 y 1 − y 2 − y 1 y 3 ) (27c) d y 3 d t = 1 ǫ 2 ( y 1 y 2 − 8 3 y 3 ) . (27d) Sample paths are d isplayed in Fig. 5. W e now ﬁt a on e dimen sional cubic SDE to the data. W e just consider the general cubic form (26) d X t = ( a 1 + a 2 X t + a 3 X 2 t + a 4 X 3 t ) d t + σ d W t (28) and estimate all o f the p arameters { a 1 , a 2 , a 3 , a 4 , σ } fro m sparse observations o f the s ystem: again using ∆ = 10 . 0 and N = 1000 . T o u pdate the dr ift parameter s we u se the Gibbs samp ler of Section 3.2.1. The estimated posterior d istributions ar e sho wn in Fig. 6 . A lot of imputed data is needed befor e th e estimates start to co n verge towards the v alues p redicted by homo genizatio n but the infer ence demon strates that there is enough informa tion in the spar se data set if th e likelihood is well appro ximated. Figure 7 shows the predictive sk ill o f the one dimen sional reduc ed model for σ estimated for various m . W e use the em pirical m ean estima te f or the par ameter values. Figures 7a and 7b show that the re duced mode l ca n rep roduc e the do uble well d istribution o f the full m odel although the separatio n of each well is underestimated for m = 2 and m = 4 du e to the larger noise. For m ≥ 8 the mod el reprodu ces well the full models marginal distribution for x . It is n ot clear whethe r there is much di ﬀ erence between m = 8 and m = 64. However , observing Figures 7c a nd 7d we see that th e au tocorre lation f unction fo r the full model is much be tter appr oximated when m = 64. This shows that the ability o f our fra mew ork to im pute data is a powerful way of deriving accurate reduced order m odels. 17 Figure 4: E stimated posterior distrib utions for parameters from a two dimensional model of the form Eq. 3 with N = 100 and = 0.1. The paramete rs, which are randomly generated, are written in the matrix notation introduced in Section 3.2.1. The histograms are the posterior distrib utions with uninformat iv e prior , in red are the posterior distribut ions for pa rameters with stab le SDEs and in black are the posteri or distribut ions w hich in clude the stability matrix prior informati on deri ved in this chapter . 18 Figure 5: Example path of x of the chaotic Lorenz system: Eq. (27d) 4.2. Mod el Reduction for T riad Systems Now we apply our model ﬁtting procedure to a triad model with a high dimensional d eter- ministic system with two slo w , climate v ariables coupled to f ast chaotic dynamics. The reductio n strategy h as two c hallenges: to successfully ap proxim ate the determin istic variables b y a stoc has- tic process and to be insensitive to a lack o f time scale separation. The full system is giv en by d x 1 d t = b 1 ǫ x 2 y 1 (29a) d x 2 d t = b 2 ǫ x 1 y 1 (29b) d y k d t = b 3 ǫ x 1 x 2 δ 1 , k − Re ik 2 ǫ 2 X p + q + k = 0 ˆ u ∗ p ˆ u ∗ q (29c) d z k d t = − Im ik 2 ǫ 2 X p + q + k = 0 ˆ u ∗ p ˆ u ∗ q , (29d) where u k = y k + iz k . This system is stable pr ovided th at th e en ergy is conserved: b 1 + b 2 + b 3 = 0. W e use the v alues b = { 0 . 9 , − 0 . 5 − 0 . 4 } (ou r results ar e insensitive over a wid e r ange of p arameter values) and we choo se a cu t o ﬀ o f Λ = 50. Sample p aths are displayed in Fig. 8. W e are interested in eliminating y lea ving equation s for just x 1 and x 2 . The small param eter ǫ represen ts the time scales within the system. The variables y have fastest time scale of ord er O (1 /ǫ 2 ) compare d to O ( 1 /ǫ ) f or x 1 and x 2 . As ǫ → 0 we can use the method of homogenization f or SDEs to eliminate the fast v ariables and this gives d x 1 ( t ) = b 1 γ ( b 3 x 2 2 ( t ) + σ 2 2 γ b 2 ) x 1 ( t ) d t + σ γ b 1 x 2 ( t ) d W t (30a) d x 2 ( t ) = b 2 γ ( b 3 x 2 1 ( t ) + σ 2 2 γ b 1 ) x 2 ( t ) d t + σ γ b 2 x 1 ( t ) d W t , (30b) where un known parameters σ and γ h ave been introduced . Her e we estimate them using the Algorithms 1 and 2 from observations of th e climate variables alone. For convenience we con - 19 −0.06 −0.02 0.02 0.06 0 20 40 60 80 a 1 Density (a) Theoretic al v alue 0 0.2 0.4 0.6 0.8 1.0 0 20 40 60 a 2 Density (b) Theoreti cal v alue 1 −0.05 0.00 0.05 0.10 0 20 40 60 80 a 3 Density (c) Theoreti cal v alue 0 −1.0 −0.8 −0.6 −0.4 −0.2 0 20 40 60 a 4 Density (d) Theoreti cal v alue − 1 0.02 0.04 0.06 0.08 0.10 0.12 0 200 400 600 σ 2 Density (e) Theoreti cal v alue 0 . 11 Figure 6: Posterior distributio n estimate s from MCMC output applied to a spar se d ata set ( ∆ = 10). Di ﬀ erent distribu tions correspond to increasing amounts of missing data. T he distrib ution in brown, for m = 64, agrees with the theore tical v alues predicted by the homogeniz ation procedur e. In the model simulati on ǫ = 0 . 01 has been used. Black line: m = 1, red line: m = 2, green line: m = 4, blue line: m = 8, light blue line: m = 16, magent a line: m = 32, brown line : m = 64. sider the inference problem where the m odel Eq. (30) is driv en by tw o independ ent Brownian 20 Inv ariant Distribution X Density −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 0.0 0.4 0.8 (a) In v ariant distri bu t i on with ǫ = 0 . 1. Histogram is for the full system, the lines corresp ond to diﬀerent m . Inv ariant Distribution X Density −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 0.0 0.4 0.8 1.2 (b) In v ariant distribu t ion with ǫ = 0 . 01. Histogram is for the whole system, the lines corresp ond to diﬀerent m . 0 20 40 60 80 100 0.0 0.4 0.8 Lag ACF 2 4 8 16 32 64 (c) Aut o correlation function for ǫ = 0 . 1. Bars are for the full system. 0 20 40 60 80 100 0.0 0.4 0.8 Lag ACF 2 4 8 16 32 64 (d) Auto correlation function for ǫ = 0 . 01. Bars are for the f ul l system. Figure 7: Predicti ve statisti cs for the reduced double well model coupled to chaotic Lorenz system: Eq. (27d) for two v alues of ǫ . In each plot the lines correspon d to the inferred one dimensiona l model for di ﬀ erent m . motions. Posterior estimates f or the case ǫ = 0 . 8 are shown in Fig. 9. Th is value correspon ds to a moderately small, though re alistic (14), amount of time scale s epara tion. As F ig. 9 demon strates, increasing the number o f imputed data leads to a conver gence and , thus, to an improvement o f the posterio r e stimates. W e apply the inference to a data set with total time T = 500 an d ob servation inter val ∆ = 0 . 1. W e simula te the system for ǫ = { 0 . 1 , 0 . 25 , 0 . 5 , 0 . 8 , 1 . 0 } . Fig. 10 shows the predictiv e probability densities and autocorrelation fu nctions for Eq. 30. In each case the data is simulated from the 21 Figure 8: Example path of x 1 (black line) and x 2 (red line) from the triad model: Eq. (29) full model Eq. 29, then the parameters ar e estimated u sing the reduced mo del Eq. 30 and th is reduced model is simulated to calculate the pred ictiv e statisti cs. T he posterior mean estimates were com puted for m = 16 missing d ata values ( it was veried that the p osteriors for m = 1 6 and m < 16 gave consistent estimates). The red uced mode l Eq. 3 0 with empirical parameter estimates is also plotted . Th is mod el is referr ed to as the redu ced model in Figu re 10. The redu ced model is ab le to rep roduc e the non-trivial shape of the PDF very well. This su ggests tha red uced ord er models ﬁtted from observed data can be used for e xtrem e v alue studies (16). The autocorrelation fun ctions have been collapsed on to th e reduced mo del by rescaling the output interval of the prediction b y their value of ǫ ; this h as been done for conv enience of dis- playing the results. T he data collap se is very good f or all mod el simu lations with the models w ith ǫ < 0 . 5 being clo sest to the reduced mo del. Th is im plies that th e parameter estimates for each case a re partially c ompensatin g for the cha nging tim e scale separation. This p rovides evidence for the potential of using redu ced order mo delling strategies even in systems with only mod erate time scale separation. 5. Summary Here we dev eloped a systematic Bay esian framew ork for the inference of the parameter of SDEs co nstrained b y th e ph ysics of the u nderly ing system. The p hysical con straints not only constrain the param eter space b ut also enfo rce global stability of the reduced order models. For climate mo dels we derive a constrain t b ased on energy conservation which ensu res g lobal stability of the e ﬀ ectiv e SDE. This constraint takes the form of a n egativ e deﬁnite matrix. W e then develop a n ew algorithm f or the samplin g of n egati ve deﬁnite m atrices. W e also d ev elop a new algorithm fo r imputin g data an d show that imp uting data impr oves the accuracy considerly of the p arameter estimation. W e demo nstrated its power s uccessfully o n two concep tual climate models. While we focused o n clima te mod els in this study our m ethod is general en ough that it can also be app lied to other areas of ﬂuid dyn amics. Furthermo re, als o many other phy sical systems observe conservation la ws and, thus, stability conditio ns can be derived which will be usefu l for parameter estimation proc edures. 22 0.002 0.004 0.006 0.008 0.010 0 500 1500 den1$x density 1 2 4 8 (a) A 1 , 1 −20 0 20 40 0.00 0.04 den1$x density 1 2 4 8 (b) A 1 , 2 −20 0 20 0.00 0.02 0.04 0.06 den1$x density 1 2 4 8 (c) A 1 , 3 −20 −10 0 10 20 0.00 0.04 0.08 den1$x density 1 2 4 8 (d) A 1 , 4 −20 −15 −10 −5 0 5 10 15 0.00 0.10 0.20 den1$x density 1 2 4 8 (e) A 1 , 5 −40 −20 0 20 40 0.00 0.02 0.04 den1$x density 1 2 4 8 (f ) A 1 , 6 −30 −20 −10 0 10 20 0.00 0.04 0.08 0.12 den1$x density 1 2 4 8 (g) A 1 , 7 −20 −10 0 10 20 0.00 0.04 0.08 den1$x density 1 2 4 8 (h) A 1 , 8 −60 −40 −20 0 20 40 0.00 0.02 0.04 0.06 den1$x density 1 2 4 8 (i) A 1 , 9 −40 −20 0 20 40 0.00 0.02 0.04 den1$x density 1 2 4 8 (j) A 1 , 10 −100 −50 0 0.00 0.04 0.08 den1$x density 1 2 4 8 (k) A 2 , 1 −30 −20 −10 0 10 20 30 0.00 0.04 0.08 den1$x density 1 2 4 8 (l) A 2 , 2 −40 −20 0 20 40 60 0.00 0.02 0.04 den1$x density 1 2 4 8 (m) A 2 , 3 −20 0 20 40 60 80 0.00 0.06 0.12 den1$x density 1 2 4 8 (n) A 2 , 4 −30 −20 −10 0 10 20 0.00 0.04 0.08 den1$x density 1 2 4 8 (o) A 2 , 5 0 50 100 0.00 0.02 0.04 den1$x density 1 2 4 8 (p) A 2 , 6 −20 −10 0 10 20 0.00 0.04 0.08 0.12 den1$x density 1 2 4 8 (q) A 2 , 7 −60 −40 −20 0 20 0.00 0.04 0.08 0.12 den1$x density 1 2 4 8 (r) A 2 , 8 −40 −20 0 20 40 0.00 0.02 0.04 0.06 den1$x density 1 2 4 8 (s) A 2 , 9 −100 −50 0 0.00 0.02 0.04 0.06 den1$x density 1 2 4 8 (t) A 2 , 10 Figure 9: Posterior estimat es of drift p arameters for two dime nsional cubic model ﬁtte d to the triad-B urgers equation for ǫ = 0 . 8. Acknowledgments W e th ank three anonymous reviewers whose comments impr oved ear lier versions of th is man uscript. This stud y received f unding f rom the Natural En viron ment Research Council, the E ngineer ing an d Physical Sciences Research Council and the Deu tsche Forschun gs- gemeinschaf t (DFG) through the cluster of e xcellence CliSAP . References [1] Achatz, U. and G. Branstator , A two-layer model with empirical linear correct ions and reduced order for studies of interna l climate va riabilit y . J. Atmos. Sci., 56, (1999) 3140-3160. 23 −1.5 −1 −0.5 0 0.5 1 1.5 0 0.1 0.2 0.3 0.4 0.5 0.6 X 1 density reduced ε =0.1 ε =0.25 ε =0.5 ε =0.8 ε =1.0 (a) Stationary distribution for x 1 2 6 10 14 18 0 0.2 0.4 0.6 0.8 1 L a g ( t) A C F reduced ε =0.1 ε =0.25 ε =0.5 ε =0.8 ε =1.0 (b) Auto correlation function for x 1 −1 −0.5 0 0.5 1 0 1 2 3 4 5 6 X 2 density reduced ε =0.1 ε =0.25 ε =0.5 ε =0.8 ε =1.0 (c) Stationary distribution for x 2 0 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1 L a g ( t) A C F reduced ε =0.1 ε =0.25 ε =0.5 ε =0.8 ε =1.0 (d) Auto correlation function for x 2 Figure 10: Output statistics compari ng the reduce d model Eq. 30, with empirical m odels for vari ous ǫ v alues with the full model. [2] Ait-Sahalia, Y , Maximum lik elihood estimati on of discretel y sampled di ﬀ usions: A closed-for m approximatio n approac h. Econometrica , 70, (2002) 223-262. [3] Ait-Sahalia, Y . , Closed-for m likel ihood expa nsions for multi varia te di ﬀ usions. Annals Stat., 36, (2008) 906-937. [4] Barczyke rn, M. and P . Ker n, Representati ons of m ultidi mensional linear process bridges. Rand. Oper . Stoch. Eq., 21, 159-189, (2013). [5] Beskos, A. and G. O. Roberts, E xact simula tion of di ﬀ usions. Ann. Appl. Prob ., 15, 2422-2444 , (2005). [6] Beskos, A., O. P apaspili opoulos and G. O. Robe rts, A fac torisation of di ﬀ usion measure and ﬁni te sample path construct ions. Meth. Comp. Appl. Prob ., 10, 85-104, (2008). [7] Chib, S. , Pitt, M. and Shephard , N., Likelihood based infere nce for di ﬀ usion dri ven state space models. W orking Pape r , Nu ﬃ eld Colle ge, O xford Uni versi ty , (2004). [8] Crommelin, D. and E. V anden-Eij nden, Reconstr uction of di ﬀ usions using spectra l dat a fro m timeseries. Comm . Math. Sci., 4 (2006), 651-668. [9] Dargatz, C., B ayesian Inferenc e for Di ﬀ usion Proce sses with Applic ations in Life Science s. PhD t hesis, Fakult¨ at der Mathemat ik, Informatik und Statisti k der Ludwig Maximilia ns Univ ersit ¨ at M ¨ unchen, (2010). [10] Durham, G. B. and Gallan t, A. R., Numerical techniques for m aximum likeli hood estimation of continu ous-time di ﬀ usion processes. J. Bus. Econ. Statist ., 20 (2002), 297-316. [11] Eaton, M. L., Mult iv ariate Statistics: A V ector Space Approach. L ecture Notes–Mo nograph. Series, V olume 53. [12] Eraker , B. , MCMC analysis of di ﬀ usion models wit h applicati on to ﬁnance . J. Bus. Econ. Statist., 19 (2001), 177-191. 24 [13] Elerian, O. S., Simulation estimati on of continuous-time models with applica tions to ﬁnance. PhD thesis, Nu ﬃ eld Colle ge, Oxford (1999) . [14] Franzke, C., A. J. Majda and E. V anden-Eij nden, Lo w-Order Stocha stic Mode Reducti on for a Real istic Barotropic Model Climate . J. Atmos. Sci., 62 (2005), 1722-1745. [15] Franzke, C., and A. J . Majda, Lo w-Order Stocha stic Mode Reduction for a Prototype At mospheric GCM . J. Atmos. Sci., 63 (2006), 457-479. [16] Franzke, C., Predictabi lity of Extre me Eve nts in a Nonlinear Sto chastic-Dyn amical Model. Phys. Re v . E, 85 (2012), DOI: 10.1103 / PhysRe vE.85.031134 [17] Friedrich, R., S. Siegert , J. Peink e, St. L ¨ uck, M. Siefert, M. Lindemann, J. Raethje n, G. Deuschl and G. Pﬁ ster , Extracti ng model equatio ns from experi mental data . Phys. Lett. A, 271 (2000), 217-222. [18] Golightly , A. and D. J. Wi lkinson, Bayesian inference for nonlinear multi variat e di ﬀ usio n models observ ed with error . Comp. Stat . Data Anal., 52 (2008), 1674-16 93. [19] Harlim, J., A. Mahdi and A. J. Majda, An ensemble Kalman ﬁlter for stati stical estimation of physics constrain ed nonline ar regression models. J. Comp. Phys., 257 , (2014), 782-812. [20] Heinz, S. , Statisc al mechanic s of turbule nt ﬂo ws. Springer V erla g Berlin, (2014), 240pp. [21] Horenko, I., E. Dittmer and C. Sch ¨ utte, Reduced stochastic models for complex molecul ar systems. SIAM Comp. V is. S ci., 9, (2005), 789-102. [22] Kondrasho v , D., S. Kravtso v and M. Ghi l, Empiric al mode reduction in a mode l of extratropi cal lo w-frequenc y v ariabili ty . J. Atmos. Sci., 63 (2006), 1859-1877. [23] Majda, A. J., I. Timofey ev and E. V anden-Eij nden, Models for stochastic climate prediction. Proc. Nat. Acad. Sci. USA, 96 (1999), 14687-146 91. [24] Majda, A. J., I. Timofe ye v and E. V anden-Eijnden, A mathemati cal framew ork for stochasti c climate models. Commun. Pure Appl. Math., 54 (2001), 891-974. [25] Majda, A. J., C. Franzke and B. Khoui der , An applied m athemat ics perspecti ve on stochastic modelli ng for climat e. Phil. Tra ns. R. Soc. A, 366 (2008), 2429-2455. [26] Majda, A. J., C. Franzk e and D. Crommelin, Normal forms for reduced stoc hastic cli mate models. Proc. Nat. Acad. Sci. USA, 106, (2009) 3649-3653. doi: 10.1073 / pnas.0900173106 [27] Majda, A. J. and Y . Y uan, Fundament al limitatio ns of ad hoc linea r and quad ratic multi-le vel regression mode ls for physical systems. Disc. Con. Dyn. Sys., 17, (2012) 1333-1363. [28] Majda, A. J. and J. Harlim, Physics constraine d nonline ar regre ssion models for ti me series. Nonlin earity , 26, (2013) 201-217. [29] Mitchell, L. and Gottwald, G. A., Data Assimilation in Slow-F ast Systems Using Homogenized Climate Models. J. Atmos. Sci., 69, (2012) 1359-1377 . [30] Ozaki, T ., A bridge between nonlinear time series models and nonlinea r stochastic dynamic systems. A local linea rization approach. Stat. Sinica, 2, (1992) 113-135. [31] Pavlio tis, G. A. and A. M. Stuart, Multisc ale methods, Springer V erl ag, (2008) 310pp. [32] Peavo y , D., Methods of likel ihood based inference for constructing stochastic climate models, PhD thesis, Uni ver - sity of W arwick, (2013) 232pp. [33] Pedersen, A. R., A ne w approach to maximum like lihood estimatio n for stochastic di ﬀ erential equat ions based on discrete observ ations. Scand. J. Statist., 22, (1995) 55-71. [34] Robert, C., Simulat ion of trunca ted normal v ariabl es. Stat. Comp., 5, (1995) 121-125. [35] Roberts, G. O., A. Gelman and W . R. Gilks, W eak conv erge nce a nd optimal scaling of random walk Metropoli s algorit hms. Ann. App. Prob ., 7, (1997) 110-120. [36] Roberts, G. O. and Stramer , O., On inference for partially observed nonlinear di ﬀ usion models using the Metropol is-Hastings algorith m. Biometrik a, 88, (2001) 603-621. [37] Selten, F . M., An e ﬃ ci ent description of the dynamics of barotro pic ﬂ o w . J. Atmos. Sci., 52, (1995) 915-936. [38] Shephard, N. and Pitt, M. K., Likel ihood ana lysis of non-Gaussian measurement t ime s eries. Biometrika, 8 4, (199 7) 653-667. [39] Shoji, I. and T . Ozaki, E stimation for nonlinear stochasti c di ﬀ erentia l equat ions by a loca l lineariza tion method. Stoch. Anal. Appl., 16, (1998) 733-752. [40] Siegert, S. , R. Friedri ch and J. Pei nke, Analysis of dat a sets of sto chastic systems. Phys. Lett. A, 243 (19 98), 275-280. [41] Y uan, Y . and A. J. Majda, In v ariant measures and asymptodic Gaussian bounds fo r normal forms of stochastic climate models. Chin. Ann. Math., 32 (2011), 343-368. 25

Systematic Physics Constrained Parameter Estimation of Stochastic Differential Equations

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment