Importance Sampling for Multiscale Diffusions

Imp ortance Sampling for Multis cale Diﬀusions P aul Dupuis ∗ Konstan tinos Spiliop oulos † Hui W ang ‡ No v em b er 27, 2024 Abstract W e construct impor t ance sampling sc hemes for sto c hastic diﬀerential eq ua tions with small noise a nd fast oscilla tin g co eﬃcien ts. Standard Monte Carlo metho ds p erform po orly f or these p ro blems in the small noise limit. With mult isca le pr ocesses there are a dditio nal complications, and indeed the stra igh tforward adaptation of metho ds for standard small noise diﬀusions will not produce eﬃcient schemes. Using the subsolution approach w e construct schemes and identif y c o nditions under which the schemes will be asymptotically optimal. Examples and s im ulation res ults are pr o vided. Keyw ords : imp ortance sampling, Mon te Carlo, h o mogenization, m ultiscale, rough en- ergy landscap e AMS : 60F05 , 60F10, 60G60 1 In tro duction In this pap er we study eﬃcient imp ortance sampling sc hemes for simulating rare eve nts asso ci ated w i th the d -dimensional sto c hastic diﬀeren tial equation (SDE) dX ǫ ( t ) =  ǫ δ b  X ǫ ( t ) , X ǫ ( t ) δ  + c  X ǫ ( t ) , X ǫ ( t ) δ  dt + √ ǫσ  X ǫ ( t ) , X ǫ ( t ) δ  dW ( t ) , X ǫ (0) = x, (1) where δ = δ ( ǫ ) ↓ 0 and ǫ δ → ∞ as ǫ ↓ 0 , (2) and W ( t ) is a standard d -dimensional Wiener p rocess. The functions b ( x, y ) , c ( x, y ) and σ ( x, y ) are assumed to b e smo oth in eac h v a riable and p erio dic with p eriod λ in ev ery ∗ Division of A p plied Mathematics , Bro wn Universit y , Providence, RI 02912 (du puis@dam.bro wn.edu). Researc h sup ported in part by the National S cie nce F oundation (D M S-1008331), the Departmen t of Energy (DE-SCOO02413), and th e Army Researc h Oﬃce (W911NF-09-1-015 5). † Division of Applied Mathematics, Brown Un iversit y , Providence, RI 029 12 (kspiliop@dam.bro wn.edu). Researc h supp orted in part by the Department of Energy (DE-SCOO02413). ‡ Division of A pplied Mathematics, Bro wn Universit y , Providence, RI 02912 (huiw ang@dam.brown.edu). Researc h sup ported in part by the National S cie nce F oundation (D M S-1008331), the Departmen t of Energy (DE-SCOO02413). 1 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions direction with resp ect to the second v ariable. The extension to the ﬁrst order Langevin equation mo del with non-p erio dic random en vironmen t will also b e d i scussed . The need to simulat e rare ev ents o ccurs in man y application areas, including telecomm u- nication, ﬁnance, insu rance, and computational physics and c hemistry . Ho w ev er, virtually an y simulat ion pr ob lem inv olving r a re even ts will h a v e a num b er of mathematical and com- putational c hallenges. As it is w ell kno w n , standard Monte Carlo sampling tec h niques p erform very p o orly in that the relativ e errors u nder a ﬁxed computational eﬀort gro w rapidly as the even t b ecomes more and more rare. Esti mating rare ev en t probabilities in the con text of d iﬀusion p rocesses with fast oscillating co eﬃcien ts p resen ts extra diﬃculties due to the additional small parameter δ and its in teraction with the in tensit y of the noise ǫ . A p oten tial application of the metho ds presen ted in this pap er is to c hemical physics and b io logy , such as problems inv olving the folding and binding kinetics of pr o teins. These mo dels u sually inv olv e rugged p oten tial surf a ces of a complex hierarc hical structure with p oten tial m inima within p o ten tial minima, separated by barriers of v arying heigh ts d ue to the presence of multiple energy scales. In some cases, one can app ro ximate the dynamics b y a diﬀusion in a r o ugh p oten tial where a smo oth p oten tial function is sup erimp osed b y a rough function (see Figures 1 and 2). A representa tiv e, but by no means complete, list of references is [2 , 8, 26, 32, 34, 44, 47]. It turns out that these mo dels often can b e appro ximated by h o mogenized sys tems w here the eﬀect of th e multisca le n a ture is p a rtially captured by the eﬀectiv e diﬀusivit y of the system. T he f ormulas for the eﬀe ctiv e diﬀusivity in the aforemen tioned c hemistry and biology literature coincide with those pro duced by the appro ximation via homogenizat ion, which justiﬁes our assumption that ǫ and δ are related according to (2). Note that the condition (2) corresp onds to Regime 1 in [13], wher e the sample path large deviation p roperties of multi scale diﬀusions are studied un d er v arious regimes. The aim of this pap er is to presen t a more eﬃcien t approac h to the s a mpling pr o blem for m u lt iscale diﬀusions. Using the large deviation and weak conv ergence r esults from [1 3], w e sho w how to construct asymptotically optimal imp ortance sampling schemes with r ig- orous b ounds on p erformance. The construction is based on subsolutions for an asso ci ated partial diﬀerenti al equatio n as in [15]. How ever, it becomes applicable only after signiﬁcan t mo diﬁcatio ns that take in to consideration the multisca le asp ect of the mo del. More pr e - cisely , changes of measure that are purely b ased on the homogenized system and directly suggested by its asso c iated partial diﬀerentia l equation do n o t lead to eﬃcien t imp ortance sampling schemes. Instead, appr o priate m odiﬁcations in vo lving the solution to a so-calle d auxiliary “cell problem” hav e to b e made in order to ac hiev e asymptotic optimalit y . Th is is consisten t with th e large devia tions results o btained in [1 3], where a change of measure (or equiv alen tly a con trol) in partial feedbac k form has to b e used to p ro ve a large deviation lo w er b ound. By “partial feedbac k” w e mean th a t the change of measure is a fu nctio n of the fast v ariable X ǫ /δ . In the presen t paper a co ntrol in full feedbac k form, i.e., a function of b oth the slow v ariable X ǫ and th e fast v ariable X ǫ /δ , w il l b e used to construct dynamic imp ortance sampling sc hemes w i th precise asymptotic p erformance b ounds. T o the b est of our knowle dge, this is the ﬁ rst w ork to address the design of asymptot- ically optimal imp ortance samplin g sc hemes for m ultiscale diﬀus ions . Related imp ortance sampling problems for regular small noise diﬀusions without fast oscillatio ns ha v e b een re- 2 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions cen tly consid e red in [46], wh e re the sc h emes are based on the solution to the corresp onding Hamilton-Jaco bi-Bellman (HJB) equ a tion, and in [24]. The presen t w ork is also related to the theory of homogenizati on of HJB equations [1, 9, 19, 25, 31, 3 3]. The pap er is organized as follo w s. In Section 2 w e r eview the concept of im p ortance sampling and the role of sub sol utions to certain HJB equation for small noise diﬀusions without multisca le features. In Section 3, we introd uce assum ptio ns, notation and review the large deviati ons results that w e use for (1). F urthermore, w e explain wh y the stand a rd construction of imp ortance sampling schemes based on the h omogenized system fails in the multi scale setting. Th e main theorem and its pro of are presen ted in S ec tion 4, wher e the correct c hange of measure is iden tiﬁed. In Section 5 w e apply the general results to ﬁrst order Langevin equations and deriv e some useful explicit f o rmulas. Extensions to an equation with ran d om en vironment are discussed in Section 6. W e rep ort simulatio n resu lt s in Section 7 for b oth the p eriodic and random cases in one dimension. Th e computational c hallenges that one faces when sim ulating tra jectories of multisca le diﬀusions are discussed in App endix A. 2 Imp ortance Sampling and Subsolutions In th i s section we review some kno w n results on imp ortance sampling for s m a ll noise diﬀu- sions without the m ultiscale feature. In particular, we d isc uss ho w sub solutio ns to a related HJB equation can b e used to design and analyze imp ortance sampling schemes for su c h systems. The purp ose of these discussions is not only to in tro duce some basic concepts in imp ortance sampling and subsolutions, b u t also to set the stage for discussions on wh y the standard pro cedure is n o t directly applicable to m ultiscale diﬀusion mod e ls. 2.1 Preliminaries on Imp ortance Sampling Let { X ǫ , ǫ > 0 } b e a d -dimensional small noise diﬀusion for whic h a sample p a th large deviation p rinciple holds, and denote th e rate function ov er the in terv al [ t, T ] by S tT ( φ ). Consider a b ounded con tinuous fu nctio n h : R d 7→ R and s u pp o se that one is in terested in estimating θ ( ǫ ) . = E[ e − 1 ǫ h ( X ǫ ( T )) | X ǫ ( t ) = x ] b y Mon te Carlo. Deﬁne G ( t, x ) . = inf φ ∈C ([ t,T ]; R d ) ,φ ( t )= x [ S tT ( φ ) + h ( φ ( T ))] , (3) where C ([ t, T ]; R d ) denotes th e space of con tin uous functions from [ t, T ] to R d . Th en by the con traction pr inciple lim ǫ → 0 − ǫ log θ ( ε ) = G ( t, x ) . (4) Let Γ ǫ ( t, x ) b e an y un biased estimator of θ ( ǫ ) that is deﬁn ed on some probability space w i th probabilit y measure ¯ P. In other w ords, Γ ǫ ( t, x ) is a rand o m v ariable such that ¯ EΓ ǫ ( t, x ) = θ ( ǫ ) , 3 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions where ¯ E is the exp ectation op erator associated with ¯ P. In this pap er we will co nsid e r only unbiase d estimators. In Mont e Carlo sim ulation, one generates a num b er of indep endent copies of Γ ǫ ( t, x ) and the estimate is the sample mean. The sp eciﬁc n umb e r of samples requ i red dep ends on the desired accuracy , whic h is m e asured by the v ariance of the sample mean. Ho w eve r, since the samples are indep endent it suﬃces to consider the v ariance of a single sample. Because of u n b iasedn e ss, minimizing the v ariance is equiv alen t to minimizing the second m o ment. By Jensen’s inequalit y ¯ E(Γ ǫ ( t, x )) 2 ≥ ( ¯ EΓ ǫ ( t, x )) 2 = θ ( ǫ ) 2 . It then follo w s from (4) th at lim s up ǫ → 0 − ǫ log ¯ E(Γ ǫ ( t, x )) 2 ≤ 2 G ( t, x ) , and th us 2 G ( t, x ) is the b est p ossible rate of deca y of the second momen t. If lim in f ǫ → 0 − ǫ log ¯ E(Γ ǫ ( t, x )) 2 ≥ 2 G ( t, x ) , then Γ ǫ ( t, x ) achiev es this b est deca y rate, and is said to b e asym ptotic al ly optimal . W e note that ev en thou gh m uch of this pap er fo c uses on asymptotically optimal sc h emes, asymptotic optimalit y is not the only p rac tical concern. If optimal or nearly optimal sc hemes are too complicated and diﬃcult to im p le ment then one may pr e fer to construct non-optimal but simpler sc hemes. T h is is also p ossible using the sub s o lution approac h that is discussed later in the pap er, and T h eo rem 4.1 iden tiﬁes a lo wer b ound on the impro v ement ov er ordinary Mon te Carlo that will b e obtained. In the end , it is an issue of balance b et w een complexit y and feasibilit y . 2.2 Large Deviations for Small Noise Diﬀusions Consider a small noise d -dimensional diﬀusion pro cess X ǫ . = { X ǫ ( s ) , t ≤ s ≤ T } satisfying dX ǫ ( s ) = r ( X ǫ ( s )) ds + √ ǫ Φ ( X ǫ ( s )) dW ( s ) , X ǫ ( t ) = x. (5) Throughout this pap er we wo rk with the canonical ﬁltered prob ab ility space (Ω , F , P) equipp ed w i th a ﬁltration { F t } that satisﬁes th e usu al conditions. Th us { F t } is right- con tin uous and F 0 con tains all P-negligible sets. Since the purp ose of this section is ex- p ository , w e assume for simplicit y that the co e ﬃcien ts r ( x ) and Φ( x ) are smo oth, that the diﬀusion matrix q ( x ) . = Φ ( x )Φ( x ) T is uniformly nondegenerate, and that all these functions are uniformly b o un ded. W e next present a represent ation theorem prov ed in [6], whic h will b e used here and also later on to analyze imp ortance sampling sc hemes in the multisca le setting. Let A denote the set of all F t -progressiv ely measur a ble d -dimen s i onal pr ocesses v = { v ( s ) , 0 ≤ s ≤ T } that satisfy E Z T 0 k v ( t ) k 2 dt < ∞ . 4 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions Theorem 2.1 Given ǫ > 0 , let X ǫ b e the unique str ong solution to (5). Then for any b ounde d Bor el-me asur able function g mapping C ([ t, T ]; R d ) i nt o R , − ǫ log E  exp  − g ( X ǫ ) ǫ  = inf v ∈A E  1 2 Z T t k v ( s ) k 2 ds + g ( X ǫ,v )  , wher e X ǫ,v is the unique str ong solution to the sto chastic diﬀer ential e quation dX ǫ,v ( s ) = r ( X ǫ,v ( s )) ds + Φ ( X ǫ,v ( s ))  √ ǫdW ( s ) + v ( s ) ds  , t ≤ s ≤ T , (6) with initial c ondition X ǫ,v ( t ) = x . It is w ell kno wn that under these conditions the sample p a th large deviation principle holds for { X ǫ , ǫ > 0 } with rate function S tT ( φ ) =    1 2 Z T t    ˙ φ ( s ) − r ( φ ( s ))    2 q − 1 ( φ ( s )) ds if φ ∈ AC ([ t, T ]; R d ) , φ ( t ) = x + ∞ otherwise , where A C ([ t, T ]; R d ) denotes the colle ction of R d -v alued absolutely contin u o us fun c tions on in terv al [ t, T ] and k v k B . = √ v T B v for any v ∈ R d and symmetric p ositiv e d e ﬁn i te m atrix B . When B is the identit y matrix, k v k B is just the standard Euclidean norm k v k . 2.3 Imp ortance Sampling in the Absence of Multiscale F eatures W e ﬁr st recall the notion of a subsolution to an HJB e quation of the t yp e U t ( t, x ) + ¯ H ( x, ∇ x U ( t, x )) = 0 , U ( T , x ) = h ( x ) . (7) In this pap er w e consid er mostly classical sense subsolutions. In some circums t ances other t yp es, suc h as w eak sense subsolutions, ma y b e useful [1 1, 15]. Deﬁnition 2.2 A fu nctio n ¯ U ( t, x ) : [0 , T ] × R d 7→ R is a classic al subsolution to the HJB e quation (7) if 1. ¯ U is c ontinuously diﬀe r entiable, 2. ¯ U t ( t, x ) + ¯ H ( x, ∇ x ¯ U ( t, x )) ≥ 0 for every ( t, x ) ∈ (0 , T ) × R d , 3. ¯ U ( T , x ) ≤ h ( x ) for x ∈ R d . When using subs olutions for imp ortance sampling it is often n e cessary to imp ose stronger regularit y conditions somewh at b ey ond those of Deﬁnition 2.2. T o ease exp o sition, we will assume the follo wing condition throughout the p a p er. It is b y no means most economical. In particular, the uniform b ound on the ﬁ rst and second deriv ativ es is n o t n e cessary , and can b e replaced by milder cond itions with further eﬀort. Ho we ve r, it is conv enient for the purp ose of illustration since it guaran tees the feedbac k con trol used in imp ortance sampling is uniformly b ounded and thus circum v en ts a n um b er of tec hnicalities. 5 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions Condition 2.1 ¯ U has c ontinuous derivatives up to or der 1 in t and or der 2 in x , and the ﬁrst and se c ond derivatives in x ar e uniformly b ounde d. Next w e review the connection b et w een su bsolutio ns and the p erformance of related imp ortance samplin g sc hemes. Typically one designs a subsolution for a s p eciﬁc starting time and initial state ( t, x ). With an abuse of notation ( t, x ) will also b e used at times to denote a generic p oin t in [0 , T ] × R d (the in tended use will b e clear f rom the con text). T he form of the Hamiltonian is naturally su g gested by th e calculus of v ariation problem (3) and the explicit form ula of the rate function S tT ( φ ) in Section 2.2: ¯ H ( x, p ) = inf u ∈ R d  h p, r ( x ) + Φ( x ) u i + 1 2 k u k 2  = h r ( x ) , p i − 1 2 h p, q ( x ) p i . (8) In fact, under mild conditions G is the unique viscosit y solution to (7). Let ¯ U ( t, x ) b e a classical su b solutio n to (7) and ¯ u the feedbac k con trol deﬁned b y the minimizer in (8) with p replaced b y ∇ x ¯ U ( t, x ), i.e., ¯ u ( t, x ) = − Φ( x ) T ∇ x ¯ U ( t, x ) . (9) Note that u nder the giv en conditions ¯ u ( t, x ) is Lipsc hitz con tin uous in x , con tin uous in ( t, x ), and u niformly boun ded. Consider the family of probabilit y measures ¯ P ǫ deﬁned by the c hange of measure d ¯ P ǫ d P = exp  − 1 2 ǫ Z T t k ¯ u ( s, X ǫ ( s )) k 2 ds + 1 √ ǫ Z T t h ¯ u ( s, X ǫ ( s )) , dW ( s ) i  . By Girsano v’s Theorem ¯ W ( s ) = W ( s ) − 1 √ ǫ Z s t ¯ u ( ρ, X ǫ ( ρ )) dρ, t ≤ s ≤ T is a Bro wn ian motion on [ t, T ] under the pr o bability measure ¯ P ǫ , and X ǫ satisﬁes X ǫ ( t ) = x and dX ǫ ( s ) = r ( X ǫ ( s )) ds + Φ ( X ǫ ( s ))  √ ǫd ¯ W ( s ) + ¯ u ( s, X ǫ ( s )) ds  . Letting Γ ǫ ( t, x ) = exp  − 1 ǫ h ( X ǫ ( T ))  d P d ¯ P ǫ ( X ǫ ) , it follo w s easily that under ¯ P ǫ , Γ ǫ ( t, x ) is an unbiased estimator f o r θ ( ǫ ). The p erforman ce of this estimator is c h a racterized b y the deca y rate of its second momen t Q ǫ ( t, x ; ¯ u ) . = ¯ E ǫ " exp  − 2 ǫ h ( X ǫ ( T ))   d P d ¯ P ǫ ( X ǫ )  2 # . (10) F ollo w ing [15], a v eriﬁcation argument can b e used to analyze Q ǫ ( t, x ; ¯ u ) as ǫ → 0. T o th i s end, we need an alternativ e exp ressio n of Q ǫ ( t, x ; ¯ u ) that allo ws u s to in vo ke the represent ation in Theorem 2.1. More precisely , sin ce ¯ u ( s, x ) is b ounded and con tin uous, w e can deﬁne X ǫ, − ¯ u to b e th e uniqu e strong solution to the equation dX ǫ, − ¯ u ( s ) = r  X ǫ, − ¯ u ( s )  ds + Φ  X ǫ, − ¯ u ( s )   √ ǫdW ( s ) − ¯ u ( s, X ǫ, − ¯ u ( s )) ds  6 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions on [ t, T ] with initial condition X ǫ, − ¯ u ( t ) = x . Then by Lemma 4.3 (stated late r on in generalit y suﬃcient for th e multiscale case), Q ǫ ( t, x ; ¯ u ) = E exp  − 2 ǫ h ( X ǫ, − ¯ u ( T )) + 1 ǫ Z T t   ¯ u ( s, X ǫ, − ¯ u ( s ))   2 ds  . Note that s in c e ¯ u and h are b ounded th e exp onen t in the last disp la y is uniformly b ounded. Hence b y Theorem 2.1 − ǫ log Q ǫ ( t, x ; ¯ u ) (11) = inf v ∈A E  1 2 Z T t k v ( s ) k 2 ds + 2 h ( X ǫ, − ¯ u,v ( T )) − Z T t   ¯ u ( s, X ǫ, − ¯ u,v ( s ))   2 ds  , where X ǫ, − ¯ u,v is the unique strong solution to the equation dX ǫ, − ¯ u,v ( s ) = r  X ǫ, − ¯ u,v ( s )  ds + Φ  X ǫ, − ¯ u,v ( s )   √ ǫdW ( s ) − [ ¯ u ( s, X ǫ, − ¯ u,v ( s )) − v ( s )] ds  on [ t, T ] with initial condition X ǫ, − ¯ u,v ( t ) = x . Fix an arbitrary v ∈ A and let ˆ X = X ǫ, − ¯ u,v . S i nce ¯ U ( t, x ) is a classical subsolution and ¯ u ( t, x ) is the minimizer in (8), it foll o ws that ¯ U t ( t, x ) +  ∇ x ¯ U ( t, x ) , r ( x ) − Φ( x ) ¯ u ( t, x )  ≥ 3 2 k ¯ u ( t, x ) k 2 for ev ery ( t, x ) ∈ (0 , T ) × R d . Hence Itˆ o’s form ula and (9) giv e d ¯ U ( s, ˆ X ( s )) ≥ 3 2    ¯ u ( s, ˆ X ( s ))    2 ds + D ∇ x ¯ U ( s, ˆ X ( s )) , Φ( ˆ X ( s )) v ( s ) E ds + √ ǫ D ∇ x ¯ U ( s, ˆ X ( s )) , Φ( ˆ X ( s )) dW ( s ) E + ǫ 2 tr h q ( ˆ X ( s )) ∇ xx ¯ U ( s, ˆ X ( s )) i ds. In tegrating the last tw o terms o v er [ t, T ] giv es a random v ariable R ( ǫ, v ) that con v erges in L 2 to zero as ǫ → 0, u n iformly in v ∈ A . Observin g that the second term on the right-hand-side is −h ¯ u ( s, ˆ X ( s )) , v ( s ) i and using ¯ U ( T , x ) ≤ h ( x ), one obtains h ( ˆ X ( T )) − ¯ U ( t, x ) ≥ Z T t  3 2    ¯ u ( s, ˆ X ( s ))    2 − D ¯ u ( s, ˆ X ( s )) , v ( s ) E  ds + R ( ǫ, v ) . (12) No w w e use the last displa y to b ound one of the t wo h ( X ǫ, − ¯ u,v ( T )) terms on the righ t-hand- side of (11), yielding the lo wer b ound 1 2 Z T t    v ( s ) − ¯ u ( s, ˆ X ( s ))    2 ds + h ( ˆ X ( T )) + ¯ U ( t, x ) + R ( ǫ, v ) . Setting ¯ v ( s ) = v ( s ) − ¯ u ( s, ˆ X ( s )), it follo w s that ˆ X = X ǫ, ¯ v with X ǫ, ¯ v deﬁned as in (6). Since ¯ v ∈ A , b y Theorem 2.1 E  1 2 Z T t k ¯ v ( s ) k 2 ds + h ( ˆ X ( T ))  ≥ − ǫ log E exp  − 1 ǫ h ( X ǫ ( T ))  , 7 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions and therefore lim in f ǫ → 0 − ǫ log Q ǫ ( t, x ; ¯ u ) ≥ lim inf ǫ → 0 inf v ∈A E  1 2 Z T t k ¯ v ( s ) k 2 ds + h ( ˆ X ( T )) + R ( ǫ, v )  + ¯ U ( t, x ) ≥ lim inf ǫ → 0 − ǫ log E exp  − 1 ǫ h ( X ǫ ( T ))  + ¯ U ( t, x ) = G ( t, x ) + ¯ U ( t, x ) . Giv en that ¯ U is a subsolution, it is automatic that ¯ U ( t, x ) ≤ G ( t, x ). Thus for the scheme to b e asym p to tically optimal we need ¯ U ( t, x ) = G ( t, x ) at the starting p oint ( t, x ). The subsolution ¯ U ( t, x ) = 0 corresp onds to standard Mon te Carlo (i.e., n o change of measure), and we reco v er the exp ected deca y rate for that case, whic h is G ( t, x ). Note that if on e can obtain a b ound on E [2 R ( ǫ, v )] th at is un if orm in v ∈ A , then non-asymptotic b ound s on the v ariance can also b e obtai ned. 3 Large Deviation Prop erties of Multiscale Diﬀusions In this section w e introdu ce assumptions and notation, and brieﬂy review the large devi- ations results for multisca le diﬀusions [13]. W e also revisit the su b solutio n appr o ac h to imp ortance samp li ng as discussed in the last section, and iden tify where the standard co n- struction breaks do wn if the multiscale feature of the problem is not in c orp orated. Through- out this section w e assume a p erio dic en vironment, that is, the fun c tions b ( x, y ), c ( x, y ), and σ ( x, y ) are p erio dic w it h p erio d λ in ev ery direction with r e sp ect to the second v ariable y . The extension to general random environmen ts but with sp ecialized dynamics, namely ﬁrst order Langevin equations, is discussed in Sectio n 6. 3.1 The Large Deviation Principle W e recall that the SDE of in terest is dX ǫ ( s ) =  ǫ δ b  X ǫ ( s ) , X ǫ ( s ) δ  + c  X ǫ ( s ) , X ǫ ( s ) δ  dt + √ ǫσ  X ǫ ( s ) , X ǫ ( s ) δ  dW ( s ) , X ǫ ( t ) = x. (13) The follo w i ng condition on (13) will b e used when e ve r the p erio dic case is discussed. Condition 3.1 1. The functions b ( x, y ) , σ ( x, y ) ar e c ontinuous and glob al ly b ounde d, as ar e their p artial derivatives up to or der 2 in x and or der 1 in y . The function c ( x, y ) is b ounde d and Lipschitz c ontinuous. 2. The diﬀusion matrix σ ( x, y ) σ ( x, y ) T is uniformly nonde gener ate. The follo wing condition will also b e assumed. In the condition, T d = [0 , λ ] d denotes th e d − dimensional torus. 8 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions Condition 3.2 Consider the op er ator deﬁne d for smo oth f : T d → R by L x f ( y ) = h b ( x, y ) , ∇ f ( y ) i + 1 2 tr  σ ( x, y ) σ ( x, y ) T ∇∇ f ( y )  , to gether with p erio dic b oundary c onditions in y . F or any ﬁxe d x , let µ ( dy | x ) b e the unique invariant pr ob ability me asur e c orr esp onding to L x . Then the drift b satisﬁes the c entering c ondition (cf. [5]) Z T d b ( x, y ) µ ( dy | x ) = 0 . Under Condition 3.2, for eac h ℓ ∈ { 1 , . . . , d } and x there exists a un iqu e f unctio n χ ℓ ( x, y ) that is t w ic e d iﬀe rentia ble and λ − p erio d ic in ev ery direction in y , and whic h solv es L x χ ℓ ( x, y ) = b ℓ ( x, y ) , Z T d χ ℓ ( x, y ) µ ( dy | x ) = 0 . (14) F or a pro of see [5 , Theorem 3.3.4 ]. The equation (14) is kno wn as a c el l pr oblem . Let χ = ( χ 1 , . . . , χ d ) . As we shall see b elo w, χ pla ys a crucial role in the design of a symp t otically eﬃcien t imp or- tance sampling sc hemes for m ultiscale diﬀus ions . W e sta te h e re the samp le path large deviations pr inciple for th e solution o f (13) deriv ed in [13]. Large deviations principles for sp ecia l cases of (13) can also b e found in [21, 3]. Theorem 3.1 Assume Conditions 3.1 and 3.2, and let { X ǫ , ǫ > 0 } b e the uniq ue str ong solution to (13). L et r ( x ) = Z T d  I + ∂ χ ∂ y  ( x, y ) c ( x, y ) µ ( dy | x ) , q ( x ) = Z T d  I + ∂ χ ∂ y  ( x, y ) σ ( x, y ) σ ( x, y ) T  I + ∂ χ ∂ y  ( x, y ) T µ ( dy | x ) , wher e I denotes the identity matrix. Then { X ǫ , ǫ > 0 } satisﬁes a lar ge deviations principle with r ate function S tT ( φ ) =    1 2 Z T t    ˙ φ ( s ) − r ( φ ( s ))    2 q − 1 ( φ ( s )) ds if φ ∈ A C ([ t, T ]; R d ) , φ ( t ) = x + ∞ otherwise . Comparing with the rate function for small noise diﬀusions in Section 2.2, it is ob vious wh y r ( x ) and q ( x ) are referred to as the “eﬀect ive d r ift” and “eﬀectiv e diﬀusivit y” in the literature. 9 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions 3.2 A N a iv e Use of Subsolutions for Multiscale Diﬀusions In this section we illustrate the f a ilure of the stand ard constru ct ion o f imp ortance sampling sc hemes by subsolutions as was outlined in Sect ion 2. 3 . Even if one u se s a subsolution with the maximum p ossible v alue at the starting p oin t, the sc heme can b e far from optimal if the m ultiscale feature is not incorp orated. The large deviation rate function in Th e orem 3.1 is identica l to that of a small n o ise diﬀusion (5) with disp ersion matrix Φ ( x ) as long as Φ( x )Φ( x ) T = q ( x ) . (15) Note that for a giv en q ( x ), the choice of Φ( x ) is n o t unique. Ho w ev er, the distribution of the solution to (5) remains the s ame no m a tter wh i c h Φ( x ) is used, and so we ﬁ x a Lipsc hitz con tin uous diﬀu sio n matrix Φ ( x ) for wh ic h (15) h ol ds. D ue to the form of the calculus of v ariation problem in the r ate function, the HJB equation r elated to the m ultiscale diﬀusion mo del and the Hamilto nian ¯ H are exactly the s a me as in (7) and (8), resp ec tiv ely . Therefore, giv en a sub s o lution ¯ U ( t, x ), (9) suggests the con trol ¯ u ( t, x ) = − Φ( x ) T ∇ x ¯ U ( t, x ) whic h we n o w blindly apply to the m ultiscale diﬀu sio n pro cess m odel. Supp ose that one mimics th e steps used in Section 2.3 for the new pr ocess mo del. T o simplify notation, as b efore w e temp orarily denote X ǫ, − ¯ u,v b y ˆ X . T hen in place of (12) one obtains h ( ˆ X ( T )) − ¯ U ( t, x ) ≥ Z T t " 3 2    ¯ u ( s, ˆ X ( s ))    2 + * ∇ x ¯ U ( s, ˆ X ( s )) , σ ˆ X ( s ) , ˆ X ( s ) δ ! v ( s ) +# ds + Z T t * ∇ x ¯ U ( s, ˆ X ( s )) , h ǫ δ b + c i ˆ X ( s ) , ˆ X ( s ) δ ! − r ( ˆ X ( s )) + ds (16) + Z T t * ∇ x ¯ U ( s, ˆ X ( s )) , " Φ( ˆ X ( s )) − σ ˆ X ( s ) , ˆ X ( s ) δ !# ¯ u ( s, ˆ X ( s )) + ds + R ( ǫ, v ) , where R ( ǫ, v ) → 0 in L 2 , uniform ly in v ∈ A . The second in tegral term in the r ig ht han d side of (16) in v olv es ǫ δ b ( ˆ X ( s ) , ˆ X ( s ) /δ ). T o d ea l with th e fact that ǫ/δ ↑ ∞ as ǫ ↓ 0, we recall the cell problem (14) and deﬁne the function ψ = ( ψ 1 , · · · , ψ d ), w here for eac h ℓ ∈ { 1 , . . . , d } ψ ℓ ( t, x, y ) = χ ℓ ( x, y ) ∂ ¯ U ( t, x ) ∂ x ℓ . 10 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions Applying Itˆ o’s form ula to ψ and substituting in to (16), we obtai n h ( ˆ X ( T )) − ¯ U ( t, x ) ≥ Z T t " 3 2    ¯ u ( s, ˆ X ( s ))    2 + * ∇ x ¯ U ( s, ˆ X ( s )) ,  I + ∂ χ ∂ y  σ ˆ X ( s ) , ˆ X ( s ) δ ! v ( s ) +# ds + Z T t * ∇ x ¯ U ( s, ˆ X ( s )) ,  I + ∂ χ ∂ y  c ˆ X ( s ) , ˆ X ( s ) δ ! − r ( ˆ X ( s )) + ds + Z T t * ∇ x ¯ U ( s, ˆ X ( s )) , " Φ( ˆ X ( s )) −  I + ∂ χ ∂ y  σ ˆ X ( s ) , ˆ X ( s ) δ !# ¯ u ( s, ˆ X ( s )) + ds + R ( ǫ, v ) , where again R ( ǫ, v ) → 0 in L 2 , uniformly in v ∈ A . If this inequalit y is inserted into the represent ation (11), th e n it b ecomes clear that th e desired lo we r b ound will n o t follo w. Under mild conditions, homogenization can b e applied as in [13, Theorem 2.7] implying that the second integ ral term v anishes in the limit. Ho w ev er, regarding the third term, one will need the inequalit y Z T t * ∇ x ¯ U ( s, ˆ X ( s )) , "  I + ∂ χ ∂ y  σ ˆ X ( s ) , ˆ X ( s ) δ ! − Φ( ˆ X ( s )) # v ( s ) + ds ≥ 0 to hold at least appro ximately for ǫ small. While the corresp onding b ound held trivially in the case without m ultiscale if Φ is chosen a s σ , it will not hold ev en app ro ximately here regardless of the c h oi ce of Φ. Indeed, h o mogenization theory [13, Th eo rem 2.7] implies that E Z T t * ∇ x ¯ U ( s, ˆ X ( s )) , ˜ σ ˆ X ( s ) , ˆ X ( s ) δ ! − Z T d ˜ σ  ˆ X ( s ) , y  µ ( dy | ˆ X ( s )) ! v ( s ) + 2 ds → 0 . where ˜ σ = ( I + ∂ χ/∂ y ) σ . Therefore in ord er for the desired inequalit y to hold one should c ho ose Φ( x ) = Z T d  I + ∂ χ ∂ y  σ ( x, y ) µ ( dy | x ) . This is not p ossible s i nce it violates (15) in general. F rom the preceding discussion it is not diﬃcu l t to see the fundamenta l diﬃculty that the a v eraged or eﬀectiv e diﬀusivity is to o crude an app ro ximation for the corresp onding con trol in imp ortance sampling to b e eﬃcien t. The form of q in Theorem 3.1 in fact suggests the correct con trol, wh ich is ¯ u ( s, x, y ) = − σ T ( x, y )  I + ∂ χ ∂ y ( x, y )  T ∇ x ¯ U ( s, x ) , (17) where y will b e replaced b y the fast motion X ǫ ( s ) /δ in implemen tation. A p roof of this assertion will b e giv en in the next section. Not surp risingly , this form of con trol is consisten t with the con trol used in the pro of of the large deviation lo we r b ound in [13]. 11 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions 4 Statemen t and P r o of of the Main Result Before stating and p ro ving the main result, we recapitulate th e fr a mewo rk and n ot ation. W e are in terested in imp ortance sampling estimator for a functional of the form θ ( ǫ ) = E[ e − 1 ǫ h ( X ǫ ( T )) | X ǫ ( t ) = x ] where X ǫ satisﬁes the SDE (13). According to Th e orem 3.1 the relev ant HJB is (7) with the Hamiltonian ¯ H of the form (8). F u rthermore, if h is b ounded and con tin uous then G ( t, x ) . = in f [ S tT ( φ ) + h ( φ ( T ))] = lim ǫ → 0 − ǫ log θ ( ǫ ) , where the inﬁm um is tak en o v er all φ ∈ C ([ t, T ]; R d ) suc h that φ ( t ) = x . Let ¯ U b e a s u bsolution to the HJB equation with the termin a l cond it ion h , and deﬁn e the con trol ¯ u by (1 7). L etting u ( s ) = ¯ u ( s, X ǫ ( s ) , X ǫ ( s ) /δ ) , it follo w s from Girsano v’s Theorem that dX ǫ ( s ) = h ǫ δ b + c i  X ǫ ( s ) , X ǫ ( s ) δ  ds + σ  X ǫ ( s ) , X ǫ ( s ) δ   √ ǫd ¯ W ( s ) + u ( s ) ds  , where ¯ W ( s ) is a standard Bro w n ia n motion und e r the probabilit y measure ¯ P ǫ deﬁned b y d ¯ P ǫ d P = exp  − 1 2 ǫ Z T t k u ( s ) k 2 ds + 1 √ ǫ Z T t h u ( s ) , dW ( s ) i  . (18) The p e rform a nce m e asure is th e n giv en b y the deca y rate of the second momen t Q ǫ ( t, x ; ¯ u ) as deﬁned in (10). Theorem 4.1 L et { X ǫ , ǫ > 0 } b e the solution to (13). Consider a b ounde d and c ontinuous function h : R d 7→ R and assume Conditions 2.1, 3.1 and 3.2. L et ¯ U ( s, x ) b e a sub so lution of (8 ) and deﬁne the c ontr ol ¯ u ( s , x, y ) by (17). Then lim in f ǫ → 0 − ǫ ln Q ǫ ( t, x ; ¯ u ) ≥ G ( t, x ) + ¯ U ( t, x ) . (19) Theorem 4.1 do es not co v er the imp ortan t case of estimating probabilities suc h as P[ X ǫ ( T ) ∈ A | X ǫ ( t ) = x ], sin c e in this case the corresp onding fu nctio n h is neither b ounded nor contin u o us. Recall that a set A ⊂ R d is called r e gu lar [with resp ect to S tT and th e initial condition ( t, x )] if the inﬁm um of S tT o v er the closure ¯ A is the same as the inﬁmum o v er the in terior A o . The f o llo wing result analogous to T heo rem 4.1 holds. Its p roof uses an argumen t v ery similar to [15] and is th us omitted. Prop o sition 4.2 L et { X ǫ , ǫ > 0 } b e the solution to (13). Assume Conditions 2.1 , 3.1 and 3.2. Consider a r e gu lar set A ⊂ R d and let h ( x ) = ( 0 if x ∈ A + ∞ if x / ∈ A. L et ¯ U ( s, x ) b e a subsolution of (7 ) with the terminal c ondition h and deﬁne the c ontr ol ¯ u ( s, x, y ) by (17). Then (19) holds. 12 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions The rest of th i s section is d evote d to the p roof of Theorem 4.1. W e ﬁrst establish an alternativ e representati on f o r the p erformance measure Q ǫ in terms of b ounded fu ncti ons, whic h will allo w us to in vo ke Theorem 2.1. Lemma 4.3 L et ¯ X ǫ . = { X ǫ, − ¯ u , t ≤ s ≤ T } solve ¯ X ǫ ( t ) = x and d ¯ X ǫ ( s ) = h ǫ δ b + c i  ¯ X ǫ ( s ) , ¯ X ǫ ( s ) δ  ds + σ  ¯ X ǫ ( s ) , ¯ X ǫ ( s ) δ   √ ǫdW ( s ) − ¯ u  s, ¯ X ǫ ( s ) , ¯ X ǫ ( s ) δ  ds  . Then Q ǫ ( t, x ; ¯ u ) . = ¯ E ǫ " exp  − 2 ǫ h ( X ǫ ( T ))   d P d ¯ P ǫ ( X ǫ )  2 # = E exp  − 2 ǫ h ( ¯ X ǫ ( T )) + 1 ǫ Z T t   ¯ u  s, ¯ X ǫ ( s ) , ¯ X ǫ ( s ) /δ    2 ds  . Pro o f. S ince ¯ u ( s, z , z /δ ) is uniformly b ounded, it follo ws from Girsano v’s theorem that d Q d P . = exp  − 1 √ ǫ Z T t h u ( s ) , dW ( s ) i − 1 2 ǫ Z T t k u ( s ) k 2 ds  deﬁnes a new probabilit y measure Q under whic h ˆ W ( s ) = W ( s ) + 1 √ ǫ Z s t u ( ρ ) dρ is a Bro wnian motion. Therefore X ǫ under Q has the same distrib utio n as ¯ X ǫ under P. This implies E exp  − 2 ǫ h ( ¯ X ǫ ( T )) + 1 ǫ Z T t   ¯ u ( s, ¯ X ǫ ( s ) , ¯ X ǫ ( s ) /δ )   2 ds  = E Q exp  − 2 ǫ h ( X ǫ ( T )) + 1 ǫ Z T t k u ( s ) k 2 ds  = E exp  − 2 ǫ h ( X ǫ ( T )) + 1 2 ǫ Z T t k u ( s ) k 2 ds − 1 √ ǫ Z T t h u ( s ) , dW ( s ) i  . Using (18), w e can con tin ue the last disp la y as ¯ E ǫ exp  − 2 ǫ h ( X ǫ ( T )) + 1 ǫ Z T t k u ( s ) k 2 ds − 2 √ ǫ Z T t h u ( s ) , dW ( s ) i  = ¯ E ǫ " exp  − 2 ǫ h ( X ǫ ( T ))   d P d ¯ P ǫ ( X ǫ )  2 # . This completes the pro of. 13 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions Pro o f of Theorem 4.1. Since h and ¯ u are b ounded, it follo ws f rom Lemma 4.3 and Theorem 2.1 that − ǫ log Q ǫ ( t, x ; ¯ u ) (20) = inf v ∈A E  1 2 Z T t k v ( s ) k 2 ds − Z T t k ¯ u ( s, ˆ X ( s ) , ˆ X ( s ) /δ ) k 2 ds + 2 h ( ˆ X ( T ))  , where ˆ X = ¯ X ǫ,v solv es ˆ X ( t ) = x and d ˆ X ( s ) = " ǫ δ b ˆ X ( s ) , ˆ X ( s ) δ ! + ¯ c s, ˆ X ( s ) , ˆ X ( s ) δ !# ds + σ ˆ X ( s ) , ˆ X ( s ) δ !  √ ǫdW ( s ) + v ( s ) ds  , with ¯ c ( s, x, y ) = c ( x, y ) − σ ( x, y ) ¯ u ( s, x, y ) . The asym p to tic analysis of v ariational problems analog ous to (20) has already app eared in [13], w here large deviation pr o p erties of m ultiscale diﬀusions suc h as T heorem 3.1 h a v e b een esta blished through a wea k con v ergence approac h. T o b e more precise, the condition ǫ/δ → ∞ corresp onds to what is called R e gime 1 in [13], and o ccupation measure tec hniques are used to characte rize th e limit of v ariational problems. Tigh tness and charac terization in terms of “relaxed con trols” are prov ed in Prop osition 3.1 and Theorem 2.8 of [13], and then the relaxed con trol f o rmulatio n is rewritten in terms of an ordin ary cont rol in Theorem 5.2. The diﬀerence b et w een the current v ariational pr o blem and those considered in [13] is that in [13] ¯ c w as indep endent of time and the midd le term in the right- hand - side of (20 ) w as absent . Nonetheless, these diﬀeren ces are only sup erﬁcial and the analysis of quan tities similar to the middle term of (2 0 ), e.g ., Z T t f s, ˆ X ( s ) , ˆ X ( s ) δ ! ds, can b e carried out using the same arguments as in [13, Prop osition 3.1 and Theorem 2.8]. F or this reason, we will d irect ly state a b ound f o r (20) without giving the details of the analysis: lim in f ǫ → 0 − ǫ log Q ǫ ( t, x ; ¯ u ) ≥ inf φ ∈AC ([ t,T ]; R d ) ,φ ( t )= x  1 2 Z T t    ˙ φ ( s ) − ¯ r ( s, φ ( s ))    2 q − 1 ( φ ( s )) ds (21) − Z T t Z T d k ¯ u ( s, φ ( s ) , y ) k 2 µ ( dy | φ ( s )) ds + 2 h ( φ ( T ))  , where ¯ r ( s, x ) . = r ( x ) − Z T d  I + ∂ χ ∂ y ( x, y )  σ ( x, y ) ¯ u ( s, x, y ) µ ( dy | x ) . 14 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions Recalling the deﬁnition of ¯ u , ¯ r ( s, x ) = r ( x ) + q ( x ) ∇ x ¯ U ( s, x ) and Z T t Z T d k ¯ u ( s, φ ( s ) , y ) k 2 µ ( dy | φ ( s )) ds = Z T t h∇ x ¯ U ( s, φ ( s )) , q ( x ) ∇ x ¯ U ( s, φ ( s )) i ds. Th us the quanti t y to b e minimized in (21) can b e rewritten as 1 2 Z T t    ˙ φ ( s ) − r ( φ ( s ))    2 q − 1 ( φ ( s )) ds − Z T t h ˙ φ ( s ) − r ( φ ( s )) , ∇ x ¯ U ( s, φ ( s )) i ds (22) − 1 2 Z T t h∇ x ¯ U ( s, φ ( s )) , q ( x ) ∇ x ¯ U ( s, φ ( s )) i ds + 2 h ( φ ( T )) . Giv en an arbitrary φ ∈ AC ([ t, T ]; R d ) with φ ( t ) = x , the subsolution prop ert y implies that d ds ¯ U ( s, φ ( s )) = ¯ U t ( s, φ ( s )) + h∇ x ¯ U ( s, φ ( s )) , ˙ φ ( s ) i ≥ h∇ x ¯ U ( s, φ ( s )) , ˙ φ ( s ) − r ( φ ( s )) i + 1 2 h∇ x ¯ U ( s, φ ( s )) , q ( φ ( s )) ∇ x ¯ U ( s, φ ( s )) i . In tegrating b oth sides on [ t, T ] and u sing the termin al condition ¯ U ( T , x ) ≤ h ( x ), it follo ws that (22) is b ounded f rom b elo w b y 1 2 Z T t    ˙ φ ( s ) − r ( φ ( s ))    2 q − 1 ( φ ( s )) ds + h ( φ ( T )) + ¯ U ( t, x ) . Note that the ﬁr s t s ummand is S tT ( φ ) and by deﬁnition G ( t, x ) is the inﬁ mum of the sum of the ﬁrst t w o terms o v er φ ∈ AC ([ t, T ]; R d ) with φ ( t ) = x , and th us lim in f ǫ → 0 − ǫ log Q ǫ ( t, x ; ¯ u ) ≥ G ( t, x ) + ¯ U ( t, x ) . This concludes the pro of. 5 First Order Langevin Equation with P erio dic En vironmen t In this section, we apply the general results to a sp ecial b ut imp ortan t class of diﬀusion mo dels, n a mely , the ﬁrst order Langevin equatio n dX ǫ ( t ) = −∇ V ǫ  X ǫ ( t ) , X ǫ ( t ) δ  dt + √ ǫ √ 2 D dW ( t ) , X ǫ (0) = x, where V ǫ is some p oten tial function and 2 D the diﬀusion constan t. W e are particularly in terested in the case wh e re the p oten tial function V ǫ is comp osed of a large-scale smo ot h part and a fast osc illating part of smaller magnitude: V ǫ ( x, x/δ ) = ǫQ ( x/δ ) + V ( x ) . 15 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions −2 −1 0 1 2 0.0 0.5 1.0 1.5 2.0 2.5 3.0 x V(x) Figure 1: V ǫ ( x, x/δ ) = ǫ (cos( x/δ ) + sin( x/δ )) + x 2 / 2 with ǫ = 0 . 1 , δ = 0 . 0 1. Th us the equation of in terest is dX ǫ ( t ) =  − ǫ δ ∇ Q  X ǫ ( t ) δ  − ∇ V ( X ǫ ( t ))  dt + √ ǫ √ 2 D dW ( t ) , X ǫ (0) = x. (23 ) In the notatio n of previous sections, this corresp onds t o b ( x, y ) = − ∇ Q ( y ) , c ( x, y ) = −∇ V ( x ) , σ ( x, y ) = √ 2 D . An example of su c h a p oten tial is giv en in Figure 1. As b efore, w e examine in s o me detail the mo del (23) with p eriod ic environmen t, and it is assu med in this section that Q ( y ) is p erio dic with p erio d λ . This p erio dicit y assumption may seem to o artiﬁcial in many practical applications. Ho we ve r, it motiv ates by analogy the design of imp ortance sampling sc hemes for ﬁ rst order Lan gevin equations with general r a ndom en vironment. See Section 6 for a discussion on these extensions. An imp ortan t observ ation for an equ a tion of the form (23) is that th e cell prob lem (14) dep ends only on y and not on x . He nce, in order to compute ¯ u from (17) we need only solv e the cell problem (14) onc e . T o b e more s peciﬁc, the in v ariant distribution µ ( dy | x ) to the cell problem is indep enden t of x , is of Gibb s t yp e µ ( dy | x ) = µ ( dy ) = 1 L e − Q ( y ) D dy , L = Z T d e − Q ( y ) D dy , and satisﬁes Condition 3.2. Explicit form ulas for the large deviation rate fun c tions and related quant ities are readily a v ailable for the one dimensional case d = 1. F or multi- dimens i onal cases, they are also a v ailable u n der extra assump ti ons on the p oten tial f unction V [13]. Since our numerical sim ulation will b e p erformed on one-dimensional mod els, w e only state the relev an t results 16 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions in Corollary 5.1 and refer the readers to [13] for more general form ulas. The pro of is omitted as it is a straigh tforward calculatio n from Theorems 3.1 and 4.1. Corollary 5.1 Consider the one dimensional c ase d = 1 a nd let { X ǫ } b e the uniq u e str ong solution to (23 ). U nd er Condition 3.1, { X ǫ } satisﬁes a lar ge deviations principle with r ate function S 0 T ( φ ) =    1 2 Z T 0 1 q [ ˙ φ ( s ) − r ( φ ( s ))] 2 ds if φ ∈ AC ([0 , T ]; R ) and φ (0) = x + ∞ otherwise , (24) wher e r ( x ) = − λ 2 V ′ ( x ) L ˆ L , q = 2 D λ 2 L ˆ L and L = Z T e − Q ( y ) D dy , ˆ L = Z T e Q ( y ) D dy . Given a classic al subsolution ¯ U , the imp ortanc e sampling c ontr ol that app e ars in The or e m 4.1 takes the form ¯ u ( t, x, y ) = − √ 2 D λ ˆ L e Q ( y ) D ¯ U x ( t, x ) . (25) An interesting observ ation fr om Corollary 5.1 is th at the eﬀectiv e d i ﬀusivity ǫq is alw a ys smaller than the diﬀusivity ǫ 2 D of the un h omo genized equatio n, since b y H¨ older’s inequalit y L ˆ L >  Z T dy  2 = λ 2 , as long as Q is not a constant. Th e in tuition is th at the p oten tial su r fac e has many small lo c al m i nima, whic h manifest themselve s in the h omogenized dynamics by a red u ct ion in the diﬀu sio n co eﬃc ient since a particle tra v eling on a r ou gh p oten tial su rface may suﬀer from the “trapping” eﬀect of these local minima. 6 Extension to Random En vironmen t Up until now all th e multiscale diﬀusion mo dels w e h a v e considered are of p erio dic envi- ronment , i.e., the drift vecto r and the disp ersion matrix are b oth p erio dic with r e sp ect to the fast v ariable. Th is section discu sses an extension to a ran d om environmen t. T o illus- trate the main idea, we sp ecialize again to diﬀusions go verned by the ﬁrst ord e r Langevin equation of t yp e dX ǫ,δ ( t ) =  − ǫ δ ∇ Q  X ǫ ( t ) δ  − ∇ V ( X ǫ ( t ))  dt + √ ǫ √ 2 D dW ( t ) , X ǫ,δ (0) = x. (26) In this section, w e ﬁnd it con v enien t to k eep trac k of b oth ǫ and δ and wr ite X ǫ,δ ( t ) for the solution to (26) as opp osed to X ǫ as in Section 5. The f ollo wing condition is assumed throughout this section as a substitute for Condition 3.1 in the per io dic case. 17 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions −2 −1 0 1 2 0.0 0.5 1.0 1.5 2.0 2.5 3.0 x V(x) Figure 2: V ǫ ( x, x/δ ) = ǫQ ( x/δ ) + x 2 / 2 with ǫ = 0 . 1 and δ = 0 . 01 . Condition 6.1 1. The c o eﬃcient { Q ( y ) , y ∈ R d } is a stationary, er go dic r andom ﬁeld deﬁne d on some pr ob ability sp ac e (Ψ , G , ν ) . F or ev ery ω ∈ Ψ , Q ( y , ω ) is C 2 ( R d ) in y with b ounde d and Lipschitz c ontinuous derivatives up to or der 2. 2. The c o eﬃcient V ( x ) is deterministic and V ( x ) ∈ C 2 ( R d ) with b ounde d and Lipschitz c ontinuous derivatives up to or der 2. The Wiener pro cess in (26) is deﬁned on another probabilit y space, and we work with the pro duct s pac e and pro duct measure and so W is in depend en t of Q . Note that for the sak e of n ot ational simplicit y , we h a ve sup p ressed the dep endence of X ǫ,δ and Q on ω for ω ∈ Ψ. Under Condition 6.1, for ev ery ω ∈ Ψ, there exists a unique strong solution to the SDE (26). In con trast to the p erio dic case, when equation (26) is used to mo del the dynamics of a particle in a rough p oten tial, the roughn e ss is du e to the “small” ran d omness generated b y the random ﬁeld ǫQ ( x/δ ). Figure 2 depicts a realizatio n of su c h a random ﬁeld sup erimp osed on the smo oth p ote ntia l f unction V ( x ) = x 2 / 2. In the ﬁ g ur e, Q ( x ) is a zero mean Gaussian random ﬁeld with Gaussian t y p e correlation, i.e., E ν [ Q ( x ) Q ( y )] = exp[ −| x − y | 2 ]. Compared with multiscale diﬀusions w it h p erio dic environmen t, th e analysis for general random med i a is relativ ely new. T o the b est of our kn o wledge, the ﬁrst attempt to generalize the results obtained for the lo cally p eriodic setting to the lo cally stationary setting p robably app eared in the w ork of [37], whic h studied random w alks on Z with a lo call y stationary en vironment. Extensions h a v e b een considered in [42, 43] to diﬀusions wh o se generato rs a re self-adjoin t and tak e a certain form. 6.1 Homogenization in One Dimension As in Section 5 , we state the homogenizatio n theorem for the one-dimensional case where ev erything can b e explicitly qu an tiﬁed. F or higher dimen s i ons, analogous resu lts exist bu t 18 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions explicit calculat ion is muc h more diﬃcult. In the follo wing r e sult we assume that ǫ is ﬁxed and let δ tend to zero in ord e r to clearly identify the eﬀect of homogenization. T o ease exp ositio n, we temp o rarily denote X ǫ,δ b y X δ . Theorem 6.1 Consider the one dimension c ase and assume Condition 6.1. Then the law of X δ on C ([0 , T ]; R ) c onver ges we akly to the law of X , in pr ob ability with r esp e ct to ν , wher e X is the solution to the SDE dX ( t ) = − 1 K ˆ K V ′ ( X ( t )) dt + r 2 ǫD K ˆ K dW ( t ) , X (0) = x, and with K = E ν h e − Q ( y ) /D i , ˆ K = E ν h e Q ( y ) /D i . (27) The rest of this subsection is devot ed to th e pro of of this theorem. Since it is very similar to th at of [42, Theorem 3.1], w e will only giv e an outline. Without loss of generalit y , w e assume ǫ = 1 in the pro of. W e start with the follo wing lemma. Lemma 6.2 L et h : R → R b e a b ounde d and me asur able function and let { ψ ( y ) , y ∈ R } b e a stationary r andom ﬁeld such that E ν h | ψ ( y ) | e − Q ( y ) /D i < ∞ . Then as δ → 0 E ω sup 0 ≤ t ≤ T     Z t 0 ψ  X δ ( s ) δ  h ( X δ ( s )) ds − B Z t 0 h ( X δ ( s )) ds     → 0 in L 1 ( ν ) , wher e E ω denotes the exp e cte d v alue with r esp e ct to the indep endent Wiener pr o c e ss W bu t with ω ∈ Ψ given, and B . = 1 K E ν h ψ ( y ) e − Q ( y ) /D i . Pro o f. The p roof follo ws from Theorem 6.1 in [42]. The only o bserv ation we need to make is that in our case the fast mot ion is go verned b y the generator − Q ′ ( y ) d dy + D d 2 dy 2 . Then by the discuss i on in Section 2.2, in p a rticular Theorem 2.1, of [36], the a v eraging should b e tak en under the corresp onding ergo dic statio nary pr o babilit y measur e on (Ψ , G ) sa y π whic h s a tisﬁes E π [ ψ ( y )] = 1 K E ν [ ψ ( y ) e − Q ( y ) /D ] for an y stationary random ﬁeld ψ . Pro o f of Theorem 6.1. Let { χ ( y ) , y ∈ R } b e the random ﬁeld deﬁned by χ ( y ) = 1 ˆ K Z y 0 e Q ( z ) /D dz − y 19 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions It is straigh tforw ard to ve rify th a t E ν [ χ ( y )] = E ν [ χ ′ ( y )] = 0 and 1 + χ ′ ( y ) = 1 ˆ K e Q ( y ) /D , (28) − Q ′ ( y ) χ ′ ( y ) + D χ ′′ ( y ) = Q ′ ( y ) . Deﬁne the pro cess Y δ ( t ) = X δ ( t ) + Z t 0  1 + χ ′  X δ ( s ) δ  V ′ ( X δ ( s )) ds + δ · χ  X δ ( t ) δ  Then Itˆ o’s form ula implies that Y δ ( t ) = x + √ 2 D Z t 0  1 + χ ′  X δ ( s ) δ  dW ( s ) . The latter and (28) imply that { Y δ } is ν -a.s. a martingale with quadratic v ariation h Y δ i ( t ) = 2 D ˆ K 2 Z t 0 e 2 Q ( X δ ( s ) /δ ) D ds. It follo ws no w from Lemma 6.2 [taking h ( x ) = 1] that as δ → 0 E ω sup 0 ≤ t ≤ T    h Y δ i ( t ) − B t    → 0 in L 1 ( ν ) , where B . = 2 D ˆ K 2 · 1 K E ν h e Q ( y ) /D i = 2 D ˆ K 2 · ˆ K K = 2 D K ˆ K . Similarly , by Lemma 6.2 [taking h ( x ) = V ′ ( x )] again w e ha v e that E ω sup 0 ≤ t ≤ T     Z t 0  1 + χ ′  X δ ( s ) δ  V ′ ( X δ ( s )) ds − 1 K ˆ K Z t 0 V ′ ( X δ ( s )) ds     → 0 in L 1 ( ν ) as δ → 0. Finally , note that by Birkhoﬀ ’s er go dic theorem χ ( y ) y → 0 ν -almost surely , as | y | → ∞ . Thus as δ → 0 sup 0 ≤ t ≤ T     δ · χ  X δ ( t ) δ      → 0 w ith pr o bability one ν -almost sur e ly . Then the desired con vergence follo ws immediately since Y δ ( · ) ⇒ x + r 2 D K ˆ K W ( · ) in probabilit y with resp ect to ν . This completes the pro of. 20 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions 6.2 Imp ortance Sampling Sche mes It is easy to see that the h o mogenization result in the random case is analogous to that in the p erio dic case C orollary 5.1 with λ = 1 and L, ˆ L replaced b y K , ˆ K , resp ectiv ely . By analogy , it suggests that in the one dimensional case { X ǫ,δ , ǫ , δ > 0 } satisﬁes the large deviations principle with rate function giv en by (24), wh er e r ( x ) = − V ′ ( x ) K ˆ K , q = 2 D K ˆ K and K and ˆ K are deﬁned b y (27). T herefore it is natural to conjecture that giv en a classical subsolution ¯ U , the corresp onding control tak es a similar form ¯ u ( t, x, y ) = − √ 2 D 1 ˆ K e Q ( y ) D ¯ U x ( t, x ) . Note that in con trast to the p eriod ic case, the cont rol ¯ u is random in that it implicitly dep ends on ω ∈ Ψ since the random environmen t Q ( y ) dep ends on it. Even though the asso ci ated imp ortance sampling sc hemes are sho wn to b e eﬃcient in our empir i cal study , dev elopmen t of the underlying large deviation theory and a rigorous p erformance analysis remains to b e d on e. 7 Sim u l ation Results In this section we test the p erformance of v arious Mon te Carlo estimators for multiscal e diﬀusions with p erio dic or random environmen t. Throughout this section, we assume that the diﬀusion p rocess X ǫ is one-dimensional and satisﬁes th e ﬁrst order Langevin equation (23). 7.1 Sim ulation Results for Periodic Case Supp ose that w e are inte rested in the Mon te Carlo estimation of θ ( ǫ ) . = E[ e − 1 ǫ h ( X ǫ ( T )) | X ǫ ( t ) = x ] for a conti nuous function h and in a p erio d ic en vironment. W e compare thr ee unbiased estimators. 1. T he stand ard Monte Carlo esti mator ˆ θ 0 ( ǫ ), 2. T he imp ortance sampling estimator ˆ θ 1 ( ǫ ) b a sed on the c hange of measure su gg ested b y (25), i.e., u 1 ( t, x, y ) . = − √ 2 D λ ˆ L e Q ( y ) /D ¯ U x ( t, x ) , where ¯ U is a subsolution to the homogenized HJB equation (7 ) . Note that in imple- men tation, x will b e replaced b y the curren t state of X ǫ and y by X ǫ /δ . 21 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions 3. T he imp ortance sampling estimator ˆ θ 2 ( ǫ ) based on the c hange of measure corresp ond- ing to the con trol u 2 ( t, x ) . = − √ q ¯ U x ( t, x ) = − √ 2 D λ p L ˆ L ¯ U x ( t, x ) . This is the c hange of m easur e based on the control suggested b y the homogenized HJB equation without taking into consideration of the m ultiscale nature of the d ynamics. It is indep enden t of the fast v ariable and diﬀers from the control u 1 b y a factor p 2 D / q [1 + χ ′ ( y )]. Based on the theory d ev elop ed previously , the estimator ˆ θ 1 ( ǫ ) s hould outp erform the other t w o estimators as ǫ → 0. F or numerical exp erimen tation, we consider the p oten tial fu nctio n th a t is dra wn in Figure 1, that is, V ( x ) = x 2 / 2 and Q ( y ) = cos y + sin y . Thus the p erio d is λ = 2 π . Let h ( x ) = ( ( x − 1) 2 x ≥ 0 ( x + 1) 2 x < 0 . It follo ws from Corollary 5.1 that the eﬀectiv e drift r ( x ) and diﬀusivit y q are r ( x ) = − κx , q = 2 D κ, w here κ = λ 2 L ˆ L = 4 π 2 L ˆ L . The limiting HJB equation (7) b ecome s U t ( t, x ) − κxU x ( t, x ) − κD | U x ( t, x ) | 2 = 0 , U ( T , x ) = h ( x ) . (29) Under m ild cond i tions that are satisﬁed here, the uniqu e v iscosity s olution to this HJB equation equals G ( t, x ) . = in f [ S tT ( φ ) + h ( φ ( T ))] = lim ǫ → 0 − ǫ log θ ( ǫ ) , where S tT ( · ) is given by Corollary 5.1 an d the inﬁ m u m is tak en o v er all φ ∈ AC ([ t, T ]; R d ) suc h that φ ( t ) = x . On e can solv e this v ariational p roblem explicitly and obtain G ( t, x ) = ( e κT − | x | e κt ) 2 (1 + 2 D ) e 2 κT − 2 D e 2 κt . Since G is not smo oth at x = 0, it is not a classical sen se solution. In general one should mollify it in ord e r to pro duce a smo ot h subsolution, but it is kn o wn (see [46] for an analogous situation) that the b ound on p erformance is still v alid if the subsolution is the minimum of t wo classica l s en se solutions with a single d isc onti nuous in terface. Therefore w e can just deﬁne the subsolution ¯ U as th e solution G and the corresp onding con trols are u 1 ( t, x, y ) = √ 2 D 2 π ˆ L e (cos y +sin y ) /D 2 e κt ( e κT − | x | e κt ) (1 + 2 D ) e 2 κT − 2 D e 2 κt · sign( x ) , u 2 ( t, x ) = √ 2 D 2 π p L ˆ L 2 e κt ( e κT − | x | e κt ) (1 + 2 D ) e 2 κT − 2 D e 2 κt · sign( x ) , 22 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions where sign( x ) . = 1 if x ≥ 0 and sign( x ) . = − 1 if x < 0. In the n umerical sim ulation, w e set D = 1, the in it ial condition ( t, x ) = (0 , 0 . 05), and the terminal time T = 1. One can calculate ˆ L = 9 . 8399 9 and κ = 0 . 407728. W e us ed a predictor- corrector Euler scheme to sim ulate the tra jectories of (23) and the asso cia ted con trolled SDE. F or reasons that will b e discussed in App endix A, a d irec t numerical appro ximation sc heme wa s adopted instead of other te c hn iques suc h as multisca le integrat or or pro jectiv e in tegrator metho ds (se e [45, 17, 23]). By Theorem 7.1 w e kno w that the error in the Eu ler appro ximation is b ounded by a term of ord er ∆ ǫ/δ 2 , wh ere ∆ is the time discr etization step. F or eac h c hoice of ǫ and δ we c hose ∆ s o th a t the aforement ioned error b ound is of the order 0 . 001. In other w ords, we set ∆ = 0 . 001 · δ 2 /ǫ . Hence, as ǫ gets sm a ller the discretization step ∆ b ecomes smaller as well . Ev en though by extensiv e exp erimenta tion w e found that in general the c hoice of ∆ w as crucial for obtaining accurate results, for the p eriod ic example stud i ed h e re the requir e ment can b e relaxed an d a coarser discretization can still lead to accurate and stable results. Sim ulations were done u sing parallel computing in the C programming language. W e used Mersenne T wiste r [35] f o r the r a nd om num b er generator, with a samp le size of N = 10 7 . The measure for comparing diﬀeren t sc hemes is the relativ e error of th e estimato rs, which is deﬁned as relativ e error . = standard deviation of the estimato r exp ecte d v alue of th e estimator The smaller the relativ e error the more eﬃcient the estimator. Sin ce in practice b oth the standard deviation and the exp ected v alue of an estimato r are t ypically unknown, empirical relativ e error is often used for measurement . In other w ords, the exp ected v alue of the estimator will b e replaced by the emp i rical samp l e mean, a nd the stand ard d e viation of th e estimator will b e replaced by the emp irica l sample stand a rd error. In order to distinguish among the d iﬀe rent Mont e C a rlo pro cedures, w e denote by ˆ ρ i ( ǫ ) the empirical r el ativ e error of ˆ θ i ( ǫ ) for i = 0 , 1 , 2. W e w ould lik e to p oin t out that the exp ect ed v alue in the denominator [whic h is alw a ys θ ( ǫ ) due to unbiasedness] is replaced by ˆ θ 1 ( ǫ ), regardless of i . The r e ason is that ˆ θ 1 ( ǫ ) is the most accurate estimate of θ ( ǫ ). The numerica l r e sults are summ arized in T able 1. As susp ected, the estimator ˆ θ 1 ( ǫ ) signiﬁcan tly outperf o rms b oth the stand a rd Mo nte Carlo estimator ˆ θ 0 ( ǫ ) and the estimat or ˆ θ 2 ( ǫ ) whic h corresp onds to the c hange of measure purely b a sed on the h omo genized HJB equation. In particular, the esti mator ˆ θ 1 ( ǫ ) seems to b e of boun ded relativ e error, whic h is a stronger notion of eﬃciency than asymptotic optimalit y . No. ǫ δ ǫ/δ ˆ θ 0 ( ǫ ) ˆ θ 1 ( ǫ ) ˆ θ 2 ( ǫ ) ˆ ρ 0 ( ǫ ) ˆ ρ 1 ( ǫ ) ˆ ρ 2 ( ǫ ) 1 0 . 25 0 . 1 2 . 5 2 . 26 e − 1 2 . 25 e − 1 2 . 26 e − 1 3 . 36 e − 4 1 . 76 e − 3 6 . 3 4 e − 4 2 0 . 125 0 . 04 3 . 125 3 . 66 e − 2 3 . 6 5 e − 2 3 . 66 e − 2 8 . 40 e − 4 1 . 80 e − 3 1 . 43 e − 3 3 0 . 063 0 . 0 16 3 . 94 9 . 34 e − 4 9 . 33 e − 4 9 . 36 e − 4 3 . 29 e − 3 1 . 85 e − 3 4 . 7 1 e − 3 4 0 . 03 1 25 0 . 007 4 . 46 6 . 93 e − 7 6 . 87 e − 7 6 . 9 9 e − 7 4 . 47 e − 2 8 . 00 e − 4 3 . 3 2 e − 2 5 0 . 025 0 . 0 04 6 . 25 1 . 48 e − 8 1 . 61 e − 8 1 . 51 e − 8 6 . 85 e − 2 7 . 55 e − 4 3 . 0 6 e − 2 6 0 . 02 0 . 002 10 3 . 08 e − 10 1 . 99 e − 1 0 1 . 51 e − 10 4 . 09 e − 1 3 . 82 e − 4 4 . 9 7 e − 2 7 0 . 015 0 . 0013 11 . 54 7 . 60 e − 1 4 1 . 37 e − 13 1 . 07 e − 13 2 . 5 3 e − 1 3 . 01 e − 4 1 . 8 6 e − 1 T able 1: Comparison table for p erio d ic case 23 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions 7.2 Sim ulation Results for Random Case In this section w e test the p erforman ce of the prop osed estimator in the case of a random en vironment by estimating an exit p robabilit y . I n particular, we again consider the ﬁrs t order Langevin equation (23) in one dimension and wish to estima te th e exit p r o bability θ ( ǫ ) . = P  X ǫ ( τ ǫ ) = x + | X ǫ (0) = x  , where the exit time τ ǫ is deﬁned b y τ ǫ . = in f  t ≥ 0 : X ǫ ( t ) / ∈ ( x − , x + )  with x ∈ ( x − , x + ) . As in the p eriod ic case, we compare the estimator prop osed in Section 6.2 [again denoted b y ˆ θ 1 ( ǫ )] with the standard Mon te-Carlo estimator ˆ θ 0 ( ǫ ) and with the estimat or ˆ θ 2 ( ǫ ) that corresp onds to th e change of measure based just o n the homogenized HJB equation. W e consider V ( x ) = x an d Q ( y ) to b e a zero mean Gauss ian random ﬁeld with co v ariance function E ν [ Q ( x ) Q ( y )] = exp[ − | x − y | 2 ] . It follo ws from Theorem 6.1 that the eﬀectiv e drift r ( x ) and diﬀusivit y q are resp ectiv ely r = − κ, q = 2 Dκ, w here κ = 1 K ˆ K . In this case the limiting HJB equation is the time ind e p endent v ersion of (29) deﬁned on the in terv al ( x − , x + ) κU ′ ( x ) + κD   U ′ ( x )   2 = 0 , with the b oundary condition U ( x ) = 0 if x = x + and U ( x ) = ∞ if x = x − . W e consid er the case D = 1, initial p oin t ( t, x ) = (0 , 0) and x ± = ± 0 . 5. T he maximal viscosit y solution (whic h is also the maximal classical sense subsolution) to this HJB equation is U ( x ) = 1 D [ x + − x ] , x ∈ ( x − , x + ) . While this examp le is not o ver a ﬁ nite time inte rv al, the p roof of Th eorem 4.1 can b e adapted as in [15] to yield the analog ous resu lt s. The imp ortance sampling estimator ˆ θ 1 ( ǫ ) corresp onds to th e control u 1 ( y ) = √ 2 D ˆ K D e Q ( y ) /D , while the estimator ˆ θ 2 ( ǫ ) corresp onds to th e constan t con trol u 2 = r 2 D K ˆ K 1 D . F ollo w ing the n o tation of the p erio dic case, we su mmarize the simulat ion results in T able 2. W e used the randomizatio n metho d to s imulate the Gaussian rand o m en vironment . S e e [28] for an exp o sition on the simula tion of Gaussian random ﬁelds. As in the p eriodic case, th e estimator ˆ θ 1 ( ǫ ) outp erforms b o th ˆ θ 0 ( ǫ ) and ˆ θ 2 ( ǫ ). C o mparin g T ables 1 and 2, the reader ma y w onder why w e did not try com b inat ions of ( ǫ, δ ) with larger 24 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions No. ǫ δ ǫ/δ ˆ θ 0 ( ǫ ) ˆ θ 1 ( ǫ ) ˆ θ 2 ( ǫ ) ˆ ρ 0 ( ǫ ) ˆ ρ 1 ( ǫ ) ˆ ρ 2 ( ǫ ) 1 0 . 25 0 . 1 2 . 5 1 . 38 e − 1 1 . 38 e − 1 1 . 38 e − 1 7 . 88 e − 4 1 . 59 e − 4 8 . 0 9 e − 4 2 0 . 125 0 . 04 3 . 125 1 . 28 e − 2 1 . 31 e − 2 1 . 28 e − 2 2 . 27 e − 3 4 . 91 e − 3 2 . 6 1 e − 3 3 0 . 06 2 5 0 . 018 3 . 4 7 2 6 . 02 e − 4 6 . 13 e − 4 5 . 89 e − 4 1 . 12 e − 2 5 . 93 e − 3 1 . 3 2 e − 2 4 0 . 05 0 . 01 5 2 . 38 e − 5 2 . 30 e − 5 2 . 22 e − 5 6 . 70 e − 2 8 . 89 e − 3 9 . 9 7 e − 2 5 0 . 04 0 . 007 5 . 72 5 . 5 e − 6 5 . 93 e − 6 4 . 86 e − 6 1 . 25 e − 1 5 . 53 e − 2 1 . 0 5 e − 1 6 0 . 025 0 . 004 6 . 25 − 7 . 82 e − 10 1 . 26 e − 0 9 − 3 . 86 e − 2 5 . 87 e − 1 T able 2: Comparison table for rand o m case with negativ e drift ratio ǫ/δ . Extensiv e empirical s t ud i es sh o wed th a t the direct numerical sc heme that we c hose to sim ulate from the S DE was muc h more sensitiv e in honoring the rule for choosing the discretization step ∆ = 0 . 001 · δ 2 /ǫ in the random case than it wa s for the p erio dic case. So, du e to practical limitation on the computational bu d ge t, w e had to limit to the v alues rep orted in T able 2 in order to obtain meaningfu l results. Note that this s i gniﬁcant computational burden r equ ired to pro duce samples is indep e nd e nt of the sc heme, and th us pro vides further imp etus for the dev elopmen t of the theoretica lly b est alg orithms. W e also ran sim ulations for the mo del dX ǫ ( t ) =  − ǫ δ Q ′  X ǫ ( t ) δ  − x  dt + √ ǫ √ 2 D dW ( t ) , X ǫ (0) = x. (30) where Q is the same r an d om ﬁeld as b efore and the goal is again to estimate the exit probabilit y θ ( ǫ ) . = P  X ǫ ( τ ǫ ) = x + | X ǫ (0) = x  , with no w x − = 0, x + = 0 . 8 and x = 0 . 1. Notice that this corr esp onds to the (random) p ote ntia l function that is dra w n in Figure 2. T he diﬀerence b et ween this and the previous mo del is that h e re there exists a stable p oin t, in particular at x = 0, in the domain of attraction (see Figure 2 ) . The r e sults w ere qualitativ ely the same as in T able 2 and are presented in T able 3. No. ǫ δ ǫ/δ ˆ θ 0 ( ǫ ) ˆ θ 1 ( ǫ ) ˆ θ 2 ( ǫ ) ˆ ρ 0 ( ǫ ) ˆ ρ 1 ( ǫ ) ˆ ρ 2 ( ǫ ) 1 0 . 25 0 . 1 2 . 5 1 . 56 e − 1 1 . 56 e − 1 1 . 56 e − 1 7 . 37 e − 4 4 . 93 e − 4 5 . 77 e − 4 2 0 . 125 0 . 04 3 . 125 2 . 35 e − 2 2 . 3 9 e − 2 2 . 35 e − 2 2 . 00 e − 3 6 . 53 e − 3 1 . 4 3 e − 3 3 0 . 0625 0 . 01 8 3 . 472 2 . 25 e − 3 2 . 32 e − 3 2 . 25 e − 3 6 . 4 5 e − 3 3 . 66 e − 2 3 . 1 4 e − 2 4 0 . 03 1 25 0 . 008 3 . 91 5 . 03 e − 5 2 . 78 e − 5 4 . 3 6 e − 5 8 . 08 e − 2 8 . 91 e − 2 4 . 54 e − 1 5 0 . 025 0 . 0 06 4 . 17 1 . 38 e − 5 5 . 2 3 e − 6 8 . 91 e − 6 2 . 25 e − 1 7 . 79 e − 2 1 . 8 9 e − 1 6 0 . 02 0 . 00 45 4 . 44 2 . 0 e − 7 3 . 0 7 e − 7 3 . 11 e − 7 4 . 61 e − 1 1 . 16 e − 2 2 . 0 1 e − 1 T able 3: Comparison table for random case with a rest p o int App endix A. Numerical Sc hemes for Multiscale Problems Assume for simplicit y the p erio dic setup , and denote b y Y ǫ n the numerical approximat ion to X ǫ t pro vided b y a direct sc h e me with w eak order o f con v ergence p . F or suc h a numerical 25 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions sc heme one has the follo wing error b ound, wh ose pro of follo ws from standard argum e nts, e.g. [17], and will not b e r e p eated here. Theorem 7.1 If ǫ, δ and the discr etization step ∆ ar e such that ∆ ǫ/δ 2 ≪ 1 , then for every T > 0 and e very smo oth function f with c omp act supp ort, ther e exist c onstants C 0 < ∞ a nd h 0 > 0 that ar e indep endent of ǫ, δ and ∆ such that for al l ∆ , ǫ, δ satisfying ∆ ǫ/δ 2 < h 0 , sup n ≤ T / ∆ | E x 0 f ( X ǫ t n ) − E x 0 f ( Y ǫ n ) | ≤ C 0  ∆( ǫ/δ 2 )  p . (31) The b o und (31) illustrates the computational diﬃcult y in appro ximating multiscale diﬀusions. Fix an error tolerance lev el ζ . The err or b ound (31) indicates that a time step of order ∆ = O  δ 2 ǫ ζ 1 /p  is needed. The corresp onding computational cost p er unit of time is 1 / ∆, wh ic h b ecomes more exp ensiv e as δ , ǫ and δ /ǫ b ecome s ma ller. Of course, the increasing computational cost highlights the imp ortance of imp ortan t sampling or other fast sim ulation tec hniques for treating these kinds of problems. Since our in terest in this work is to study imp orta nce sampling and not the n umerical metho ds, w e d o not elab orat e here muc h on the approximat ion asp ect o f the problem. Numerical metho ds suc h as multisca le inte grator metho ds [45, 17], and pro jective integrato r metho ds [23, 38] ha v e b een prop osed to eﬃcient ly sim ulate systems with widely separated time scales. These metho ds tur n out to b e less co stly th an direct appro ximation metho ds, esp eciall y when δ is small. F or example, in the case of equation (1) and for δ = ǫ 3 / 2 , p erturbation anal ysis [17] sho ws that these metho ds are less costly th an direct approximat ion w h en ǫ ≪ ζ . Observe that in the examples of Section 7, the v alues of ǫ and δ were not smaller than the o v erall tolerance error ζ . Hence, we c hose to us e d irect approxima tion schemes and to r e ly on parallel computing to carry out the sim ulation. Ac kno wledgmen ts W e wo uld like to thank the Cente r for Comp utat ion and Visualization (CCV) at Bro wn Univ ersit y for making a v ailable to us their high p erformance c ompu tin g center. References [1] O. Alv arez and M. Bardi, Viscosit y solutions metho ds for singular p erturbations in d e - terministic and sto c h a stic con trol, SIAM Journal on Contr ol and Optimization, 40(4), (2001 ), pp. 1159-11 88. [2] A. Ansari, Mean ﬁrst passage time solution of th e Smoluc ho wski equation: Application of r el axation dynamics in m y oglobin, Journal of Chemic al Physics , 112(5), (2 000), pp. 2516- 2522. [3] P . Baldi, Large deviations for d iﬀusions p rocesses w it h h omo genization and applica- tions, Anna ls of Pr ob ability , 19(2), (199 1), pp. 509–52 4. 26 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions [4] M. Bardi and I. Capuzzo Dolcet ta, Optimal Contr ol and Visc osity Solutions of Hamil- ton Jac obi Bel lman Equations , Birk¨ auser, Boston, 1997. [5] A. Bensoussan, J.L. Lions and G. P apanicolaou, Asymptot ic Analysis f o r Perio dic Structur es , V ol 5, Stu dies in Mathematics and its App lic ations, North-Holland Pub- lishing Co., Amsterdam, 1978. [6] M. Bou´ e an d P . Dupuis, A v ariational r epresen tation for certain functionals of Bro wn ia n motion, Ann als of Pr ob ability , 26(4), (1998) , pp. 1641- 1659. [7] M. Bou ´ e, P . Dupuis and R. S. Ellis, Large d e viations for small n o ise diﬀusions with discon tin uous statistics, Pr ob ability The ory and R elate d Fields , 116(1), (2000), pp. 125- 149. [8] J. D. Bryngelson, J. N. Onuc h ic , N. D. So cci and P . G. W olynes, F un nels, pathw ays and th e energy landscap e of protein folding: A synt hesis, Pr oteins , 21(3) , (1995), pp. 167-1 95. [9] R. Buc kdahn and N. I chihara, Limit theorem for con trolled backw ard SDEs and ho- mogenizatio n of Hamilton-Jacobi-Bellman equations, Applie d Mathematics and O p ti- mization , 51 (200 5), pp. 1-33. [10] M. G. Crand all, H. Ishii and P .-L. Lions, User’s guide to viscosit y solutions of second order partial d iﬀe rentia l equations, Bul l. Amer. M ath. So c., (N.S.), 27(1) , (1992), pp . 1-67. [11] T. Dean and P . Dupuis, S plitti ng for r a re eve nt sim ulation: a la rge d evi ation app roa c h to design and analysis, Sto c hastic Pr o c esses and their Applic ations , 119 (2009), pp . 562-5 87. [12] P . Dupuis a nd R.S. Ellis, A We ak Conver genc e Appr o ach to the The ory of L ar ge Devi- ations , John Wiley & Sons, New Y ork, 199 7. [13] P . Dupu is and K. Spiliop oulos, Large deviatio ns for m ultiscale problems via w eak con v ergence m e tho ds, Sto chastic Pr o c esses and their Applic ations , (2011), to app ear. [14] P . Dupuis and H. W ang, Imp ortance sampling, large deviations and diﬀeren tial games, Sto chastics and Sto chastics R ep orts, 76, (2004), pp. 481-50 8. [15] P . Dupuis and H. W ang, Su bsolutions of an Isaacs equation and eﬃcient s c hemes for imp ortance sampling, Mathematics of Op er ations R ese ar ch , 32 (3), (2007), p p. 723-757. [16] W.E. and B. Engquist, Th e heterogeneous m ulti-scale metho d s, Comm. M a th. Sc. , 1(1), (2003 ), pp. 87-133. [17] W. E, D. Liu and E. V anden-Eijn den, Analysis of m ultiscale metho ds for sto c hastic diﬀeren tial equations. Comm. Pur e Appl. Math. , 58(11), (2005) , 1544-15 85. [18] S.N. Eithier and T.G. Kurtz, M a rkov Pr o c esses: Char acterization and Conver genc e , John Wiley & Sons, New Y ork, 1986. 27 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions [19] L. Ev ans, P erio dic h o mogenization of certain fully nonlinear partial diﬀerentia l equa- tions, Pr o c. R oy. So c. Edinbur gh Se ction A , 120 (19 92), p p. 245-265. [20] W.H. Fleming and H.M. Son er , Contr ol le d Markov Pr o c esses and Visc osity Solutions , Springer, 2nd Ed., 2006 . [21] M. F reidlin a nd R. So w ers, A comparison of homogeniza tion and large deviations, with applications to wa v efront propagatio n, Sto chastic Pr o c ess and their Applic ations , 82(1), (1999 ), pp. 23–52. [22] C.W. Gear and I.G. Kevrekidis, Pro jectiv e metho ds for stiﬀ d iﬀe rentia l equations: problems with gaps in their eigen v alue sp ectrum, SIAM Journal of Scientiﬁc Comput- ing , 24(4 ), (2003), pp. 109-110 . [23] D. Giv on, I. G. Kevrekidis and R. Kup f e rman, Strong con v ergence of pro jectiv e in te- gration schemes for singularly p erturb ed sto c h ast ic diﬀeren tial systems, Comm. Math. Sci. , 4(4), (200 6), pp. 707-72 9. [24] P . Guasoni and S. Rob ertson, Optimal imp ortance sampling with explicit formulas in con tin uous time, Financ e and Sto chastics , 12(1), (2008), pp. 1-19. [25] K. Horie and H. I s hii, Sim u lt aneous eﬀects of homogenization and v anish in g v iscosit y in fully nonlinear elliptic equations, F unkc i a laj Ekvaciaj , 46(1), (2003) , pp. 63-88. [26] C. Hy eon and D. Thiru m a lai, Can energy land sca p es roughn ess of proteins and RNA b e m easur e d b y using mechanica l unfolding exp erimen ts?, P r o c . Natl. A c ad. Sci., USA, 100(1 8), (2003) , pp. 10249-10 253. [27] I. G. Kevrekidis, C. W. Gear, J . M. Hyman, P . G. Kevr ekid is, O. Runb o rg and C. Theo dorop oulo s, Equation-free, coarse-grained m ultiscale compu t ation: Enab lin g mi- croscopic sim ulators to per f o rm system-lev el analysis, Comm. Math. Sci. , 1(4) , (200 3), pp. 715 -762. [28] P .R. Kramer, O. Ku rbanm urado v and K. Sab elfeld, Comparativ e analysis of m ulti- scale Gaussian random ﬁeld sim ulation alg orithms, Journal of Computa tional Physics , 226(1 ), (2007), pp. 897-924 . [29] S. Kozlo v, The a v eraging of random op erators, Math. USSR Sb , 109 (1979), p p . 188- 202. [30] S. Kozlo v, Geometric asp ects of a v eraging, Russian Math. Surveys , 44 (1989), pp . 91-14 4 [31] S.M. Kozlo v and A.L. Piatniskii, Degeneration of eﬀectiv e diﬀusion in the presence of p eriod ic p o tent ial, Ann. Inst. H. Poinc ar e P r ob ab. Statist. , 32( 5), (1996), pp. 57 1-587. [32] S. Lifson and J . L. J a c kson, On the self-diﬀusion of ions in a p oly electrolyte solution, Journal of Chemic al P h ysics , 36, (1962), pp . 2410-2 414. 28 P . Dupu i s, K. Spiliop oulos, H. W ang Imp ortance Sampling for Multiscale Diﬀusions [33] P .-L. Lions and P .E. Souganidis, Homogenizat ion of degenerate second-order PDE in p eriod ic and almost p erio dic environmen ts and applications, Ann. Inst. H . Poinc ar ´ e Ana l. Non Lin ´ eair e , 22(5), (2005), pp. 667-677. [34] D. Mondal, P .K. Ghosh and D.S. R ay , Noise-induced transp ort in a rough rac k et p o- ten tial, Journal of Chemic al Physics , 130, (2009), pp. 0747 03.1-0747 03.7. [35] M. Matsumoto and T. Nishim ura, Mersenne t wister: a 62 3-dimensionally equid is- tributed uniform pseudo-random n umber generator, ACM T r ans. Mo del. Comput. Simul. , 8(1), (1998 ), pp. 3-30. [36] S. O ll a, Homogenizatio n of Diﬀusion Pro cesses in Random Fields, 1994, a v ailable at www.ceremade.dauphin e .fr/ ∼ olla/l ho.ps. [37] S. Olla and P . S ir i, Homogenization of a b ond diﬀusion in a lo cal ly ergo dic random en virnoment, Sto chastic Pr o c esses and their Applic ations , 109 (2004), pp. 317-3 26. [38] A. Papa v asiliou and I. G. Kevrekidis, V ariance redu c tion for the equation-free sim u- lation of multiscale sto c hastic systems, Mu l tisc ale M o deling Simulation , 6(1), (2007), pp. 70-8 9. [39] G.A. P a vliotis and A.M. Stuart, Multisc ale Metho ds: Aver aging and Homo genization , Springer, 2007 . [40] G. Papanicola ou and S.R.S. V aradhan, Bound a ry v alue problems with r a pidly oscillat- ing random co eﬃcien ts, Col lo quia Mathematic a So cietatis Janos Bolyai 27, R andom Fields , Esztergom (Hungary) 1979 , North Holland, (1982 ), pp . 835-87 3. [41] G. P apanicolaou, Diﬀusion in Random Media, 1994, av ailable at math.stanford.edu/ ∼ papanico/pub f t p/make .p df [42] R. Rho des, Diﬀusion in a lo ca lly stationary random environmen t, Pr ob ability The ory and R e l ate d Fields , 143(3- 4), (2009), pp. 545-568. [43] R. Rho des, Homogenizati on of lo cally stationary diﬀusions with p ossibly degenrate diﬀusion matrix, A nnales de l’Institut H enri Poinc ar´ e , 45(4), (2009), pp. 981- 1001. [44] J.G. Sa ve n, J. W ang and P .G.W olynes, Kinetics of protein folding: The dyn a mics of globally connected r o ugh energy landscap es with biases, Journal of Chemic al Physics , 101(1 2), (1994) , pp. 11037-11 043. [45] E. V anden-Eijn den, Numerical tec hniqu e s for m ulti-scale dyn amical systems with sto c hastic eﬀects, Comm. M a th. Sci. , 1(2), (2003 ), pp. 385-39 1. [46] E. V anden-Eijnd en and J . W eare, Rare ev ent sim ulation with v anishing error for small noise diﬀusions, (200 9), sub mitt ed. [47] R. Zwanzig, Diﬀusion in a rough p oten tial, Pr o c. Natl. A c ad. Sci. USA , 85, (19 88), pp. 2029- 2030. 29

Importance Sampling for Multiscale Diffusions

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment