Efficient Bayesian experimentation using an expected information gain lower bound

SIAM/ASA J. U NCERT AINTY Q UANTIFICA TION c  xxxx So ciet y for Industrial and Applied Mathematics V ol . xx, pp. x x–x Eﬃcient Ba y esian exp e rimentation using an exp ected info rmation gain lo w er b ound P anagiotis Tsiliﬁs ∗ , Roger G. Gh a nem † , and Pa ris Hajali ‡ Abstract. Exp erimental design is crucial for inference where limitations in the data collection pro ced ure are present due to cost or other restrictions. Op timal experimental d esigns determine p arameters t hat in some ap p ropriate sense mak e the data the most informative possible. In a Bay esian setting this is translated to up dating to the b est possible posterior. Information theoretic arguments have led to the formation of the exp ected information gain as a design criterion. This can b e ev aluated mainly b y Monte Carlo sa mp ling and maximized by u sing sto chastic appro ximation methods, both known for b eing compu tationally exp ensive tasks. W e prop ose a framework where a lo wer bound of the exp ected information gain is used as an alternativ e d esign criterion. In addition to alleviating the computational b urden, this also add resses issues concernin g estimation bias. The problem of p ermeabilit y inference in a large contaminated area is used to demonstrate the va lidity of our ap- proac h where we emplo y the massiv ely parallel version of th e m ultiphase multicomponent sim ulator TOUGH2 to sim u late contaminan t tran sp ort and a P olyn omial Chaos ap p ro x imation of the forward mod el th at further accelerates the ob jectiv e funct ion ev aluations. The prop osed metho d ology is demonstrated to a setting where ﬁeld measurements are a v ailable. Key wo rds. Ba yesian exp erimental design, Exp ected information gain, Sto chastic optimization, Polynomial Chaos, Two-phase transport AMS subject classiﬁcations. 62F15, 62K05, 86A22, 86A32 , 94A17 1. Intro duction. Remediation of a p ol lu ted subsu r face pr esen ts an increasingly common need in most urb an areas und ergoing accelerated expansions. Risks asso ciated with these p ollutan ts range from health consequences to ﬁn ancial costs of b oth monitoring, remediation and insur ance. T he fact that the subsu rface can n ev er b e completely charac terized is a key con tributor to these r isks, and credible pr o cedures f or improving this c haracterization hav e far reac hin g consequences across the sp ectrum of constituency . What is ultimately needed is a suﬃcient assessmen t of the su bsurface which dep end s on the p h ysics of subsu rface ﬂ o w and a k n o wledge of the initial conditions, and an assessment of the ﬂow functionals that are relev ant to risk assessmen t. As a ﬁr st step, in the presen t pap er, w e address th e issue of optimal subsur f ace c haracterization under conditions of limited resources. A v ast amoun t of researc h works h a v e b een dedicated to this c hallenge and h a v e oﬀered application-sp eciﬁc answers throughout the y ears with a Ba yesian decision analysis f ramew ork often b eing present. Among the earliest ones, the works b y [ 7 , 17 ], discussed the wo r th of correlated h ydraulic co n ductivit y measurement s that w as ev aluated dep ending on the rev en ue due to the resulting u ncertain t y reduction and the cost of obtaining them. In the excellen t w ork b y [ 18 ], a s equen tial samplin g metho dology wa s d ev elop ed and the issues of where to obtain the n ext sample and w hen to cease sampling w ere addressed using an analogous worth-of- ∗ Department of Math ema t ics, University of Southern Calif ornia, Los Angeles, CA 90089 ( tsiliﬁs@usc.edu ). † Department of Civil Engineering, Universi ty of Southern Califo rnia, Los Angeles, CA 90089 ( ghanem@usc.edu ) ‡ MUREX Environmental Inc., 15375 Ba rranca Pa rkwa y , Suite K - 101, Irvine, CA 9261 8 ( paris ha jali@murexenv.co m ) 1 2 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI information fr amework. More recen tly [ 11 ], a Kullbac k-Liebler formalism that maximizes the w orth of inf ormation of an experiment wa s devel op ed, and the w orth of additional observ ations c haracterized. While that w ork ignored the physics of ﬂ o w in p orou s media by d ev eloping a kriging mo d el for the measur ed concent r ations in the su bsurface, it do es lay the f ou n dations for the presen t w ork. A common c h aracteristic in all the abov e w orks and man y th at follo wed, is that the d ata-w orth an alyses, Bay esian in nature, are divided in three main ph ases: the prior, the prep osterior and the p osterior analyses w ith the ﬁrst and the last b eing the w ell kno wn steps of an y Ba y esian metho d and the second b eing the phase where the d ecisions ab out the design parameters are to b e made. This t ypically inv olv es the suggestion of a utilit y function, exp r essed often in m onetary terms that qu an tiﬁes the worth of colle cting some sp ec iﬁc d ata and then ev aluating the a v erage wo r th by taking the exp ectation of this function. The ab ov e mentio n ed works and numerous others [ 24 , 26 , 3 ] v eriﬁed a general conclusion: that the lo cation or any other d esign parameters c haracterizing the b est samples ma y dep en d on the mo del uncertaint y but also on the d ecision to b e mad e with th e latter factor usually add ing an economic ﬂ a v or in the d ata w orth approac h . Although in most real-w orld problems concerning the geosciences comm un it y , this might b e an eﬃcient w ay to address su c h issues, it is of critical imp orta n ce, fr om a mathematical p ersp ectiv e, to b e able to distinguish th e r ole of mo del un certain t y in optimal designs and n ot muc h atten tion has b een receiv ed on this dir ection. It is the p urp ose of this pap er to p resen t an app roac h to th e optimal design p roblems arising in conta min an t transp ort applications that m eets eac h p roblem’s ob jectiv es from an information theoretic p ersp ec tive, that is to exp lore the use of other design criteria as can- didates in the prep oste rior p h ase of an y data-w orth framework, that provide maximum in- formation ab out m o del u ncertain t y without b eing concerned ab out economic costs, but still pro vidin g lo w r isk resu lts for decision making. An extensiv e statistical literature is a v ailable on exp erimental design metho dologies for linear regression mo d els summ arized in [ 2 , 6 ] with criteria sta ted as functionals of the Fisher information mat r ix. Ba yesia n count erp arts of these criteria as w ell as additional design criteria for nonlinear mo d els hav e also b een suggeste d (see [ 4 ] for a review). T yp ically these are taken as the exp ectation of some appr opriately c hosen utilit y fun ctions or appro ximations of them. This is the d irection follo wed in this w ork. W e c ho ose a design criterio n for data collect ion pro cedure according to an ob jectiv e fun ction that maximizes inf orm ation gain ab out the uncertain parameters. As a natural consequen ce of dealing with realistic, complex and often high-dimensional non-linear mo d els, computationally intensiv e tec hniqu es such as the u se of mo del su rrogates and Monte Carlo m etho ds are inv olv ed in order to ev aluate and optimize the design criterion. The Ba ye sian parameter inferen ce results that are obtained f rom suc h optimal designs can then pro vide a r ational basis for decision m aking. All the ab o v e men tioned steps can b e su mmarized as shown in the ﬂo wc hart in Fig. 2 . Th e metho dology dev elop ed in this pap er is demonstrated on an actual site sh o wn in Fig. 1 . S oil-t yp e d ata w as collected at this site at v arious depth at b oreholes located along the solid lines that criss-cross the site, as sho wn in the soil b oring map in Fig. 1 . Concent r ation v alues w ere collected at the b lac k dots throughout the site. With reference to th is sp eciﬁc case study , the sequence of action that we in vesti gate consists of the follo wing s teps: 1. In itial s p ills are assumed to ha ve occur red at the lo cation of the storage tanks sho wn EFFICIENT BA YESIAN EXPERI MENT A TION 3 Figure 1: Aerial ph oto sho wing the lo cation of storage tanks (left) and soil borin g m ap (righ t). in Figure 1 . 2. Mean and v ariances of p ermeabilit y v alues are asso ciated to soil t yp es th roughout the site, resulting in lognormal initial probabilistic mo dels of p ermeabilities. 3. Physics of ﬂo w through p orous media yield a corresp ond in g distribution for the con- cen tration of p ollutants thr oughout the site. 4. F or a giv en monitoring design consisting of a set o f sp eciﬁc ﬁv e spatial lo cations where concen trations are to b e observ ed, a distrib u tion of these observ ables can b e obtai n ed from th e previous step. 5. F rom this distrib u tion of observ ables, th e lik eliho o d for eac h p ossible v alue of ob- serv able is calculated, and the Ba y esian p ost erior distribu tion for that observ able is ev aluat ed. Not e that at this step, no concen tration data is assumed to hav e b een collect ed yet . 6. Th e Kullbac k-Leibler (KL) d istance b et w een th e prior and the p osterior distributions is ev aluated as a measure of information gain. 7. Th e KL d istance is integrat ed o v er all p ossible v alues attained b y the observ ables at the ﬁxed design locations. 8. Th e optimal monitoring locations are determined b y maximizing this a v eraged KL distance. 9. Measurements of p ollutan ts are tak en f r om the actual ﬁ eld data at the optimal lo ca- tions. 10. The Ba ye sian p osterior distributions for p ermeabilitie s are calculated based on these observ ations. Th ese p osteriors pr ovide an u p dated description of the p erm eabilities throughout th e s ite and can b e used to imp ro ve the prediction of concen tration maps o v er time. While a stopping criterion for this Ba yesia n up date has b een int r o duced elsewhere [ 11 ], it is outside the scop e of the presen t in vest igation. The pap er addresses computational and algorithmic issues required for the implement ation of the ab ov e s teps and p ro vides in s igh t in to the computed optimal monitoring s tr ategie s. and is organized as follo ws. In Section 2 w e in tro du ce the d esign criterion used f or e xp erimen tal design that pro vides optimal Ba yesia n 4 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI Figure 2: A ﬂ o w chart outlining the steps in prior, p rep osterior and p osterior stages of the optimal design metho d ology . inference results and a lo wer b ound that is used as an alternativ e in the optimization p ro cedure. The sto c hastic approxima tion framework r equired f or maximizing the ob jectiv e fu nction, is describ ed in Section 3 . Section 4 pro vides a to y example in order to j ustify the use of the lo w er b ound estimate instead of the actual exp ected in formation gain as the design criterion. Then our metho dology is applied on a real site in Section 5 . Precisely , the mo del is describ ed in subsections 5.1 and 5.2 , then a surrogate is constructed in su bsection 5.3 in order to accelerate sim ulations of computationally in tensive pr o cedures. The exp erimen tal d esign analysis is p erformed in sub section 5.4 and the results are v alidated in subsection 5.5 by estimating the Bay esian u p date of the uncertain parameters based on data that is generated b oth for a hyp othetical scenario and from ﬁeld measur emen ts. Our conclusions are summ arized in Section 6 . 2. Ba y esian exp erimental design. 2.1. Design criterion. Our main interest in the present work is to deﬁne and ev aluate a particular exp eriment whic h, constrained by ﬁxed resources, will r ed uce th e prediction EFFICIENT BA YESIAN EXPERI MENT A TION 5 uncertain ty in some wel l-deﬁ n ed optimal sense. Let d denote wh at we refer to as the set of design parameters. F or eac h c h oice of these paramete r s, the exp erimen t design is ﬁxed. Thus d could, for instance, consist of smo othing k ernels that parameterize measurement instrumen ts. Alternativ ely , as we do in th e p r esen t work, we construe d ∈ R 2 m as a v ector that deﬁnes the spatial co ord inates in the horizon tal plane of m p oin ts at wh ic h observ ations will b e obtained. Clearly , d could also consist of nonlinear functionals. Th e n u m erical v alues of the observ ations will b e denoted b y y ∈ R m . Pr ior to conducting the exp erimen ts, y is a random vec tor. F ollo wing the exp eriment, the n umerical v alues attained by this ve ctor will b e u sed to condition the infer en ce. W e mak e the restrictiv e assumption that optimal reduction of uncertain ty in mo d el-based pred iction is attained through an optimal red uction of un certain t y in mo del parameters wh ic h w e denote b y θ . Relaxing this assump tion, w h ile n um erically tedious, do es not pr esen t conceptual diﬃculties. 2.1.1. Expected info rm ation gain. W e are thus in terested in inferring the u nkno w n pa- rameters θ that go v ern th e b eha vior of a physical p ro cess. W e mo del these parameters as a v ector of random v ariables. In a Ba yesia n setting, th is in ference is carried out by up dating the p rior d istr ibution p ( θ ) with a p osterior one, namely p ( θ | y , d ) , giv en that some sp eciﬁc data y w as o b serv ed for particular d esign parameters d . The p osterio r distribution is up dated according to the rule (2.1) p ( θ | y , d ) = p ( y | θ , d ) p ( θ ) p ( y | d ) where the m ulti-dimensional in tegral, p ( y ) = R p ( y | θ , d ) p ( θ ) d θ , is referred to as the evidenc e . Using a Shannon information approac h we can deﬁne the information gain after up dating the distribution of θ to b e th e Kullback- Leibler (KL) divergence [ 20 ] of the prior and the p osterior distributions, that is (2.2) D K L [ p ( ·| y , d ) | | p ( · )] = Z Θ p ( θ | y , d ) log  p ( θ | y , d ) p ( θ )  d θ . W e are in terested in quantifying the information gain b efore the d ata is actually collected. In the same spirit w e deﬁne the exp ected information gain, ﬁrs t prop osed by [ 22 ] as (2.3) U ( d ) = Z Y D K L [ p ( ·| y , d ) || p ( · )] p ( y | d ) d y = Z Y Z Θ p ( θ | y , d ) log  p ( θ | y , d ) p ( θ )  d θ p ( y | d ) d y This can b e inte r preted as f ollo ws: The con tribution of an y p ossible outpu t that is u sed as a set of data y to up date to the p osterior d istribution is giv en as th e KL dive rgence of the t wo d istributions. Before the data is collected, U ( d ) provides u s with a measur e of the a v erage information to b e gained. This is a fu n ction only of the d esign parameters d . It is therefore natural to assum e that the c hoice of d ∗ that oﬀers on aver age the most informative observ ations and th u s , is the op timal exp erimen tal design, is the one that maximizes U ( d ) and s o it s atisﬁes (2.4) d ∗ = arg max d ∈D U ( d ) . 6 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI Ev aluatio n of the ab o v e ob jectiv e function and therefore its optimization is n ot a triv- ial task. At ﬁrst, one can see that the p osterior, b eing unk n o wn b eforehand, needs to b e ev aluat ed or approximat ed. In [ 23 ], the p osterior was replaced b y its L ap lace appro ximation and U ( d ) w as estimated with a sparse quadrature r ule. In [ 31 ] an alternativ e form of th e ob jectiv e fun ction w as d eriv ed by usin g Ba yes’ rule to express the p osterior in terms of the evidence, the lik eliho o d and th e prior distr ibutions and estimates we r e prop osed v ia Mont e Carlo metho ds. In b oth app roac hes, the maxima were ident iﬁed by using an exhaustive grid searc h o ve r the whole d esign space and limitations d ue to computational exp ense were re- p orted. Ev aluation of the d esign criterion on all p oints of the design space can easily b ecome infeasible in app licatio n s where either higher d imensional d esign parameters are inv olv ed or an exp ensiv e forward solv er is incorp orated, h ence the n eed for iterativ e searc h strategies is necessary to detect th e optimal v alue. In [ 15 , 16 ] it wa s demonstr ated that sto c hastic ap- pro ximation metho ds [ 35 ] are w ell adapted to the p resen t situation, when the Mon te Carlo estimate of the ob jectiv e is used and this is the approac h we follo w in this pap er . Ho wev er, instead of simply adapting the metho dology dev elop ed in these works, we f urther d eriv e a lo w er b ound of the exp ected in formation gain and its corresp ondin g Mont e C arlo estimate to b e maximized. The reason f or doing so is to o ve r come computational obstacles that arise from our application: The direct Mon te Carlo estimato r entai ls a discretization of the double in tegral app earing in equation ( 2.3 ) and has b een sho wn to ha ve a bias th at is inv ersely pro- p ortional to the num b er of samp les in the inner su m of the estimate, requiring a v ery large n umb er of in ner lo op samples for the b ias to b e negligible [ 31 ] (see also App endix B in [ 15 ] for a n umer ical stud y). F ur thermore, the v ariance of that estimator is con trolled only b y the n umb er of the outer loop samples. Expr essions for b ot h the bias and v ariance are give n in A . F or applications suc h as the one included in this p ap er, using tens or hundreds of thousand s samples can easily b ecome prohib itive. On the con trary , th e low er b ound deriv ed b elo w can b e easily seen to b e unbiased and its v ariance is control led by th e p ro duct of the num b er of samples used in b oth lo ops, something that allo ws us to achiev e th e same lev el of accuracy in our estimate with a m uch low er n u m b er of samples. 2.1.2. Derivation of a low er b ound. By s u bstituting p ( θ | y , d ) from eq. ?? , w e write U ( d ) = Z Y Z Θ log  p ( θ | y , d ) p ( θ )  p ( θ , y | d ) d θ d y = Z Y Z Θ log [ p ( y | θ , d )] p ( y | θ | d ) p ( θ ) d θ d y − Z Y log [ p ( y )] p ( y | d ) d y and by denoting with H [ q ( ω )] = − R log [ q ( ω )] q ( ω ) d ω the en tropy of a distr ib ution q ( ω ) we tak e U ( d ) = − Z Θ H [ p ( y | θ , d )] p ( θ ) d θ + H [ p ( y | d )] . In w hat follo ws we take th e likeli h o o d as a Gauss ian d istribution with densit y p ( y | θ , d ) = (2 π ) − m/ 2 | Σ | − 1 / 2 exp  − 1 2 [ y − G ( θ , d )] T Σ − 1 [ y − G ( θ , d )]  . EFFICIENT BA YESIAN EXPERI MENT A TION 7 where th e mean is the mo del output G ( θ , d ) and th e cov ariance m atrix is Σ . This is a common c h oice for mo dels wh ere the observ ables y are d eﬁned as the mo d el pred iction plu s some m easuremen t errors y = G ( θ , d ) + ǫ with the latter being normally distrib u ted with densit y N (0 , Σ ). In general, the measurement error can b e r elated to the mo del output through a pr op ortionalit y f actor or even a more complex relation which is t ypically in corp orated in the co v ariance m atrix Σ . T his implies that Σ can dep end implicitly on b oth th e uncertain and the design parameters. F or our purp oses w e mak e the rather simple assu mption th at no su ch dep en dence is inv olv ed and Σ h as ﬁ xed entries. F urther details on the exact v alues of the co v ariance matrix in clud ing indep end ence among the measuremen t err ors is p ro vided later in our app lications. F or the ab o v e c hoice of the likeli h o o d function the en trop y can b e calculated a n d is simply H [ p ( y | θ , d )] = 1 2 { m + lo g [(2 π ) m | Σ | ] } and we ﬁnally get (2.5) U ( d ) = − 1 2 { m + lo g [(2 π ) m | Σ | ] } + H [ p ( y | d )] . A t last, using Jensen’s inequalit y , w e derive a low er b ound for U ( d ). Namely we h a v e H [ p ( y | d )] = Z Y − log [ p ( y | d )] p ( y | d ) d y ≥ − log  Z Y p 2 ( y | d ) d y  and we deﬁne the low er b ound of U ( d ) as (2.6) U L ( d ) = − 1 2 { m + log [(2 π ) m | Σ | ] } − log  Z Y p 2 ( y | d ) d y  . Note th at since the common ﬁrst term of U ( d ) and U L ( d ), as they app ear in equations ( 2.5 ) and ( 2.6 ), are indep enden t of d , one ev entually n eeds only to m aximize the second term, namely the en tropy of the marginal d istribution of the data or its lo wer b ound . This idea is not new and has b ee n p r eviously used in linear regression mo dels [ 32 ] and in geoph ysical applications [ 37 ]. 2.1.3. Estimation of the low er b ound. As men tioned ab o ve , the optimization p roblem is equiv alent to minimizing U ∗ L ( d ) = Z Y p 2 ( y | d ) d y . After expanding the evidence function and writing U ∗ L ( d ) = Z Y Z Θ Z Θ p ( y | θ 1 , d ) p ( y | θ 2 , d ) p ( θ 1 ) p ( θ 2 ) d θ 1 d θ 2 d y w e can see that a Mon te C arlo estimate of the ab o ve is (2.7) ˆ U ∗ L ( d ) = 1 N M N ,M X i,j =1 p ( y i | θ j , d ) 8 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI where { θ i } , { θ j } , i = 1 , ..., N , j = 1 , ..., M are i.i.d. samples from p ( θ ) and { y i } are i.i.d. samples from p ( y | θ i , d ). 3. Sto chastic optimization. 3.1. Simultaneous p erturbation sto chastic app roxi ma tion. Maximizatio n of the ob jec- tiv e fu nction derived in the ab o ve section can in general b e a diﬃcult task. In cases where the design parameters are of high dimension or an exp ensive forw ard solv er is inv olv ed in th e ev aluat ion of U ∗ L ( d ), a direct grid searc h can easily b ec ome proh ib itiv e. S ince only a noisy estimate of the actual ob jectiv e fun ction to b e maximized is av ailable, we turn our atten tion to sto chastic appr o ximation metho ds. These are algorithms that approxima te ro ots of noisy functions and can b e used for optimization p roblems to ﬁn d the ro ots of the grad ient of the ob jectiv e fun ction to b e optimized. This is done with an iterativ e pr o ce d ure of the general form (3.1) d k +1 = d k − a k g ( d k ) , k ≥ 0 where { a k } k ≥ 0 is a sequence of p ositiv e pre-sp eci ﬁ ed deterministic constan ts, kn o wn also as the le arning r ate of the algorithm and g ( d ) = ∇ d U ( d ) is th e gradient of the ob jectiv e f u nction with resp ect to d . Algorithms that use an explicit expression for the gradient are called R obbins-Monr o algorithms [ 30 ] and those wh ere a ﬁnite-diﬀerence s cheme is emplo yed, are called Kiefer-W olfowitz algorithm s [ 19 ]. F or our purp oses w e use an algorithm from the Kiefe r- Wolfowitz family , namely the Simultane ous P erturb ation Sto chastic Appr oximat ion metho d (SPSA), p rop osed by Spall [ 35 , 36 ]. What mak es this metho ds attrac tive v ersu s others is the fact that only t wo forw ard ev aluations are required for th e gradien t estimat ion. The up dating step of the m etho d is giv en by (3.2) d k +1 = d k − a k ˆ g k ( d k ) , where (3.3) ˆ g k ( d k ) = ˆ U ∗ L ( d k + c k ∆ k ) − ˆ U ∗ L ( d k − c k ∆ k ) 2 c k    ∆ − 1 k , 1 . . . ∆ − 1 k ,n d    , (3.4) a k = a ( A + k + 1) α , c k = c ( k + 1) γ and a k , c k , α , γ are p ositive parameters whose v alues aﬀect the con v ergence rate and the gradien t estimate. General instructions on h o w to c h o ose th e appr opriate v alues f or eac h problem are giv en in [ 36 ]. The v ectors ∆ k are random vecto rs w ith coeﬃcients ∆ k ,i dra wn indep end en tly from an y ze r o-mean probability distribution suc h that the exp ectatio n of | ∆ − 1 k ,i | exists [ 35 ]. A common c hoice, whic h we used in our analysis, c orr esp onds to ∆ k ,i = 2( Z − 1 / 2) where Z ∼ Bernoulli( p ) with success probabilit y p = 1 2 . 4. Example: Nonlinear m o del. EFFICIENT BA YESIAN EXPERI MENT A TION 9 4.1. The model and experimental scena rios. Th e main goal in this exa mp le is to explore the accuracy of the low er b ound estimate as a su b stitute for the direct maximization of the exp ected information gain [ 15 ] and provi d e a numerical comparison of the t wo appr oac h es. This is ac h iev ed b y ev aluating the t wo ob jectiv e fu nctions for a s im p le algebraic m o del whic h is inexp ensiv e to ev aluate and is nonlin ear with resp ect to b oth the un certain and d esign parameters. Consider th us the mo del wh ere the observ able qu an tit y y is dep endent on κ and d as y ( κ, d ) = G ( κ, d ) + ǫ (4.1) = κ 3 d 2 + κe −| 0 . 2 − d | + ǫ (4.2) where G ( κ, d ) is the mo del outpu t and the n oise is taken to b e ǫ ∼ N (0 , 10 − 4 ). F or pr ior w e c ho ose initially κ ∼ U (0 , 1) and la ter on w e discuss a few more ca ses. Supp ose we h a v e con trol o v er d ∈ D where D = [0 , 1] is the design space and w e are in terested in inferrin g κ . W e explore t wo cases, ﬁrst the case wh ere inference is carried out using a s ingle observ atio n of y and second the case wher e t wo observ ations of y can b e obtained, corresp onding to d iﬀerent v alues of d . One can think of the design p arameter d as the lo cation w h ere y is observe d . Before observin g y , we w ould lik e to know th e v alue of d that w ould make our observ ations the m ost inf orm ativ e ones. Bot h cases of this example ha ve b een studied in [ 15 ] u sing the direct Mon te Carlo estimate of U ( d ). Th e ﬁrst case w as also stud ied in [ 23 ] using the Laplace appro ximation of th e p osterior distrib u tion of κ and then p erforming the in tegration with sparse quadratures. 4.2. Results. W e are using the exp ect ed inform ation gain lo we r b oun d estimate as give n in ( 2.7 ) as our criterion to determine the optimal design d f or inferrin g κ . F or comparison, w e also repro d uce the results of [ 15 ] using the exp ected information gain estimate whic h from no w on w e call double lo op M onte Carlo (dlMC) estimate. The exact expression of the dlMC estimate is giv en in A . O u r computations are p erform ed using an ensemble of samples with N = M = 10 4 for all diﬀeren t priors that w e consider b elo w, while keeping the error v ariance ﬁxed. W e demonstrate that the t wo estimate s h are the same maxima and their graphs ha ve go o d quantitat ive agreemen t w ith the lo w er b ound p ro viding sligh tly less noisy v alues. 4.2.1. Single observation. Fig. 3 shows the v alues of estimates of U ∗ L ( d ) for the d esign of one exp eriment after using a 101-p oin t unif orm p artition on the design space D = [0 , 1] for N = M = 10 3 and for N = M = 10 4 as we ll as the estimates of dlMC. All cases sho w the existence of t wo lo cal maxima at d = 0 . 2 and d = 1. As exp ect ed, our estimate alwa ys tak es v alues sligh tly sm aller than dlMC, as the former is a lo w er b ound of the latter. Due to the fact that the ﬁrst term in our estimate (the Gaussian en tropy) is a constan t, while the ﬁrst term of the d lMC estimate is a Mon te-Carlo appro ximation of the very same constan t, w e observ e that the low er b ound is a smo other curv e for the same v alues of N , M . F ollo wing the slop e analysis of [ 15 ] we also pr esen t the results for tw o diﬀeren t choic es of prior, namely when κ ∼ U (0 , κ e ) and when κ ∼ U ( κ e , 1) w here κ e =  (1 − e − 0 . 8 ) / 2 . 88  1 / 2 whic h is the p oint where the slop e of G ( κ, 1) b ecomes greater than the slop e of G ( κ, 0 . 2). F or the former case w e obtain a global maxim u m at d = 0 . 2 whereas for the latter, the maxim u m is at d = 1. A t some p articular p oin ts w e can observe that although our estimate is only a lo wer boun d of the 10 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 d 3 . 0 3 . 1 3 .2 3 . 3 3 . 4 ˆ U L ( d ) ( a ) κ ∼ U ( 0 , 1) D o u b l e - l o o p M C N = 1 0 3 , M = 1 0 3 N = 1 0 4 , M = 1 0 4 0 . 0 0 .2 0 . 4 0 . 6 0 .8 1 . 0 d 1 .8 1 . 9 2 .0 2 . 1 2 . 2 2 . 3 2 .4 2 . 5 ˆ U L ( d ) ( b ) κ ∼ U ( 0 , κ e ) D o u b l e - l o o p M C N = 1 0 3 , M = 1 0 3 N = 1 0 4 , M = 1 0 4 0 . 0 0 . 2 0 . 4 0 .6 0 .8 1 . 0 d 2 . 4 2 . 6 2 . 8 3 . 0 3 . 2 3 . 4 ˆ U L ( d ) ( c ) κ ∼ U ( κ e , 1) D o u b l e - l o o p M C N = 1 0 3 , M = 1 0 3 N = 1 0 4 , M = 1 0 4 Figure 3: Exp ected inform ation gain lo wer b ound s and dlMC with N = 10 4 , M = 10 4 estimated o v er D = [0 , 1] f or thr ee diﬀeren t priors. dlMC estimate, the v alue of the dlMC might fall b elo w the v alue of U ∗ L ( d ). This, in addition to the fact th at we compare noisy estimates of a deterministic q u an tit y , can also emerge as the result of the bias of dlMC whic h is not negligible for our c h oice of M (n u m b er of inner lo op samples). F or a more detailed d iscussion on the bias of d lMC for this example see [ 15 ] and f or the analytic form of th e bias see [ 31 ]. 4.2.2. Tw o observations. W e calculate th e v alues of U ∗ L ( d 1 , d 2 ) f or N = M = 10 4 o v er D × D = [0 , 1] 2 and we c omp are them with those of dlMC in Fig. 4 . S imilar qualit ative r esults as in case 1 can b e observed. F or κ ∼ U (0 , 1), th e p oin ts wh ere maxim um is attained are ( d 1 , d 2 ) = (0 . 2 , 1) and ( d 1 , d 2 ) = (1 , 0 . 2) whic h are com binations of th e lo cal m axima in th e 1-dimensional case. That means that, if tw o observ ations can b e aﬀorded, th ey shou ld b e tak en at the p oints where lo cal m axima exist for th e one observ ation scenario. The ord er is insigniﬁcan t sin ce, as w e see, the U ∗ L ( d 1 , d 2 ) surface is symmetric w ith resp ect to th e d 1 = d 2 line. S imilar conclusions are drawn for κ ∼ U (0 , κ e ) and κ ∼ U ( κ e , 1) w here the m axima are at ( d 1 , d 2 ) = (0 . 2 , 0 . 2) and ( d 1 , d 2 ) = (1 , 1) r esp ectiv ely , as already exp ected from the resu lts of case 1. 5. Example: Case study . This second example demonstrates the app lication of the for- malism to a s u bsurf ace p ollution c haracterization pr oblem for an actual site where ﬁeld data for p ermeabilit y and concentrat ions are a v ailable. EFFICIENT BA YESIAN EXPERI MENT A TION 11 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 d 1 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 d 2 κ ∼ U ( 0 , 1) 3 .1 8 0 3 .2 2 5 3 .2 7 0 3 .3 1 5 3 .3 6 0 3 .4 0 5 3 .4 5 0 3 .4 9 5 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 d 1 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 d 2 κ ∼ U ( 0 , κ e ) 2 . 0 5 0 2 . 1 2 5 2 . 2 0 0 2 . 2 7 5 2 . 3 5 0 2 . 4 2 5 2 . 5 0 0 2 . 5 7 5 0 . 0 0 .2 0 .4 0 .6 0 .8 1 .0 d 1 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 d 2 κ ∼ U ( κ e , 1) 2 .6 0 2 .7 2 2 .8 4 2 .9 6 3 .0 8 3 .2 0 3 .3 2 3 .4 4 (a) Low er bound estima te ˆ U L ( d ) 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 d 1 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 d 2 κ ∼ U ( 0 , 1) 3 .3 4 3 .4 0 3 .4 6 3 .5 2 3 .5 8 3 .6 4 3 .7 0 3 .7 6 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 d 1 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 d 2 κ ∼ U ( 0 , κ e ) 2 . 2 6 2 . 3 2 2 . 3 8 2 . 4 4 2 . 5 0 2 . 5 6 2 . 6 2 2 . 6 8 2 . 7 4 0 . 0 0 .2 0 .4 0 .6 0 .8 1 .0 d 1 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 d 2 κ ∼ U ( κ e , 1) 2 .7 6 2 .8 8 3 .0 0 3 .1 2 3 .2 4 3 .3 6 3 .4 8 3 .6 0 (b) dlMC estimate ˆ I ( d ) Figure 4: Exp ected information gain lo w er b oun d estimate and dlMC estimate with N = 10 4 , M = 10 4 for the 2-dimens ional design, estimated o ve r D = [0 , 1] 2 for diﬀeren t pr iors. 5.1. Description. W e are interested in p erforming Bay esian infer en ce on the uncertain parameters that aﬀect the transp ort of p ollutan ts in a con taminated area. Suc h a pro cedur e will decrease the uncertain ty r egarding the exten t an d lo cation of the plume and will further aﬀect the cost and other imp ortant asp ec ts of soil remediation. When limited resources are a v ailable for exp erimen tation and data collecti on, it is of great imp ortance to dev elop exp er- imen tal design strategies that will enhance the qualit y of d ata. Our cur r en t study concerns a contaminat ed site lo cated in Santa F e Springs, C alifornia. Previous data consisting of soil t yp es is a v ailable from b oreholes lo cated along four cross s ections across the site. Eac h b ore- hole reac hes a d ep th of 20m. Accordingly , soil t yp e is deﬁned as a categorical v ariable with six p ossible states. T h e ﬁeld and the lo cations of th e b oreholes are s ho wn in Figs. 1 and 5 (left). The a v ailable soil data is used to constru ct a domain that can b e used in our forwa r d mo d el to sim ulate th e transp ort of p ollutan ts. Th e construction of a d omain that can b e regarded as a go o d approximati on of the real situation, is ac hieved b y the stratigraphic mo d eling capabilities of the Groundw ater Modeling Sys tem (GMS) softw are [ 1 ]. More sp eciﬁcally , us in g a stand ard in verse-dista n ce we ighting based in terp ola tion sc heme, we assign soil t yp es for all locations in the a r ea o f our in terest. The strati graphy of the resulting domain as w ell as the exten t of ea ch soil are sh o wn in Figs. 5 (right) and 6 . Next, the domain wa s d iscretized to a 40 × 60 × 10 grid wh ere eac h cell has dimensions 15 . 4m. × 6 . 7m. × 2m. an d carries the inform ation of the soil typ e present in that lo cation. Th e grid is u sed as our ﬁ nite vo lum e d iscretizatio n in the 12 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI subsequent TO UGH2 sim ulations and eac h of the six d iﬀeren t soil typ es presen t in our domain are assigned diﬀerent p ermeabilit y v alues whic h , in our study , are considered to b e rand om and are th e only source of u ncertain t y . Figure 5: Boreholes (left) and ﬁnal domain stratigraphy (righ t) with the z -axis b eing stretc hed for the s ake of illus tr ation. 5.2. Simulating ﬂo w and transp o rt using TOUGH2. 5.2.1. EOS7r mo dule. W e emplo yed the multiphase sim u lator TO UGH2 [ 28 ] and its massiv ely p arallel ve rs ion TOUGH2-MP [ 40 ] with th e E O S7r mo du le to sim ulate groun dwa ter ﬂo w and con taminant transp ort. TOUGH2 provides us with a ﬁnite vo lu me solve r wh ic h discretizes the mass and energy equations o ve r space and time and up d ates a set of pr imary v ariables that consist the s olution of the go verning equations by estimating the incremen ts at eac h time step with a Newton-Raphson metho d . The common mathematical form of the equations for multiphase ﬂu id ﬂ o w include sev eral thermophysical parameters su c h as densit y , viscosit y , enthalp y etc. whic h are determin ed by the v arious ”EOS” (equatio n -of- state) mod ules. Th e E O S7r mod ule [ 27 ] used here, is mainly intended to pr o vide radion uclide transp ort capabilit y but can b e as w ell used for transp ort of v olatile and wat er soluble orga n ic c hemicals (V OCs). Change in concen trations is caused in general for three reasons: transp ort, deca y and adsorption. V olatilization of the V OCs in all phases is mo deled by Henry’s law and o ccurs by adv ection and diﬀusion. Deca y is mo deled by a ﬁrs t-order deca y la w. In case of radion u clide transp ort, this is interpreted as r adioactiv e deca y bu t in the case of organic con taminan ts it can b e explained as bio degradation. Adsorption is dep end en t on th e distribution co eﬃcien t that c haracterizes eac h ro ckt yp e. EOS7r in total can simulate tr ansp ort of ﬁve comp onen ts in a t w o-phase ﬂow, namely water, air, brine, a paren t con taminan t and a daugh ter con taminant in aqueous and gaseous phases. 5.2.2. Santa Fe site. F or our p roblem we are inte r ested in sim ulating the spreading of organic conta min an ts in a ﬁeld lo cated in San ta F e Spr ings, CA whic h is appro ximately 600 m. × 40 0 m. and only 20 m. deep. Th is is discretized in TO UGH2 to a grid consisting of 40 × 60 × 10 = 24000 activ e cells w here eac h cell has dimension 15 . 4 m. × 6 . 7 m. × 2 m. , as explained in the previous paragraph and is assigned a ro ckt yp e according to its soil t yp e. W e EFFICIENT BA YESIAN EXPERI MENT A TION 13 (a) Clay (b) Silt (c) Clay ey sand (d) Silt y sand (e) Sand with 10% silt (f ) Poor ly g raded sand Figure 6: Soil morphology of the domain. The z -axis has b ee n stretched for th e sak e of illustration. fo cus on inv estigating the propagation of u ncertain t y emerging fr om the unkn o wn p ermeabil- ities κ = ( κ 1 , ..., κ 6 ) of the six diﬀeren t soil types the are present on our site, thr ou gh th e transp ort pro cess of the V OC s . The p orosit y is take n to b e uniform φ = 0 . 35 all o ver the domain. Although w e are intereste d in the transp ort mainly of p etroleum h ydro carb ons whose w eathering pro cesses are in general kn o wn to includ e ad s orption and bio d egradation eﬀects, for the purp oses of this study w e will consider these eﬀects to b e of negligible imp ortance b y assignin g the distribution co eﬃcien ts for adsorption to b e zero and th e half-life p arameter ∼ 10 50 so th at w e ha ve no deca y eﬀects. Th us our mo del fo cuses only on the v olatilizati on prop erties of the VOCs. W e c h o ose the v alues for the molecular w eight and in verse Henr y’s constan t parameters to corresp ond to those used for describing transp ort of p etroleum hydro- carb ons and sp eciﬁcally those that are mostly d etecte d using gas c hromatograph y tec hniqu es, named Gasoline Range Organics (GR O). Typica lly GR O are th e s ubsets of h yd r o carb ons in the C5-C12 range and in clude hydrocarb ons suc h as Benzene, T oluene, Eth ylb enzene, m-, o- and p -Xylenes, Naph thalene and Acenaphthene. T heir molecular weig hts are in the 78 − 154 grams range. In our case w e arbitrarily set th e m olecular weig ht to b e 112 . 4 grams whic h is 14 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI Figure 7: Lo cations w here initial conta m in an t injections are placed. a v alue close to the a verage GR O hydrocarb on. T able 1: Material and initial parameters used in our sim ulations. P arameter Sym b ol V alue P orosit y φ 0 . 35 T ortu osity factor τ 0 0 . 1 Relativ e p ermeabilit y parameters: λ 0 . 457 - S lr 0 . 15 - S ls 1 - S g r 0 . 1 Capillary pressur e parameters: λ 0 . 457 - S lr 0 - 1 /P 0 = α/ρ w g 5 . 105 e − 4 - P max 10 7 - S ls 1 Diﬀusivities (all k ): gas p hase d k g 10 − 6 aqueous phase d k l 10 − 10 Molecular w eight - 112 . 4 In verse Herny’s constant - 2 . 1 · 10 − 8 Half-life parameter - 10 50 Initial pressure P (0) 1 . 013 · 10 5 Initial gas saturation S g 0 . 75 T emp erature (constant ) T 20 ◦ F or our forw ard ev aluations of the mo del with TOUGH2 we consider that only 3 com- p onents are present by assigning the b rine and daught er radionuclide mass fr actions to b e zero. W e assum e isothermal conditions with the temp erature b eing constan t at 20 ◦ C and no-ﬂux b oundary conditions. The mathematical form ulation of the ﬂo w and transp ort mo del is describ ed in deta il in App endix B and the v alues of the model parameters that are assumed kno wn are giv en in T able 1 . Since w e wan t the pressure and gas saturation to b e close to EFFICIENT BA YESIAN EXPERI MENT A TION 15 steady-state conditions by the time the con taminan ts are released to the ground, we ﬁrst run the mo d el for a time p eriod until T = 13 . 3 yea r s, where the initial mass fractions for the con taminan ts is zero. The outp uts of pr essure and gas s atur ation are s u bsequentl y used as initial conditions for the transp ort simulat ion. W e rerun the mo d el after assigning initial aqueous phase s olub ilities for the VOC. W e denote these initial conditions with y 0 . Th eir lo cations at time t = 0 are tak en to b e approximat ely in th e areas w ere the storage tanks are lo cated as can b e seen in Fig. 7 . F or our pur p oses w e add 164 inactiv e (zero vol u me) cells to th e sur f ace la y er of our grid and initial injections are assigned. The exact v alues w ere c hosen to b e y 0 = 0 . 1 + U pp b (parts p er b illion) where U are n u m b ers rand omly generated from a Un iform distribution with supp ort on (5 · 10 − 5 , 10 − 2 ) and are considered kno wn. The transp ort sim ulation r uns for the same time p eriod as the ﬂo w simulation. 5.3. Deve loping a surrogate mo del . Due to the complexity of our mo d el, only a single sim ulation of the tran s p ort ﬂo w w ith TOUGH2 requires sev eral min u tes to ﬁnish. Th u s, implemen ting our experimental design framew ork, includin g the minimization of the exp ected information gain low er b ou n d with SPSA which requires thousands of ob jectiv e fun ction ev aluat ions, b ecomes impractical. It is therefore necessary to create a sur r ogate mo del that w ould pro vide us with forward ev aluations of the mo d el that are signiﬁcan tly c heap er to obtain than runnin g TOUGH2. 5.3.1. The prio r and input transf o rmation. Th e un kno wn p h ysical parameters of our problem are the p erm eabilities κ = ( κ 1 , ..., κ 6 ) of the six materials making-up the s ubsurf ace at the site of in terest. W e c h o ose all κ i to b e indep end en t with a log n ormal prior distribution, that is (5.1) p (log κ i ) = N ( − 23 . 5 , 4) The choice of the mean µ = − 23 . 5 and v ariance σ 2 = 4 are made s u c h that our pr ior co vers a r ange for p ermeabilit y v alues in the order of magnitud e 10 − 8 cm 2 to 10 − 12 cm 2 whic h corre- sp onds to semi-p ervious mate r ials. W e ﬁnd this a rather general prior that corresp onds s olely to our kno wledge that th e materials present in our domain are silt, cla y , silt y sand , cla yey sand, s an d w ith 10% silt and p o orly graded sand. No w if we let F ( x ) = P ( κ i ≤ x ) to b e the cumulati ve d istribution function of κ i , then we ha ve that ξ i := F ( κ i ) ∼ U (0 , 1) an d f or ξ = ( ξ 1 , ..., ξ 6 ) w e deﬁne ˆ G ( ξ ) = G ( F − 1 ( ξ ) , d ) where F ( ξ ) = Q 6 i =1 F ( ξ i ) to b e our mo del output where the in p ut is a set of 6 indep enden t standard uniform random v ariables. 5.3.2. P olynomial chaos expansion. W e mak e use of th e p rop ert y that our rand om out- put ˆ G ( ξ ) is a physical pro cess with ﬁnite v ariance, therefore it is a square-integ rable r andom ﬁeld ˆ G ( ξ ) ∈ L 2 ( R m ), deﬁned on the probabilit y sp ace ([0 , 1] 6 , F , P ) and admits a p olynomia l c haos representati on of the form [ 8 , 34 , 38 ] (5.2) ˆ G ( ξ ) = X α , | α | < ∞ p α Ψ α ( ξ ) where α = ( α 1 , ..., α d ) and α i ∈ N m for i = 1 , ..., n is a multi-index with mod ulus | α | = α 1 + ... + α n , eac h p α is a v ector in R m , ξ is a second-order r andom v ariable deﬁned on 16 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI ([0 , 1] 6 , F , P ) with v alues in R m and the f unctions Ψ α form a complete set of orthonormal functions that satisfy (5.3) E [Ψ α ( ξ )Ψ β ( ξ )] = δ αβ = δ α 1 β 1 × · · · × δ α d β d T ypically the random v ariable ξ has ind ep endent comp onent s that follo w a Gaussian, Un iform, Gamma or Beta distribu tion. The basis function then is c hosen to consist of m ultidimensional p olynomials Ψ α ( ξ ) = ψ α 1 ( ξ 1 ) × · · · × ψ α d ( α n ), where ψ α i is resp ecti vely Hermite, Legendre, Laguerre or Jacobi p olynomial s of order α i . According to the inpu t transformation in the previous paragraph, w e tak e the comp onents of ξ to b e U (0 , 1) distribu ted and therefore th e p olynomials in the expansion will b e Legendre. F or computational pu rp oses we w ork w ith a truncated v ersion of the expression ( 5.2 ) b y writing (5.4) ˆ G r ( ξ ) = X α , | α |≤ r p α Ψ α ( ξ ) . where th e n umb er of term s in the ab o v e expansions is (5.5) N p = |{ α ∈ N n , 0 ≤ | α | ≤ r }| = r X j =0 ( j + n − 1)! j !( n − 1)! . Eq. ( 5.4 ) p ro vides an accurate appro ximation of the true mo d el output y as long as the co eﬃcien ts p α are estimated in a fashion that th ey also incorp orate a transformation of ξ to the uncertain parameters of the problem of in terest. 5.3.3. Co eﬃcient estimation. In order for the expression ( 5.4 ) to b e of use, w e need to estimate the co eﬃcien ts p α . This in general can b e done with v arious metho ds, mainly catego rized as in tru s iv e [ 39 ] and nonintrusiv e metho ds. W e u se n on-in trus iv e metho ds wh ich are easier to implement and more con ve n ien t w hen the forw ard simula tion is seen as a b lack b o x. P opular n onin tru s iv e metho d s include appr oximati n g the co eﬃcient s by orthogonal pro jection of the output on the basis functions [ 9 ], whic h inv olv es calculating m ultidimens ional in tegrals. Other m etho ds calculate th e co eﬃcien ts b y solving a linear sy s tem of equations constructed after ev aluating the mo del on a set of samples and then either in terp ol ating these p oi nts (by c ho osing collocation p oints of the p olynomial ro ots [ 21 ]) or minimizing the least s quares error [ 41 ]. The last metho d is the one adapted here, namely we estimate the co eﬃcien ts by linear regression taking adv an tage of the linear dep end ence of ˆ G r ( ξ ) on p α . This requires the ev aluati on of ˆ G ( ξ ) at N G p oint s { ξ j } N G j =1 and th en for eac h comp onen t y c , c = 1 , ..., 24000 of the output v ector ˆ G = ( y 1 , ..., y 24000 ) T , w e solv e the system (5.6) y c = Ψ p c , where y c = [ y 1 c , ..., y N G c ] T , Ψ is the N G × N p matrix formed b y ev aluating the p olynomial basis functions at the N G selected p oin ts and p c are th e ve ctor with the unkn o wn co eﬃcien ts. The least squares solution of ( 5.6 ) is the one th at min imizes || Ψ p c − y c || 2 and is giv en by (5.7) ˆ p c = (Ψ T Ψ) − 1 Ψ T y c EFFICIENT BA YESIAN EXPERI MENT A TION 17 pro vided that (Ψ T Ψ) − 1 exists. T h is requ ires that N G ≥ N p so that our system w ill b e o v erd etermined. Again, due to th e computationally demand in g TOUGH2 f orw ard simulato r it is impractic al to obtain tens or hundreds of thousand samp les so w e w ork with an ensem b le of N G = 1000 samples which, ev en on the massiv ely parallel v ersion of TOUGH2 (TOUGH2- MP), with a mo derate n umber of pro cessors used, tak es aroun d one w eek to b e generated. This c hoice of N G allo ws u s to calculate the co eﬃcient s of an expansion of order up to r = 5. The p oint s { ξ j } N G j =1 w ere rand omly selecte d using Latin Hyp ercub e sampling. In f act, the 5-order p olynomial chao s exp ansion includ ed 462 co eﬃcien ts. Th is is close to N G ≈ 2 N p whic h has b een suggested as a goo d c hoice [ 14 ]. New results concerning the stabilit y and L 2 -con v ergence of Pol yn omial Chaos approximat ions obtained via least-squares solutions h a v e b een recen tly rep orted b y [ 5 , 12 ], ho wev er suc h an analysis falls b eyond the scop e of this wo rk. 5.3.4. Go o dness of ﬁt and truncation error. The next step after obtaining the expansion co eﬃcien ts is to test ho w w ell ˆ G r p erforms as a surrogate. F or v alidation purp oses, we estimate the co eﬃcien ts for r = 1 , 2 , 3 , 4 and 5 using all 1000 samp les. W e wan t to test h o w wel l the least squares solution ﬁts the samples but also h o w close ˆ G r is to ˆ G in the L 2 sense. F or the ﬁrst test we employ the common R 2 statistic kno wn also as c o e ﬃc i ent of deter- mination [ 33 ] and provides a go o dness-of-ﬁt measure for our linear m o del. The statistic is deﬁned as (5.8) R 2 c = 1 − RS S c P N G j =1 ( y i c − ¯ y c ) 2 where th e residuals sum of squares is RS S c = P N G j =1 ( y j c − y r,j c ) 2 , c = 1 , ..., 24000. F or the second test, we deﬁn e the exp ect ed relativ e truncation error as (5.9) e c = E [ |G c ( ξ ) − y c ( ξ ) | 2 ] E [ |G c ( ξ ) | 2 ] = R [0 , 1] n |G c ( ξ ) − y c ( ξ ) | 2 p ( ξ ) d ξ R [0 , 1] n |G c ( ξ ) | 2 p ( ξ ) d ξ , c = 1 , ..., 24000 where p ( ξ ) = 1 is the join t distribution of in d ep endent U (0 , 1)’s. The ab ov e can b e estimated as (5.10) e c ≈ P 1000 j =1 |G c ( ξ j ) − y c ( ξ j ) | 2 P 1000 j =1 |G c ( ξ j ) | 2 , c = 1 , ..., 24000 . Fig. 8 shows the b o xplots of the R 2 statistic, obtained from all comp onents G , of the expansions of order r = 1 , ... 5 (left) and the b o xplots of the r elativ e err ors for eac h order r = 1 , ..., 5 and for all the comp onents of G (righ t). Regarding the R 2 statistic and particularly for r = 5 the m edian is 0 . 986 and the lo w er qu artile is ab o v e 0 . 96 giving us a go o d ﬁt for the 75% of the exp ansions along the d omain. A thorough lo ok sho wed that the remaining outputs including outliers for w hic h the ﬁt is not stable corresp ond to p oin ts of the d omain that are in the b otto m la ye r and along the east b ound ary . Regarding the exp ec ted truncation err or and sp eciﬁcally for the expansions of order 5 the median is 3 . 1 · 10 − 4 and the upp er quant ile is 10 − 2 giving u s go o d appr o ximations in th e L 2 sense. Again the trun cation error in creases for th e p oint s in the b ottom la yers. T o ensure that our exp erimen tal design metho dology 18 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI implemen ted in the n ext paragraph is un aﬀected b y p ossible instabilities of the Pol yn omial Chaos expansions, the mo del outp uts pro d uced for the b ottom lay er of our domain we r e not in vol ved in our study . 1 2 3 4 5 P o l y n o m i a l o r d e r r 0 . 0 0 .2 0 .4 0 .6 0 .8 1 . 0 R 2 R 2 st a t i sti c 1 2 3 4 5 P o l y n o m i a l o r d e r r 0 .0 0 .2 0 .4 0 .6 0 .8 1 .0 e c R e l a t i v e e r r o r Figure 8: Bo xplots of the R 2 statistic (left) and relativ e errors (right ) of the p olynomial c haoses for all comp onents of the output G ( ξ ), of order r = 1 , ..., 5. 5.4. Experimental design. 5.4.1. Design settings. As men tioned ab o v e, our main goa l is to p erf orm Ba yesian infer- ence on the p ermeabilit y parameters that will pro vide us a tight er p osterior that will decrease the un certain t y ab out the tr ue v alues of the soil p erm eabilit y . This will further enh ance our current kno wledge ab out the plum e lo cation an d extent that can p otent ially result in devel- oping cost-eﬃcien t remediation str ategies. In order to achiev e such a goal, we are seeking the b est lo cations from where data sh ould b e collected by employing the exp ecte d information gain lo w er b ound as our design criterion. Unlik e th e analysis of the nonlinear algebraic mo d el it is clearly un dersto o d that in th is case, design of a single exp erimen t, that is design for the colle ction of a sin gle datum w ould not ha ve imp orta nt eﬀe cts in the inference procedu r e and it is never p erf orm ed in practice. In our setting w e consider that for any lo cation in the ( x, y )-plane we can observe the concen trations y from the ﬁrst 5 lay ers of our domain, that is up to 10m. depth, so 5 d ata p oint s are a v ailable from eac h location. Without any lo ss of generalit y in our metho d w e c ho ose to ﬁnd the design of exp eriment s consisted of the 5 b est lo cations in the ( x, y )-plane from where d ata will b e collect ed sim u ltaneously to b e used for inference. This is a mo derate c hoice whic h app ea r s to b e satisfact ory in ord er to v alidate our m etho d. F urther ev aluation of the optimal n umb er of data p oint s is b ey ond the scop e of the present study . The design parameters therefore are d = ( d 1 ,x , d 1 ,y , d 2 ,x , d 2 ,y , ..., d 5 ,x , d 5 ,y ) where ( d i,x , d i,y ) ∈ D 0 = [0 , 600] × [0 , 400], i = 1 , ..., 5 and th e design space for our problem is D = D 5 0 . The observ ations y are sub ject to additional measurement noise as indicated fr om our Gaussian lik eliho o d f u nction N ( y | ˆ G ( ξ ) , Σ ) inv olv ed in th e exp ected information gain lo wer b ound deriv ation. Here we hav e substituted ou r f orward solv er G ( κ , d ) with the p olynomial c haos surrogate ˆ G ( ξ ). The co v ariance mat r ix has b een c hosen to b e Σ = σ 2 I 25 with σ 2 = 10 − 2 . This is a rather simp le choice that guaran tees that measurement errors are indep en d en t. A EFFICIENT BA YESIAN EXPERI MENT A TION 19 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 E a st i n g ( m ) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 No r th i n g ( m ) 0 2 0 4 0 6 0 8 0 1 0 0 1 2 0 ( a ) N = M = 10 0 2 0 4 0 6 0 8 0 1 0 0 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 E a sti n g ( m ) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 No r th i n g ( m ) 0 2 0 4 0 6 0 8 0 1 0 0 1 2 0 ( b ) N = M = 100 0 2 0 4 0 6 0 8 0 1 0 0 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 E a sti n g ( m ) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 No r t h i n g ( m ) 0 2 0 4 0 6 0 8 0 1 0 0 1 2 0 ( c ) N = M = 1000 0 2 0 4 0 6 0 8 0 1 0 0 Figure 9: Scatter plots of the optimal p oin ts d i for all optimization solutions d ∗ of the SP SA algorithm. T h e red transparent area s denote the lo cations of the sources, plotted for compar- ison of the p atterns. more soph isticated c h oice wo u ld b e to tak e the standard deviation to b e prop ortional to the observ able qu an tit y σ = a ˆ G c ( ξ , d ) where a is some prop ortionalit y factor. W e a void this c hoice as it imp lies the d ep endence of the error on the design p arameters w hic h con tradicts our assu mptions for the lo w er b ound deriv ation. 20 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI 5.4.2. Results. W e run the SPSA algo r ith m in order to ﬁnd d that minimizes ˆ U ∗ L , there- fore maximizes the low er b ound estimat e. T o assess the p erformance of the algorithm, w e r u n sev eral cases w here the choice s of N and M in the Monte Carlo lo op v ary . It is easy to see that the v ariance of our estimator is (5.11) v ar h ˆ U ∗ L ( d ) i = 1 N M v ar [ p ( Y | κ , d )] whic h sh o ws th at N and M ha ve the same con trib ution in con trolling the v ariance, therefore we tak e them equ al. As s tated previously , our estimator is unbiased, u nlik e the d lMC estimator, so an extremely large v alue for M is not n ecessary as it is for dlMC to main tain a lo w bias. W e consid ered three diﬀerent cases where N = M = 10 , 100 and 1000. Since the resu lts of SPSA are noisy , in ord er to b etter ev aluate the p erformance of the algorithm w e obtain the results of 100 ind ep endent run s for eac h case. F or the ﬁ r st t wo cases w e r un the algorithm for a total of 10 4 ev aluat ions of the ob jectiv e fun ction and for the last case we run for 5 · 10 3 ev aluat ions. It app ears that these c hoices are rather h igh since the iterate s are stabilized muc h earlier main taining a lo w iteration error. The resulting 5-tuples of p oin ts in D 0 are sho wn all together in a tota l of 500 p oin ts in Fig. 9 , for all thr ee cases. This giv es us a qualitativ e idea ab out w h ere the exp ect ed information gain is maximized and so data sh ould b e coll ected fr om. Th e formation of a certain p attern as N , M are increased is obvio u s and the optimal p oin ts ap p ear to b e very close to where the con taminan t sources w ere placed. A t the same time v arious p oints that app ear to b e outliers are also pr esent, a c h aracteristic b ehavio r of the SPS A algorithm. The case N = M = 10 p articularly sh o ws all p oin ts widely sp read all ov er D 0 and one can only distinguish the tw o areas aroun d ( x, y ) = (150 , 250) and around ( x, y ) = (450 , 100) where more p oin ts are accumulate d . The cases N = M = 100 and N = M = 1000 p ro vide v ery similar conclusions with the ma jorit y of the runs ha ving conv erged at v ery sp eciﬁc lo cations, close to the actual con taminan t sources, forming a clear pattern with the main diﬀerence that in the th ird case the p oint s are eve n more concen trated and less outliers are p resen t. F or a more qu an titativ e argument w e also p r esen t the exact v alues of ˆ U ∗ L ( d ) as we ll as those of (5.12) ˆ U L ( d ) = − 1 2 { m + lo g [(2 π ) m | Σ | ] } − log   1 N M N ,M X i,j =1 p ( y i | κ ij , d )   calculate d at all outcomes d ∗ of our r u ns. These v alues quantify the information gain that eac h d esign is expected to oﬀer and at the end, if we had to c ho ose only one design, this should b e the design th at pr o vides the largest v alue. Th is time, for diagnostic reasons an d in ord er to provide a measur e of comparison, we wa nt to ha ve a ﬁxed accuracy for the estimates and w e use the same num b er of samples N = M = 1000 for the ev aluatio n of the tw o ob jectiv e functions on all p oin ts. Their v alues are all shown in the histograms p r esen ted in Fig. 10 . Again there is a signiﬁcan t diﬀerence b et w een the qu alit y of the r esults of the ﬁrst case with that of th e last tw o cases for b oth fu nctions. F or ˆ U ∗ L ( d ), the ﬁrst case giv es us designs that whose v alues v ary in a range from 0 . 1 · 10 8 to 6 · 10 10 (observ e that the x -axis of the histogram is an order of magnitude larger than the rest) w ith a large num b er of them b eing a w a y from EFFICIENT BA YESIAN EXPERI MENT A TION 21 0 1 2 3 4 5 6 F i n a l ˆ U ∗ L ( d ) v a l u e 1 e 1 0 0 2 4 6 8 1 0 1 2 C o u n t ( a ) N = M = 10 − 3 −2 − 1 0 1 2 3 F i n a l ˆ U L ( d ) v a l u e 0 2 4 6 8 1 0 1 2 C o u n t ( b ) N = M = 10 0 . 0 0 .2 0 .4 0 .6 0 .8 1 .0 1 .2 1 .4 1 .6 1 . 8 F i n a l ˆ U ∗ L ( d ) v a l u e 1 e 9 0 2 4 6 8 1 0 1 2 C o u n t ( c ) N = M = 100 − 3 −2 − 1 0 1 2 3 F i n a l ˆ U L ( d ) v a l u e 0 2 4 6 8 1 0 1 2 C o u n t ( d ) N = M = 100 0 . 0 0 .5 1 . 0 1 .5 2 .0 F i n a l ˆ U ∗ L ( d ) v a l u e 1 e 9 0 2 4 6 8 1 0 1 2 C o u n t ( e ) N = M = 1000 − 3 −2 − 1 0 1 2 3 F i n a l ˆ U L ( d ) v a l u e 0 2 4 6 8 1 0 1 2 C o u n t ( f) N = M = 1000 Figure 10: Histograms of the ˆ U ∗ L ( d ) (left) and ˆ U L ( d ) (righ t) v alues for all optimal p oints d ∗ . the minimum v alue. The imp ression one might get for this case is that c hances are few that the optimization will yield an optimal design solution and n ot just an outlier. The other t wo cases pro vide similar results w ith the third b eing, as exp ected, b etter than the second, in the sense that the op timal p oints are even more concen trated around the m inim um v alue. F or completeness we present also ˆ U L ( d ) whic h is tec hn ically nothing else but the n egativ e logarithm of ˆ U ∗ L ( d ), s h ifted by a constant . In the ﬁr st case, the v alues co v er a r an ge f r om − 3 to 2 . 5 with a mean around 0. The last tw o cases giv e u s again similar r esults with b ot h histograms co v ering v alues from around 1 to 3 . 5, a muc h narro wer in terv al than th e ﬁrst case. Again we can see that the third case is sligh tly b etter than the second w here w e see that the unimo dal stru cture of the histogram is more concentrat ed on higher v alues close to 3 wh ic h implies that a few m ore runs even tually conv erged to the global maxim um. 22 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI T able 2: The ( x, y )-co ordinates of the 5-p oint designs d ◦ and d ⋄ and their ˆ U L ( d ) v alues. d 1 d 2 d 3 d 4 d 5 ˆ U L ( d ) d ◦ (127 . 9 6 , 259 . 65) (442 . 5 9 , 97 . 05) (143 . 22 , 304 . 41 ) (337 . 1 1 , 279 . 02 ) (412 . 26 , 135 . 8 6) 3 . 698 d ⋄ (22 . 95 , 171 . 37) (75 . 36 , 300 . 00) (318 . 58 , 169 . 2 7) (425 . 06 , 109 . 96) (4 41 . 00 , 253 . 70) 0 . 286 5.5. Ba y esian up date of log ( κ ) . I n ord er to v alidate our metho dology and demonstrate the signiﬁcance of the exp erimental design analysis deve lop ed ab o v e, w e p er f orm Ba ye sian inference based on t wo diﬀerent designs that are selected according to the resu lts of th e previous s ection. More precisely , data f rom tw o diﬀeren t s ets of 5 p oints, d en oted with d ◦ and d ⋄ , is co llected and used to up date the probabilit y d istribution of κ . S p eciﬁcally , d ◦ is selected among the 100 optimal designs that w ere appro ximated by S PSA in the previous section, for N = M = 1000 and is the one that ac h ieved the maximum ˆ U L ( d ) v alue, so based on the previous analysis it is the design that is exp ect ed to yield the b est p osterior. Th e second design d ⋄ , c hosen for comparison only , was arbitrarily selected among the optimal designs that w ere returned b y SPSA using the p o or estimate with N = M = 10 and has a m uc h low er ˆ U L ( d ) v alue, therefore it is expected to up date to a wider p osterior. T he exa ct lo cation of the p oint s for eac h set and their ˆ U L ( d ) v alues are shown in T able 2 . Since the p osterior distributions cannot b e calculated explicitly , we u s e Markov Chain Monte Carlo (MCM C) metho d s to d ra w samples from them. MCMC metho ds r ely on gener- ating a sequence of random v ariables based on a Marko v Chain that co nv erges to a sta tionary distribution π ( y ), called the target distrib ution and therefore th e sequence generated after a burn -in p erio d, can b e though t of as follo wing the stationary distribution π . T o av oid strong auto correlation among the samp les, appropriate th inning an d tun ing of the algorithm is re- quired. A p ow erful MCMC metho d is the Metrop olis-Hastings (MH) algorithm that w as ﬁ rst prop osed by Metrop ol is [ 25 ] and later generalized b y Hastings [ 13 ]. The MH algorithm at an arbitrary step generates a new s amp le y f rom a pr op osal distribu tion q ( x, y ) giv en that x is the last accepted sample and accepts it with probabilit y (5.13) α ( x, y ) = m in  1 , π ( y ) q ( x, y ) π ( x ) q ( y , x )  . In our case we use the ad ap tive version of the MH algorithm as dev elop ed in [ 10 ] with a Gauss ian as the prop osal distrib ution where its co v ariance m atrix is up dated at eac h step taking into accoun t all the previous samp les that ha ve b een drawn. Th is implies that the c hain is non-Mark o vian, h o w ever it has b een pro v ed that it has the correct ergodic prop erties. The adaptive MH (aMH) algo r ithm provides faster co nv ergence to the target distribu tion and ac hiev es lo we r auto correlation among the samp les w ithout v ery large thinn ing. Belo w we explore t wo cases where in the ﬁrst d ata is observ ed from a reference ﬁeld that is asso ciated with one sp ec iﬁ c realizatio n of the m o del outpu t w hile in the second, data is observ ed f rom a reference ﬁ eld that is constructed usin g Gaussian pro cess (GP) regression on measuremen ts tak en from the real site. 5.5.1. Case 1: Reference data generated from prio r. Here we assu me a h yp othetic al situation where the real p ermeabilit y v alues are those d ispla y ed on T able 3 . Ou r mo del’s EFFICIENT BA YESIAN EXPERI MENT A TION 23 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 E a sti n g (m) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 North i n g ( m) (a ) De p th : 0 m - 2 m 0.00 0.75 1.50 2.25 3.00 3.75 4.50 5.25 6.00 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 Ea sti n g ( m) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 North i n g ( m) (b ) De p th : 2 m - 4 m 0.0 0 0.7 5 1.5 0 2.2 5 3.0 0 3.7 5 4.5 0 5.2 5 6.0 0 6.7 5 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 E a sti n g (m) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 North i n g ( m) ( c ) De p th : 4 m - 6 m 0 1 2 3 4 5 6 7 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 Ea sti n g ( m) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 North i n g ( m) (d ) De p th : 6 m - 8 m 0.0 0.9 1.8 2.7 3.6 4.5 5.4 6.3 7.2 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 Ea sti n g ( m) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 No r th i n g ( m) (e ) De p th : 8 m - 1 0 m 0 .0 0 .9 1 .8 2 .7 3 .6 4 .5 5 .4 6 .3 7 .2 8 .1 Figure 11: Concentrat ion map for the 5 top lay ers of the d omain used as the reference ﬁeld in example 1. The ’ ◦ ’ and ’ ⋄ ’ signs indicate th e lo cations of d ◦ and d ⋄ designs r esp ectiv ely , from w here measuremen ts were taken. T able 3: T ru e v alues of κ and log ( κ ) u sed for generating the data in case 1. i log κ i κ i (cm 2 ) 1 − 22 . 195 2 . 295 · 10 − 10 2 − 20 . 791 9 . 346 · 10 − 10 3 − 21 . 567 4 . 300 · 10 − 10 4 − 25 . 193 1 . 145 · 10 − 11 5 − 25 . 272 1 . 058 · 10 − 11 6 − 24 . 610 2 . 050 · 10 − 10 output for those v alues ca n b e directly ev aluated and the 5 top la y ers are disp lay ed in Fig. 11 . With this reference ﬁ eld assum ed to b e our ”realit y”, w e collect data from the lo cations indicated at designs d ◦ and d ⋄ and use them for ou r compu tations. W e generate 50000 samp les with a 10000 -samp le bu rn-in p erio d and w e retain a samp le ev ery 5 steps. After 8000 samples are obtained, histograms of th eir v alues are sh own in 24 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI Fig. 12 . Note that the histograms displa y the marginal distribution of eac h log κ i , i = 1 , ..., 6 whic h are no longer indep enden t themselv es, ho wev er comparison of the histograms f or eac h case toge th er with their priors and the true v alue giv es us an idea ab out the diﬀeren t r esu lts obtained for eac h d esign. As exp ected, w e observe that d ◦ pro vides narro wer p osteriors that d ⋄ . T his can b e observed particularly on log κ 2 and log κ 5 and in a smaller scale on log κ 1 , log κ 3 and log κ 4 . The p osterio r s of log κ 6 for b oth designs do not disp la y any signiﬁcan t discrepancy fr om the prior but ev en in this case the on e corresp onding to d ◦ app ear to b e sligh tly decen tralized tow ards the true v alue. In addition, for d ⋄ three p osteriors, namely log κ 2 , log κ 5 and log κ 6 retain their gaussian b ell- sh ap ed form and it seems that almost no new information has b een gained ab out their v alues. This is actually a consequ ence of the fact that the corresp onding soils (silt y sand, sand with 10% silt and cla y) are not presen t in the lo cations where the d ata w as co llected from, for this design. Note that cl ay is p resen t in the lo we r la y ers of our domain and n ot muc h information is gained ab out it fr om d ◦ either. An other general c haracteristic is that, ev en for the soils for w h ic h b oth designs pro vide similar p oste r iors, th e maxim um a p osteriori v alues of d ◦ sho w excellen t agreement with the true v alues while th ose of d ⋄ sho w a sligh t d iscrepancy wh ic h makes them inappropriate for estimation. Ove r all, w e conclude that the design d ◦ pro vides signiﬁcan tly b ette r inference results that those pro vided b y d ⋄ and this conclusion can b e generalized for the comparison of d ◦ with any other d esign that ac hiev es a lo we r ˆ U L ( d ) v alue. − 3 0 −2 8 − 2 6 −2 4 −2 2 − 2 0 −1 8 − 1 6 V a l u e 0 . 0 0 . 1 0 . 2 0 . 3 0 . 4 0 .5 0 . 6 p ( l og( κ 1 ) | y ) l og( κ 1 ) − 3 0 −2 8 − 2 6 −2 4 −2 2 −2 0 − 1 8 −1 6 V a l u e 0 . 0 0 . 1 0 .2 0 . 3 0 . 4 0 . 5 p ( l og( κ 2 ) | y ) l og( κ 2 ) −3 0 −2 8 −2 6 − 2 4 − 2 2 −2 0 −1 8 −1 6 V a l u e 0 . 0 0 . 2 0 .4 0 . 6 0 . 8 1 . 0 1 . 2 1 . 4 1 . 6 1 . 8 p ( l og( κ 3 ) | y ) l og( κ 3 ) − 3 0 −2 8 − 2 6 −2 4 −2 2 − 2 0 −1 8 − 1 6 V a l u e 0 . 0 0 . 1 0 . 2 0 . 3 0 . 4 0 .5 0 . 6 0 . 7 p ( l og( κ 4 ) | y ) l og( κ 4 ) − 3 0 −2 8 − 2 6 −2 4 −2 2 −2 0 − 1 8 −1 6 V a l u e 0 . 0 0 . 1 0 .2 0 . 3 0 . 4 0 . 5 0 . 6 p ( l og( κ 5 ) | y ) l og( κ 5 ) −3 2 − 3 0 −2 8 −2 6 − 2 4 −2 2 − 2 0 −1 8 −1 6 V a l u e 0 . 0 0 0 .0 5 0 . 1 0 0 .1 5 0 . 2 0 0 .2 5 0 . 3 0 p ( l og( κ 6 ) | y ) l og( κ 6 ) Figure 12: Comparison of th e histograms of th e marginal p osteriors of all log ( κ i ), i = 1 , ..., 6 for the designs d ◦ (green) and d ⋄ (red). Th e red curve indicates their common prior distr ib ution and th e dashed blac k line ind icates th e tru e v alue used to generate the d ata. 5.5.2. Case 2: Reference data fr om ﬁeld measurements. In this case, the data used in the lik eliho o d fun ction whic h is incorp orated in the accepta n ce probabilit y through Ba y es’ rule, is tak en f r om a reference concen tration ﬁ eld, that w as generated by GP regression based EFFICIENT BA YESIAN EXPERI MENT A TION 25 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 E a sti n g (m) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 North i n g ( m) (a ) De p th : 0 m - 2 m 9.60 9.84 10.08 10.32 10.56 10.80 11.04 11.28 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 Ea sti n g ( m) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 North i n g ( m) (b ) De p th : 2 m - 4 m 9.5 2 9.7 6 10 .00 10 .24 10 .48 10 .72 10 .96 11 .20 11 .44 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 E a sti n g (m) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 North i n g ( m) ( c ) De p th : 4 m - 6 m 9.44 9.68 9.92 10.16 10.40 10.64 10.88 11.12 11.36 11.60 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 Ea sti n g ( m) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 North i n g ( m) (d ) De p th : 6 m - 8 m 9.3 6 9.6 0 9.8 4 10 .08 10 .32 10 .56 10 .80 11 .04 11 .28 11 .52 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 Ea sti n g ( m) 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 No r th i n g ( m) (e ) De p th : 8 m - 1 0 m 9 .36 9 .60 9 .84 1 0.08 1 0.32 1 0.56 1 0.80 1 1.04 1 1.28 1 1.52 Figure 13: C oncen tration map for the 5 top la y ers of the d omain obtained b y GP regressio n on real data, used as our reference ﬁeld in example 2. Th e ’ ◦ ’ and ’ ⋄ ’ signs indicate the lo cations of d ◦ and d ⋄ designs resp ectiv ely , from wh ere m easur emen ts w ere taken. on measurements taken from the ﬁeld d ata at th e sp eciﬁed lo cations as sh o wn in the soil b oring map in Fig. 1 . T hese measurement s, d enoted with f , consist of 1240 p oi nts and their lo cations form a 1240 × 3 matrix X . Then, if f ∗ denotes the co ncentratio n s o v er the discretize d domain, that is f ∗ ∈ R 24000 , with corresp ondin g lo cations X ∗ , then (5.14) f ∗ | X ∗ , X , f ∼ N  K ( X ∗ , X ) K ( X, X ) − 1 f , K ( X ∗ , X ∗ ) − K ( X ∗ , X ) K ( X, X ) − 1 K ( X , X ∗ )  , where K ( X ∗ , X ) is the co v ariance matrix with ( K ( X ∗ , X )) i,j = k ( x i , x j ), x i ∈ X ∗ , x j ∈ X and k ( · , · ) denotes some appropr iately c hosen k ernel (for more details on the deriv ation of this conditional distribution one can see [ 29 ]). I n our case we chose a squ ared exp onentia l kernel, giv en as (5.15) k ( x , x ′ ) = σ 2 exp " − 1 2 3 X n =1 ( x n − x ′ n ) 2 ℓ 2 n # whic h w as ﬁt to the data with v ariance σ 2 = 5 and correlatio n lengths ( ℓ 1 , ℓ 2 , ℓ 3 ) = (30 , 30 , 10). Our r eferen ce ﬁeld, consists of the mean ¯ f = K ( X ∗ , X ) K ( X, X ) − 1 f and the v alues of the 5 26 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI − 3 0 −2 8 −2 6 −2 4 −2 2 − 2 0 −1 8 −1 6 −1 4 V a l u e 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 .0 1 .2 p ( l og( κ 1 ) | y ) l og( κ 1 ) − 3 0 −2 8 − 2 6 −2 4 −2 2 −2 0 − 1 8 −1 6 V a l u e 0 . 0 0 .2 0 . 4 0 . 6 0 . 8 1 . 0 1 . 2 1 .4 p ( l og( κ 2 ) | y ) l og( κ 2 ) −3 0 −2 8 −2 6 − 2 4 − 2 2 −2 0 −1 8 −1 6 V a l u e 0 1 2 3 4 5 6 7 p ( l og( κ 3 ) | y ) l og( κ 3 ) − 3 4 − 3 2 − 3 0 −2 8 − 2 6 −2 4 − 2 2 −2 0 −1 8 − 1 6 V a l u e 0 . 0 0 .5 1 .0 1 . 5 2 . 0 2 . 5 3 . 0 p ( l og( κ 4 ) | y ) l og( κ 4 ) − 3 4 − 3 2 − 3 0 − 2 8 −2 6 − 2 4 − 2 2 − 2 0 −1 8 −1 6 V a l u e 0 . 0 0 .2 0 . 4 0 . 6 0 . 8 1 . 0 1 . 2 1 .4 1 . 6 p ( l og( κ 5 ) | y ) l og( κ 5 ) −3 2 −3 0 −2 8 −2 6 −2 4 −2 2 −2 0 −1 8 −1 6 − 1 4 V a l u e 0 . 0 0 . 1 0 . 2 0 . 3 0 .4 0 . 5 p ( l og( κ 6 ) | y ) l og( κ 6 ) Figure 14: Comparison of th e histograms of th e marginal p osteriors of all log ( κ i ), i = 1 , ..., 6 for the designs d ◦ (green) and d ⋄ (red). The red curv e indicates their co mm on prior distribu tion. top la y ers are sho wn in Fig. 13 . Again, w e s et th e p ost eriors with p ( κ | y , d ◦ ) and p ( κ | y , d ⋄ ) as our target distributions and this time we generate 100000 samples w ith a 10000-sample bur n-in p erio d and we re- tain a sample ev ery 5 steps. After 18000 samples are obtained, histograms of the m arginal distributions of eac h log κ i , i = 1 , ..., 6, are sho wn in Fig. 14 . Again, w e observ e that d ◦ pro vides narro we r p osteriors than d ⋄ for most of the cases. This can b e observ ed particularly on log κ 1 , log κ 4 and log κ 5 . The p osterior of log κ 6 for the optimal design is almost identica l to its prior w hic h implies that nothing n ew has b een learned ab out its true v alue. As in the previous example, this is du e to the fact that th e corresp ond ing soil (cla y) is not present in the lo cations where the data wa s collec ted, for th is design (recall that cla y is present in the lo w er la yers of our d omain). In this case, s in ce our data is obtained from a pro cedure other than ev aluating our forwa rd mo d el, it cannot b e seen as a d irect output of it and generally w e do not exp ect to obs er ve uniqueness of the input parameters that can p otenti ally giv e rise to our measurements. Th is is r eﬂected by the f act that diﬀeren t designs giv e p osteriors that seem to concen trate around diﬀeren t v alues. Bay esian inference how ev er, still provides us with informativ e resu lts as w e obser ve the p osterior distribu tions to b e m u ch further a wa y from the priors, thus lea r ning something ab out ho w close our m o del and our prior assumptions are to the actual realit y . 6. Conclusions . W e p resen ted an exp erimen tal d esign framew ork that fo cuses on p r o vid- ing optimal solutions to the design problem in terms of maximizing the information on mo del parameters from Ba y esian inference. This w as ac h iev ed b y emplo ying an inform ation theoretic criterion, namely the exp ected r elativ e entrop y b et w een th e prior and the p osterior distrib u- EFFICIENT BA YESIAN EXPERI MENT A TION 27 tions of the unkno w n parameters. An additional reform ulation of the design criterion through the deriv ati on of a lo wer b oun d, w as of great importance in order to address biasedness issues as well as allevia te the compu tational b urden asso ciated with optimization. The framewo r k w as applied to a real-w orld problem: The pr ob lem of p ermeabilit y id en tiﬁ- cation in con taminated soils in the presence of exp erim ental ﬁeld data. Th e forw ard mo d el in this case consisted of a system of diﬀeren tial equat ions that describ e con taminan t transp ort in p orous media, namely a tw o-component, t w o-phase ﬂo w-and-trans p ort mo del that pro vides as its output the concen trations of th e p ollutan ts at the time wh en data is collec ted. The design parameters consist of multiple lo cations w here the concen trations will b e measured. In the present setting, with the analogous adopted co de (TOUGH2) used as a blac k b o x sim ulator, no deriv ativ e in formation w as av ailable to the optimization pro ce ss . Th e SPSA algorithm, in add ition to add r essing this iss u e also seemed to accelerate conv ergence to the optimal so- lutions. T o fu rther redu ce the compu tational cost, constru ction of a mo del su r rogate was of crucial imp ortance since th e imp lemen tation of suc h a pro cedu re would b e impractical oth- erwise. Finally , our metho dology was v alidate d after th e inferen ce r esults sho wed that the data collected at the optimal design lo cations we r e muc h more informativ e than the d ata ob- tained from other arbitrarily c h osen p oin ts, in the sense th at they r esulted in muc h narro wer p osteriors, thus gaining more knowle d ge ab out the true s itu ation in th e sub s urface. W e ob- serv e from our analysis that, und er limited resources, th e p er f ormance of a Bay esian up date dep ends signiﬁcan tly on the location of data acquisition. This highligh ts the need for optimal monitoring la y outs in order to manage en vironmental risks under economic constraints. App endix A. Comparison of the exp ected information gain estimates. A.1. The double-lo op Monte-Carlo estimate. The exp ected information gain can b e rearranged in the form I ( d ) = Z Y Z K { log [ p ( y | κ , d )] − log [ p ( y | d )] } p ( y | κ , d ) p ( κ ) d κ d y and can b e ev aluate d by u s ing Mon te Carlo metho ds with ˆ I ( d ) = 1 N N X i =1    log  p ( y i | κ i , d )  − log   1 M M X j =1 p ( y i | κ i,j , d )      , where { κ i } , { κ ij } , i = 1 , ..., N , j = 1 , ..., M are i.i.d. samples f r om p ( κ ) and { y i } are i.i.d. samples from p ( y | κ i , d ). A.2. Prop erties of t he estimates. Th e p rop erties of ˆ I ( d ) are explored in [ 31 ]. Precisely it is sho wn that the v ariance is prop ortional to v ar h ˆ I ( d ) i ∝ A ( d ) N + B ( d ) N M and th e bias is prop ortional to B ias h ˆ I ( d ) i ∝ C ( d ) M , 28 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI where A ( d ), B ( d ) and C ( d ) are terms th at dep end on the lik eliho o d and evidence fun ction. Clearly , as mentioned ab o v e, N con trols the v ariance and M con trols the bias. F or ˆ U ∗ L ( d ) w e hav e v ar h ˆ U ∗ L ( d ) i = 1 N M v ar [ p ( y | θ , d )] and th e bias is trivially zero, whereas for ˆ U L ( d ) w e hav e v ar h ˆ U L ( d ) i = v ar   log 1 N M N ,M X i,j =1 p ( y i | θ i,j , d )   ≈ v ar " log U ∗ L ( d ) + 1 N M P N ,M i,j =1 p ( y i | θ i,j , d ) − U ∗ L ( d ) U ∗ L ( d ) # = v ar " 1 N M P N ,M i,j =1 p ( y i | θ i,j , d ) − U ∗ L ( d ) U ∗ L ( d ) # = 1 N M v ar [ p ( y | θ , d )] U ∗ L ( d ) 2 , where th e second line follo ws after 1st-order T a ylor expansion ab out U ∗ L ( d ) and B ias h ˆ U L ( d ) i = E h ˆ U L ( d ) − U L ( d ) i = E   − log 1 N M N ,M X i,j =1 p ( y i | θ i,j , d ) + log U ∗ L ( d )   ≈ 1 N M v ar [ p ( y | θ , d )] U ∗ L ( d ) 2 , where the third line follo ws after a 2nd-order T a ylor exp ansion. Here the v ariance and the bias of the low er b ound are con trolled by b oth N and M . App endix B. Governing equations for ﬂo w and tra nsp o rt in a p orous medium. B.1. Balance equations. The general mass- and energy balance e qu ations in a multico m- p onent (NK comp onen ts) multiphase problem are given as dM k dt = ∇ · F k + q k , k = 1 , ..., N K, where k is lab eling the mass comp onent up to a to tal of N K , M represents mass or energy p er v olume, F r epresen ts mass or heat ﬂu x and q d en otes sinks or sources. In some formulations, suc h as in TOUGH2, an int egral expression is used in the f orm d dt Z V n M k dV n = Z Γ n F k · n d Γ n + Z V n q k dV n , k = 1 , ..., N K where V n is an arb itrary sub domain of the ﬂo w system and Γ n its closed b oundary surface, n is a vec tor normal on the su rface elemen t d Γ n p oint in g in w ard into V n . If TOUGH2 is EFFICIENT BA YESIAN EXPERI MENT A TION 29 used for the numerical solution of the gov erning equations, mo du le EO S 7r pr o vides consisten t c haracterization of th e constitutiv e equations, allo wing a v alue of NK as high as 5 (water, air, brine, p aren t radionucli d e, daughter radionuclide). In out problem, a zero sou r ce term is assum ed r esu lting in q = 0 and the conta min ant sources in the mo del are tak en to b e lo calized on th e sur face, at pr e-assigned spatial lo cations shown in Fig. 7 . In a TOUGH2 formalism, these sources are mo deled as injections in attac hed zero-v olume blo c ks connected to the grid blo c ks wh ere the sour ce is assu med to b e. The m ass accum ulation cond ition f or the k th comp onent is M k = φ X β S β ρ β X k β where the summation is tak en o ver the ﬂuid phases (liquid, gas, non-aqueous ph ase liquids). Only liquid and gas phase are considered in our case. Th e p orosit y is den oted b y φ , S β is the saturation of ph ase β (take s v alues from 0 to 1), ρ β is the density of p hase β and X k β is the mass fraction of comp onen t k pr esent in phase β . The advecti ve mass ﬂ ux is giv en as F k adv = X β X k β F β and in dividual phase ﬂuxed are giv en by a multiphase version of Darcy’s la w: F β = ρ β u β = − κ κ r β ρ β µ β ( ∇ P β − ρ β g ) . Here u β is the Darcy v elo cit y (v olume ﬂ ux) in phase β , κ is the absolute p erm eabilit y , κ r β is relativ e p ermeabilit y to p hase β , µ β is viscosit y and P β = P + P cβ is the ﬂ uid pressure in ph ase β , w hic h is the sum of the pressur e P of a reference ph ase (usually tak en to be th e gas phase) and the capillary p r essure P cβ ( ≤ 0), g is th e v ector o f gra vitational accele r ation. V ap or pressur e low ering d ue to capillary and ph ase adsorption eﬀects is mo deled b y Kelvin’s equation P ν ( T , S l ) = f V P L ( T , S l ) P sat ( T ) where f V P L = exp  M w P cl ( S l ) ρ l R ( T + 273 . 15)  is the v ap or p ressure lo wering factor. Ph ase adsorption is n eglecte d . The saturated v ap or pressure of bulk aqueous phase is denoted by P sat , P cl is the d iﬀerence b et wee n aqueous and gas phase pressures, M w is the molecular w eight of w ater and R is the un iv ersal gas constant. T emp erature T is assum ed constant. The absolute p ermeabilit y of the gas phase in creases at low pressur es according to the relation giv en by Klin ken b erg κ = κ ∞  1 + b P  30 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI where κ ∞ is the p ermeabilit y at ”inﬁnite” p ressure and b is the Klink enb erg parameter. In addition to Darcy’s ﬂo w, mass ﬂux o ccurs by molecular diﬀusion according to f k β = − φτ 0 τ β ρ β d k β ∇ X k β . Here d k β is the molecular diﬀusion co eﬃcien t for comp onen t k in phase β , τ 0 τ β is the to r tuosit y whic h includes a p orous medium d ep enden t factor τ 0 and a co eﬃcient that dep ends on ph ase saturation S β , τ β = τ β ( S β ). Finally , the total mass ﬂux is ﬁnally giv en by F k = F abd + X β f k β . B.2. Relative p ermeability mo del. Th e relativ e p ermeabilit y is assu med to follo w th e v an Genuc h ten-Mualem mo del w hic h for liquid is κ r l = ( √ S ∗ h 1 −  1 − S ∗ 1 /λ  λ i 2 , S l < S ls 1 , S l ≥ S ls and f or gas is κ r g = ( 1 − κ r l , if S g r = 0  1 − ˆ S  2  1 − ˆ S 2  , if S g r > 0 sub ject to restrictio n 0 ≤ κ r l , κ r g ≤ 1. Here S ∗ = ( S l − S lr ) / ( S ls − S lr ) and ˆ S = ( S l − S lr )(1 − S lr − S g r ). B.3. Capilla ry p ressure mo del. F or the capillary pressu r e w e use th e v an Genuc h ten function giv en as P cap = − P 0  S ∗− 1 /λ − 1  1 − λ sub ject to restrictions − P max ≤ P cap ≤ 0. Again S ∗ is deﬁned as for the relativ e p ermeabilit y . B.4. P arameters. T able 1 disp la ys the n ominal v alues assigned to all parameters requir ed according to the ab o ve go v erning equ ations. It is imp ortan t also to note the f ollo wing for our sim ulation: 1. The mobilities are upstream we ighted. 2. The p er m eabilities are harmonic weigh ted. 3. Mo dule EOS7r currently neglects brine and daughte r radionuclide, resulting in a 3- comp onen t ﬂow. 4. Adsorption eﬀects are neglected. 5. Bio degradation eﬀects are neglected. REFERENCES [1] Aquav eo, L.L.C., 2007. Gr oundwater Mo deli ng System V ersion 6.5.6, build date, May 27, 2009, UT, USA . EFFICIENT BA YESIAN EXPERI MENT A TION 31 [2] A.C. Atkinson and A .N. Donev. Optimum exp erimental design with SAS . Oxford Statistical Science Series, Oxford Universit y Pres s, 2007. [3] D.A. Ba´ u and A.S. May er. O ptimal d esign for pump-and -treat under u ncertain hydraulic condu ctivity and plume distribution. Journal of Contaminant Hydr olo gy , 100: 30–46, 2008. [4] K. Chaloner and I. V erdinelli. Bay esian exp erimen tal design: a review. Statistic al Scienc e , 10:273–304, 1995. [5] A. Cohen, M.A. Dave n p ort, and D. Leviatan. On the stabilit y and accuracy of leats squares approxima - tions. F oundations of Computational Mathematics , 13:81 9–834, 2013. [6] V.V. F edoro v . The ory of optimal exp eriments . Academic Press, New Y ork, 197 2. [7] R.A. F reeze, B. James, J. Massmann, T. Sp erling, and L. Smith. H ydrogeological decision an alysis, 4: The concept of data worth and its use in the developmen t of site inv estigation strategies. Gr ound Water , 30 :574–588, 1992. [8] R.G. Ghanem and P .D. S panos. Sto chastic ﬁnite elements: A sp e ctr al appr o ach, r evise d e dition . Dov er Publications Inc, 2012. [9] D.M. Ghiocel and R.G. Ghanem. Sto c h astic ﬁnite element analysis of seismic soil structure interaction. Journal of Engine ering Me chanics , 12 8:66–77, 2002. [10] H. Haario, E. S ak sman, and J. T amminen. A n adaptive metrop olis algorithm. Bernoul li , 7:223–242, 2001. [11] H. Haddad-Zadegan, R. Ghanem, and P . Ha jali. Data w orth analysis in spatial prediction and soil remediation. In G. Deod atis and B. Ellingwood , editors, ICOSSAR’13: International Conf er enc e on Structur al Safety and R eliabili ty , Columbia Universit y , June 16 -20 2013. [12] J. Hampton and A . Do ostan. Coherence motiv ated sampling and conv ergence analysis of least-squares p olynomial c haos regression. arXiv pr eprint, arXiv:1410.1931 , 201 4. [13] W.K. Hastings. Monte carlo sampling meth o ds using marko v chains and their applications. Biometrika , 57:97–1 09, 1970. [14] S. Hosder, R .W. W alters, and M Bal ch. Eﬃcient sampling for non- intrusiv e p olynomial chaos applications with multiple un certain input v ariables. In 9th AIAA Non-Deterministic Appr o aches Confer enc e , 2007. [15] X. Huan and Y. Marzo uk . Sim ulation-based opt imal ba yesian ex p erimen tal design for nonlinear systems. Journal of Computational Physics , pages 288–317, 2013 . [16] X. Hu an and Y . Marzouk. Gradien t-b ased sto chas t ic optimization metho ds in bayes ian exp erimental design. I nternational Journal for Unc ertainty Quantiﬁc ation , 4:47 9–510, 2014. [17] B.R. James and R.A. F reeze. The wo rth of d ata in predicting aquitard contin u it y in hydrogeologi cal design. W ater R esour c es R ese ar ch , 29:20 49–2065, 1993. [18] B.R. James and S .M. Gorelic k. Wh en enough is enough: The worth of monitoring data in aquifer remediation design. Water R esour c es R ese ar ch , 30:3 499–3513, 1994. [19] J. Kiefer and J. W olfo witz. Stochastic estimation of a reg ression function. T he Annals of Mathematic al Statistics , 23 :462–466, 1952. [20] S. Kullback and R.A. Leibler. On information and suﬃciency . The Ann al s of Mathematic al Statistics , 22:79–8 6, 1951. [21] H. Li and D. Zhang. Probabilistic collocation method for ﬂow in p orous media: Comparisons with other stochastic methods. Water R esour c es R ese ar ch , 43:4 4–56, 2007. [22] D. V. Lindley . On a measure of the info rmation p rovided b y an exp eriment. T he Annals of Mathematic al Statistics , 27 :986–1005, 1956. [23] Q. Long, M. Scavino, R. T empone, and S. W ang. F ast estimation of ex p ected information gains for bay esian exp erimental desig n s based on laplace approximations. Computational Metho ds in Appli e d Me chanics and Engine ering , 259, 2013. [24] J. Ma ssmann and R.A. F reeze. Groundwater contamination fro m waste management sites: The in terac- tion betw een risk-based engineering design and regulatory p olicy:1. metho dology . Water R esour c es R ese ar ch , 23 :351–367, 1987. [25] N. Metrop olis, A.W. Rosenbluth, M.N. R osen blu t h, A.H. T eller, and E. T eller. Equ ations of sta te calcu- lations b y fa st computing mac hines. Journal of Chemic al Physics , 21:10 87–1091, 1953. [26] T. Norb erg and L. Ros´ en. Calculating th e optimal num b er of contaminan t samples by means of data w orth analysis. Envir onmetrics , 17:705–719, 2006. [27] C. Oldenburg and K. Pruess. EOS7R: R adionuclide tr ansp ort for TOUGH2 . Berk eley , California, No vem- 32 P ANAGIOTIS TSILIFIS, R OGER G. GHANEM AND P ARI S HAJALI b er 1995. R ep ort LBL-3486 8. [28] K. Pruess, C. Olden b urg, and G. Mo ridis. TOUGH2 User’s guide, V ersion 2 . Berkeley , California, 1999 . Rep ort LBNL-4313 4. [29] C.E. R asm ussen and C.K.I. Williams. Gaussian pr o c esses for machine le arning . The M I T Press, 2006. [30] H. Robbins and S. Monro. A stochastic approximati on metho d. The Annals of Mathematic al Statistics , 22:400– 407, 1951. [31] K.J. Ryan. Estimating exp ected information gains for exp erimental designs with app lication to the random fa tigue-limit mo del. Journal of Computational and Gr aphic al Statistics , 12:5 85–603, 2003. [32] P . Sebastiani and H.P . Wy nn. Maximum entropy sampling and optimal bay esian exp erimental design. Journal of the R oyal Statistic al So ciety. Series B: Statistic al Metho dolo gy , 62:14 5–157, 2000 . [33] G.A.F. Seber and A.J. Lee. Line ar r e gr ession analysis, se c ond e dition . John Wiley and Sons, 2003. [34] C. Soize and R. Ghanem. Ph ysical systems with random uncertainties: Chaos representations with arbitrary probabilit y measure. SIAM Journal of Scientiﬁc Computing , 26(2): 395–410, 2004. [35] J.C. Spall. Multiv ariate sto chastic approximatio n using a simultaneous p erturb ation gradien t approxi- mation. I EEE T r ansactions on Automatic Contr ol , 37, 1992. [36] J.C. Sp all. Implementation of the sim u ltaneous p erturb ation algorithm for stochastic optimization. I EEE T r ansactions on A er osp ac e and Ele ctr onic Systems , 34 , 1998. [37] J. v an den Berg, A. Curtis, and J. T ramp ert. Optimal nonlinear b a yesian exp erimental design: an application to amplitude v ersus oﬀset experiments. Ge ophysic al Journal International , 155:411–42 1, 2003. [38] D. Xiu and G.E. Karniadakis. The wiener-ask ey p olynomial c haos for stochastic diﬀeren tial equations. SIAM Journal of Scientiﬁc Computing , 24:619–644, 2002. [39] D. X iu and G.E. Karniadakis. Modeling uncertaint y in ﬂ o w sim ulations via generalized polynomial chaos. Journal of Computational Physics , 187 :137–167, 2003. [40] K. Zhang, Y.S. W u, and K . Pruess. User’s guide for TO U GH2-MP - A massively p ar al lel version of the TOUGH2 c o de . Berkel ey , California, 2008. Rep ort LBNL-315E. [41] Y. Zhang and N .V. Sahinidis. Uncertain ty q u antiﬁcati on in co 2 sequestration using surrogate mo dels from polynomial c haos expansion. Industrial & Engine ering Chemistry R ese ar ch , 52:3121– 3132, 2012.

Efficient Bayesian experimentation using an expected information gain lower bound

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment