Experimental Design via Generalized Mean Objective Cost of Uncertainty

1 Experimental Design via Generaliz ed Mean Objecti v e Cost o f Uncerta inty Shahin Boluki, Xiaoning Qian, and Edward R. Dou g herty Abstract —The mean objectiv e cost of uncertainty (MOCU) quantiﬁes the perf ormance cost of using an operator that is optimal acros s an u ncertainty class of systems as opposed to u sing an op erator that is optimal for a particular system. MOCU-based experimental design selects an experiment to maximally reduce MOCU , thereby gaining the greates t reduction of uncertainty impacting t h e operational ob j ectiv e. The original formulation applied to ﬁnding optimal system operators, where optimality is with respect to a cost function, such as mean-square error; and the prior distributi on gov erning the uncertainty class relates directly to the u nderlying physical system. Here we p ro vide a generalized MOCU and the corr esponding experimental design. W e then demonstrate how this n ew fo rmulation in cludes as special cases MOCU-based experimental design methods devel- oped for materials scien ce and genomic n etworks when th ere is experimental erro r . Most importantly , we show that the classical Knowledge G radient and Efﬁcient Global Optimization experimental design pr ocedures are actually implementations o f MOCU-based experimental design under their modeling assump- tions. I . I N T RO D U C T I O N The mean objectiv e cost of un certainty (MOCU) q uantiﬁes the pe r forman ce cost of using an o perator that is optimal across an u n certainty class of sy stem s as op posed to an ope rator that is o ptimal for a particular sy stem within th e class [ 1]. MOCU- based experimen tal design selects an expe riment th a t m axi- mally redu ces MOCU, thereby optimally redu cing u ncertainty with respect to the o perationa l objective [ 2]. For instance, if one wish e s to desig n a W iener ﬁlter when the relev ant p ower spectra not f ully known but belon g to a n unce rtainty class of power sp ectra, th en the p r oblem is to design a line ar ﬁlter that is optimal relative to bo th m ean-squar e erro r (MSE) an d the prob ability mass over the u ncertainty class. An op timal experiment maximally red uces MOCU re lative to uncer ta in ty in the relevant power spectra [3]. This letter provides a g eneralized f o rmulation of MOCU not necessarily dependen t on the particular ities of the under- lying system mo d el or inv olving a d esign pr oblem fo cused on operato r s. W e show that the corr espondin g g eneralized experimental design encom p asses existing form ulations in signal proc e ssing , gen omics, and mater ia ls d iscovery , and that it ﬁts with in Lind ley’ s par adigm fo r Bayesian experimental design [4]. W ithin this ge n eralized framework we examine the connectio n and dif ferences of MOCU-based form u lations with other Bayesian experime n tal design method s. In par ticu lar , we show that th e gen eralized MOCU gener ates the same po licies as Knowledge Gr a dient (KG) [5], [6] a n d Efﬁ cient Glob al S. Boluki, X. Qian and E. R. Dougherty are with the Department of Electric al and Computer Engineeri ng, T e xas A&M Uni versi ty , College Sta- tion, TX 77843 USA. Email: (s . boluki@t amu.edu, xqian@ec e.tamu.edu, and edwa rd@ece.ta mu.edu). Optimization (E GO ) [7] und er their m o deling assump tions, that is, for optimal experimental design und er Gaussian belief and obser vation noise f o r an o fﬂine ra nking and selectio n prob- lem. Not only does the gen eralized M OCU framework unify disparate problem s, it open s u p Bayesian expe r imental design for red u ction of objective related uncer ta in ty , as dem o nstrated by materials discovery u sing Ginzburg-Landau th eory . I I . G E N E R A L I Z E D M O C U W e ﬁrst f ormulate expe r imental design in terms of genera l- ized MOCU and then give th e standar d method b y simply deﬁning the ter m s in the gen eralized model app r opriately . In this letter, the lower case Gre ek letters den ote ran dom variables or distribution fun ctions an d capital Gree k letters denote the co r respond ing domain space. W e assume a prob- ability space Θ with probability m easure π , a set Ψ , a n d a function C : Θ × Ψ → [0 , ∞ ) , where Θ , π, Ψ , an d C are called the uncertain ty class , prior distribution , action space , and cost function , respectively . Elements of Θ and Ψ are called u n certainty parameters and a ctions , respectively . For any θ ∈ Θ , an optimal actio n is an elem e nt ψ θ ∈ Ψ su ch that C ( θ , ψ θ ) ≤ C ( θ, ψ ) for any ψ ∈ Ψ . An intrinsically Bayesian r obust (IBR) action is an element ψ Θ IBR ∈ Ψ such that E θ [ C ( θ, ψ Θ IBR )] ≤ E θ [ C ( θ, ψ )] for any ψ ∈ Ψ . Whereas ψ Θ IBR is op timal over Θ , for θ ∈ Θ , ψ θ is op timal relativ e to θ . The objective cost of un certainty is deﬁned b y the perfor mance loss of ap plying ψ Θ IBR instead of ψ θ on θ : U Ψ (Θ) = C ( θ , ψ Θ IBR ) − C ( θ, ψ θ ) . (1) A veraging this cost over Θ gives the me an ob jective cost of uncertainty (MOCU) : M Ψ (Θ) = E θ [ C ( θ, ψ Θ IBR ) − C ( θ , ψ θ )] . (2) The action space is arbitrary so lo ng as the cost function is deﬁned on Θ × Ψ . It can be a set of ﬁlters deﬁned on a random p rocess with C b eing m ean-squar e error or a set of drug intervention s with C quantify ing patient con dition. As noted in [1], MO CU can be viewed as the minim u m expected value of a Bayesian loss function that maps an operator to its dif ferential cost ( for using the gi ven ope r ator instead of an optimal operato r). Th e m inimum expectation is attained b y an op timal ro bust operator that minim izes th e av erage d ifferential cost. In decision theory , this differential cost is called the re g r et , which is d eﬁned as th e difference between the m aximum payo ff (for makin g an optim al decision) and th e actu al p ayoff (for th e d ecision that has been made). From this persp ectiv e , MOCU can b e viewed as the minimu m expected r egre t for using a robust op e rator . Suppose there is a set Ξ , called the experiment space , whose elements, ξ , called experiments , are jointly distributed with 2 the uncerta in ty parameters θ . Given ξ ∈ Ξ , the condition al distribution π ( θ | ξ ) is the p osterior distribution relative to ξ and Θ | ξ deno te s the correspon ding pro bability space, called the con ditional uncertain ty cla ss . Relativ e to Θ | ξ , we deﬁne IBR actions ψ Θ | ξ IBR and the cond itional (r emaining ) MOCU, M Ψ (Θ | ξ ) = E θ | ξ [ C ( θ, ψ Θ | ξ IBR ) − C ( θ , ψ θ )] , (3) where the expectation is with respe c t to π ( θ | ξ ) . T aking the expectation over ξ gives the expected remain ing MOCU, D Ψ (Θ , ξ ) = E ξ [M Ψ (Θ | ξ )] = E ξ [E θ | ξ [ C ( θ, ψ Θ | ξ IBR ) − C ( θ , ψ θ )]] , (4) which is called the experimental d esign valu e . An optimal experiment ξ ∗ ∈ Ξ m inimizes D Ψ (Θ , ξ ) , i.e., ξ ∗ = arg min ξ ∈ Ξ D Ψ (Θ , ξ ) . (5) ξ ∗ also minimizes the d ifference between the exp e c ted rem ain- ing MOCU and the current MOCU: ξ ∗ = argmin ξ ∈ Ξ D Ψ (Θ , ξ ) − M Ψ (Θ) = argmin ξ ∈ Ξ E ξ [E θ | ξ [ C ( θ, ψ Θ | ξ IBR ) − C ( θ , ψ θ )]] − E θ [ C ( θ, ψ Θ IBR ) − C ( θ , ψ θ )] = argmin ξ ∈ Ξ E ξ [E θ | ξ [ C ( θ, ψ Θ | ξ IBR )]] − E θ [ C ( θ, ψ Θ IBR )] . (6) W ith sequen tial experiments, the action space an d experi- ment space can be time depen d ent, i.e ., they can b e different for each tim e step . Hereaf ter , in sequential experiment setups, the action sp ace and experimen t space at time step t , and th e optimal experiment selecte d at t to be perf ormed at the next time step are den oted by Ψ t , Ξ t , an d ξ ∗ ,t , r e spectiv ely . L et π ( θ | ξ : t ) be the posterior distribution after observing th e se- lected experimen ts’ outco mes fr om th e ﬁr st time step throu gh t , and Θ | ξ : t denote the correspo nding c o nditiona l uncer tain ty class. When experiments are selecte d sequentially an d there is no ﬁxed limited budget of experimen ts but instead the experimenter wants to stop th e iter ati ve proc edure when o nly negligible knowledge regardin g the o b jectiv e can be gained from ad ditional experiments, the f orm in (6) is useful beca use it in c o rporate s th e difference between the expec te d remaining MOCU and the curren t MOCU. The iter ativ e procedu re may be stopped if it falls b elow a thresho ld. While th is p rocedur e is optimal at each step , it is no t optim al giv en a ﬁxed n umber of experiments to be perfor med. This latter kind o f ﬁnite-h orizon optimal design using MOCU is trea ted in [8] using dynamic progr amming. In the standard for mulation, MOCU depen d s on a c lass of op erators applied to a p a rameterized phy sical model in which θ is a ran d om vector whose distribution d e pends on a phy sical ch aracterization of the uncertainty . For instance, in a gene regulato ry ne twork, uncertainty arises regarding regulations and exper imental design dec ides which u n known regulations sho uld be d etermined via exp e r iments so as to minimize the c o st of u ncertainty relative to the objective of minimizing the long -run likelihood o f the cell bein g in a cancerou s state [1], [2], [9]. Θ is an uncertainty class of system models par a m eterized by a vector θ g overned by a p robability distribution π ( θ ) and Ψ is a class of o perators on th e m odels whose p erform ances are measured by C . For each oper ator ψ , C ( θ, ψ ) is the cost of ap plying ψ o n mo d el θ ∈ Θ . Initially propo sed f or o ptimal in tervention in Markovian regulatory net- works [1] and optimal r obust cla ssiﬁcation [1 0], IBR operators have b een designed for linear and m orpho logical ﬁlters [11] and Kalman ﬁlters [ 1 2]. As originally formu lated [2], experimental design in volves k exper iments T 1 , . . . , T k , where experim ent T i exactly deter- mines the u ncertain parame ter θ i in θ = ( θ 1 , θ 2 , . . . , θ k ) ∈ Θ . The cond itio nal u ncertainty vector θ | θ i is com posed of all uncertain par a meters other than θ i , with θ i now d etermined by T i . Θ | θ i is the r educed uncertainty class g iv e n θ i . The IBR operator for Θ | θ i , the re m aining MOCU given θ i , and the ex- perimental d esign value take the form s ψ Θ | θ i IBR , M Ψ (Θ | θ i ) , a n d D( θ i ) = E θ i [M Ψ (Θ | θ i )] , respectively . Th e optimal e xperiment T i ∗ is speciﬁed by i ∗ = ar g min i =1 ,...,k D( θ i ) . Returning to the ge n eralized MOCU form u lation, th ere is wide ﬂexibility in experimental design, dep ending on the assumptions regard ing the uncerta in ty class, action space, and experiment sp ace, leading to many existing Bayesian experi- mental design formu lations. Bayesian exper imental design has a long history , in particular, u tilizing the expected gain in Shannon info rmation [13], [14], [15], [16]. I n 197 2, Lindley propo sed a gener al decision theore tic appr oach in c o rporatin g a two-p a rt decision in volving the selection of an expe r iment followed b y a terminal d e c ision [4]. Su pposing λ is a design selected from a family Λ a n d X is a data vector, an d leaving out the terminal dec isio n , an optimal experime n t is given b y λ ∗ = arg max λ ∈ Λ E X [E Θ [ U ( θ, X , λ ) | X , λ ] | λ ] , (7) where U is a utility function (see [17] for the full dec ision- theoretic optimization ). W ith genera lized MOCU, each experimen t ξ corr esponds to a data vector X | ξ and the expected re maining MOCU is E ξ [M Ψ (Θ | X , ξ )] = E X | ξ [E Θ [ C θ | ( X | ξ ) ( ψ Θ | ( X | ξ ) IBR ) − C θ | ( X | ξ ) ( ψ θ | ( X | ξ ) )]] = E X | ξ [E Θ [U Ψ ( θ, X , ξ ; Θ)]] . (8) From (8), the optimization o f (5) can be expr essed in th e same f orm as (7), with ξ in place o f λ and utility func tion − U Ψ ( θ, X , ξ ; Θ) . Hence, in d escending ord er of gener a lity , we have Lindley’ s proced u re, gen e ralized MOCU, a nd MO CU. Th e salient point regarding the latter is that the un c ertainty is on th e underly ing random pr ocess, mean ing the science , and its aim is to design a better oper ator on th e u nderlyin g p rocess. As stated in [1 8], there is a scientiﬁc gap in constructing function al mode ls and makin g pr ior assum ptions o n model p arameters when the actual u ncertainty app lies to the unde r lying rando m processes. W e next show how gener a lize d MOCU includes o ther existing objective-based expe rimental-de sig n formu lations. I I I . G U I D I N G S I M U L A T I O N S I N M AT E R I A L S D I S C O V E RY In [19], optimal experimental desig n based on MOCU is applied to a computation a l problem for shape mem- ory alloy (SMA) design with desired stress-strain proﬁles 3 for a particular dopant at a g iv e n con centration utilizin g time-depen dent Ginzburg-Landau ( TDGL) theor y . The TDGL model simulates the free energy for a speciﬁc do pant with a spec iﬁed concentr ation, given the do pant’ s parameter s. Th e assumption is that there is a set D = { d 1 , . . . , d N } of N po- tential dopants and each dopant d i can be characterized by two parameters, its strength h i and its range of stress disturbance r i . The con centration of the dopants can be selected from a set O = { o 1 , . . . , o P } of P p re-speciﬁed values. The true values of these do pant parameters are unknown; ho wev er , there exists a p rior distribution over the dop ant p arameters. In summary , we have Θ = H × R and θ = [ h, r ] , where h = [ h 1 , . . . , h N ] and r = [ r 1 , . . . , r N ] , and H a n d R repr esent the sample spaces of h and r , re sp ectiv e ly . T h us, θ i = [ h i , r i ] fully characterizes dopan t d i . Since the compu tational complexity of th e T D GL m odel is enorm ous, the goal is to ﬁnd an o ptimal dop a nt an d concentr a tion to minim iz e the simulated energy d issipation, with the least number of times runn ing th e TDGL model (least number of experim ents). Follo wing [19], fo r this purp ose, a surrogate model g ( h, r , o ) is traine d based on ﬁtting som e initial d ata gener ated from the TDGL mod el. The surrogate model can a pprox imately pr edict a dissipation e n ergy for a speciﬁed do pant and concentratio n , and it is used as the co st function thro ughou t the experimental design iteration s. T he TDGL model acts as the tru e u nderlyin g system, or Natu re, and th e su rrogate mode l is the model of the tru e system. Th e action space is Ψ = { ψ d i ,o j } d i ∈ D, o j ∈ O , wher e each action ψ d i ,o j is using the i th dopan t with the j th possible concen- tration. T he cost function is C ( θ , ψ d i ,o j ) = g ( h i , r i , o j ) . The experiment space is Ξ = { ξ d i ,o j } d i ∈ D, o j ∈ O , where ξ d i ,o j cor- respond s to obtainin g a noisy measurem ent of the dissipation energy when using the i th dopan t with the j th concentr a tion. ξ d i ,o j ∼ f ( ξ d i ,o j | θ i ) , wher e f is a p robability d istribution. In this fram ew o r k, the I BR ac tio n at time step t is ψ Θ | ξ : t IBR = a rgmin ψ ∈ Ψ E θ | ξ : t  C ( θ, ψ )  = a rg min ψ d i ,o j ∈ Ψ E θ | ξ : t  g ( h i , r i , o j )] . (9) From (4) and ( 5), th e optima l exper iment at time step t is ξ ∗ ,t = arg min ξ ∈ Ξ E ξ [E θ | ξ [ C ( θ, ψ Θ | ξ ,ξ : t IBR ) − C ( θ , ψ θ )]] = arg min ξ d i ,o j ∈ Ξ E ξ d i ,o j [E θ | ξ d i ,o j ,ξ : t [ C ( θ, ψ Θ | ξ : t +1 IBR )]] , (10) where the secon d equality is due to the indepen dence of C ( θ, ψ θ ) from ξ d i ,o j . The last lin e of (10) is exactly the p o licy propo sed in [19] fo r this materials scienc e p roblem. I V . D Y N A M I C A L G E N E T I C N E T W O R K S In [9], optimal objective-based experimen tal design is derived for networks with multiple dynam ic trajectories, modeling in [9] is based on [20]. Brieﬂy , the network’ s nodes and their c orrespon ding values repre sent entities, pro - teins/chemicals or genes, an d their co rrespon d ing concentra - tion levels or expression lev els, respectively . T he values are assumed to be nonnegative integers. Each edge represents an interaction with its input, regulation, and output no des. Each interaction can dy namically hap pen if all of its in put an d activ ator nodes are nonzero a nd its inhibito r nod es are z ero. All interactio ns are kn own. When the network is in state x , it can have one or more possible intera c tions based on the node values, wh ere if any takes p lace, the network tr ansitions to a next state. When multip le in teractions exist, if knowledge of the relative priorities o f these co mpeting interactions exist, we can co m pletely determ ine th e state trajectory o f the network from an initial state x 0 . The assumptio n is that these relati ve pr iorities are n ot known but can be measur ed one at a time with experimental er ror . If the network has R o f these competing interactions, i.e., interactions tha t can d ynamically happen a t the sam e time , then th e un certainty class consists o f a set of R Boo lean random variables, Θ = { 0 , 1 } R , an d θ = ( θ 1 , ..., θ R ) , wh ere θ i ∈ { 0 , 1 } i =1 ,...,R . The i th experiment can determin e the value of θ i with an experim ental erro r ha ving prob a bility δ i . Speciﬁcally , if θ i is selected to be m easured, with p r obability 1 − δ i the outcome of the experiment is θ i , and with p robability δ i is 1 − θ i . He r e, Ξ = { ξ 1 , ..., ξ R } , each experiment ξ i correspo n ds to measu r ing θ i , and ξ i | θ i = ( θ i with pro bability 1 − δ i , 1 − θ i with pro bability δ i . (11) An actio n blo cks an in teraction fr o m happ ening, so th e action space is Ψ = { ψ 1 , ..., ψ A } , where A is the num ber of interactions that can be blocked. Each ac tion changes the dynamic tr ajectory of th e n etwork. If the set of p ossible state trajectories is denoted by S Θ ψ i when the i th action ( ψ i ) is taken, then the prob ability o f each trajectory s ∈ S Θ ψ i is P S Θ ψ i ( s ) = E x 0  E θ [ 1 s x 0 ,θ ( ψ i )= s ]  , (12) where 1 w is the indicator fun c tion ( 1 w = 1 if w is true a n d is 0 o therwise), and s x 0 ,θ ( ψ i ) is the deterministic tr ajectory for a ﬁxed initial state x 0 and θ , when action ψ i is taken. Here, S Θ ψ i = ∪ x 0 ∈ X 0 ∪ θ ∈ Θ s x 0 ,θ ( ψ i ) , where X 0 denotes the set of all possible initial states. For each trajectory s , the dy namic perfor mance c ost ε ( s ) is de ﬁned as the distance (in terms of any a pprop riate norm) of the stead y-state vector cor r espondin g to that trajector y ( x s f ) from a desired d istribution v , i.e. ε ( s ) = || x s f − v || . Thus, the cost func tio n fo r a ﬁxed θ an d action ψ is the expected co st over the possible tr ajectories, C ( θ, ψ ) = E S Θ ψ [ ε ( s )] . The IBR action fo r th is problem is ψ Θ IBR = ar g min ψ ∈{ ψ 1 ,...,ψ A } E θ [ C ( θ, ψ )] . (13) According to (4) and (5), the optima l experiment can be derived a s ξ ∗ = ar g min ξ i ∈ Ξ E ξ i [E θ | ξ i [ C ( θ, ψ Θ | ξ i IBR ) − C ( θ , ψ θ )]] = ar g min ξ i ∈ Ξ E ξ i [E θ i | ξ i [E θ \ θ i [ C ( θ, ψ Θ | ξ i IBR ) − C ( θ , ψ θ )]]] = ar g min ξ i ∈ Ξ E θ i [E ξ i | θ i [E θ \ θ i [ C ( θ, ψ Θ | ξ i IBR ) − C ( θ , ψ θ )]]] = ar g min ξ i ∈ Ξ E θ i [E ξ i | θ i [E θ \ θ i [ C ( θ, ψ Θ | ξ i IBR )]]] , (14) 4 where “ \ ” denotes set subtraction in the subscripts. The second line holds because o nly the posterior distribution of θ i depend s on experiment ξ i ; and the last equality follows f rom the indepen d ence of C ( θ , ψ θ ) from ξ i . The last line is exactly the policy derived in [ 9] but there the p olicy d eriv ation was based on add ing the objective-based cost of experimen tal er ror to the previous no tio n of objecti ve cost of uncertainty , whereas here we directly a p ply the generalized fo r mulation o f MOCU as we have fo r mulated in Section II. V . C O N N E C T I O N O F M O C U - BA S E D E X P E R I M E N TA L D E S I G N W I T H K G A N D E G O Knowledge Gradien t (KG) [5], [6], which is used in dif- ferent ﬁelds, from drug d iscovery to material de sign [21], [22], was originally intr oduced as a solu tion to an ofﬂine ranking a nd selection problem , wh ere the assumption is that there are A ≥ 2 action s (alternatives) that can be selecte d , i.e., Ψ = { ψ 1 , . . . , ψ A } . Each action h as a n unknown tr u e rew a rd (sign- ﬂip ped cost) and at each time step an exp e riment provides a no isy o bservation of the reward o f a selected action . There is a limited budget ( B ) of the num b er o f measu r ements we ca n make b efore the time arr i ves to decide which a ction is the best, that being the on e having the lowest exp e cted co st (or the high est expected reward). The assumption is that we h av e Gaussian p rior b eliefs over the unknown rew a rds, either indepen dent Gaussian beliefs over the r ew ar d s wh en the r ewards of different action s ar e uncorr elated, or a jo int Gaussian belief wh en the rewards are correlated. In the independ ent case, f or eac h action-r ew ar d pair ( ψ i , θ ψ i ) , θ ψ i ∼ N ( m ψ i , β ψ i ) . In th e corr elated case, the vec- tor of re wards, [ θ ψ 1 , . . . , θ ψ A ] , has a m ultiv ariate Gaussian dis- tribution N ( m, Σ) with the mean vector m = [ m ψ 1 , . . . , m ψ A ] and c ov ariance matr ix Σ , with d iagonal en tries [ β ψ 1 , . . . , β ψ A ] . If the selected action to be applied at t is ψ t , then the observed noisy reward of ψ t at that iteration is ξ t = θ ψ t + ǫ t , wher e θ ψ t is unkn own and ǫ t ∼ N (0 , λ ψ t ) is ind ependen t o f the reward of ψ t . Here, th e un derlying system to learn is th e unk nown re ward function and each possible m odel is fully described by a rew a rd vector θ = [ θ ψ 1 , θ ψ 2 , . . . , θ ψ A ] in the uncertainty class Θ . For the indep endent case, π ( θ ) = Q A i =1 N ( m ψ i , β ψ i ) . For the cor related case, π ( θ ) = N ( m, Σ) . The expe r iment space is Ξ = { ξ 1 , . . . , ξ A } , whe r e experiment ξ i correspo n ds to applying ψ i and g etting a noisy observation of its rew ard θ ψ i , that is, measuring θ ψ i with observation noise, where ξ i | θ ψ i ∼ N ( θ ψ i , λ ψ i ) . I n the indep e ndent c a se the state of knowledge at e a ch time point t is cap tured by the posterio r values of the means and variances for the rew ards after incorpo rating obser vations ξ : t as S t = [( m t ψ , β t ψ )] ψ ∈ Ψ , and in the correlated case by the posterior vector of means and a covariance matrix af ter ob serving ξ : t as S t = ( m t , Σ t ) , where m t = [ m t ψ 1 , . . . , m t ψ A ] and the diagon al o f Σ t is the vector [ β t ψ 1 , . . . , β t ψ A ] . The pr obability space Θ | ξ : t is equal to Θ | S t and the cost fu nction is C ( θ , ψ ) = − θ ψ . For this problem, the IBR action at tim e step t is ψ Θ | ξ : t IBR = arg min ψ ∈ Ψ E Θ | ξ : t  C ( θ, ψ )  = arg min ψ ∈ Ψ E Θ | ξ : t  − θ ψ  = arg max ψ ∈ Ψ E Θ | ξ : t  θ ψ  = arg ma x ψ ∈ Ψ m t ψ , (15) Again, by (4) an d (5), the op timal experiment at time step t can be de r iv e d : ξ ∗ ,t = ar g min ξ i ∈ Ξ E ξ i | ξ : t [E θ | ξ i ,ξ : t [ C ( θ, ψ Θ | ξ : t ,ξ i IBR )]] − E θ | ξ : t [ C ( θ, ψ Θ | ξ : t IBR )] = ar g min ξ i ∈ Ξ E ξ i | ξ : t h E θ | ξ : t +1  − θ ψ Θ | ξ : t +1 IBR  i − E θ | ξ : t  − θ ψ Θ | ξ : t IBR  = ar gmax ξ i ∈ Ξ E ξ i | ξ : t h E θ | ξ : t +1  θ ψ Θ | ξ : t +1 IBR  i − E θ | ξ : t  θ ψ Θ | ξ : t IBR  = ar gmax ξ i ∈ Ξ E ξ i | ξ : t h max ψ ′ ∈ Ψ m t +1 ψ ′ i − max ψ ′ ∈ Ψ m t ψ ′ . (16) The d eriv ed policy (16) by direct ap plication of the gener alized MOCU is exactly the sam e as the original KG po licy in [5], [6], and [23]. As KG is shown to be o ptimal when the horizon is a single measur ement an d asymptotically optimal (the num ber of measur ements goes to inﬁnity), th e same h o lds for the MOCU-based p o licy for this p roblem. Efﬁcient Glo b al Optimiza tio n (EGO) [7], which is b ased on expected imp rovement ( EI), is widely used for black- box optimization and exper imental design . As shown in [22], KG redu ces to EGO when there is no observation noise and choosing the b e st action at each time step is limited to selecting from the set of actions wh ose r ew ar d s have been previously observed; that is, at each time step if we want to make a ﬁnal decision as to th e b est actio n to be applied , it must be an action w h ose pe r forman ce has been previously observed fr o m the ﬁrst time step up to that time. Thus, MOCU-based learning can also b e red uced to EGO u nder its m odel assumptio ns. W e will show this dir ectly . Consider the ran k ing and selection pro blem with no n o ise in th e observations, so that ǫ t = 0 for all t . Each exper iment ξ i correspo n ds to apply ing ψ i and observing the true value o f θ ψ i . Mor e over , the ch o ice of th e best action at ea c h time step is conﬁne d to th e set of actio n s w h ose rewards hav e been pre- viously ob served. Let Ψ t denote this set: Ψ t = { ψ t ′ } t ′ =1 ,...,t . The IBR action at tim e t is ψ Θ | ξ : t IBR = arg min ψ ∈ Ψ t E Θ | ξ : t  − θ ψ  = arg max ψ ∈ Ψ t θ ψ , (17) where the last equality is d ue to the fact that th e rew ard of an action who se perf o rmance is alr e a dy obser ved is known, since there is n o observation noise. Let Z t = { ξ t ′ } t ′ =1 ,...,t denote the set of experime n ts perform ed up to the curren t time t , where experiment ξ t ′ correspo n ds to ψ t ′ being applied at t ′ and its rew a r d being ob served, in other words, measurement of θ ψ t ′ at t ′ . Since there is no po in t in measur in g an actio n’ s rew a rd mo re than once, th e next experim ent is selected fr om the set of remaining exper iments, so that the experiment space at tim e step t is Ξ t = Ξ \ Z t . From ( 4), (5), and (17), the 5 optimal experimen t selected at t is ξ ∗ ,t = ar g min ξ i ∈ Ξ t E ξ i | ξ : t h E θ | ξ : t +1  − θ ψ Θ | ξ : t +1 IBR  i − E θ | ξ : t  − θ ψ Θ | ξ : t IBR  = ar g max ξ i ∈ Ξ \ Z t E θ ψ i | ξ : t h max  θ ψ i , max ψ ′ ∈ Ψ t θ ψ ′  i − ma x ψ ′ ∈ Ψ t θ ψ ′ = ar g max ξ i ∈ Ξ \ Z t E θ ψ i | ξ : t h max  θ ψ i − max ψ ′ ∈ Ψ t θ ψ ′ , 0  i , (18) which is exactly the EGO p o licy in [7]. There are fund amental differences between th e general MOCU fo rmulation and KG (o r EGO): (1) with MOCU the experiment space and a c tion space can be different, en abling more ﬂexible e xperimen tal design com pared to the assumption of the same exper iment and a c tion space in KG (or EGO); (2 ) MOCU considers th e u n certainty dire ctly o n the unde rlying physical mod el, wh ich allows direct incor poration o f pr io r knowledge regard ing the u nderlyin g system, wher eas in KG (or EGO) the uncertain ty is con sidered on the r eward functio n and th ere is no direct co nnection between prior assum ptions and the und erlying ph ysical model. V I . A S I M U L A T I O N S T U DY T O C O M P A R E M O C U - BA S E D E X P E R I M E N T A L D E S I G N A N D K G In this section, we perf o rm a simulation study to illustrate the ﬂexibility of MOCU-based experim e ntal design compa r ed to KG, esp ecially the impo rtance of the ﬂexibility of dissect- ing the uncertain ty class assump tions to better inco rporate prior kn owledge regard ing the under lying m odel. Here we compare the experimen tal design per forman ces by MOCU and KG b ased on a simulated quadra tic function examp le with o ne inp ut variable as the under lying rew ar d function that we want to maximize: f ( θ, ψ ) = θ 1 ψ 2 + θ 2 ψ + θ 3 , i.e. C ( θ , ψ ) = − f ( θ , ψ ) . The observation noise is ad ditive Gaussian with the d istribution N (0 , θ 2 4 ) . In this simu lation model, θ 1 , θ 2 , θ 3 and θ 4 are un k nown parame te r s. W e take Ψ = { ψ 1 , ..., ψ 20 } = { 0 . 5 , 1 , 1 . 5 , ..., 10 } as th e set of actions (possible inpu t values ψ ). Th e correspon ding experiment f or each action is to apply ψ i so that we can observe the outco me ξ i (the reward): ξ i | θ ∼ N ( θ 1 ψ 2 i + θ 2 ψ i + θ 3 , θ 2 4 ) . (19) Note that as shown in Sectio n V, un der model assump tions of KG, MOCU-based exp e rimental design results in the same policy as KG. But here, as oppo sed to KG that directly models the rewards (and c orrespon ding costs) of actions with Gaussian distributions with (prior ) ﬁxed parameter values (either k nown or estimated), MOCU-based experimental design computes the generalized MOCU by mo deling the unce r tainty of th e rew ard function by inco rporatin g the uncertainty over the und erlying parameters, to gu ide the experimental design p rocedu re. For both MOCU-based exper im ental design an d KG, we assume tha t there is no p r ior knowledge on the model param e- ters θ = [ θ 1 , θ 2 , θ 3 , θ 4 ] . For MOCU, th e non-in formative p rior π ( θ ) ∝ θ − 2 4 is used, which u pdates to a Gau ssian-in verse- gamma distribution ( π ∗ ( θ ) ) when measurements become avail- able when experim ents are c a rried out in sequenc e . For KG, to mo d el the rewards of actions dire ctly with cor related Gaussian distributions, app roximate b e liefs ar e co nstructed at each experiment since the n oise variance is un known and no joint Gau ssian prior distribution exists over the rew ard values of the actio ns. For this approx imation, following [24] and [22], a Gau ssian pr ocess regression ( GPR) mo del [25] with a q uadratic basis (mean ) function and a squared exponential covariance m atrix with additive Gau ssian observation noise is trained using the mea surements performed (expe riment outcomes observed) up to that time step ( b y maximizing the marginal lo g -likelihood of the observations). In our simulation, θ 1 is d rawn fr om U ( − 5 , 2) ( U ( a, b ) denotes the uniform distribution ov er th e interval ( a, b ) ); θ 2 is set to − 2 θ 1 r , where r is drawn fro m U ( − 2 . 5 , 13) ; θ 3 is sampled from U ( − 5 , 5) ; and θ 4 is set to σ ( f ) × w , where w ∼ U (0 . 0 75 , 0 . 7) and σ ( f ) denotes the true standard deviation of the re ward values of actions based o n the given model parameters. Eac h simulation starts with four ran d omly selected actio ns, fo r which noisy observations of th eir rewards are simulated as initial train ing da ta to both MOCU-b ased experimental design and KG. The sequential experimental design pro cedures based on MOCU and KG are both co n tinued for ﬁve iteratio ns. For KG at each time step t , th e (p osterior) vector of means ( m t ), the cov ariance matrix ( Σ t ), and the noise variance a re estimated by training a GPR m odel o n the av ailable measurements, and the next experimen t is selected b y (16). For MOCU-b ased experim ental d e sign at each tim e step t , the (posterior) Gaussian-inverse-gamma d istribution after incorpo rating the av ailable m e asurements is used in (6) to optimally select the n ext experim ent. T o com pare the perfo rmances, we check th e average o p - portun ity cost me tr ic, d eﬁned as the difference between the true max imum of the r ew ard among all the a ctions and the true reward of th e actio n selected as th e b est on e based on two experime n tal d esign strategies. No te tha t this best action might be different from the n ext suggested experimen t by each po licy . The best action at each time step is the on e that would be selected to be applied if the itera tive e xperimen ts are stopped at th at tim e. In other words, ea ch exper imental design policy sugg ests the next experiment, and after observing th e outcome and b ased on its u pdated beliefs selects th e best action (that would be applied if th e iterativ e exp e riments were to stop) and th e next experim ent to be perform ed (if experimental budget is not exhau sted ). When following the MOCU-b ased policy , the next sug gested exper iment is the minim izer of the expected r emaining MOCU, but the best action at each time step is the I BR action th at ma x imizes (min imizes) the expectation of the reward (co st) with respe c t to the ( p osterior) Gaussian-inv erse-gamm a distribution of un certain param eters based on the latest belief a t that time step. When following the KG policy , the best actio n at each time step is the one that maximizes the (posterior ) GPR mean value at that time step which m ight be different from the sugge sted next experimen t by KG. Figure 1 illustrates the average op portun ity cost for MOCU- based experimental design and KG over 1,000 simu lation r u ns. 6 As can b e seen fro m th e ﬁgure , as soon as the exper imental design iterations begin MOCU-based policy co n sistently has the lower av e r age oppo rtunity cost compa r ed to KG. Th is conﬁrms that directly incorp o rating the mo d el uncertain ty (the uncertainty o f model pa r ameters in this simulation study as we assume that we have the model functio nal for m) in the generalized MOCU fr a mew o rk results in a better experime n tal design policy . Note th a t at itera tio n 0 no experimen t selectio n by any of the m e th ods is pe rformed , and only four ran domly selected exper iment ou tcomes are av ailable. Since the ﬂat (non- informative) p rior is assumed for the parameters in the MOCU-based framework, the IBR action selection a s th e best ac tion can be very conservative befo re beginning th e experimental design procedu re. Th e maxim ize r o f the direc t approx imation of the reward function by GPR at iteration 0 is better tha n th e I BR action for this simple simulation model. But a s soo n as the ﬁrst exp e riment is selected by the policies, MOCU-based policy greatly reduces the uncertain ty per taining to the objective very sharply with the o bserved measure m ents and perfo rms co n sistently better than KG. 0 1 2 3 4 5 Experiment design iteration 0 5 10 15 20 25 30 Average oportunity cost MOCU-based policy KG Fig. 1. A verage opportunit y cost of MOCU-based poli cy compare d with KG polic y . V I I . C O N C L U S I O N S In this letter, we p resent a gen eralized MOCU framework, leading to the M OCU-based experimen tal design pertainin g to the maximum u ncertainty reduction o f differential cost with respect to the actual operation al o b jectiv es. The pr o- posed fram ew o r k ﬁts into L indley’ s u tility paradigm [4] in classical Bayesian experim e ntal design and is more ﬂexible for the d ev elopmen t of co r respond ing experimental d esign strategies for different real-world applicatio ns compared to the existing KG and EGO method s with their c orrespon ding model assumptions. As we have shown in the simulatio n study ( Sec tion VI) and in the rec ent app lications to life and materials scienc e (Sections I II and IV ) , our ge n eralized MOCU framework, with the beneﬁts from ﬂexible dissection o f the uncertainty c lass, action (op erator) space, experiment space, and utility function depend in g on o perationa l objectives, can lead to better objective-based u ncertainty quan tiﬁcation and thereafter better experimenta l design to converge to desired objectives with smaller o perationa l cost. A C K N OW L E D G M E N T S The au thors ack nowledge the support of NSF through th e projects CAREER: Kno wled ge-driven An alytics, Mod el Uncer - tainty , an d Exp eriment Design , NSF-CCF-1553 281; and DM- REF: A c c elerating th e Development of Phase-T ransforming Heter ogeneous Materials: Applica tion to High T emperatur e Shape Memory Alloys , NSF-CMMI -1534 534. R E F E R E N C E S [1] Byung-Jun Y oon, Xiaoning Qian, and Edward R. Dougherty , “Quanti- fying the objecti ve cost of uncertaint y in comple x dynamical s ystems, ” IEEE T ransactions on Signal Proce ssing , vol. 61, no. 9, pp. 2256–226 6, May 2013. [2] Roozb eh Dehghannasiri , Byung-Jun Y oon, and Edward R. Dougherty , “Optimal e xperimental design for gene reg ulatory netwo rks in the pres- ence of uncertainty , ” IEEE/ACM T rans. Comput. Biol. Bioinformatics , vol. 12, no. 4, pp. 938–950, July 2015. [3] Roozb eh Dehghanna siri, Xiaoning Qian, and Edward R. Dougherty , “Optimal experi mental design in the conte xt of canonical expa nsions, ” IET Signal Proc essing , vol. 11, pp. 942–951, October 2017. [4] Dennis V . Lindley , Bayesian Statist ics, A Revie w , SIAM, Philadelphi a, 1972. [5] Pete r I. Frazier , W arren B. Powel l, and Sav as Dayanik, “ A knowledge - gradien t polic y for sequent ial information coll ectio n, ” SIAM J ournal on Contr ol and Optimizati on , vol. 47, no. 5, pp. 2410–2439, 2008. [6] Pete r I. F razier , W arren B. Powell , and Sav as Dayanik, “The knowle dge- gradien t polic y for correlated normal beliefs, ” INFORMS Journa l on Computing , vol. 21, no. 4, pp. 599–613, 2009. [7] Donald R Jones, Matthia s Schonlau, and Will iam J W elch, “Efﬁci ent global optimiz ation of expensi ve black-b ox functi ons, ” J ournal of Global optimizat ion , vol. 13, no. 4, pp. 455–492, 1998. [8] Mahdi Imani, Roozbeh Dehghannasiri, Ulisses M. Braga-Neto, and Edward R. Dougherty , “MOCU signiﬁcantl y outpe rforms entropy for expe rimental design, ” Submitted , 2017. [9] Danie l N. Mohseniza deh, Roozbeh Dehghannasiri, and Edward R. Dougherty , “Optimal objecti ve-ba sed experimenta l design for uncer - tain dynamical gene netwo rks with e xperimental error , ” IEEE /A CM T ransacti ons on Computatio nal Biology and Bioinformatics , 2017, doi: 10.1109/TCBB.2016 .2602873. [10] L ori A. Dalton and Edward R. Dougherty , “Optimal classiﬁers with minimum expecte d error within a Bayesian framewor k–part i: Discrete and G aussian models, ” P attern R eco gnition , vol. 46, no. 5, pp. 1301– 1314, 2013. [11] L ori A. D alton and Edward R. Doughert y , “Intrinsi cally optimal Bayesia n rob ust ﬁltering, ” IEE E Tr ansactions on Signal Pro cessing , vol. 62, no. 3, pp. 657–670, Feb 2014. [12] Roozbeh Dehghannasiri, Mohammad S . Es faha ni, and Edward R. Dougherty , “Intrinsic ally Bayesian robust Kalman ﬁlter: An innov ation process approach, ” IE EE T ransact ions on Signal Pr ocessing , vol. 65, no. 10, pp. 2531–2546, May 2017. [13] M. Stone, “ Applicati on of a measure of information to the design and comparison of regression experiment s , ” Annals of Mathematical Statist ics , vol. 30, pp. 55–70, 1959. [14] Morris H. DeGroot, “Unc ertai nty , information and sequential expe ri- ments, ” Annals of Mathemati cal Statistic s , vol. 33, no. 2, pp. 404–419, 1962. [15] Morris H. DeGroot, Concepts of Informati on Based on Utility , pp. 265– 275, Springe r Netherlands, Dordrecht, 1986. [16] J os ´ e M. Bernardo, “Expected information as expect ed utili ty , ” Annals of Statistics , vol. 7, no. 3, pp. 686–690, 1979. [17] Kathryn Chaloner and Isabella V erdinelli , “Bayesian experi m ental design: A re view , ” Statisti cal Science , vol. 10, no. 3, pp. 273–304, 1995. [18] Xiaoning Qian and Edward R. Dougherty , “Bayesian regression w ith netw ork prior: Optimal Bayesian ﬁltering perspecti ve, ” IEEE T ransac- tions on Signal P r ocessing , vol. 64, no. 23, pp. 6243–6253, 2016. 7 [19] Roozbeh Dehghannasiri, Dezhen Xue, Prasanna V . Balacha ndran, Mo- hammadmahdi R. Y ouseﬁ, Lori A. Dalton, T urab Lookman, and Ed- ward R. Dougherty , “Optimal e xperimental design for materials discov - ery , ” Computational Mat erials Science , v ol. 129, no. Supp lement C, pp. 311–322, 2017. [20] Daniel N. Moh senizade h, Jianping Hua, Michael Bi ttner , and Edward R. Dougherty , “Dynamical modeling of uncertain intera ction-b ased ge- nomic network s, ” BMC B ioinformat ics , vol. 16, no. 13, pp. S3, Dec 2015. [21] Si Chen, Kristofer-Ro y G. Reye s, Manee s h K. Gupta, Michael C. McAlpine , and W arren B. Powe ll, “Optimal learning in experimenta l design using the knowle dge gradient polic y with applicatio n to char- acteri zing nanoemulsion stabili ty , ” SIAM/ASA J ournal on Uncertainty Quantiﬁc ation , vol. 3, no. 1, pp. 320–345, 2015. [22] Peter I. Frazier and Jialei W ang, “Bayesia n optimiz ation for materia ls design, ” Informati on science for materials discovery and design , vol. 225, pp. 45–57, 2016. [23] Yi ngfei W ang, Kristofer G. Reye s, Keit h A. Brown , Chad A. Mirkin, and W arren B. Powell, “Nested-batc h-mode learning and stochastic optimiza tion with an applicati on to sequent ial multistage testing in materia ls science, ” SIAM J. Scientiﬁc Computing , vol. 37, 2015. [24] W arren Sco tt, Pet er Fraz ier , and W arren Po well, “The correla ted kno wl- edge gradient for simulatio n optimizatio n of continu ous parameters using Gaussian process regressi on, ” SIAM J ournal on Optimization , vol. 21, no. 3, pp. 996–1026, 2011. [25] Carl E. Rasmussen and Christopher K.I. Wi lliams, Gaussian P r ocesses for Machine Learning , Adaptati ve computation and machine learning series. Univ ersity Press G roup Limited, 2006.

Experimental Design via Generalized Mean Objective Cost of Uncertainty

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment