Algorithms for Probabilistically-Constrained Models of Risk-Averse Stochastic Optimization with Black-Box Distributions

We consider various stochastic models that incorporate the notion of risk-averseness into the standard 2-stage recourse model, and develop novel techniques for solving the algorithmic problems arising in these models. A key notable feature of our wor…

Authors: Chaitanya Swamy

Algorithm s for Probabilistically-C onstrained Models of Risk-A verse Stoch astic Optimization with Black-Box Distrib utions Chaitanya Sw amy ∗ Abstract W e consider various s tochastic models that incorpor ate the notion of risk-averseness into the standar d 2-stage recourse mode l, and dev elop novel technique s for solving th e algorithm ic p roblems arising in these models. A key n otable feature o f our work that disting uishes it from work in some o ther related models, s uch as th e (stand ard) budget model and the (d emand-) robust mode l, is tha t we o btain results in the black-box setting, that is, w here one is given only sam pling acce ss to the under lying distribution. Our first mode l, which we call the risk-averse budget model , in corpor ates the n otion o f r isk-averseness via a pr obabilistic constraint that restricts the prob ability (accor ding to the underly ing distribution) with which the secon d-stage cost may exceed a gi ven b udget B to at most a given inp ut threshold ρ . W e also a consider a closely-related model that we call the risk-averse r ob ust model , where we seek to minimize the first-stage cost and the (1 − ρ ) -quan tile (according to the distribution) of the second-stage cost. W e obtain app roximatio n algorithm s for a v ariety of combinato rial op timization problems including the set cover , vertex cover , multicut on tre es, min cut, and facility location pro blems, in the risk-averse budget and rob ust mo dels with blac k-bo x distributions . W e first devise a fully polyn omial app r oximation scheme for solving the LP-r elaxations of a wide-variety o f risk-averse budgeted prob lems. Com plement- ing this, w e give a ro unding proced ure that lets us u se existing LP-based appro ximation algo rithms f or the 2-stage stochastic and/o r deterministic coun terpart of the problem to ro und the fractio nal solu tion. Thus, we obtain n ear-optimal solutions to r isk-averse pro blems that preserve the budget a pprox imately and in cur a s mall blo w-up of the probab ility th reshold (both of which are una voidable). T o the best o f our knowledge, these are the first appr oximation r esults for problems in v olving pr o babilistic constraints and black-box distributions . Our results extend to the s etting with non- uniform scenar io-budgets, and to a genera lization of the risk-a verse ro bust model, where the goal is to minimize the sum of the fi rst-stage cost and a weighted combina tion of the e xpectation and the (1 − ρ ) - quantile of the second -stage co st. 1 Introd uction Stochast ic optimiz ation mo dels prov ide a means to mod el uncerta inty in the input data w here the uncertai nty is modeled by a probab ility distr ibu tion ov er the po ssible real izations of the actual data, calle d scena rios . Starting w ith the wo rk of D antzig [10] and Beale [2] in the 1950 s, these models ha ve found increasing applic ation in a wide v ari ety of ar eas; see, e.g., [ 4, 35] and t he refe rences therein. An important and widely- used model in stocha stic programming is the 2-sta ge r ecou rse model : first, g iv en th e unde rlying distrib utio n ov er scen arios, one may ta ke some fi rst-st ag e actions to co nstruct an antic ipatory part of the solution, x , incurr ing an associated cost c ( x ) . Then, a scenario A is realized accordi ng to the distrib ution , and one may take additional se cond-sta ge r eco urse actions y A incurr ing a certain c ost f A ( x, y A ) . The goal in the standa rd 2-sta ge mode l is to minimize the total exp ected co st, c ( x ) + E A  f A ( x, y A )  . Man y applicati ons come unde r this setting. An oft-cite d m oti v ating exa mple is the 2-s tag e stochasti c f acility locat ion pr oblem . A compa ny has to deci de where to set up its f acilitie s t o serve clie nt demands. The demand -pattern is not ∗ cswamy@math.u waterloo.ca . Dep t. of Combinatorics an d Optimization, Univ . W aterloo, W aterloo, ON N2L 3G1. Supported in part by NSERC grant 32760-06. 1 kno wn precisely at the outset, b ut one does ha v e s ome statistical info rmation ab out th e demands. The first- stage decisions consist of deciding which faciliti es to open initia lly , gi ve n the distrib utional information about the demands; once the client demands are realized ac cording to this dist rib ution, we can extend the soluti on by op ening more faci lities, incurring their recou rse costs. The reco urse costs are usua lly highe r than th e ori ginal ones (e.g., becau se opening a facility later in v olv es depl oying resou rces with a small lea d time), could be dif ferent for the diff erent faciliti es, and cou ld ev en depend on the realized scenario . A common criticism of the standard 2-stage mode l is that the expecta tion measure fails to adequately measure the “risk ” ass ociated with the first-st age deci sions: two solutions with the s ame e xpect ed cost are v alued equ ally . But in realis tic settings, on e also co nsiders the ri sk in v olv ed in th e decis ion. F or e xample, in the stocha stic facili ty locatio n problem, gi v en two solutions with the same expe cted cost , one whi ch in curs a moderate second-s tage cost in all scenarios, and o ne w here there is a non-negl igible prob ability tha t a “ dis- aster scena rio” wit h a huge associate d cost occurs , a co mpany wou ld naturally prefer the former solution. Our models and results . W e conside r v arious stochastic mod els that i ncorpor ate the notion of risk- a verse ness into the standard 2-stage model and de velop n ov el techni ques for solv ing the algo rithmic prob- lems arisi ng i n these models. A ke y notabl e fea ture o f our work that distingui shes it from w ork in some other related models [1 9, 1 1], is that we obtain resul ts in the blac k-box setti ng, that i s, where one is gi v en only sampling access to the un derlying di strib ution. T o better moti v ate our models , we fi rst giv e an o verv iew of some related models consid ered in the appro ximation-a lgorithms literature that also embo dy the idea o f risk- protec tion, and point out why these mod els are ill-su ited to the design of algorithms in the black-box setting. One simple and natural w ay of prov iding so me assurance against the risk du e to scenario-un certainty is to provide bounds on the se cond-sta ge cost inc urred in each sc enario. T wo closely relat ed models in th is vei n are the b udge t model , consi dered by Gupta, Ravi and Sinha [19], and the (demand-) r o bu st model , consid ered by Dhamdher e, Goyal, Ra vi and Singh [11]. In the b udget mod el, one seeks t o minimize t he exp ected to tal co st sub ject to the c onstrain t that th e seco nd-stag e cost f A ( x, y A ) incurred in e very scenario A be at most some inp ut b udg et B . (In ge neral, one could hav e a dif ferent budge t B A for each sc enario A , b ut for simplici ty we focus on the uniform-b ud get model.) Gupta e t a l. consi dered the bu dget model in the polyno mial scena rio sett ing, wher e o ne is gi v en explicitl y a list of all sce narios (wit h n on-zero probability) and their probabiliti es, thereby r estrictin g their attention to dis trib utions with a polynomial- size support. In the rob ust model cons idered by Dhamdhere et al. [ 11], whic h is more in the spirit of ro bu st optimiz ation, the goal is to m inimize c ( x ) + max A f A ( x, y A ) . It i s easy to see ho w the tw o m odels are relate d: if one “guess es” the maximum second-stag e cost B incurre d by the o ptimum, then the rob ust problem ess entially reduce s to t he budge t problem w ith budg et B , e xcept that the second- stage cost term in the objec tiv e fun c- tion is re placed by B (which is a constant). Notice that it is not cle ar ho w to eve n specify prob lems with exp onential ly many sc enarios in the r ob ust model. Feige e t a l. [14] expand ed the mod el of [11] b y c onsid- ering expo nentiall y many scen arios, wh ere the scenari os are impli citly s pecified by a cardinalit y constr aint. Ho wev er , con sidering scenario-col lections that are determin ed on ly by a cardinalit y con straint seems rather specia lized and some what artificial, e speciall y in t he c ontex t of stochastic op timization ; e .g., i n f acility loca- tion, it is rather styliz ed ( and ov erly conserv ati v e) to assume that every set of k clients (for some k ) may sho w up in th e sec ond-stag e. W e w ill consider a more general way of specifying (e xpon entially many ) scen arios in rob ust problems, where the inp ut specifies a black-b ox di strib ution and the collecti on of s cenarios is then gi ven by the suppo rt of this distrib ution. W e shall call this model the distr ibu tion-ba sed ro b ust-model. Both the bu dget model and the (distrib ution-b ased) rob ust model suf fer from certain common drawbacks . A seriou s algorit hmic limitatio n of both these models (see Section 5) is that for almost an y (non-tri via l) stocha stic p roblem (such as frac tional stochastic set cov er with at most 3 element s, 3 sets, 3 scenar ios), o ne canno t o btain any approx imation guarantees in the black-b ox se tting using any bo unded number of samples (e ven allowing for a bounded violatio n of the bud get in the bud get m odel). Intuiti v ely , th e reason for this is that there could be scen arios that o ccur with vanish ingly smal l p robabili ty that one will almost nev er 2 encou nter in our samples, bu t which essen tially force one to tak e cert ain first-s tage actio ns in order to satisfy the b udge t c onstrain ts in the b udg et m odel, or to obt ain a low-c ost solution in the rob ust model. Notice also that both th e bu dget and rob ust models adop t the conserv ati v e vie w that o ne needs to bou nd the secon d- stage cos t in every sce nario, reg ardless of how lik ely it is for th e scenario to oc cur . (By the same token, the y also provid e the greatest amount of risk-a v ersion.) In contras t, many of the risk-mod els considered in t he finance and stoch astic-op timization litera ture, such as th e mean- risk model [2 7], v alue-at- risk ( V aR) constr aints [30, 2 3, 32], condition al V aR [34], do f actor in the probabili ties of dif feren t s cenarios . Our models for risk-a v erse stochastic optimiza tion address the abo ve issue s, and significan tly refine and exten d the b ud get an d robus t model s. Our goal is to come up w ith a model that is sufficie ntly rich in modeling po wer to allo w for black- box distri bu tions, and in which one can obtain strong algori thmic result s. Our models ar e motiv ate d b y the obse rva tion (se e Appendix A) that it is po ssible to obtain approx imation guaran tees in the budge t m odel with black-bo x distrib utions, if one allo ws the second-s tage cos t to e xceed the bu dget with some “small” probab ility ρ (accor ding to the under lying distrib utio n). W e can turn this soluti on concept ar ound and incorpora te it into the m odel to arri v e at the follo wing . W e are no w given a probab ility thresh old ρ ∈ [0 , 1] . In our ne w b udget model, which we call the risk-aver se b udget model , gi ven a b udget B , we seek ( x, { y A } ) so as to mini mize c ( x ) + E A  f A ( x, y A )  subjec t to the pr obabil istic constr aint Pr A [ f A ( x, y A ) > B ] ≤ ρ . The correspon ding risk-av erse (distrib ution -based) r obu st m odel seeks to minimize c ( x ) + Q ρ [ f A ( x, y A )] , where Q ρ [ f A ( x, y A )] is the (1 − ρ ) -quantile of { f A ( x, y A ) } A ∈A , which is the smallest numb er B such tha t Pr A [ f A ( x ) > B ] ≤ ρ . Notice that the paramete r ρ allo ws us to control the risk-a v ersion le vel and tradeof f risk-a v ersenes s aga inst c onserv atis m (in th e spi rit of [3, 41 ]). T aking ρ = 1 in t he r isk-a ver se b udget model gi ve s the standard 2-sta ge reco urse mod el, wher eas t aking ρ = 0 in the ris k-a verse b udg et- o r r ob ust-models re cov ers the standar d b udget- and rob us t model s resp ecti vely . In the seque l, we tre at ρ as a constant that is not part of the input. W e obtain ap proximati on algorith ms for a v ariety of combinatoria l optimizatio n problems (Sec tion 4) includ ing the set cov er , v erte x cover , multicut on trees, min cut, and fac ility location problems, in the risk-a v erse bu dget an d rob ust mod els with blac k-box distr ib utions . W e obtain near -o ptimal solutions that preser ve the budg et approximately and incu r a small blo w-up of the probabi lity thresh old. (One shou ld exp ect to violate the bu dget he re; otherwis e, by setting ve ry h igh first-stage costs, o ne would be able to solve the decision v ersion of an N P -hard problem!) T o the best of ou r kno wledge, the se a re the first appr oxi mation r esult s for problems with pr obabilistic constra ints and blac k-box distrib ution s . Our resul ts exten d to the setting with non-un iform scenario -b udgets, and to a gen eralizati on of the r isk-a ver se rob ust model, wher e the goal is to minimize c ( x ) plus a weighted combination of E A  f A ( x, y A )  and Q ρ [ f A ( x, y A )] . In the sequel , we focus primarily on the risk-a v erse bud get model since results obtaine d this model essent ially transla te to the risk-a ve rse rob ust model (th e b ud get-viola tion can be absorbed into the app roximatio n ratio). Our results are b uilt on two components. First, and t his is t he technically m ore difficu lt component and our main contrib ution, we devis e a fully polynomial a ppr o ximation sc heme f or solving th e LP-rel axations of a wid e-v ariety o f risk-a vers e problems (Theorem 3 .3). W e s ho w th at in the black-b ox setti ng, for a wide v ariety of 2-stage p roblems, for a ny ǫ, κ > 0 , in time p oly  λ ǫκρ  , one can c ompute (with high p robabili ty) a soluti on to the LP-relaxati on of the risk-a ve rse bu dgeted problem, o f cost at most (1 + ǫ ) times the op timum where the pro bability t hat the secon d-stage cost excee ds the bud get B is at most ρ (1 + κ ) . Here λ is the max- imum r atio between t he cos ts of the s ame actio n in stag e II a nd stage I (e.g., opening a f acility or choo sing a set). W e sho w in Section 5 that th e dependenc e on 1 κρ , and he nce, the violation of the proba bility-th reshold, is u na vo idable in the black -box settin g. W e b elie ve that this is a general tool of independ ent interest that will fi nd a pplicati on in th e des ign of approximation algo rithms for o ther discrete risk-a v erse stoch astic op- timizatio n p roblems, and that our techniqu es will find use in solvin g o ther probabilis tic prog rams. The second componen t is a simple ro unding pr ocedure (Theor em 3.2) that complemen ts (and motiv at es) the above appro ximation scheme. As w e mention belo w , our LP -relaxat ion is a relaxation of ev en the fra ctional risk-aver se pr ob lem (i.e., where one is allo wed to take fractional decisions). W e giv e a general 3 round ing procedure to con v ert a solutio n to our LP -relaxa tion to a solution to the fractional risk-a vers e proble m losing a certain fac tor in the soluti on cost, b udget, and th e probab ility of b udg et-viola tion. This allo ws us to then use an LP-based “local” approx imation algo rithm for the correspo nding 2-stage problem to obtain an inte ger s olution, where a loc al algor ithm is one tha t app roximately prese rves the LP-cost o f e ach scenar io. In parti cular , for v arious cov ering problems, one can use th e local 2 c -app roximatio n algorithm in [38], which is obtaine d u sing an LP -based c -ap proximatio n algori thm for th e deterministic problem. W e need to o vercome va rious obstacl es to devise ou r approximat ion sche me. The first dif fi culty fac ed in solving a prob abilistic program suc h as ou rs, is that the fe asible re gion of even the fraction al pr oblem is a non-con vex set. T hus, eve n in the polyn omial-scen ario setting, it is not clear h ow to solv e (eve n) th e fractio nal risk-a vers e problem. (In contr ast, in th e standar d 2-stage reco urse model, th e frac tional problem can be easily formulat ed a nd solv ed as a linear program (LP) in the polynomia l-scenari o setting.) W e formulat e an LP-relaxation (which is also a relaxat ion of the fractional problem), w here we introduce a v ariable r A for e very sc enario A that is supposed to indicate whether the b udget is excee ded in scenario A . Correspondin gly , we ha ve two sets of decisio n v ariables to denote the decisi ons take n in scenario A in the tw o cases respec tiv ely where the b udget is exceede d and where it is not e xceeded . T he constrain ts that enforce this semantics will of course be pro blem-speci fic, b ut a co mmon c onstrain t that figures in all these for mulations is P A p A r A ≤ ρ , which ca ptures our pro babilist ic constrain t. This con straint, which couple s the dif ferent s cenarios , creates sig nificant challenges in solvi ng the LP-relaxatio n. (Again, notice the contrast with the standard 2-stage r ecourse m odel.) W e g et around th e difficulty posed by th is cou pling constr aint by takin g the L agr angian dua l wit h respect to this constraint, in troducin g a dual varia ble ∆ ≥ 0 . The result ing m aximizat ion p roblem (ov er ∆ ) has a 2-stage minimizati on L P embedde d i nside it. Although this 2-stag e LP does n ot b elong to t he c lass of problems de fined in [38, 45, 7 ], we prov e that for any fixed ∆ , this 2-stag e LP c an be solve d t o “near -optimality” using the sample ave ra ge appr oximation (SAA) met hod. The cr ucial in sight here is to realize that fo r the purpose of obtaining a near -optimal solutio n to the risk- a verse LP , it suffice s to obt ain a rat her wea k guarantee for t he 2-stage L P , where we allow for an additi ve error proportion al to ∆ . This guarantee is specifical ly tailored so tha t it is weak enough that one can pro ve such a guarant ee by sho wing “closeness-i n-subgradients” and the analysis in [45], and yet can be lev eraged to obtain a near -optimal sol ution to (the relaxation of) our ri sk-a ver se problem. G i ven this gu arantee, we sho w that one can ef ficiently find a s uitable v alue for ∆ such that the s olution obtai ned f or this ∆ (via the SAA method) satisfies the desired guaran tees. Related w ork. Stoch astic optimiza tion i s a field with a vast amou nt of lite rature; we direct t he reader to [4, 30, 35] for more informatio n on the subject. W e surv ey the work that is mos t rele vant to our work. Stochast ic optimiza tion proble ms hav e only recently been studied fr om a n app roximatio n-algorithms perspecti ve. A v ariety of approx imation result s ha ve been obtai ned in the 2-s tage recou rse mode l, b ut more genera l models, such as risk-op timization or p robabili stic-pro gramming mode ls hav e re cei ved little or no attention . The (standa rd) budg et model was first considere d by Gupta et al. [19], who design ed approximat ion algori thms for stochasti c network desig n problems in this model. Dhamdhere et al. [11] introduce d the demand- rob ust model (whic h we call th e ro b ust model) , and obtained algor ithms for the rob ust ve rsions o f v arious combinat orial optimization problems; some of their guaran tees w ere later impro ved by Golo vin et al. [16]. All these works focus on the pol ynomial-sc enario setting. F eige, Jain , Mahdian, and Mirrokni [14] consid ered the rob ust model with expo nentially man y s cenarios tha t are specified implicit ly via a cardinali ty constr aint, and der iv ed approximatio n algorithms for var ious c ov ering prob lems in th is more gener al model. There is a larg e body of wo rk in the fina nce and stocha stic-opti mization litera ture, datin g back to Marko witz [27], that deals with risk-model ing and optimization ; see e.g., [34, 1, 36] and the references therein . O ur risk -av erse model s are related to some model s in finance. In fact, the pr obabilis tic cons traint that we u se is calle d a value-a t-risk (V aR) co nstraint in the finance literatu re, and its use in r isk-opti mization is quite popula r i n finance models; it has ev en been written into some industr y re gulatio ns [2 3, 32]. 4 Problems in vol ving probab ilistic constr aints are called pr obabili stic or c hance-cons traine d pr ogra ms [8, 29] i n th e stoch astic-op timization litera ture, and ha ve be en e xtensi vely studie d (see, e.g., Pr ´ eko pa [31]). Re- cent work in this area [6, 28, 13] has focu sed on repla cing the origin al proba bilistic constrain t by more tractab le co nstraints so that any solution satisfying the new constrai nts also sati sfies t he original probabilis - tic constrain t with high pro bability . Notice that th is ty pe of “relax ation” is oppos ite to what on e aims for in the desig n of approximat ion algorithms, where w e want th at e very solution to the ori ginal pr oblem re- mains a sol ution to the relaxation (b ut most lik ely , not vice vers a). Although some ap proximati on results in the oppos ite direction are obtained in [6, 28, 13], the y are obtained for very structured constra ints of the ty pe Pr ξ [ G ( x, ξ ) / ∈ C ] ≤ ρ , where C is a con vex set , ξ is a continuou s random v ariable whose dis- trib ution satisfies a ce rtain con centrati on-of-measure prope rty , and G ( . ) is a bi-affine or co n vex mapping; also the bounds obtained in vol ve a rela tiv ely lar ge v iolation of the p robabili ty thres hold (co mpared to ou r (1 + κ ) -f actor). T o the best of our kno wledge, there is no prior work in the stochas tic-optimiz ation or finance literatur e on t he des ign of efficie nt algorithms with pr ovable wor st-case g uara ntees for discret e risk- optimiza tion or probabili stic-prog ramming problems. In the Computer S cience litera ture, [24] and [15] consid er the stoc hastic bin pac king and kn apsack prob lems with pro babilist ic co nstraints tha t limit the ov er - flow probabi lity of a bin or the knapsac k, and obtain ed no vel approximat ion algo rithms f or these problems. Their results are howe ver obtai ned for speciali zed dis trib utions where the item sizes are independe nt random v ariables fo llo wing Bernou lli, exp onentia l, or Poisson distrib utions specified in the input. In the conte xt of stocha stic o ptimizatio n, this con stitutes a rather stylized setting that is far from the black- box s etting. The work closest in spirit to ours is that of So, Zhang, and Y e [41]. They consid er the prob lem of minimizing th e first-stag e cost plus a risk-measure called th e condi tional V aR (CV aR) [34]. Their mode l interp olates between the 2- stage rec ourse mod el and the (standard ) rob ust model (as o pposed to the b udget model in our case) . They giv e an app roximation scheme for solving the LP-relaxat ions of a broad class of proble ms in the black-box setti ng, using which t hey obt ain approx imation algorithms for certain di screte optimiza tion proble ms. Our meth ods are ho wev er quite d iff erent fro m their s. In their model, the fract ional proble m yields a con vex pr ogra m and moreov er , the y are able to u se a nice re presenta tion the orem in [34] for the CV aR measure t o c on vert their p roblem into a 2-sta ge prob lem and then adapt the methods in [7]. In our case, the no n-con vexity inheren t in the probabilist ic constr aint creates v arious dif ficulties (first the non-co n vex ity , then the co upling con straint) an d we consequen tly need to work harder to obtain o ur result. W e re mark that our techni ques can be used to so lve a g ener alizatio n of their mod el , where we ha ve t he same object iv e func tion b ut also include a probabilis tic b udget constra int as in our risk-a verse bud get model. W e no w briefly surve y the approximation results in recour se mode ls. The first such ap proximati on re- sult appears to be du e to Dye, Stougie, an d T omasgard [12]. The rec ent in terest and fl urry of algorithmic acti vity in this area can be tr aced to the wo rk of Ra vi and Sinha [33] a nd Immorlica , Kar ger , Mink off and Mirrokni [22], which ga ve approximat ion algo rithms for the 2-stage varia nts of vario us discrete optimiza- tion probl ems i n the polynomial scenario [33, 22] and independen t-activation [22] settings. Approximation algori thms for 2-sta ge problems with black-box distrib utions w ere fi rst obtained by Gup ta, P ´ al, Ra vi and Sinha [ 17], and s ubseque ntly b y Shmoy s and Swamy [38] (see also prel iminary ver sion [39]). V ariou s other approx imation results for 2-stag e problems hav e since been obtaine d; see, e.g., the surve y [44]. Mult i- stage rec ourse problems in the black-b ox model were con sidered by [18, 4 5]; both ob tain approximat ion algori thms with guarantees th at de teriorate with the number of stages , either exponent ially [18] ( excep t for multistag e Steiner tree which was also consi dered in [20]), or linearl y [45]; improv ed guarantees for set cov er and vertex co ver ha ve been subsequen tly obtai ned [ 42]. Our approximat ion sche me mak es use of t he SAA method, wh ich is a simple and appealing method for solvin g stochastic p roblems that is qui te often used in practice. In the SAA meth od one samples a certain number of scen arios to estimate the scenario probab ilities by their frequen cy of occurrence , and the n solves the 2-stage problem determined by this a pproximate distrib ution. The effec tiv eness of this method depends on the sample size (ideall y , polynomia l) req uired to g uarante e that an optima l solution to t he SAA -proble m 5 is a prov ably near -optimal solutio n to the original problem. Kley wegt et al. [25] (see also [37]) prove a bound that d epends on the v ariance of a certain quantity that need not be polyno mially bounded. Sub sequentl y , Swamy and Shmoys [45], and Charikar et al. [7] obtain ed improv ed (poly nomial) sample-b ounds for a lar ge class of structu red 2 -stage problems. The proof in [45], which also applies to multistage programs, is based on le verag ing approximat e s ubgradi ents, and our pro of makes u se of portion s of their analysis . The pr oof of Charikar et al. [7] is qu ite dif ferent; it appl ies to 2-stage prog rams but pro ves the stro nger theorem that e ven approx imate s olutions to the S AA pro blem translate to approximate solutions to the original problem. 2 Pr eliminaries Let R + denote R ≥ 0 . Let k u k denote the ℓ 2 norm of u . The Lipsch itz constan t of a functio n f : R m 7→ R is the smallest K such that | f ( x ) − f ( y ) | ≤ K k x − y k . W e conside r con vex minimization proble ms min x ∈P f ( x ) , w here P ⊆ R m + with P ⊆ B ( 0 , R ) = { x : k x k ≤ R } for a suitabl e R , and f is con ve x. Definition 2.1 Let f : R m 7→ R be a func tion. W e say that d ∈ R m is a subgradient of f at the point u if the inequa lity f ( v ) − f ( u ) ≥ d · ( v − u ) h olds for ever y v ∈ R m . W e say that ˆ d is an ( ω , ξ ) -subgradi ent of f at the point u ∈ P if for every v ∈ P , we have f ( v ) − f ( u ) ≥ ˆ d · ( v − u ) − ω f ( v ) − ω f ( u ) − ξ . The ab ov e defini tion of a n ( ω, ξ ) -subgradie nt is slightly weak er than the notio n of a n ω -subgra dient as defined in [38], wh ere o ne r equires th at f ( v ) − f ( u ) ≥ d · ( v − u ) − ω f ( u ) . But t his d if ference is superficial; one could also implemen t the algorit hm in [38] u sing the weak er not ion of an ( ω , ξ ) -subg radient. It is well kno wn (see [5]) that a co n vex function has a subgradie nt at ev ery po int. One can infer from Defini tion 2.1 that, letting d x denote a subgr adient o f f at x , the Lipschitz constan t of f is at most max x k d x k . Let K be a po siti ve number , and τ ,  be two parameters wit h τ < 1 . Let N = lo g  2 K R τ  . Let G ′ τ ⊆ P be a discrete set such that for any x ∈ P , there exists x ′ ∈ G ′ τ with k x − x ′ k ≤ τ K N . Define G τ = G ′ τ ∪  x + t ( y − x ) , y + t ( x − y ) : x, y ∈ G ′ τ , t = 2 − i , i = 1 , . . . , N  . W e call G τ and G ′ τ , a n τ K N -net and a n e xtended τ K N -net respecti vely of P . As sho w n i n [45], if P co ntains a ball of rad ius V (where V ≤ 1 without loss o f gener ality), then one can const ruct G ′ τ so that | G τ | = p oly  log( K R V τ )  . As mentione d earlier , our algorith ms make use of the sample a vera ge appr oximation (SAA) method. The f ollo wing result fro m Swamy and Shmoys [45], which we ha ve adapt ed to our setting , will be our mai n tool for ana lyzing the SAA method. Lemma 2.2 ([45]) Let b f a nd f be two nonn e gative con vex fu nctions with Lipschitz c onstant a t most K such that at e very poi nt x ∈ G τ , the r e e xists a vecto r ˆ d x ∈ R m that is a subgrad ient of b f ( . ) and an   8 N , ξ  - subgr adient of f ( . ) at x . Let ˆ x = argmin x ∈P b f ( x ) . The n, f ( ˆ x ) ≤ (1 +  ) min x ∈P f ( x ) + 6 τ + 2 N ξ . Lemma 2.3 (Chernoff-Ho effding boun d [21]) Let X 1 , . . . , X N be iid ran dom variables with each X i ∈ [0 , 1] and µ = E  X i  . Then, Pr[   1 N P i X i − µ   > ǫ ] ≤ 2 e − 2 ǫ 2 N . 3 The risk-av erse b udgeted s et cov er pr o blem: an illustrative example Our tech niques can be used to ef fi ciently solv e the ris k-a verse version s of a vari ety of 2-stage stoch astic optimiza tion pro blems, both in the risk-a verse b udget and robus t model s. In this s ection, we il lustrate the main underlyin g id eas by focusin g on the risk-a verse b udgeted se t cove r problem. In the risk ave rse b udgete d set cover problem ( RASC ), we are giv en a uni vers e U of n element s and a co llection S of m su bsets of U . The set of elements to b e cov ered is uncertain : we are gi ven a probabi lity distri bu tion { p A } A ∈A of scen arios, where each scenari o A specifies a subse t of U to be c ov ered. The cost of picking a set S ∈ S in the first-stage is w I S , an d is w A S in sc enario A . The goal i s to determin e which set s to p ick in stage I an d which on es to pick 6 in each s cenario so as to minimize t he e xpecte d cost of p icking sets, subject to Pr A [ cost of scenari o A > B ] ≤ ρ , whe re ρ is a constant t hat is n ot pa rt of the input. Notice that th e costs w A S are only re vealed when we sample scenar io A ; th us, the “input size”, denoted by I is O ( m + n + P S log w S + log B ) . For a gi ven (fra ctional) point x ∈ R m with 0 ≤ x S ≤ 1 for a ll S , define f A ( x ) to be the minimum v alue of w A · y A subjec t to P S : e ∈ S y A,S ≥ 1 − P S : e ∈ S x S for e ∈ A , an d y A,S ≥ 0 for all S . Let P = [0 , 1] m . As mentione d in the I ntroduc tion, the set of feasi ble soluti ons to e ven the fr actiona l risk-aver se pr oblem (where one can b uy se ts fraction ally) is n ot in general a con ve x set. W e conside r the fo llo wing L P-relaxat ion o f the proble m, which is a relaxation of e ven the fractional risk-a verse problem (Claim 3.1). Thro ughout we use A to inde x the scen arios in A , and S to index the sets in S . min X S w I S x S + X A,S p A  w A S y A,S + w A S z A,S  (RASC-P) s.t. X A p A r A ≤ ρ (1) X S : e ∈ S  x S + y A,S  + r A ≥ 1 for all A, e ∈ A, (2) X S : e ∈ S  x S + y A,S + z A,S  ≥ 1 for all A, e ∈ A, (3) X S w A S y A,S ≤ B for all A (4) x S , y A,S , z A,S , r A ≥ 0 for all A, S. (5) Here x denotes the first-stage decisions. The v ariable r A denote s whether one ex ceeds the bu dget of B for scenario A , and the v ariable s y A,S and z A,S denote respecti vely the sets picked in scenario A in the situati ons where one does not e xceed the budg et (so r A = 0 ) and where one does e xceed the b udget (so r A = 1 ). Consequ ently , c onstrain t (4) ens ures that the cost of the y A decisi ons does not ex ceed the bud get B , and (1 ) e nsures th at th e t otal p robabili ty mass of scen arios wher e o ne do es exceed th e b udget is at most ρ . Let OPT denote the optimum value of (RASC-P). A significant dif ficulty f aced in solving (RASC-P) is that the scenarios are no longer sepa rable gi ven a first-stag e so lution, s ince constra int (1) couples the d if ferent scenar ios. As a consequenc e, in order to speci fy a solution to (RASC-P) one needs to compute a fi rst-stag e solution and giv e an expli cit proced ure that compute s ( y A , z A , r A ) in any gi ven scenario A . In our algorithms howe ver , we can av oid t his complication becaus e, as we sho w bel ow , gi ven only the first- stage component o f a solution t o (RAS C-P), one can round it to a first- stage s olution to the fracti onal risk- av erse problem (and then to an integ er solu tion) los ing a small fact or i n the solutio n cost and the probabi lity-thre shold. But observ e that if we ha ve a first-stage solution x to the fractional ris k-a vers e problem with pr obability -thresh old P such that th ere e xist se cond-st age solutions yieldi ng a tota l e xpected cost of C , then one can also easily compute s econd-s tage soluti ons that yield no greate r total cost (a nd where Pr[ secon d-stage cost > B ] ≤ P ), by simply sol ving the LP f A ( x ) in each scenar io A . This imp lies that our algorithm for solving (RASC-P) only need s to return a first-stage solution to ( RASC -P) that can be ex tended to a near -optimum solution (withou t specifyin g an exp licit proced ure to compute the second -stage so lutions) . W e sho w (Theo rem 3.3) th at for an y ǫ, κ > 0 , one can efficien tly compu te a first-st age solu tion x for which t here e xist s olution s ( y A , z A , r A ) in e very scenario A satisfying (2)–(5) such that w I · x + P A p A w A · ( y A + z A ) ≤ (1 + 2 ǫ ) OPT , and P A p A r A ≤ ρ (1 + κ ) . Complementing this, we giv e a simple round ing proced ure base d on the roundin g theore m in [38] to con vert a fract ional sol ution to (RASC-P) to an integer soluti on using an LP-based c -approximatio n algor ithm for the d eterministi c set cove r ( DSC ) pro blem, that is, an algorith m that ret urns a set cov er of c ost at most c times the opti mum of the standard LP-relaxation for DSC . W e pro ve this roundi ng th eorem fi rst, in orde r t o better motiv ate our goal of solving (RASC-P ). 7 Claim 3.1 OPT is a lower bound on the optimum of the fractio nal r isk-aver se pr oblem. Pro of : W e sho w that any solution ˆ x to the fraction al risk-a vers e problem can be mapped to a so lution to (RASC -P) of no greater cost. Let ˆ y A be such that f A ( ˆ x ) = w A · ˆ y A , so Pr[ f A ( ˆ x ) > B ] ≤ ρ . W e set x = ˆ x . For scenario A , i f f A ( ˆ x ) ≤ B , we set r A = 0 , y A = ˆ y A , z A = 0 . Otherwise, we set r A = 1 , y A = 0 , z A = ˆ y A . It is easy to see that this yields a feasible solution to (RASC-P) of cost w I · ˆ x + P A p A f A ( ˆ x ) . Theor em 3.2 (Round ing theor em) Let  x, { ( y A , z A , r A ) }  be a solution satisfy ing (2) – (5) of objective value C =  w A · x + P A p A w A · ( y A + z A )  , and let P = P A p A r A . Given any ε > 0 , one can obtain (i) a soluti on ˆ x suc h that w I · ˆ x + P A p A f A ( ˆ x ) ≤  1 + 1 ε  C and Pr A  f A ( ˆ x ) > ( 1 + 1 ε ) B  ≤ ( 1 + ε ) P ; (ii) an int e ger sol ution ( ˜ x, { ˜ y A } ) of cost at mos t 2 c  1 + 1 ε  C suc h tha t Pr A  w A · ˜ y A > 2 cB (1 + 1 ε )  ≤ (1 + ε ) P using an L P-based c -app r oximation algo rithm for the deterministic set cover pr oblem. Mor eover , one only needs to know the fir st-sta ge so lution x to obtain ˆ x and ˜ x . Pro of : Set ˆ x =  1 + 1 ε  x . Consider any scenario A . Observ e that ( y A + z A ) yields a feas ible s olution to the seco nd-stag e prob lem for scenario A . Also, if r A < 1 1+ ε , then  1 + 1 ε  y A also yields a feasible solu tion. Thus, we ha ve f A ( ˆ x ) ≤ w A · ( y A + z A ) and if r A < 1 1+ ε then we also hav e f A ( ˆ x ) ≤  1 + 1 ε  B . So w I · x + P A p A f A ( ˆ x ) ≤  1 + 1 ε  C and Pr  f A ( ˆ x ) > (1 + 1 ε ) B  ≤ P A : r A ≥ 1 1+ ε p A ≤ ( 1 + ε ) P . W e can now round ˆ x to an inte ger so lution ( ˜ x , { ˜ y A } ) using the Shmoys -Swamy [38] roundi ng proced ure (which only needs ˆ x ) losing a factor of 2 c in the first- and second-s tage cos ts. This pr ov es part (ii). 3.1 Solving the risk-a verse pr oblem (RASC-P) W e no w describe and anal yze the procedu re used to s olve (RASC-P). First, we get around the dif ficulty posed by the coupling co nstraint (1 ) in formulation (RASC-P) by using the technique of Lagrangian re lax- ation. W e tak e the Lagrangian dua l of (1) introdu cing a dual v ariable ∆ to obtain th e follo wing formulation . max ∆ ≥ 0 − ∆ ρ +  min x ∈P h (∆; x ) = w I · x + X A p A g A (∆; x )  (LD1) where g A (∆; x ) = min X S w A S  y A,S + z A,S  + ∆ r A s.t. (2)–(4) , y A,S , z A,S , r A ≥ 0 for all S. (P) It is straightfo rward to sho w via du ality theory th at (RA SC-P) and (LD1) hav e t he same o ptimal value, and moreo ver that if ( x ∗ , { y ∗ A } , { z ∗ A } , { r ∗ A } ) is an optimal sol ution to (RASC-P) and ∆ ∗ is t he optimal v alue for the dual v ariable correspon ding to (1) then  ∆ ∗ ; x ∗ , { ( y ∗ A , z ∗ A , r ∗ A ) }  is an opt imal solution to (LD1). Recall that P = [0 , 1] m . Let OPT (∆) = min x ∈P h (∆; x ) . So OPT = max ∆ ≥ 0  OPT (∆) − ∆ ρ  . Let λ = max  1 , max A,S ( w A S /w I S )  , which we assu me is kno wn. The main result of this section is as follo ws. Through out, when we say “with high probability ”, we m ean that a failu re probability of δ can b e ensured using p oly  ln( 1 δ )  -depen dence o n the sample size (or running time). Theor em 3.3 F or any ǫ, γ , κ > 0 , RiskAlg (see F ig. 1) runs in time p oly  I , λ ǫκρ , log( 1 γ )  , and retu rns with high pr obabili ty a first-sta ge solution x and solution s ( y A , z A , r A ) for each scenario A that satisfy (2) – (5) and su ch that (i) w I · x + P A p A w A · ( y A + z A ) ≤ (1 + ǫ ) OPT + γ ; and (ii) P A p A r A ≤ ρ (1 + κ ) . Under the very m ild assumptio n ( ∗ ) that w I · x + f A ( x ) ≥ 1 for every A 6 = ∅ , x ∈ P , 1 we can con vert this guar antee into a (1 + 2 ǫ ) -multiplicativ e guara ntee in the cost in time p oly  I , λ ǫκρ  . 1 A similar assumption is made in [38] to obtain a multiplicativ e guarantee. 8 RiskAlg ( ǫ, γ , κ ) [ ǫ ≤ κ < 1 ; the quantities p ( i ) , c ost ( i ) , ( y A , z A , r A ) are u sed only in the analysis.] C1. Fix ε = ǫ/ 6 , ζ = γ / 4 , η = ρκ/ 16 . Also, set σ = ǫ/ 6 , γ ′ = γ / 4 , β = κ/ 8 , and ρ ′ = ρ (1 + 3 κ/ 4) . Consider the ∆ values ∆ 0 , ∆ 1 , . . . , ∆ k , where ∆ 0 = γ ′ , ∆ i +1 = ∆ i (1 + σ ) and k is the smallest value such that ∆ 0 (1 + σ ) k ≥ UB . Note that k = O  log( UB γ ′ ) /σ  . C2. For each ∆ i , let  x ( i ) , { ( y ( i ) A , z ( i ) A , r ( i ) A ) }  ← SA-Alg (∆; ε, η , ζ ) (here ( y ( i ) A , z ( i ) A , r ( i ) A ) is an o ptimal solution to g A (∆ i ; x ( i ) ) and is implicitly given). Let p ( i ) = P A p A r ( i ) A and c ost ( i ) = h (∆ i ; x ( i ) ) = w I · x ( i ) + P A p A  w A · ( y ( i ) A + z ( i ) A ) + ∆ i r ( i ) A  . C3. By sampling n = 1 2 β 2 ρ 2 ln  4 k δ  scenarios, for each i = 0 , . . . , k , compute an estimate p ′ ( i ) = P A b q A r ( i ) A of p ( i ) , where b q A is the frequ ency of scenario A in the samp led set. C4. If p ′ (0) ≤ ρ ′ then return x (0) as the first-stage solution. [In scenario A , return ( y A , z A , r A ) = ( y (0) A , z (0) A , r (0) A ) ]. C5. Otherwise (i.e., p ′ (0) > ρ ′ ) find an ind ex i such that p ′ ( i ) ≥ ρ ′ and p ′ ( i +1) ≤ ρ ′ (we argue that such an i must exist). L et a be such that a · p ′ ( i ) + (1 − a ) p ′ ( i +1) = ρ ′ . Return the first-stage solution x = a · x ( i ) + (1 − a ) x ( i +1) . [In scenario A , return the solution ( y A , z A , r A ) = a ( y ( i ) A , z ( i ) A , r ( i ) A ) + (1 − a )( y ( i +1) A , z ( i +1) A , r ( i +1) A ) .] SA-Alg ( ∆; ε, η , ζ ) [ K is (a bound on) th e Lipschitz constant of h (∆; . ) ; P ⊆ B ( 0 , R ) and P contains a ball o f radius V ≤ 1 .] B1. Set τ = ζ / 6 , N = log  2 K R V τ  . Let G τ ⊆ P be an e xtended τ K N -net of P a s defined in Section 2, so that | G τ | = p oly  log( K R V τ )  . Draw N = 8 N 2  4 λ ε + m η  2 ln  2 | G τ | m δ  samples and for each scenar io A , set b p A = N A / N , where N A is the numb er of times scenario A is samp led. B2. Solve the SAA prob lem min x ∈P b h (∆; x ) , wh ere b h (∆; x ) = w I · x + P A b p A g A (∆; x ) to o btain a so lution ˆ x . Return ˆ x an d in scenario A , return the optimal solution to g A (∆; ˆ x ) . Figure 1: The procedur es Risk Alg and SA-Alg . Procedur e Risk Alg i s descr ibed in Figur e 1. In the pro cedure, we also specif y the second-s tage solutions for each scenario that can be used to extend the computed first-stag e solution to a near- optimal solution to (RASC-P). W e use these solutions only in the analysis. W e show in Section 5 that the dependenc e on 1 κρ is unav oidable in the black-b ox mod el. The “greedy algori thm” for determin istic set cover [9] is an LP -based ln n -approximatio n algor ithm, so Theorem 3.3 combine d with Theorem 3.2 sho w s that for any ǫ, κ, ε > 0 one can effici ently compute an integ er solution  ˜ x, { ˜ y A }  of cost at most 2 ln n  1 + ǫ + 1 ε  · OPT suc h that Pr A  w A · ˜ y A > 2 B ln n (1 + ǫ + 1 ε )  ≤ ρ (1 + κ + ε ) . Algorith m RiskAlg is e ssentiall y a search procedure for the “righ t” v alue o f t he L agrang ian multip lier ∆ , wrapped aro und the SAA method, which is used in procedure SA-Alg to compute a ne ar- optimal solutio n to the minimization problem min x ∈P h (∆; x ) for any gi ven ∆ ≥ 0 . Theorem 3.4 states the precise approx- imation gua rantee satisfied by the solutio n returned by S A-Alg . Giv en this, we ar gue that by co nsiderin g polyn omially many ∆ va lues that increas e geometric ally up to some upper bou nd UB , one can find ef fi- ciently some ∆ where the solution  x, { ( y A , z A , r A ) }  return ed by SA-Alg for ∆ is such that P A p A r A is “close ” to ρ . This will also imply that this soluti on is a near -optimal solution. W e set UB = 16( P S w I S ) /ρ , so log U B is poly nomially bounded . Howe ver , the searc h for th e “rig ht” v alue of ∆ an d our an alysis are complica ted by the fact that we hav e two sourc es of error whose magnitudes we need to control: first, we only ha ve an approximate solution  x, { ( y A , z A , r A ) }  for ∆ , which also means that one cannot use any optimality cond itions; second, for an y ∆ , we ha ve on ly implici t access to the seco nd-stag e solutions { y A , z A , r A ) } computed by T heorem 3.4, so we cannot actually comput e or u se P A p A r A in our searc h, b ut will need to estimate it via sampling. Theor em 3.4 F or any ∆ ≥ 0 , an d any ε , ζ , η > 0 , S A-Alg runs in time p oly  I , λ εη , log ( ∆ ζ )  and r eturns, with high pr obabili ty , a first-s tag e solution x such that h (∆; x ) ≤ (1 + ε ) OPT (∆) + η ∆ + ζ . 9 Analysis. For the rest of this section, ǫ, γ , κ are fixed value s giv en by Theorem 3.3. W e may assume without loss of generality that ǫ ≤ κ < 1 . W e prov e T heorem 3.4 in Section 3.1.1. Here, we sho w how this leads to the p roof of Theorem 3.3. Giv en Theorem 3.4 and Lemma 2.3, we assume that the hi gh probabili ty e vent “ ∀ i, c ost ( i ) ≤ (1 + ε ) OPT (∆ i ) + η ∆ i + ζ an d | p ′ ( i ) − p ( i ) | ≤ β ρ ” happens . Claim 3.5 W e have p ( k ) < ρ/ 2 and p ′ ( k ) < ρ/ 2 . Pro of : If p ( k ) > ρ (1+ ε ) 4 , then c ost ( k ) − η ∆ k > 2(1 + ε )( P S w I S ) > (1 + ε ) OPT (∆ k ) + ζ , which is a contra diction. The last inequa lity follo ws since OPT (∆) ≤ P S w I S for any ∆ . T herefo re, p ( k ) < ρ/ 2 , and p ′ ( k ) ≤ p ( k ) + β ρ < ρ/ 2 . Pro of of Theorem 3.3 : Let x be the fi rst-stag e solution returned by RiskAlg , and ( y A , z A , r A ) be the soluti on returne d for scena rio A . It is clear that (2)–(5) are satisfied. Sup pose first that p ′ (0) ≤ ρ ′ (so x = x (0) .) Part (ii) of the theorem follo ws since p (0) ≤ p ′ (0) + β ρ ≤ ρ (1 + κ ) . P art (i) follo ws since w I · x (0) + X A p A w A · ( y (0) A + z (0) A ) ≤ h ( γ ′ ; x ) ≤ (1 + ε ) OPT ( γ ′ ) + η γ ′ + ζ ≤ (1 + ε ) OPT + γ ′ (1 + ε + η ) + ζ . The penul timate inequal ity follo w s because for any ∆ , we ha ve O PT (∆) ≤ OPT (0) + ∆ ≤ OPT + ∆ . No w suppose that p ′ (0) > ρ ′ . In this case, there must exist an i such tha t p ′ ( i ) ≥ ρ ′ , and p ′ ( i +1) ≤ ρ ′ becaus e p ′ (0) > ρ ′ and p ′ ( k ) < ρ ′ (by Claim 3.5), so step C 4 is well defined. W e again prov e part (ii) first. W e ha ve P A p A r A = a · p ( i ) + (1 − a ) p ( i +1) ≤ ρ ′ + β ρ ≤ ρ (1 + κ ) . T o pro ve part (i), observ e that w I · x + P A p A w A · ( y A + z A ) ≤ a · c ost ( i ) + (1 − a ) · c ost ( i +1) − ∆ i  a · p ( i ) + (1 − a ) · p ( i +1)  , which is at most (1 + ε )  a · OPT (∆ i ) + (1 − a ) OPT (∆ i +1 )  + η ( a ∆ i + (1 − a )∆ i +1 ) + ζ − ∆ i ( ρ ′ − β ρ ) . No w notin g that ∆ i +1 = (1 + σ )∆ i , it is easy to see that OPT (∆ i +1 ) ≤ (1 + σ ) O PT (∆ i ) . Also , ρ ′ − β ρ − η (1 + σ ) ≥ (1 + ε + 2 σ ) ρ . So the abo ve q uantity i s at m ost (1 + ε + 2 σ )  OPT (∆ i ) − ∆ i ρ  + ζ ≤ (1 + ǫ ) OPT + γ . The running time is the time take n to obtain th e sol utions for all the ∆ i v alues plus the time ta ken to compute p ′ ( i ) for e ach i . This is at most ( k + 1) · p oly  I , λ εη , log ( ∆ k ζ )  + O  ln k β 2 ρ 2  , us ing Theorem 3.4. Note that log(∆ k ) is polyno mially bounde d. Plu gging in ε, η , ζ , β , and k , we obtain the p oly  I , λ ǫκρ , log( 1 γ )  bound . Pro of of m ultiplicati ve guarantee. T o obtain the multiplicati ve guarant ee, we show that by initially sam- pling roughly max { 1 /ρ, λ } times, with high probability , o ne can either determine that x = 0 is an optimal first-stag e soluti on, o r obtain a lower bound on OPT and then set γ appropria tely in RiskAlg to obtain the mult iplicati ve bound. Recall that f A ( x ) is the minimum value of w A · y A ov er all y A ≥ 0 such th at P S : e ∈ S y A,S ≥ 1 − P S : e ∈ S x S for e ∈ A . Call A = ∅ a null scenari o. Let q = P A : A 6 = ∅ p A and α = min { ρ, 1 /λ } . Note that OPT ≥ q . Let ˆ z A be an optimal solution to f A ( 0 ) . Define a solution ( ¯ y A , ¯ z A , ¯ r A ) for scen ario A as foll ows. Set ( ¯ y A , ¯ z A , ¯ r A ) = ( 0 , 0 , 0) if A = ∅ , and ( 0 , ˆ z A , 1) if A 6 = ∅ . W e first argu e tha t i f q ≤ α , th en  0 , { ( ¯ y A , ¯ z A , ¯ r A ) }  is an optimal solution to (RASC -P). It is clear that the solution is feasible since P A p A ¯ r A = q ≤ ρ . T o prove optimality , supp ose  x ∗ , { ( y ∗ A , z ∗ A , r ∗ A ) }  is an optimal solutio n. Consider the solution w here x = 0 and the solution for scenario A is ( 0 , 0 , 0 ) if A = ∅ , and ( 0 , z ∗ A + y ∗ A + x ∗ , 1) otherwise. This certainly gi ves a feasib le solution. The dif ference between th e cost of this so lution and that o f the optimal s olution is at most P A : A 6 = ∅ p A w A · x ∗ − w I · x ∗ , whic h is nonpo siti ve since w A ≤ λw I and q ≤ 1 /λ . Setting z A = ˆ z A for a non -null sc enario c an on ly de crease t he co st, and hence, also yields an optimal solutio n. 10 Let δ be the desired failur e pr obabilit y , w hich we may assume to be less than 1 2 without loss of generalit y . W e determine w ith high probabili ty if q ≥ α . W e draw M = ln(1 /δ ) α samples and compute X = number of times a n on-null scenario is sampled. W e claim that with high probabi lity , if X > 0 then OPT ≥ LB = δ ln(1 /δ ) · α ; in this ca se, we retu rn the solu tion RiskAlg ( ǫ, ǫ LB , κ ) to obta in the de sired guaran tee. Otherwise, if X = 0 , we return  0 , { ( ¯ y A , ¯ z A , ¯ r A ) }  as the solution . Let r = P r [ X = 0] = (1 − q ) M . So 1 − q M ≤ r ≤ e − q M . If q ≥ ln  1 δ  / M , then Pr[ X = 0] ≤ δ , so with probabilit y at least 1 − δ we say that OPT ≥ LB , which is true since OPT ≥ q ≥ α . If q ≤ δ / M , then Pr[ X = 0] ≥ 1 − δ and we re turn  0 , { ( ¯ y A , ¯ z A , ¯ r A ) }  as the solution, which is an optimal s olution since q ≤ α . If δ/ M < q < ln  1 δ  / M , then we al ways return a correct answer sin ce it is both true that OPT ≥ q > LB , and that  0 , { ( ¯ y A , ¯ z A , ¯ r A ) }  is an optimal solution . 3.1.1 Pro of of Theorem 3.4 Through out this se ction, ε, η , ζ are fixed at the va lues gi ven in the state ment of Theo rem 3.4. Let (BSC- P) denote the problem m in x ∈P h (∆; x ) . The proof proceeds by analyzing the subgrad ients o f h (∆; . ) and b h (∆; . ) and sho wing that Lemm a 2.2 can be app lied here. W e first note that the argument s gi ven in [38, 45, 7] for 2-stage progra ms d o not direct ly apply to (BSC-P) since it does not fall into the class of problems considered therein. Shmoys and Swamy [38] sho w (essen tially) that if one can compute an ( ω , ξ ) -subgr adient of the object iv e function h (∆; . ) at an y giv en point x for a suf ficiently small ω , ξ , then one ca n us e the ellipsoi d method to obtain a near o ptimal solutio n to (BSC-P). The y arg ue that for a lar ge class of 2-stag e L Ps, on e can ef fi ciently compu te an ( ω , ξ ) -subgra dient using p oly  λ ω  samples. S ubseq uently [45], they le veraged the proof of the ellipsoid -based algorithm to ar gue that the SAA method a lso yields an ef ficient appro ximation scheme for the sa me class of 2 -stage LPs. These proofs rely on the fact that for their class of 2-stage progr ams, each componen t of the subgradien t lies in a range bounded multiplica tiv ely b y a facto r of λ and can be approximated additi vely using p oly ( λ ) samples. Howe ver , in the c ase of (BSC-P), fo r a subgrad ient d = ( d S ) of h ( ∆; . ) , we can only s ay that d S ∈ [ − w A S − ∆ , w I S ] (see Lemma 3.6), which m akes it dif ficult to obtain an ( ω , ξ ) -subgrad ient usi ng sampling for suitably small ω, ξ . C harikar , C hekuri and P ´ al [7] considere d a similar cl ass of 2 -stage problems, and ga ve an alternate proof of ef ficiency of the SAA m ethod sho wing that ev en approximate solution s to the SAA pr oblem trans late to ap proximate solutio ns to th e origi nal problem. Their proo f sho ws that if Λ is such that g A (∆; x ) − g A (∆; 0 ) ≤ Λ w I · x for eve ry A and x ∈ P , then p oly  I , Λ ǫ  samples suffice to constru ct an SAA problem whose optimal solutions correspo nd to (1 + ǫ ) -optimal solution s to the original proble m. But for our problem, we can only obtain the bound Λ ≤ w A · x + ∆( P S x S ) ≤ λw I · x + ∆ P S x S , and ∆ might be lar ge compared to w I · x . The k ey in sight that allo ws us to circ umvent these di fficu lties is that in o rder to est ablish our (weak) guaran tee, w here we al lo w for an add iti ve error measu red relati ve to ∆ , it su ffice s to be able to a pproximat e each c omponent d S of the subg radient of h (∆; . ) withi n an additi ve er ror prop ortional to ( w I S + ∆) , and this can be done by drawin g p oly ( λ ) samples. This enables one to argue that functio ns b h (∆; . ) and h (∆; . ) satisfy the “close ness-in-s ubgradients” property stated in L emma 2.2. The subgradien ts of h (∆; . ) and b h (∆; . ) at x are obtained from the optimal dual solutio ns to g A (∆; x ) 11 for e very A . The dual of g A (∆; x ) is giv en by max X e ( α A,e + β A,e )  1 − X S : e ∈ S x S  − B θ A (D) s.t. X e ∈ S ∩ A ( α A,e + β A,e ) ≤ w A S (1 + θ A ) for all S X e ∈ S ∩ A β A,e ≤ w A S for all S X e ∈ A α A,e ≤ ∆ α A,e , β A,e ≥ 0 for all e ∈ A. Here α A,e and β A,e are respecti vely the dual variab les corres ponding to (2) and (3), and θ A is the dual v ariable corresp onding to (4). As in [38], w e the n ha ve the follo wing descrip tion of the subgra dient of h . Lemma 3.6 Let ( α ∗ A , β ∗ A , θ ∗ A ) be an optimal dual solutio n to g A (∆; x ) . Then the vector d x with componen ts d x,S = w I S − P A p A P e ∈ S  α ∗ A,e + β ∗ A,e  is a subg radie nt of h (∆; . ) at x . Since b h (∆; . ) is of the same form as h (∆; . ) , we ha ve similarly that ˆ d x = ( ˆ d x,S ) , w here ˆ d x,S = w I S − P A p A P e ∈ S  α ∗ A,e + β ∗ A,e  , is a subgrad ient of b h (∆; . ) at x . Since ˆ d x and d x both hav e ℓ 2 norm at most λ k w I k + | ∆ | , b h (∆; . ) and h (∆; . ) hav e Lipschitz constant at most K = λ k w I k + | ∆ | . Lemma 3.7 Let d be a subgr adient of h (∆; . ) at the point x ∈ P , and suppose that ˆ d is a vector such that ˆ d S ∈ [ d S − ω w I S − ξ / 2 m, d S + ω w I S + ξ / 2 m ] for all S . Then ˆ d is an ( ω , ξ ) -subgr adient of h (∆; . ) at x . Pro of : Let y be an y poin t in P . W e ha ve h (∆; y ) − h (∆; x ) ≥ ˆ d · ( y − x ) + ( d − ˆ d ) · ( y − x ) . The second term is at least X S : d S ≤ ˆ d S ( d s − ˆ d S ) y S + X S : ˆ d S >d S ( ˆ d S − d S ) x S ≥ X S  − ω w I S y S − ω w I S x S  − ξ ≥ − ω h (∆; y ) − ω h (∆; x ) − ξ . In the sequel, we set ω = ε/ 8 N , ξ = η ∆ / 2 N . L et ( α ∗ A , β ∗ A , θ ∗ A ) be the optimal dual soluti on to g A (∆; x ) used to define ˆ d x and d x . Notice that ˆ d x,S is simply w I S − P e ∈ S  α ∗ A,e + β ∗ A,e  a verag ed ov er the scenar ios sampled independe ntly to construc t the S AA problem b h (∆; . ) , and E  ˆ d x,S  = d x,S . The sample size N in SA-Alg is s pecifically c hosen so that the Chernof f bound (Lemma 2 .3) i mplies th at | ˆ d x,S − d x,S | ≤ ω w I S + ξ / 2 m for all S with probabil ity at le ast 1 − δ | G τ | for e very x ∈ G τ ; h ence, ˆ d x is an ( ω , ξ ) -subgradie nt of h (∆; . ) at x (by Lemma 3.7). So ta king th e u nion b ound sho ws that with p robabilit y at leas t 1 − δ , b h (∆; . ) and h (∆; . ) satisfy t he conditions of Lemma 2.2 with K = λ k w I k + | ∆ | ,  = ε and ξ (as ab ov e), which yields the desired approx imation guarant ee. W e can take R = √ m and V = 1 2 here, so the number of samples N is p oly  I , λ εη , log ( ∆ ζ )  . Remark 3.8 Notice that no where do w e use the fact t hat th e sce nario-b udgets are uniform, and thus, our results (Theorem 3.4 and henc e, T heorem 3.3) e xtend to the settin g w here we ha ve dif ferent bud gets for the dif ferent scenarios. The scenario bu dgets { B A } are now not specified explici tly; we get to kno w B A when we sample scenar io A . (Notice that we may assume that B A ≤ λ P S w I S for all A .) 12 3.2 Risk-av erse r obust set cov er In the risk-a verse robust set cove r problem, the goal is to choose some sets x in stage I and some sets y A in each scenario A so t hat their u nion co vers A , so as to minimize w I · x + Q ρ [ w A · y A ] . Recall that Q ρ [ w A · y A ] is the (1 − ρ ) -quantile of { w A · y A } A ∈A , that is, the s mallest B such tha t Pr A [ w A · y A > B ] ≤ ρ . As mentione d in the Introdu ction, risk-a verse rob ust problems can be essentially reduce d to risk-a verse b udget proble ms. W e briefly sketch this reduction here for the set cov er proble m. The same ideas can be used to obtain approximatio n alg orithms for the risk-a vers e rob ust versi ons of all the applications considered in Section 4. W e use the common m ethod of “guessin g” B = Q ρ [ w A · y A ] for an optimal solution. Giv en this guess, we need to find inte gral  x, { y A }  so as to minimize w I · x + B (and henc e, w I · x ) subje ct to the const raint that x + y A forms a set cov er for A and and Pr A [ w A · y A > B ] ≤ ρ . This looks v ery simila r to the risk-a verse b udgeted s et co ver problem; the only dif ference is that the e xpected se cond-sta ge cost does n ot appear in the objecti ve function. Thus, one can write an LP-relaxation for the (fraction al) risk-a verse robus t proble m that looks similar to (RAS C-P) excep t that the objecti ve functio n is no w w I · x , and constraint (3) and the v ariables z A,S can be dropped. After L agrangi fying (1) using the dual varia ble ∆ , we obtain the followin g proble m max ∆ ≥ 0 − ∆ ρ +  min h ′ (∆; x ) = w I · x + X A p A g ′ A (∆; x )  (LD2) where g ′ A (∆; x ) = min  ∆ r A : (2) , (4) , y A ≥ 0 , r A ≥ 0  . Let OP T R ob denote the optimum value of the frac tional risk-a vers e rob ust p roblem min x ∈P ( w I · x + Q ρ [ f A ( x )]) , and O PT R ob ( B ) denote the optimum valu e of (LD2) for a giv en B ≥ 0 . Note that OPT R ob ( B ) decrea ses w ith B . W e pro ve that for any B ≥ 0 and ∆ ≥ 0 , SA-Alg returns a solution to the inner minimizatio n problem in (LD2) that satisfies the approximatio n gu arantee stated in Theorem 3.4. Ar guing as in the proof of Theorem 3 .3, this implies th at RiskAlg can be us ed to obtain a near -optimal solution to (LD2) while violatin g the probab ility threshol d by a small factor . The cl aimed approx imation guarantee for SA -Alg follo w s beca use h (∆; . ) and its sample-a vera ge ap- proximat ion b h ′ (∆; . ) construc ted in SA-Alg satisf y the closeness- in-subgr adients property of Lemma 2.2. Let α ∗ A,e is the val ue of the dual vari able correspon ding to (2) in an optimal dual solution to g ′ A (∆; x ) . Note that P e α ∗ A,e ≤ ∆ for all A . Similar to Lemma 3.6, we no w ha ve that the vectors d x = ( d x,S ) with d x,S = w I S − P A p A ( P e ∈ S α ∗ A,e ) and ˆ d x = ( ˆ d x,S ) with ˆ d x,S = w I S − P A b p A ( P e ∈ S α ∗ e ) are respe c- ti vely subgrad ients of h ′ (∆; . ) and b h ′ (∆; . ) at x . Let N , N , τ , G τ be as defined in SA-Alg w ith R = √ m , V = 1 2 and K = k w I k + | ∆ | . Using N samples, for any x ∈ G τ , with ver y high probab ility we ha ve that | ˆ d x,S − d x,S | ≤ η ∆ / 4 mN ; thus, as in Lemma 3.7, ˆ d x is an  0 , η 2 N  -subgr adient o f h ′ (∆; . ) at x . So Lemma 2.2 sho w s that SA-Alg returns a solutio n ˆ x such that h ′ (∆; x ) ≤ O PT + η ∆ + ζ with high prob- ability . Notice that in fact, the approximat ion gua rantee o btained via SA-Alg is p urely additi ve. Also, one can avoid the depend ence of the sample-si ze on λ (and ε ) here since the modified form of the subgradien t means t hat we c an ens ure tha t | ˆ d x,S − d x,S | ≤ η ∆ / 4 mn for e very x ∈ G τ and c omponent S using a number of samples that is indepe ndent of λ . T his implies that for any ǫ, γ , κ > 0 , RiskAlg computes (nonne gativ e)  x, { y A , r A }  satisfy ing (2), (4) such that w I · x ≤ (1 + ǫ ) OPT R ob ( B ) + γ and P A p A r A ≤ ρ (1 + κ ) . T o complet e the reducti on, w e describ e how to gues s B . Let W = P S w I S , which is an up per bound on the optimum (with log W polynomia lly bounded). W e use the standa rd method of enumera ting values of B increa sing geometri cally by (1 + ǫ ) ; we start at γ and end at the smallest va lue that is at least W . So if B ∗ is the “correct” guess, then we are guaranteed to enumerat e B ′ ∈ [ B ∗ , (1 + ǫ ) B ∗ + γ ] . W e use RiskAlg to compute the solutio n for each B , and return  x, { y A , r A }  that minimizes w I · x + B . Let  x ′ , { y ′ A , r ′ A }  be the solu tion computed for B ′ . Then we ha ve w I · x + B ≤ w I · x ′ + B ′ ≤ (1 + ǫ ) OPT R ob ( B ′ ) + (1 + ǫ ) B ∗ + 2 γ ≤ (1 + ǫ ) OPT R ob + 2 γ . W e remark tha t the sa me te chnique s yield a s imilar g uarantee 13 for the LP-relaxat ion of a generaliz ation of the problem, where we wish to minimize w I · x plus a weighted combina tion of E A  w A · y A  and Q ρ [ w A · y A ] . W e can con vert the abov e guarant ee into a purely multipli cati ve one under the same assumpti on ( ∗ ) stated in Theorem 3.3. Let q = P A 6 = ∅ p A . N otice that if q ≤ ρ , then O PT R ob = 0 and x = 0 is an optimal soluti on, and otherwise OPT R ob ≥ 1 . Let δ be such that (1 + κ ) δ ln(1 /δ ) ≤ 1 . Using ln(1 /δ ) ρ ′ samples we can determin e with high probabi lity if q ≤ ρ ′ or if q > ρ . In the former case, we return x = 0 and y A in scenario A , whe re y A = 0 if A = ∅ and is any feasib le solution if A 6 = ∅ . Note that w I · x + Q ρ ′ [ w A · y A ] = 0 . In the latter case, we set γ = ǫ , an d obtain a ex ecute the procedur e detai led abo ve to obtain a (1 + 3 ǫ ) -multiplica tiv e guaran tee. Finally , one can use Theorem 3.2 to round the fraction al solution to an inte ger solution, or to a so- lution to the frac tional risk-a vers e rob ust problem. (The vi olation of the b udget B can no w be absorbed into the approximation ratio .) F or any ǫ, κ, ε > 0 , we obtain a fractio nal solution ˆ x such that w I · ˆ x + Q ρ (1+ κ + ε ) [ f A ( ˆ x )] ≤  1 + ǫ + 1 ε  OPT R ob , and an integer solution ( ˜ x, { ˜ y A } ) such that w I · ˜ x + Q ρ (1+ κ + ε ) [ w A · ˜ y A ] ≤ 2 c  1 + ǫ + 1 ε  OPT R ob using an LP-based c -app roximation algorith m for deterministi c set cove r . Setting B = 0 abo ve yield s a problem that is interes ting in its o wn right. When B = 0 , we seek a minimum-cos t collection of sets x that are pi cked only in stag e I such that Pr A [ x is not a set cov er for A ] ≤ ρ . That is , we obtain a chance- constr ained pr oblem with out re course. As shown above (althou gh B = 0 is not one of our “guesses” ), we can solve this chance-con strained set cove r proble m to obtain a solution x such that w I · x ≤ (1 + ǫ ) OPT R ob (0) + γ where Pr A [ x does not co ver A ] ≤ ρ (1 + κ ) . 4 A pplications to combinatorial optimization pr oblems W e no w sho w that the te chniques de veloped in Sectio n 3 for th e risk-a verse bud geted set cov er problem can be used to obtain approxi mation algorithms for the risk-a verse version s of v arious combinato rial optimiza- tion problems such as co verin g problems—(set cover ,) ver tex co ver , multicut on trees, min s - t cut—and faci lity loc ation pro blems. This includes man y of the pr oblems considered in [17, 38, 11] in the s tandard 2-stag e and demand-r ob ust models. In all the applicatio ns, the first step is to arg ue that proced ure Risk Alg can be used to obtain a near- optimal solution to a suitable LP-relaxation of the p roblem while violating the probability thresho ld by a small f actor . The orem 3.3 prov es this for cov ering probl ems; for multico mmodity flow and facili ty locatio n, we need to modify the ar guments slightly . The second step, which is more proble m-specific, is to round the LP-solution to an intege r solution. Anal ogous to part (i) of The orem 3.2, we first rou nd the LP-solution to a soluti on to the fraction al risk-a verse problem. Give n this, our task is now reduce d t o rounding a fractional soluti on to a standa rd 2-stag e problem into an inte gral one. For this latter step, one can use any “local ” LP-based approximatio n algorithm for the 2 -stage prob lem, where a local algorithm i s one that preserves approx imately the cost of each scenario. (For set cov er , ve rtex co ver an d multicut on trees, we may use part (ii) of Theorem 3.2 directly , which utilizes the local L P-roundin g algorith m in [3 8] (whic h in turn i s obtain ed using an LP-based appr oximation algor ithm for the deterministi c coveri ng problem).) As in the case of risk-a verse robus t set cov er , our results extend to the setting of non-uni form bud gets. W e say that an algorithm is a ( c 1 , c 2 , c 3 ) -appr oximation algorithm for the risk-a verse problem with b udget B and threshol d ρ , if it r eturns a so lution of cost at most c 1 times the op timum where the pr obability that the second -stage cost exce eds c 2 · B is at most c 3 · ρ . Our approximatio n results for the b udgeted probl em also translate to the r isk-a verse rob ust v ersion of the proble m. S pecifically , a ( c 1 , c 2 , c 3 ) -appr oximation algorithm for the budg eted proble m implies that one can obtain an integer soluti on ( x, { y A } ) to the rob ust prob lem such that c ( x ) + Q ρ (1+ c 3 ) [ f A ( x, y A )] ≤ max { c 1 , c 2 } · OPT R ob . As men tioned in S ection 3.2, the r ob ust probl em with a guess of Q ρ [ f A ( x, y A )] = 0 14 gi ves rise to a problem where o ne can t ake actions only in st age I an d o ne seeks to “ take care” of “most” second -stage scenarios; we can solve this chanc e-constr ained proble m ap proximate ly . W e also achie ve bicrite ria approximatio n guara ntees for the problem of m inimizing c ( x ) plus a w eighte d combinati on of E A  f A ( x, y A )  and Q ρ [ f A ( x, y A )] . 4.1 Coverin g problems V ertex co ver and multicut on tr ees. In the risk -av erse budge ted vertex cover problem, we are gi ven a graph whose edges need t o cov ered by v ertices . The edge-set i s random a nd determined b y a d istrib ution (on se ts of ed ges). A v erte x v may be picked in stage I or in a sce nario A incurring a cost of w I v or w A v respec tiv ely . W e a re also g iv en a b udget B and a probabi lity threshold ρ and req uire that th e probabi lity that the second-stag e cos t of picking verti ces exce eds B be at most ρ . In the risk-a verse versio n of multicut on trees, we are gi ven a tree, a (black-box ) dist rib ution over sets of s i - t i pairs, a budge t B , and a threshol d ρ . The goa l is t o choo se edges in s tage I and in each sc enario such th at the u nion of edges pick ed in st age I an d in scenario A forms a multicu t for the s i - t i pairs th at ar e re vealed in scen ario A . Moreov er , the se cond-sta ge cost of picking edges m ay exce ed B with probabili ty at m ost ρ . The goal is to minimize the total exp ected cost. Both the se problems are struc tured cases of risk- av erse bu dgeted set co ver . So one ca n formula te an LP-relaxatio n of the risk-a verse prob lem exac tly as in (RASC-P) and by Theo rem 3.3, obtain a near- optimal soluti on to the r elaxatio n. W e may then ap ply Theorem 3.2 directly to these prob lems to round t he fra c- tional solu tion. Since th ere is an LP-base d 2-appro ximation algorith m for the determinis tic versi ons of both proble ms, we obtain the follo wing theorem. Theor em 4.1 F or an y ǫ, κ, ε > 0 , the r e is a  4(1 + ǫ + 1 ε ) , 4(1 + ǫ + 1 ε ) , 1 + κ + ε  -appr oximation algori thm for the risk-av erse budg eted vers ions of vertex cov er and m ulticu t on tre es. Min s - t cut. In the stochast ic m in s - t cut problem, we are giv en an undirec ted g raph G = ( V , E ) an d a source s ∈ V . The location of the sink t is random and gi ven by a distrib ution. W e may pick an edge e in stage I or i n a scen ario A incurrin g costs w e and w A e respec tiv ely . The con straints are that in an y scenar io A with sink t A , the ed ges picked in stag e I and in that scen ario induce an s - t A cut, and th e goal is to minimize the expec ted cost of choosing edges. In the risk-av erse b udgeted problem there is the additional constraint that the the second -stage cost may exc eed a giv en bud get B with proba bility at most (a gi ven v alue) ρ . The LP -relaxa tion of the risk-a verse problem based on a path-co veri ng formulati on is a special case of (RASC-P). The only additional observ ation needed t o see that Theorem 3.3 can be applied here is that the cover ing problem (P) for a scenario A (and its dual) can be solved ef fi ciently althou gh there are an exp onential number o f constra ints. Thus, procedures RiskAlg an d SA-Alg ca n be impl emented ef ficiently and we may obtain a near -optimal solution to the relaxation . W e use T heorem 3.2, part (i) to con vert the solutio n to a near -optimal solution ˆ x to the fractiona l r isk- a verse pr oblem. W e no w use t he algor ithm in [11], which is a loc al LP-based O (log | V | ) -approxi mation algori thm to round this solution to an integ ral one. Their algorithm require s that there exist multipliers λ A in eac h scenario A such that w A e = λ A w e for e very e ; consequen tly we also need th is for our result . A detail worth noting is that their algorithm require s access also to the second-stag e fractio nal solutions (b ut not the scenario-p robabilities). But this is not a problem since there are only polynomial ly many scenario s here correspo nding to the dif ferent locations of the sink. So giv en the fi rst-stag e solution ˆ x , one can simply compute the optimal fractio nal second -stage solution for each scenario for use in their algorith m. Theor em 4.2 F or any ǫ, κ, ε > 0 , ther e is an  O (log | V | )(1 + ǫ + 1 ε ) , O (log | V | )(1 + ǫ + 1 ε ) , 1 + κ + ε  - appr oximation algorith m for risk-aver se bud geted min s - t cut. 15 4.2 Facility location pr oblems In the risk-av erse budge ted facil ity location problem ( RA UFL ), we hav e a set of m faciliti es F , a client- set D , and a distrib ution o ver client-d emands. W e may open facili ties in stage I or in a giv en scenari o, and in each scena rio A , for e very client j with no n-zero de mand d A j , we must assign its demand to a facility opened in stage I or in that scenario. The costs of openin g a facility i ∈ F in stage I and in a scenario A are f I i and f A i respec tiv ely; the cost of assigning a client j ’ s d emand in scenario A to a facilit y i is d A j c ij , where the c ij ’ s form a metric. The first-stage cost is the cost of openin g f acilities in stage I, and the cost of scenario A is the sum of all the facili ty-openi ng an d client-assig nment costs incurred in that scenario. The goal is to minimize the tot al ex pected cost subjec t to th e usua l conditio n that the probab ility that the second-stag e cost exc eeds B is at most some threshold ρ . For notational simplicit y , we consider the case of { 0 , 1 } -demand s, so a scen ario A ⊆ D simply specifies the clients that need to be assigne d in that scenario . W e formulat e the follo wing LP-relaxatio n of the problem. T hrough out, i inde xes the fac ilities in F and j the clients in D . min X i f I i y i + X A ⊆D p A  X i f A i  y A,i + v A,i  + X j ∈ A,i c ij  x A,ij + u A,ij   (RAFL-P) s.t. X A p A r A ≤ ρ (6) X i x A,ij + r A ≥ 1 for all j ∈ A (7) X i  x A,ij + u A,ij  ≥ 1 for all j ∈ A (8) x A,ij ≤ y i + y A,i for all j ∈ A, i (9) x A,ij + u A,ij ≤ y i + y A,i + v A,i for all j ∈ A, i (10) X i f A i y A,i + X j ∈ A,i c ij x A,ij ≤ B for all A (11) y i , y A,i , v A,i , x A,ij , u A,ij , r A ≥ 0 for all A, i, j. (12) Here y i denote s the first-sta ge decisio ns. The vari able r A denote s if one exc eeds the budge t B in sc enario A ; (6) limits the probab ility mass of such scenarios to at most ρ . The decision s ( x A,ij , y A,i ) and ( u A,ij , v A,i ) are intended to denote the decisions taken in scenario A in the two cases w hen does not exceed the b udget, and when one does exceed the b udget respecti vely . Correspondi ngly , (7) and (8) enforce that ev ery client is assign ed to a facilit y in these two cases, an d ( 9) and (10) ensure that a client is only assigned to a faci lity opene d in stage I or in that scenari o in these two cases. Finally , (11) is the b udget constraint for a scenario. Let OP T be the optimal v alue of (RAF L-P). G i ven fi rst-sta ge decisions y ∈ [0 , 1] m , let ℓ A ( y ) denote the m inimum cost of fractionally opening facili ties and fractionally assigning clients in scenar io A to open faci lities (i.e., faciliti es opened to a combined ext ent of 1 in stage I and scen ario A ). Let P = [0 , 1] m . As in Section 3, w e L agrang ify (6) using a dual v ariable ∆ ≥ 0 to o btain the problem max ∆ ≥ 0  − ∆ ρ + OP T (∆)  where O PT (∆) = min y ∈P h (∆; y )  , h (∆; y ) = f I · y + P A p A g A (∆; y ) , and g A (∆; y ) is the minimum v alue of P i f A i ( y A,i + v A,i ) + P j ∈ A,i c ij ( x A,ij + u A,ij ) + ∆ r A subjec t to (7)–(12) (where the y i ’ s are fixed no w). As in Claim 3.1, it is easy to show that OPT is a lower bou nd on the optimal valu e of e ven the fractio nal risk-a verse problem. Theor em 4.3 F or any ǫ, γ , κ > 0 , in time p oly  I , λ ǫκρ , log ( 1 γ )  , one can use RiskAlg to compute (with high pr obabil ity)  y , { ( x A , y A , u A , v A , r A ) }  that satisfies (7) – (12) w ith objective value C ≤ (1 + ǫ ) OPT + γ suc h that P A p A r A ≤ ρ (1 + κ ) . This can be con verted to a (1 + 2 ǫ ) -guar antee in the cost pr ovi ded f I · y + ℓ A ( y ) ≥ 1 for every y ∈ [0 , 1] m , A 6 = ∅ . 16 Pro of : Examin ing procedure RiskAlg , argui ng that RiskAlg can be used to approximately solve (RAFL-P) in v olves two things: (a) coming up with a bound U B such that log U B is polynomiall y bounded so that one can restrict the se arch for the right value of ∆ in Risk Alg ; and (b) showing that an optimal solution to the SAA-versio n of the inner -minimization prob lem for any ∆ ≥ 0 construct ed in SA-Alg yields a solutio n to the true minimizatio n problem that satisfies the approxima tion guarantee in Theorem 3.4. There are two notable as pects i n whic h th e ri sk-a vers e facili ty loca tion dif fers from risk-a verse set c ov er . First, unlike in set cov er , one cannot ensure that the cost incurred in a scenario is always 0 by choosin g the first-stag e decisions appropriate ly . Thus, the problem (RAFL -P) may in fact be infeasib le. This creates some complicat ions in coming up with an upper bound UB for use in RiskAlg . W e show that one can dete ct by an ini tial s ampling s tep th at eit her t he pr oblem i s i nfeasibl e, or come up with a suitable v alue for UB . Second, due to the non-cov ering nature of the problem, one needs to delve deeper into the structur e of the dual LP for a sce nario (afte r Lagrangifyin g (6)) to prov e the cl oseness- in-subgradients proper ty for SAA object iv e function constructed in SA-Alg and the true objecti ve function . Define C A = P j ∈ A (min i c ij ) . This is the minimum possib le assign ment cost that one can incur in scenar io A . W e may determine with high probability using O  1 ρκ  samples if Pr A [ C A > B ] > ρ or Pr A [ C A > B ] ≤ ρ  1 + 5 κ 28 ) . In the former case, we can concl ude that the problem is infeasib le. In the latter case, we set ˆ ρ = ρ  1 + 5 κ 28  , ˆ κ such that ˆ ρ (1 + ˆ κ ) = ρ (1 + κ ) , and UB = 32(1+ ε )( P i f I i + B ) 3 ρκ , and call proced ure RiskAlg with these valu es of ˆ ρ, ˆ κ and UB (and the giv en ǫ , γ ). W e prov e in Claim 4.4 belo w that with this uppe r bound, p ( k ) , p ′ ( k ) < ρ ′ = ˆ ρ (1 + 3 ˆ κ/ 4) ; this is the only conditio n required for the search for ∆ in RiskAlg . T ask (b) boils down to showing that the object iv e functi on b h (∆; . ) of the S AA-problem in SA-Alg and the true problem h (∆; . ) satisfy the condition s of Lemma 2.2. Due to the non-co verin g natu re of the formulation , we need to der iv e additiona l insights about op timal dua l solutio ns to g A (∆; y ) to p rov e this. Lemma 4 .5 pro ves that this holds with high probabili ty , with K = λ k f I k + | ∆ | ,  = ε and ξ = η ∆ 2 N . So by L emma 2.2, the solution ˆ y = argmin y ∈P b h (∆; y ) returned by SA-Alg satisfies the require ments of Theorem 3.4. As in the set cover problem, we m ay take R = √ m , V = 1 2 , which ensures that the sample size is polyno mially bound ed. The proof of the con version to a multipl icati ve guarantee is as in T heorem 3.3. Recall that ∆ k ≥ UB and p ( k ) = P A p A r ( k ) A , where  y , { ( x A , y A , u A , v A , r A ) }  is the solut ion ret urned by SA-Alg for ∆ k of cost h (∆ k ; y ) ≤ (1 + ε ) OPT (∆ k ) + η ∆ k + ζ with ε , η , ζ set as in RiskAlg . Claim 4.4 W e have p ( k ) < ρ ′ and p ′ ( k ) < ρ ′ , wher e ρ ′ = ˆ ρ (1 + 3 ˆ κ/ 4) . Pro of : Let F = P i f I i and q = Pr A [ C A > B ] ≤ ρ (1 + 5 κ/ 28) . C onside r the solution y with y i = 1 for all i . For any ∆ ≥ 0 , we hav e O PT (∆) ≤ h (∆; y ) ≤ F + P A p A C A + q ∆ ≤ F + B + q ∆ . Suppose p ( k ) ≥ ρ ′ − β ˆ ρ . Then c ost ( k ) − η ∆ k ≥ ∆ k ˆ ρ (1+ 9 ˆ κ / 16) ≥ ∆ k ρ (1 + 9 κ/ 16) , where the last inequ ality f ollo ws since ˆ ρ (1 + ˆ κ ) = ρ (1 + κ ) and ˆ ρ ≥ ρ . A lso (1 + ε ) OPT (∆ k ) + ζ ≤ 2(1 + ε )( F + B ) + (1 + ε ) q ∆ k < 2(1 + ε )( F + B ) + ∆ k ρ (1 + 3 κ/ 8) since ε = ǫ/ 6 ≤ ˆ κ/ 6 ≤ κ/ 6 . But then c ost ( k ) − η ∆ k > (1 + ε ) OPT (∆ k ) + ζ which gi ves a cont radictio n. So p ( k ) < ρ ′ − β ˆ ρ , which implies that p ′ ( k ) < ρ ′ . Lemma 4.5 W ith pr obability at least 1 − δ , b h (∆; . ) and h (∆; . ) sati sfy the conditions of L emma 2.2 wit h K = λ k f I k + | ∆ | ,  = ε and ξ = η ∆ 2 N . Pro of : Consider a point y ∈ P . Consider an optimal dual solu tion to g A (∆; y ) where α ∗ A,j , ψ ∗ A,j , β ∗ A,ij , Γ ∗ A,ij , θ ∗ A are the optimal value s of the dual varia bles correspo nding to (7)–(11) respe cti vely . Note that g A (∆; y ) equals the objecti ve valu e of this dual solu tion, which is gi ven by X j ∈ A  α ∗ A,j + ψ ∗ A,j  − X i y i  X j ∈ A  β ∗ A,ij + Γ ∗ A,ij   − B · θ ∗ A . 17 W e choose an o ptimal du al soluti on that mini mizes P i,j β ∗ A,ij . As in Lemma 3.6, i t is easy to show that the v ectors ˆ d y = ( ˆ d y , i ) an d d y = ( d y , i ) gi ven by ˆ d y , i = f I i − P A b p A P j ∈ A  β ∗ A,ij + Γ ∗ A,ij  and d y , i = f I i − P A p A P j ∈ A  β ∗ A,ij + Γ ∗ A,ij  are respect iv ely subgradien ts of b h (∆; . ) and h (∆; . ) at y . No w we claim that for ev ery i , P j β ∗ A,ij ≤ ∆ and P j Γ ∗ A,ij ≤ f A i . Giv en this, k ˆ d y k , k d y k ≤ K where K = λ k f I k + ∆ for any y ∈ P , so K is an upper bou nd on the L ipschit z consta nt of h (∆; . ) and b h (∆; . ) . The se cond inequal ity is a co nstraint of the d ual, correspond ing to v ariable v A,i . Sup pose β ∗ A,ij > 0 for some j . The du al en forces th e constraint α ∗ A,j + ψ ∗ A,j ≤ c ij (1 + θ ∗ A ) + β ∗ A,ij + Γ ∗ A,ij , cor respondi ng to v ariable x A,ij . W e claim that this must hold at equality . By complementary slackne ss, we hav e x ∗ A,ij = y i + y ∗ A,i where ( x ∗ A , y ∗ A , u ∗ A , v ∗ A ) is an op timal p rimal solution t o g A (∆; y ) . So if y i > 0 then x ∗ A,ij > 0 and complementa ry slack ness gi ves the d esired equality . If y i = 0 and the abov e inequality is strict, the n we may decrease β ∗ A,ij while maintaini ng dual feasibility and optimality , w hich giv es a contrad iction to the choice of th e dua l soluti on. T hus, since the dual als o impo ses that ψ ∗ A,j ≤ c ij + Γ ∗ A,ij (corres ponding to u A,ij ), we hav e that β ∗ A,ij ≤ α ∗ A,j , so P j β ∗ A,ij ≤ P j α ∗ A,j ≤ ∆ (the last inequa lity follo ws from the dual constr aint for r A ). As in Lemma 3.7, if d is a subg radient of h (∆; . ) at y and ˆ d is a v ector such that | ˆ d i − d i | ≤ ω f I i + ξ 2 m , then ˆ d is an ( ω , ξ ) -subgrad ient of h (∆; . ) at y . Since E  ˆ d y , i  = d y , i for e very y and i , pluggi ng in the samp le size N used in SA-Alg and us ing the Chernof f bound (Lemma 2.3), we obtain with probability at least 1 − δ , | ˆ d y , i − d y , i | ≤ ε 8 N f I i + η ∆ 4 mN for all i , for e very point y in the e xtended τ K N -net G τ of P . Thus , with probab ility at least 1 − δ , ˆ d y is an  ε 8 N , η ∆ 2 N  - subgra dient of h (∆; . ) at y for e very y ∈ G τ . W e no w discu ss the rounding pro cedure. Analogous to p art (i) of Theorem 3.2, it is n ot h ard to see that if  y , { ( x A , y A , u A , v A , r A ) }  is a so lution satisfyin g (7)–(12) of objecti ve val ue C w ith P = P A p A r A , th en for an y ε > 0 , taking ˆ y = y  1 + 1 ε  gi ves P i f i ˆ y i + P A ℓ A ( ˆ y ) ≤  1 + 1 ε  C and Pr[ ℓ A ( ˆ y ) >  1 + 1 ε  B ] ≤ (1 + ε ) P . So no w o ne can use a local approximati on algo rithm for 2-sta ge st ocha stic facility location ( SUFL ) to roun d ˆ y . Shmoys and Swamy [38] sh ow that any LP-based c -approximati on algor ithm fo r the deterministic f a- cility location problem ( DUFL ) that satisfies a c ertain “demand-ob liv iousness” property can be used to obtain a min { 2 c, c + 1 . 52 } -a pproximat ion algorit hm for SUFL , by using it in con junction with the 1 .52- approx imation algori thm for DUFL in [2 6]. “Demand-obli viousness” means that the algori thm shou ld round a fractiona l solutio n without having any kno wledge about the client-demand s, and is imposed to handle the fact t hat one does no t ha ve the s econd-s tage solutions exp licitly . T here are so me dif ficulties in applying this to our problem. F irst, the resultin g algorithm for SUFL need not be local. Second ly , and more significant ly , e ven if we do ob tain a local ap proximatio n algorith m for SUFL by the con vers ion proces s in [38], th e re- sultin g algorithm may be r andomized , if the c -approx imation algor ithm for DUFL is randomized. This is indeed the case in [38]; they obtain a randomized local 3.378-app roximation algo rithm using the demand- obli vious, randomized 1.858 -approx imation algorithm of Swamy [43]. (T his w as i mprov ed to a rando mized local 3.25 -approxi mation algorithm by Srini vasa n [42], again u sing the algo rithm in [43].) Using such a randomiz ed local c ′ -appro ximation algorithm for SUFL would yield a random integ er solut ion such that there is at lea st a 1 − ρ (1 + κ + ε ) proba bility m ass i n scena rios for which the e xpected cost incurr ed, where the expectat ion is over the random choices of the algorithm, is at most c ′ B  1 + 1 ε  . But we would like to make the stronger claim that, w ith high p r obability over the rand om c hoices of the algorithm , we return a soluti on where the probab ility-mass of scenari os with cost at most c ′ B  1 + 1 ε  is at leas t 1 − ρ (1 + κ + ε ) . W e can take care of both issues by imposing the follo wing (suf fi cient) conditio n on the demand-obli vious algori thm for DUFL th at is used to obta in an approximati on algorith m for SUFL (via the c on versi on process in [38]): w e require that with probability 1, the algor ithm return an integer solutio n where each client’ s assign ment cost is within some fact or of its cost in the fract ional soluti on. One can use the randomize d 18 approx imation algorithm of Swamy [43] or the deterministic Shmoys-T ardos-Aardal (S T A ) algorit hm [40], both of which satisfy this conditio n. Giv en a fractional solution ( x, y ) to DUFL with facility cost F , for a paramete r γ ∈ (0 , 1) , the ST A-algorithm returns an integ er solution ( ˜ x, ˜ y ) with facilit y cost is at most F /γ , where for ev ery j , P i c ij ˜ x ij ≤ 3 1 − γ · P i c ij x ij (so for an y dema nds, the total a ssignment cost i s at most 3 1 − γ times the fractio nal assign ment cost). T aking γ = 1 4 and a pplying t he r ounding pro cedure of [38] yield s the follo wing theorem. Theor em 4.6 F or any ǫ, κ, ε > 0 , ther e is an  5 . 52(1 + ǫ + 1 ε ) , 5 . 52(1 + ǫ + 1 ε ) , 1 + κ + ε  -appr oximation algori thm for risk-av erse b udge ted facility locatio n. Remark 4.7 The local approxima tion algorith m for SUFL dev eloped by [33] is unsuit able for our purpose s, since this algorithm needs to kno w expl icitly the second-sta ge fract ional solution for each scenario, which is an exp onential amount of information . Budget constraints on individual components of the second-stage cost. Our techniq ues can be used to de vise appro ximation algor ithms for va rious f airly ge neral ri sk-a verse v ersions of facilit y locat ion. Sinc e the second -stage cos t c onsists of tw o distinct components , the f acility-open ing cost a nd the c lient-ass ignment cost, one can consider risk-a verse budg eted versions of the problem where we impose a joint pr obabil istic b udge t constra int on the total second -stage cost, and each component of the second-s tage cost. That is, consid er (RAF L-P) with the f ollo wing add itional constrain ts for each s cenario A : P i f A i y A,i ≤ B F and P j,i c ij x A,ij ≤ B C . Here B F and B C are res pecti vely bud gets on the p er -scenario faci lity-open ing and client- assignmen t costs. T o put it in wo rds, (RAFL-P) augmented with the above constrain ts imposes t he follo wing joint proba bilistic bud get constraint: Pr A  total cost of scenar io A > B O R f acility-cost of scenario A > B F O R assignment-c ost of scenario A > B C  ≤ ρ. Note that b y settin g the ap propriat e budg et to ∞ we can mod el the absenc e of a particular b udget constraint . One can m odel vario us in terestin g situati ons by setting B , B F , B C suitab ly . For example, suppose we set B F = 0 and B = ∞ (or equiv alently B = B C ). Then we seek a minimum-cost solution where we want to choose facilitie s to open in stage I such that with probability at least 1 − ρ , we can assign the clients in a scenar io A to (a subset of) the stage I facil ities while incurring assign ment cost at most B C . One can also consid er risk-a verse robus t versio ns of the problem where we seek to m inimize the first-stage cost plus the (1 − ρ ) -quantile of a certain component of the second-stag e cost (i.e., the second-stag e facilit y-openi ng, or assign ment, or total cost). Employ ing the usual “guess ing” trick, this giv es rise to a budg et problem where we ha ve a budge t constra int for a single componen t of the second-stag e cost (that is, two of B , B F and B C are set to ∞ ). As before, the guarantees obtained for t he bu dget problem (see belo w) t ranslate to this risk-a verse robu st problem. Our techniques can be used to solve this more genera l LP . Specifically , Theorem 4.3 continues to hold. But here w e face the complicatio n t hat e ven if we hav e a first-stag e s olution x to the fractional risk-a verse proble m for which w e kno w that ther e e xist second-stag e feasib le solution s that yield a solutio n of total exp ected cost C , it is not clear ho w to compute such feasibl e second-stag e sol utions. Howe ver , notice that RiskAlg not on ly returns a first-stage solu tion (with the abov e existe nce property) but als o sho ws how to compute a suitable second-stag e solution in eac h scenario A , which thus, al lows us to specify completely a near- optimal solution to the LP-relaxatio n (where the RHS of (6) is ρ (1 + κ ) ). Whe reas earlier we used these solutions only in the analysis , now they are part of the algorithm. In the rounding procedu re, the first step, whe re we con ver t the soluti on to the LP-relaxati on to a fraction al solution to the risk -a verse prob lem is uncha nged. But we of course no w need a str onger n otion of “locali ty” from our app roximation alg orithm fo r 19 SUFL . W e need a n al gorithm t hat app roximately pre serve s (with p robabil ity 1) bo th the facility-o pening and client- assignmen t components of the secon d-stage cost of each scen ario. (Cle arly , if the b udget const raint is imposed on only one of the component s then we only need the cost-preserv ation of that component .) Many LP-roundin g algorithms for SUFL (such as th e ones in [38, 42]) do in fac t come with this stro nger local guaran tee. Thus, one can use these to obtain an approx imation algorithm for the above risk-a verse problem with multipl e b udget constraints. Finally , we obtain the same appro ximation guarant ees with non-unif orm scenari o b udgets { ( B A , B A F , B A C ) } . The only sma ll detail he re is that i n order to obtain the upper b ound UB for use in RiskAlg , we no w de- termine if Pr[ C A > min { B A C , B A } ] is greater than ρ or a t most ρ  1 + 5 κ 28  . In the former case, we conclu de infeasibil ity , and in the latt er , we set ˆ ρ = ρ  1 + 5 κ 28  , ˆ κ such that ˆ ρ (1 + ˆ κ ) = ρ (1 + κ ) , and UB = 32(1+ ε )( P i f I i + P j max i c ij ) 3 ρκ and run RiskAlg with these valu es. (Note that we may assume that B A C ≤ P j max i c ij for all A .) 5 Sampling lower boun ds W e now prove variou s lower bounds on the sample size required to obtain a bounded approximatio n gu ar- antee for the risk -av erse budge ted problem in the black-b ox model. W e sh ow that the d ependen ce of the sample size on 1 κ for an additive violati on of κ in the probability thresho ld is unavo idable in the black-bo x model e ven for the frac tional risk-a verse problem and ev en if we allo w a bounded violation of the budg et. The cr ux of our lo wer boun ds is th e follo wing observ ation. Consider th e foll owin g problem. W e are gi ven as in put a threshol d  ∈ (0 , 1 4 ) and a biase d coin with proba bility q of landing head s, where the coin is gi ven as a black -box; that is, we do not know q b ut may toss the coin as many times as necess ary to “learn” q . The goal is to determine if q ≤  or q > 2  ; if q ∈ ( , 2  ] then the algorithm may answer anyth ing. W e pro ve that for any δ < 1 2 , any algor ithm that ensures error probability at most δ on ev ery input must need at least N ( δ ;  ) = ln  1 δ − 1  / 4  coin tosses for each thresh old  . Lemma 5.1 Let δ < 1 2 and A N ( δ ;  ) be an algorithm that has failur e pr obability at most δ and uses at m ost N ( δ ;  ) coin tosses for thr eshold  . Then, N ( δ ;  ) ≥ N ( δ ;  ) for every  ∈ (0 , 1 4 ) . Pro of : Suppose N ( δ ;  ) < N ( δ ;  ) for some  ∈ (0 , 1 4 ) . Let X be a rand om v ariable tha t den otes the number of times the co in lands heads. If X = 0 then the algorithm must say “ q ≤  ” with probabili ty at least 1 − δ , otherwise the algorithm errs with probability more than δ on q = 0 . But then for some q 0 < 1 4 slight ly greater than 2  , we hav e P r[ X = 0] > (1 − 2  ) N ( δ ;  ) ≥ δ 1 − δ . So A will say “ q ≤  ” (and hence, err) for q = q 0 , with probab ility m ore than δ . As a corolla ry we obtain that for any δ < 1 2 , it is impossib le to determine if q = 0 or if q > 0 with error probab ility at most δ using a bounded number of samples. No w consider risk-a verse budg eted s et cover . W e say that a solution is an ( ǫ, γ ) -optima l s olution if its cost is at mo st (1 + ǫ ) OPT + γ . Suppose there is an a lgorithm A for risk- av erse bu dgeted set cov er that on any inp ut (with a black-box distrib ution) draws a bounded number of samples and returns an ( ǫ, γ ) -optimal soluti on with pr obability at least 1 − δ, δ < 1 2 , where the prob ability- threshold is v iolated by at most κ . Consider th e fol lowin g risk-a verse bu dgeted set-co ver insta nce. There are three e lements e 1 , e 2 , e 3 , thre e sets S i = { e i } , i = 1 , 2 , 3 . The budg et is B ≥ 6 γ and the probab ility th reshold is ρ ≤ 1 8(1+ ǫ ) . T he costs are w I S i = B for all i , a nd w A S 1 = 0 , w A S 2 = w A S 3 = 2 B / 3 for ev ery scenario A . Let κ < 1 4 . There are 3 scenar ios: A 0 = ∅ , A 1 = { e 1 , e 2 , e 3 } , A 2 = { e 2 , e 3 } with p A 1 = ρ − κ, p A 3 = 1 − p A 1 − p A 2 . Observ e that if p A 2 ≤ κ , then OPT ≤ ρ · 4 B / 3 , and e very ( ǫ, γ ) -optimal solution m ust ha ve x S 1 + x S 2 + x S 3 ≤ 1 3 . But if p A 2 > 2 κ (which is p ossible sinc e ρ < 1 ) t hen an y solut ion where the probab ility of e xceedin g the b udget 20 is a t most ρ + κ mus t ha ve x S 2 + x S 3 ≥ 1 2 , otherwis e the co st in both sce narios A 1 and A 2 will e xceed B . Thus, algorithm A can be used to determine if p A 2 ≤ κ or p A 2 > 2 κ . This is true ev en if w e allow the b udget t o be violated by a f actor c < 10 9 since we must still ha ve x S 2 + x S 3 > 1 3 if p A 2 > 2 κ ; choosing B ≫ 1 , ρ ≪ 1 , we can a llo w an arb itrarily lar ge b udget- violatio n. So since A ha s f ailure probabili ty at most δ , by Lemma 5.1, it must draw Ω  1 κ  samples. T aking κ = 0 sho ws that obtaining guarante es w ithout violating the probabi lity thresho ld is impossible with a bound ed sample size, w hereas taking κ = κρ shows that a multiplicativ e (1 + κ ) -fac tor violation of the probability thresho ld req uires Ω  1 κρ  samples. More ov er , tak ing ρ = 0 shows that one cannot hope to achie ve any appro ximation guaran tees in the (standard ) bud get model with black- box distrib utions. Theor em 5.2 F or any ǫ, γ > 0 , δ < 1 2 , ever y algorithm for risk-aver se b udgeted set cover that retu rns an ( ǫ, γ ) -optimal solution w ith failu r e pr obabil ity at most δ using a bounded number of samples • must violate the pr obabil ity thr eshold on some input; • r equir es Ω  1 κ  samples if the pr obabili ty-thr eshold is violated by at most an additive κ ; • r equir es Ω  1 κρ  samples if the pr obabili ty-thr eshold is violated by at most a multiplicati ve (1+ κ ) -facto r . The proof of impossibi lity of approximati on in the standard rob ust mod el with a bounded sample size is ev en simpler . Consider the follo wing set cov er instance. W e hav e a single element e that gets “activ ated” with some probabili ty p ; the cost of the set S = { e } is 1 in stage I and some larg e number M in stage II. If p = 0 then O PT = 0 , otherwise OPT = 1 . Thus, it is easy to see that an algorithm returni ng an ( ǫ, γ ) - optimal sol ution can be used to distinguish between thes e two ca ses (it sho uld set x S ≤ γ in the former case, and x S suf fi ciently larg e in the latter). Refer ences [1] C. Acerbi and D. T asche. On the coherence of expect ed short fall. J ournal of Banking and Fin ance , 26,1487 –1503, 2002. [2] E. M. L. Beale. On m inimizing a con vex function subject to linear inequal ities. Jo urnal of the Royal Statis tical Societ y , Series B , 17:173– 184; discussion 194–203, 1955. [3] D. Bertsimas and M. Sim. The price of rob ustness, I NFOR MS Journ al on Opera tions R esear ch , 52:35 –38, 2004. [4] J. R. Birge and F . V . Louv eaux. Intr oduct ion to Stoch astic P r ogr amming . Springer -V erlag, NY , 1997. [5] J. Bor wein a nd A. S. Lewis. C on ve x Ana lysis a nd Non linear Opti mization . Springer -V erlag, NY , 2000. [6] G. Calafiore and M. Campi. The scenario approach to robu st co ntrol design. IEEE T ransactio ns on Aut omatic Contr ol , 51(5):742–7 53, 2006. [7] M. Charikar , C. Chekuri, and M. P ´ al. Samplin g bounds for stochastic optimization. Pr oceedi ngs, 9th RANDOM , pages 257–269 , 2005. [8] A. C harnes and W . Cooper . Uncerta in con ve x programs: rand omized solution s and confidence lev els. Mana gement Science , 6:73–79,195 9. [9] V . Chv ´ atal. A gre edy heuris tic for the set-co verin g problem. Mathemati cs of Oper ations Resear ch , 4:233 –235, 1979. 21 [10] G. B. Dantzig. Linear programming under uncertainty . Manag ement Scienc e , 1:197–2 06, 1955. [11] K. Dhamdhere, V . Goyal, R. Ravi, and M. Singh. How to pay , come what may: approximation algo- rithms for demand-r ob ust cover ing problems. Pr oceedin gs, 46th Annual IEEE Symposi um on F ounda- tions of Computer Scien ce , pages 367–37 8, 2005. [12] S. Dye, L. Stougie, a nd A. T omasgard. The stocha stic single resource service-p rovis ion pr oblem. Naval Rese ar ch Logistic s , 50(8):8 69–887, 2003. Also appe ared as “The sto chastic single nod e ser- vice pro vision problem”, COSOR-Memorandum 99 -13, Dept. of Mat hematics and Computer Science, Eindho ven T echnical Univ ersity , Eindhov en, 1999. [13] E. Erdo ˇ gan and G . Iyen gar . On tw o-stage co n vex ch ance constrai ned problems. Math. Method s of Operatio ns Research, 65(1):1 15–140 , 2007. [14] U. Feige, K . Jain, M. Mahdian, and V . Mirrokni. Rob ust combinatorial optimizatio n w ith expo nential scenar ios. P r oceeding s, 13th IPCO , pages 439–45 3, 2007. [15] A. Goel and P . Indyk . Stochastic load balancing. Pr oceedings , 40th Annual IEEE Symposium on F ounda tions of Computer Science , pages 579–5 86, 1999. [16] D. Golovi n, V . Goyal, and R. Ra vi. Pay today for a rai ny day: improv ed approximation algorithms for demand- rob ust min-cut and shortest path problems. Pr oceeding s, 23r d ST A CS , pages 206–217, 2006. [17] A. Gupta, M. P ´ al, R. Ravi, and A. S inha. Boosted sampling : appr oximation algorithms for stochastic optimiza tion. Pr oceeding s, 36th A nnual A CM Symposium on Theory of Computing , page s 4 17–426 , 2004. [18] A. Gupta, M. P ´ al, R. Ravi, and A. Sinha. What about W ednesday? Approximatio n algorithms for multistag e stochastic optimizat ion. P r oceeding s, 8th AP PR OX , pages 86–98, 2005. [19] A. Gupta, R. Ravi, and A. Sinha. An edge in time sa ves nine: LP roundi ng approx imation algo- rithms for stochas tic n etwork design . In Pr oceedings , 45th A nnual IE EE Symposium on F oundation s of Computer Scienc e , pages 218-227 , 2004. [20] A. H ayrapet yan, C. Swamy , and ´ E. T ardos. Network design for information networks . P r oceedin gs, 16th SOD A , pages 933–9 42, 2005. [21] W . Hoeffd ing. Probabilit y inequ alities fo r sums o f bounde d rand om v ariables. Journ al o f the American Statis tical Associati on , 58:13–3 0, 1963. [22] N. Immorlica, D. Karge r , M. M ink off , and V . Mirrokni. On the costs and benefits of procrastinat ion: approx imation algorithms for stochastic combinat orial optimization problems. Pr oceedings, 15th A n- nual A C M-SIAM Sympos ium on Discr ete A lgorith ms , pages 684–693 , 2004. [23] P . Jorion. V alue at Risk: A N ew Benchma rk for Measuri ng Derivatives Risk . Irwin P rofess ional Publishe rs, New Y ork, 1996. [24] J. Kleinber g, Y . Rabani, and ´ E. T ardos. Allocating bandwidth for b ursty connecti ons. SIAM J ournal on Computing , 30(1) :191–21 7, 2000. [25] A. J. Kley wegt, A. Shapiro , and T . Homem-De-Mello. T he sample a verage appro ximation method for stocha stic discret e optimization . SIAM J ournal of Optimizatio n , 12:479 –502, 2001. 22 [26] M. Mahd ian, Y . Y e, and J. Zhang. Approximation algo rithms for metric facility locatio n problems. SIAM J ournal on Computing , 36:411 –432, 2006. [27] H. M. Marko witz. Portfolio selectio n. J ournal of F inance , 7:77–91, 1952. [28] A. Nemirovski and A. Shapiro. Scenario approxi mations of chance constrain ts. In G. Calafiore and F . D abbene , editors. Pr obabili stic and Randomized Methods for Design under Uncertaint y , Springer - V erlag, 2005 . [29] A. Pr ´ ek opa. Contri bu tions to the theory of stochastic programming. Mathe matical P r ogr amming , 4:202 –221, 1973. [30] A. Pr ´ eko pa. Stoc hastic Pr ogr amming . Kluwer Academic Publish ers, Dordrecht, 1995. [31] A. P r ´ ekopa. P robabil istic pro gramming. In A . Ruszczynski and A. Shapiro, editors, S tocha stic P r o- gra mming , v olume 10 of H andbo oks in Operati ons Resear ch and Mgmt. Sc. , North-Holland, Amster - dam, 2003. [32] M. Pritsker . Ev aluating va lue at risk m ethodo logies. J ournal of Fi nancial Services Resear ch , 12(2/3 ):201–2 42, 1997. [33] R. R a vi and A. Sinha. Hedging uncer tainty: approxi mation algorithms for stocha stic optimization proble ms. Mathematical Pr ogr amming, Series A , 108:9 7–114, 2006. [34] R. Rockafellar and S. Uryase v . Conditiona l valu e-at-risk for general loss distrib utions. Jour nal of Banking and F inance , 26:1443 –1471, 2002. [35] A. Ruszczyns ki and A. Shapiro. Editors, Stoch astic P r ogr amming , volume 10 of Handboo ks in Oper- ations Resear ch and Mgmt. Sc. , North-Hollan d, Amsterdam, 2003. [36] A. Ruszczyn ski and A. S hapiro. O ptimizatio n of risk m easure s. In G. Calafiore and F . Dabbene, editor s. Pr obabilistic and Randomize d Methods for Design unde r U ncertai nty , S pringer -V erlag, 2005. [37] A. Shapiro. M onte Carlo samplin g methods. In A . Ruszczyn ski and A. Shapiro, editors, Stoc has- tic Pr ogr amming , v olume 10 of Handbook s in Opera tions Resear ch an d Mgmt. Sc. , North- Holland, Amsterdam, 2003. [38] D. B. Shmoys and C. Swamy . A n ap proximatio n scheme for stoch astic linear prog ramming and it s applic ation to stochas tic integ er programs. Jou rnal of the ACM , 53 (6):978 –1012, 2006. [39] D. B . Shmoys and C. Swamy . S tochast ic opti mization is (almost) as easy as deterministi c optimizati on. Pr oceeding s, 45th Annual IEEE FOC S , page s 228–2 37, 2004. [40] D. B. Shmoys, ´ E. T ardos, and K. I. Aardal. Approxi mation algorithms for facility location problems. Pr oceeding s, 29th Annual ACM S ymposium on Theory of C omputing , pages 265–274, 1997. [41] A. M-C. So, J. Zhang, an d Y . Y e. Stochastic combinator ial optimizatio n with c ontrolla ble risk av ersion le vel. Pr oceedings , 9th A PPR O X , pages 224–2 35, 2006. [42] A. Srini vas an. A pproxi mation algorithms for s tochasti c and ris k-a verse optimizat ion. Pr oceedings, 18th SOD A , pages 1305– 1313, 2007. [43] C. Swamy . Appr oximation Algorith ms for C luster ing Pr oblems . Ph.D. thesis, Corne ll U ni vers ity , Ithaca , N Y , 2004. http://www .math.uwaterloo .ca/ ∼ cswam y/theses/master .pdf . 23 [44] C. Swamy and D. B. Shmoys . A pprox imation algo rithms fo r 2-stage stocha stic optimiz ation prob lems. A CM SIGA CT News , 37(1):3 3–46, March 2006. Also appeared i n Pr oceedings, 26th F STTCS , pages 5–19, 2006 . [45] C. S wamy and D. B. Shmoys. Sampling-bas ed app roximation algorith ms for m ulti-st age sto chastic op- timizatio n. http://www .math.uwaterloo .c a/ ∼ cswam y/papers/multistage-jou r n.pdf . Prelimin ary version in Pr oceedings , 46th A nnual IEEE Symposium on F oundatio ns of Computer Science , pages 357–366, 2005. A A bicriteria appr oximation f or the Shmoys-Swamy class of 2-stage stochas- tic LPs in the standard b udget model ( ρ = 0 ) Here we sketch ho w one can obtain a bicriteri a approx imation algorith m for the class of 2-stage LP s intro- duced in [38 ] in the standard bu dget model (that is, where w e ha ve a determinist ic b udget constrain t). W e sho w that for any ρ > 0 , in time in versely proportion al to ρ , one can obtain a near -optimal solution where the total probability -mass of scenarios where the budge t is violated is at m ost ρ . W e consider the follo w ing class of 2-stag e stochas tic LPs [38] 2 in the stand ard bu dget m odel. min h ( x ) = w I · x + X A p A f A ( x ) subje ct to x ∈ P ⊆ R m + , f A ( x ) ≤ B for all A (Stoc- P) where f A ( x ) = min w A · r A + q A · s A s.t. D A s A + T A r A ≥ j A − T A x r A , s A ≥ 0 , r A ∈ R m , s A ∈ R ℓ . Here (a) T A ≥ 0 for ev ery scenario A , and (b) for e very x ∈ P , P A ∈A p A f A ( x ) ≥ 0 and the primal and dual problems correspond ing to f A ( x ) are feasible for ev ery scenario A . It is assumed that P ⊆ B ( 0 , R ) , and that P contains a ball of radius V ( V ≤ 1 ) where ln  R V  is polyn omially b ounded . Define λ = max  1 , max A ∈A ,S w A S w I S  ; we assume that λ is kno wn. Let OPT b e the optimum v alue and I denote the input size. It is possible to adapt the proofs in [38, 7, 44] to obtain the bicriter ia guarante e and one can also prov e an SAA theore m in the style of [45, 7]. But p erhaps, the s implest proof, which we no w describ e, is obtaine d using the ellipsoid-b ased algorithm in [38]. Let P ′ = { x ∈ P : f A ( x ) ≤ B fo r all A } . Note that unlike in the case where we ha ve a proba bilistic b udget constrain t, P ′ is a con ve x set . Consider runn ing the e llipsoid -based algorithm in [38] with th e follo wing modification. S uppos e w e wish to return a solutio n of va lue at most (1 + ǫ ) OPT + γ . Let N = p oly  m, ln( K R V γ )  be a suita bly lar ge v alue th at is equ al to the number o f i teration s of t he el lipsoid m ethod. Let ρ ′ = ρ/ N . Suppose the center of the curr ent ellipsoid is x ∈ P . Using O  1 ρ ′  samples one can de termine with high probabil ity if Pr A [ f A ( x ) > B ] > ρ ′ / 2 or if P r A [ f A ( x ) > B ] ≤ ρ ′ . In the former case, by sampling again O  1 ρ ′  times, with very high probabi lity , we can obta in a scenari o A such that f A ( x ) > B . Now we comput e a subgradien t d A,x of f A ( . ) (w hich is obta ined from an optimal dual so lution to f A ( x ) ) at x , and use the inequa lity d A,x · ( y − x ) ≤ 0 to cut the current ellipso id. Notice that this is a valid inequality since for any y ∈ P ′ , by the definition of a sub gradien t, we hav e 0 < f A ( y ) − f A ( x ) ≥ d A,x · ( y − x ) . In the latter case, wher e we detect th at Pr A [ f A ( x ) > B ] ≤ ρ ′ , we continue as in the algorithm i n [38]: we mark the curren t point x and use an approximate subgradient of h ( . ) at x to cut the current ellipsoi d. Proceeding this 2 This was stated in [39] with extra constraints B A s A ≥ h A , but this is equi v alent to ` B A D A ´ s A + ` 0 T A ´ r A ≥ ` h A j A ´ − ` 0 T A ´ x . 24 way we obtain a collectio n of marke d points x 1 , . . . , x k , where k ≤ N , such that with hig h probabil ity , Pr A [ f A ( x i ) > B ] ≤ ρ ′ for each x i , and by the analys is in [38] we ha ve that min i h ( x i ) is “close” to OPT . The n ext step in the algorit hm in [38] is to find a p oint in the con vex hull of x 1 , . . . , x k whose value is c lose to min i h ( x i ) (proc edure FindMin ). Notice that for an y point y in the con vex hu ll of x 1 , . . . , x k , we ha ve Pr A [ f A ( y ) > B ] ≤ kρ ′ ≤ ρ : for any sc enario A with f A ( x i ) ≤ B for all i , the con vexit y of f A ( . ) implies that f A ( y ) ≤ B . Thus, although the set { x ∈ P : Pr A [ f A ( x ) > B ] ≤ ρ } is not con vex, this does not present a problem for us. So one can use p rocedure FindMin in [38] to return a point y such that h ( y ) ≤ (1 + ǫ ) OPT + γ where Pr A [ f A ( y ) > B ] ≤ ρ . 25

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment