Solving Chance Constrained Programs via a Penalty based Difference of Convex Approach
We develop two penalty based difference of convex (DC) algorithms for solving chance constrained programs. First, leveraging a rank-based DC decomposition of the chance constraint, we propose a proximal penalty based DC algorithm in the primal space …
Authors: Zhiping Li, Nan Jiang, Rujun Jiang
Solving Chance Constrained Programs via a P enalt y based Difference of Con v ex Approac h Zhiping Li 1 Nan Jiang 2 Rujun Jiang 3, † Marc h 13, 2026 Abstract W e dev elop tw o p enalt y based difference of conv ex (DC) algorithms for solving chance con- strained programs. First, lev eraging a rank-based DC decomp osition of the chance constraint, w e prop ose a proximal p enalt y based DC algorithm in the primal space that do es not require a feasible initialization. Second, to impro ve numerical stabilit y in the general nonlinear settings, w e derive an equiv alent lifted form ulation with complementary constraints and sho w that, after minimizing primal v ariables, the p enalized lifted problem admits a tractable DC structure in the dual space o ver a simple p olyhedron. W e then dev elop a p enalt y based DC algorithm in the lifted space with a finite termination guaran tee. W e establish exact p enalt y and station- arit y guaran tees under mild constraint qualifications and identify the relationship of the lo cal minimizers b et ween the tw o formulations. Numerical exp eriments demonstrate the efficiency and effectiveness of our prop osed metho ds compared with state-of-the-art b enchmarks. 1 In tro duction As a p o werful to ol for addressing uncertain t y in decision-making, the chance constrained program (CCP) has b een extensively studied in the literature. It has b een widely applied across fields such as finance, energy systems, and supply chain managemen t [ 2 , 32 ]. In general, a chance constrained program can b e written as follows: min x ∈X n f ( x ) : Pr { g ( x , ˜ ξ ) ≤ 0 } ≥ 1 − α o , (1) where ˜ ξ is a random v ector supp orted on Ξ ⊆ R d ξ and α ∈ (0 , 1) is a prescrib ed violation risk lev el. The feasible region X is a deterministic set con tained in an op en set U ⊆ R d , the functions f : U → R , g : U × Ξ → R are real-v alued functions. W e use ˜ ξ to denote the random v ector and ξ as its realization. The goal of Problem ( 1 ) is to minimize the ob jectiv e function f ( x ) s ub ject to the requirement that the constraint g ( x , ˜ ξ ) ≤ 0 be satisfied with probabilit y at least (1 − α ). Problem ( 1 ) is generally difficult to solv e, esp ecially when the distribution information of the random v ariable ˜ ξ is unknown [ 1 ]. First, verifying whether a candidate solution satisfies the chance constrain t can b e computationally challenging. Moreo ver, the feasible region of a CCP is generally 1 IED A, HKUST and School of Economics, F udan Universit y . Email: lizhiping13624@gmail.com 2 IED A, HKUST. Email: nanjiang@ust.hk 3 Sc ho ol of Data Science, F udan Universit y . Email: rjjiang@fudan.edu.cn † Corresp onding author. 1 noncon vex ev en when X is conv ex, implying that finding an optimal solution with a prov able guaran tee can b e elusiv e. Giv en the c hallenges mentioned ab o ve, there are tw o ma jor strategies to address the CCP . The first strategy uses tractable conserv ative appro ximations, where the CCP is formulated as a con vex optimization problem that can b e efficien tly solved and yields a feasible solution to the original CCP [ 43 , 3 , 15 ]. The second approac h is the Sample Av erage Approximation (SAA), also sometimes kno wn as the scenario approac h, and has b een extensively studied in [ 2 , 37 , 40 ]. It is a p opular metho d to appro ximately solv e Problem ( 1 ), esp ecially when the distribution of ˜ ξ is unkno wn, and we can only access a sample of S i.i.d. realizations { ξ s } S s =1 . In this pap er, we consider the SAA appro ximation of Problem ( 1 ) as follows: v ∗ = min x ∈X ( f ( x ) : 1 S S X s =1 I { g ( x , ξ s ) ≤ 0 } ≥ 1 − α ) , (2) where the indicator function I { g ( x , ξ s ) ≤ 0 } = 1 if g ( x , ξ s ) ≤ 0 and 0 otherwise. Moreov er, we fo cus on the conv ex setting and make the follo wing assumptions. Assumption 1. The fe asible r e gion X ⊆ R d is c omp act, c onvex, and has a nonempty r elative interior. Assumption 2. The function f : U → R is c ontinuously differ entiable and c onvex. The function g : U × Ξ → R takes the form g ( x , ξ ) = max i ∈ [ I ] h i ( x , ξ ) , wher e h i : R d × Ξ → R is c ontinuously differ entiable and c onvex in x for every ξ ∈ Ξ and i ∈ [ I ] . Mor e over, U ⊆ R d is an op en set that c ontains X . Assumptions 1 and 2 are common in the existing literature [ 50 , 30 , 53 , 4 ] and cov er several common settings, e.g., the ob jectiv e function f ( x ) is linear, and the feasible region X is a poly- hedron [ 38 ]. Note that Assumptions 1 and 2 imply that ∇ f ( x ) is b ounded on X and thus f is Lipsc hitz con tinuous on X . That is, there exists L f > 0 suc h that | f ( x ) − f ( x ′ ) | ≤ L f ∥ x − x ′ ∥ , ∀ x , x ′ ∈ X . Problem ( 2 ) is kno wn as a single chance constrained program if I = 1 and a join t c hance con- strained program otherwise. Other assumptions will be in tro duced if needed. Giv en a sample of S realizations { ξ s } S s =1 , w e use g s ( x ) to denote g ( x , ξ s ) and use h i,s ( x ) to denote h i ( x , ξ s ) for all s ∈ [ S ] and i ∈ [ I ] in the rest of the pap er. 1.1 Relev an t Literature Chance Constrained Programs. Chance constrained programming has a long history as a framew ork for incorp orating uncertain ty directly into decision mo dels. Since its early dev elopment [ 12 , 11 ], it has b ecome a cen tral tool in applications where decisions m ust remain feasible with high probabilit y . In energy systems, for instance, c hance constrained programs provide a mec hanism for coping with op erational randomness such as renew able generation v ariabilit y , fluctuating demand profiles, and unexp ected netw ork con tingencies [ 16 , 42 ]. Such mo dels enable system op erators to enforce reliability targets while ackno wledging the sto chastic nature of real-world p ow er sys- tems. Similar ideas app ear in financial decision-making, where probabilistic constrain ts are used 2 to balance p ortfolio p erformance against the risk generated by volatile market conditions [ 19 ]. By controlling the likelihoo d of unacceptable losses, these mo dels support inv estmen t strategies that remain robust in uncertain en vironments. In logistics and supply chain management [ 20 , 22 ], c hance constrain ts pla y a comparable role: they allow planners to ensure service, routing, or in- v entory requirements are satisfied with prescribed confidence levels despite uncertain demand or transp ortation disruptions. W e refer interested readers to [ 2 , 32 ] for a comprehensive review. A central difficulty in working with chance constrained programs ( 1 ) is that the feasible region is t ypically nonconv ex, whic h limits the applicabilit y of standard conv ex optimization to ols. T o address this issue, a commonly used strategy is the sample av erage appro ximation (SAA), in which the probabilistic constrain t is replaced with a finite set of scenarios. This yields a deterministic mixed-in teger form ulation that can be solv ed b y mo dern mixed-in teger optimization solvers. While this provides a viable computational framework, its practical applicabilit y is generally limited to instances of mo derate scale [ 37 , 44 ]. Another line of researc h fo cuses on constructing conv ex inner appro ximations of the original chance constraint [ 40 ]. These metho ds replace the probabilistic requiremen t with a tractable approximation that can b e handled using standard conv ex optimiza- tion techniques. A commonly used example is the conditional v alue-at-risk (CV aR) reformulation, whic h pro vides a conv ex upper b ound on the probabilit y of violation. While this approximation preserv es feasibilit y with respect to the chance constraint, it can b e conserv ativ e and generally do es not recov er the optimal ob jectiv e v alue of the original problem. More recent developmen ts aim to mitigate the conserv atism of single-shot con vex approximations b y employing iterative refinement pro cedures [ 30 , 31 ]. The ALSO-X metho d is a representativ e example of this direction. When the deterministic feasible region X is con vex, it has b een sho wn that ALSO-X can yield solutions that impro ve up on those obtained from the CV aR approximation, due to its iterativ e up dates. Building on ALSO-X, an enhanced v ariant, ALSO-X+, has b een prop osed to impro ve solution qualit y further. In n umerical experiments, ALSO-X+ has b een observ ed to ac hiev e better ob jective v alues than ALSO-X and CV aR appro ximation; see, for example, [ 54 , 51 , 46 ]. A t the same time, these w orks also do cumen t that the computational time required b y ALSO-X+ is substantially higher than that of ALSO-X and the CV aR approximation, esp ecially on larger instances. A further limitation is that the theoretical prop erties of the solutions pro duced b y ALSO-X+ are not fully understo o d. In particular, curren t analyses in [ 30 ] do not establish conv ergence to a stationary p oin t of the underlying chance constrained problem. Hence, the qualit y of the limiting solutions lacks a rigorous optimality guarantee. Difference of Conv ex (DC) Program. DC programs are optimization problems that aim to minimize a DC function sub ject to DC constrain ts. DC programs pla y an imp ortan t role in noncon vex optimization and ha ve b een extensiv ely studied for decades [ 28 , 47 ]. One of the most imp ortan t metho ds for solving DC programs is the DC algorithm (DCA) [ 27 , 36 ], which solves a sequence of conv ex subproblems by replacing the concav e part with its first-order approximation at the curren t iterate. Recently , [ 50 ] prop osed a DCA to solv e the SAA of c hance constrained programs, with subproblems that are easy to implement and are amenable to the off-the-shelf solv ers. Another p opular metho d for solving the DC program is exact p enalt y [ 35 , 23 , 29 ], which p enalizes the DC constrain t in the ob jectiv e function and solves a sequence of DC programs using DCA with increasing p enalt y parameters. Compared with the DCA, the exact p enalt y metho d do es not require an initial feasible solution. It allo ws constraints to b e violated in the initial 3 up dates, which can sometimes lead to a lo wer ob jectiv e v alue [ 35 ]. 1.2 Summary of Con tributions In this pap er, we study the p enalt y based DC algorithms for solving chance constrained programs. Our contributions can b e summarized as follo ws: (i) Building on the rank-based DC decomp osition of the SAA chance constraint prop osed in [ 50 ], w e first prop ose a p enalty based DC metho d in the primal space. The metho d do es not re- quire an initial feasible solution and can alleviate ov er-conserv atism caused by infeasible subproblems. This algorithm impro v es b oth solution quality and running time in the poly- hedral setting, where b oth X and g are p olyhedral, but exhibits pronounced oscillations in the nonlinear setting. (ii) T o mitigate the aforemen tioned oscillations that ma y arise in the primal space under nonlinear settings, we derive an equiv alen t lifted formulation with complementary constrain ts, inspired b y the T oland dualit y framework for DC programs [ 48 ]. This leads to a tractable DC structure o ver a simple p olyhedron and yields a proximal DC algorithm in the lifted space. The subproblem admits finite termination, is amenable to effectiv e warm starts, and exhibits substan tially improv ed numerical stability . (iii) W e establish an exact penalty for b oth the primal and lifted form ulations under mild con- strain t qualification assumptions, including global exactness, lo cal exactness for the primal form ulation, and equiv alence b et ween strongly stationary p oin ts of the lifted form ulation and stationary p oints of the p enalized c oun terpart. W e further analyze the relationship b et ween lifted and primal lo cal minimizers, identify the p ossibilit y of spurious lifted lo cal minimizers, and provide a sufficient condition under which suc h spurious solutions are ruled out. (iv) Numerical exp erimen ts demonstrate that the prop osed metho ds substantially reduce ov er- conserv atism. Moreov er, to the b est of our knowledge, the algorithm in the lifted space remains computationally efficient and outp erforms state-of-the-art baselines in b oth solution qualit y and runtime. Organization. The remainder of the pap er is organized as follo ws. In Section 2 , we develop the algorithmic framework: we first review the DC decomp osition of the c hance constraint and the exact p enalt y framew ork in the primal space. Then we derive an equiv alen t lifted formulation with complemen tary constrain ts and the algorithm in the lifted space. Section 3 establishes the exact p enalt y and stationarity guarantees for b oth form ulations, including global and lo cal results. Section 4 inv estigates the relationship b etw een lo cal minimizers of the lifted and primal formu- lations, c haracterizes when spurious lifted lo cal minimizers may arise, and pro vides conditions ensuring that lifted local optimalit y implies primal lo cal optimality . Section 5 rep orts numerical exp erimen ts on real and synthetic instances and compares against state-of-the-art baselines. In Section 6 , we conclude the pap er and pro vide some future directions. 1.3 Notations and Preliminaries W e first introduce the notation used throughout the pap er. F or a p ositiv e integer n , we write [ n ] := { 1 , 2 , . . . , n } . W e use R n + to denote the nonnegative orthant. F or a vector x ∈ R d , ∥ x ∥ 4 denotes the Euclidean norm and ⟨· , ·⟩ denotes the standard inner pro duct. W e use ∥ x ∥ 1 to denote the ℓ 1 -norm and ∥ x ∥ 0 to denote the ℓ 0 pseudo norm, i.e., the n umber of nonzero elements. F or a scalar t ∈ R , define [ t ] + := max { t, 0 } . F or ¯ x ∈ R d and ε > 0, B ( ¯ x , ε ) := { x ∈ R d : ∥ x − ¯ x ∥ < ε } denotes the op en neigh b orho o d with radius ε . Given a set S ⊆ R d , the indicator function is δ S ( x ) := 0 , x ∈ S , + ∞ , x / ∈ S , and the Euclidean pro jection op erator is Π S ( x ) ∈ arg min y ∈S ∥ x − y ∥ . Note that the pro jection is unique if S is conv ex. The distance from a p oint to a set is dist( x , S ) := inf {∥ x − y ∥ : y ∈ S } , and diam( S ) := sup {∥ x − y ∥ : x , y ∈ S } . F or a closed conv ex set X , T X ( ¯ x ) and N X ( ¯ x ) denote the tangent cone and normal cone at ¯ x ∈ X . F or an extended-real-v alued function f : R d → R ∪ { + ∞} , the directional deriv ative at ¯ x along d is f ′ ( ¯ x ; d ) := lim t → 0 f ( ¯ x + t d ) − f ( ¯ x ) t . If f is lo cally Lipschitz, its Clarke directional deriv ativ e is f ◦ ( ¯ x ; d ) := lim sup t ↓ 0 f ( x + t d ) − f ( x ) t , and its Clarke sub differential is ∂ f ( ¯ x ) := n v ∈ R d : ⟨ v , d ⟩ ≤ f ◦ ( ¯ x ; d ) ∀ d ∈ R d o . When f is conv ex, the Clarke sub differential coincides with the usual conv ex sub differen tial, and f ◦ ( ¯ x ; d ) = f ′ ( ¯ x ; d ). F or a closed prop er con vex function f , its F enc hel conjugate is f ∗ ( p ) := sup x ∈ R d ⟨ p , x ⟩ − f ( x ) . F or a DC program in the form min x { ϕ ( x ) − ψ ( x ) } with ϕ, ψ closed prop er conv ex, its T oland dual [ 48 ] is min p { ψ ∗ ( p ) − ϕ ∗ ( p ) } , which is also a DC program. A DC program is called a p olyhedral DC program if either ϕ or ψ is p olyhedral, i.e., the epigraph is a p olyhedron. The p olyhedral DC program has some interesting prop erties on lo cal optimality and finite conv ergence of the DC algorithm. The reader may refer to [ 47 ] for a detailed review. F or a v ector y ∈ R S , let y (1) ≤ · · · ≤ y ( S ) denote the nondecreasing rearrangement of ( y 1 , . . . , y S ) and | y | (1) ≤ · · · ≤ | y | ( S ) denote the nondecreasing rearrangement of ( | y 1 | , . . . , | y S | ). The sum of the r largest comp onen ts is defined as T r ( y ) = P S i = S − r +1 y ( i ) . The Ky-F an norm is defined as ∥ y ∥ ( r ) := P S i = S − r +1 | y | ( i ) . Next, we list some classical results on exact p enalization of constrained optimization problems. T o mak e the pap er self-contained, we pro vide the pro ofs in App endix A . The reader ma y also refer to [ 10 ] for a detailed review. Consider the single constrained problem min x ∈ R d { f ( x ) : g ( x ) ≤ 0 , x ∈ X } , (3) where f , g : R d → R are lo cally Lipsc hitz and X ⊆ R d is conv ex and com pact. Denote the feasible set by F := { x ∈ X : g ( x ) ≤ 0 } . 5 Let g ◦ ( ¯ x ; d ) b e the Clark e directional deriv ativ e of g at ¯ x along d . W e sa y that the generalized Mangasarian–F romovitz constrain t qualification (GMFCQ) holds at a b oundary feasible p oint ¯ x ∈ F with g ( ¯ x ) = 0 if there exists a direction d ∈ T X ( ¯ x ) such that g ◦ ( ¯ x ; d ) < 0 . (4) Let G ( x ) := g ( x ) + R + = { t + g ( x ) : t ≥ 0 } for x ∈ X and G − 1 ( y ) = { x ∈ X : y ∈ G ( x ) } for y ∈ R . Under the GMF CQ, the feasible region F can b e shown to b e metrically regular [ 10 , Definition 2.2] around ¯ x . That is, there exists κ, ϵ > 0 such that dist( x , G − 1 ( y )) ≤ κ dist( y , G ( x )) , ∀ x ∈ X ∩ B ( ¯ x , ϵ ) , y ∈ B (0 , ϵ ) . W e state this in the following prop osition. Prop osition 1.1. Supp ose GMFCQ ( 4 ) holds at ¯ x . Define G ( x ) := g ( x ) + R + for x ∈ X . Then F is metric al ly r e gular at ¯ x . Conse quently, ther e exist κ > 0 and ε > 0 such that the lo c al line ar err or b ound dist( x , F ) ≤ κ [ g ( x )] + , ∀ x ∈ X ∩ B ( ¯ x , ε ) (5) holds. The error b ound ( 5 ) together with the Lipsc hitz con tin uity of f implies lo cal exact p enalization. Similar pro ofs can b e found in [ 33 , 13 , 39 ]. Theorem 1.2 (Local exact p enalt y) . Supp ose f is Lipschitz c ontinuous on the c omp act and c onvex set X with c onstant L f , and that the lo c al err or b ound ( 5 ) holds at some fe asible p oint ¯ x ∈ F . F or σ > 0 , c onsider the ℓ 1 -p enalty pr oblem min x ∈X f ( x ) + σ [ g ( x )] + . (6) Then the fol lowing statements ar e e quivalent: (i) F or any σ > 0 , if ¯ x is a lo c al minimizer of the p enalty pr oblem ( 6 ) and satisfies g ( ¯ x ) ≤ 0 , then ¯ x is a lo c al minimizer of the c onstr aine d pr oblem ( 3 ) . (ii) If ¯ x is a lo c al minimizer of the c onstr aine d pr oblem ( 3 ) , then ther e exists ¯ σ > 0 such that, for every σ > ¯ σ , ¯ x is a lo c al minimizer of the p enalty pr oblem ( 6 ) . When the constrained system admits a global linear error b ound, i.e., dist( x , F ) ≤ κ [ g ( x )] + , ∀ x ∈ X , (7) the same argument as in Theorem 1.2 yields a global exact p enalization result. Since the pro of is analogous, we only state the theorem and omit its pro of. Theorem 1.3 (Global exact p enalty) . Supp ose that f is Lipschitz c ontinuous on the c omp act and c onvex set X with c onstant L f , and that the glob al err or b ound ( 7 ) holds. Then the fol lowing statements ar e e quivalent: 6 (i) F or any σ > 0 , if ¯ x is a glob al minimizer of the p enalty pr oblem ( 6 ) and satisfies g ( ¯ x ) ≤ 0 , then ¯ x is a glob al minimizer of the c onstr aine d pr oblem ( 3 ) . (ii) If ¯ x is a glob al minimizer of the c onstr aine d pr oblem ( 3 ) , then ther e exists ¯ σ > 0 such that for every σ > ¯ σ the p oint ¯ x is a glob al minimizer of the p enalty pr oblem ( 6 ) . 2 Algorithm F ramew orks In this section, w e present the o v erall algorithmic foundations of our p enalt y based approac h for solving chance constrained program ( 2 ). W e first review the DC reformulation of the SAA c hance constraint in the primal space and discuss why the exact penalty in the primal space can b e fragile in practice, esp ecially in the nonlinear settings. Motiv ated by these limitations and inspired by the general T oland duality of DC programs [ 48 ], w e prop ose an alternativ e exact p enalt y approac h and then derive an equiv alen t lifted formulation, whic h is a mathematical program with complemen tarity constraints (MPCC), and show that the resulting v alue-function represen tation yields a tractable DC structure. This leads to a p olyhedral DC program with a finite conv ergence guaran tee and is amenable to effective w arm starts. 2.1 Algorithm in the Primal Space T o obtain a tractable reformulation of Problem ( 2 ), [ 50 ] recently prop osed a difference of conv ex based reform ulation of Problem ( 2 ) along with a difference of con vex based algorithm (DCA). The algorithm is easy to implement, with subproblems that can b e solv ed directly by state-of-the- art solv ers, and is especially efficient in the p olyhedral setting. W e b egin this section b y briefly reviewing its key elemen ts. Let m := ⌊ αS ⌋ and let g (1) ( x ) ≤ · · · ≤ g ( S ) ( x ) denote the nondecreasing rearrangemen t of the vector ( g 1 ( x ) , . . . , g S ( x )). Define G 1 ( x ) := S X s = S − m g ( s ) ( x ) , G 2 ( x ) := S X s = S − m +1 g ( s ) ( x ) . (8) The following equiv alence is standard, and we include it for completeness. Prop osition 2.1. [ 50 , lemma 1] F or any x ∈ X , the sample b ase d chanc e c onstr aint S X s =1 I { g s ( x ) ≤ 0 } ≥ S − m (9) holds if and only if G 1 ( x ) − G 2 ( x ) ≤ 0 . By Prop osition 2.1 , Problem ( 2 ) is equiv alen t to v ∗ = min x n f ( x ) : x ∈ ˆ X o , (10) where the feasible region ˆ X of ( 10 ) is defined as follows: ˆ X := { x ∈ X : ϕ ( x ) := G 1 ( x ) − G 2 ( x ) ≤ 0 } . W e next summarize several prop erties of the functions G 1 and G 2 established in [ 50 ]. 7 Prop osition 2.2. [ 50 , lemma 1, 2] Under Assumption 1 and Assumption 2 , b oth G 1 ( x ) and G 2 ( x ) define d in ( 8 ) ar e c onvex and c ontinuous on X . Mor e over, let M G 2 ( x ) denote the active index set of G 2 ( x ) at x by M G 2 ( x ) := n ( s 1 , . . . , s m ) ⊆ [ S ] : m X t =1 g s t ( x ) = G 2 ( x ) o . Then for any x ∈ X , the sub differ ential of G 2 admits the explicit r epr esentation ∂ G 2 ( x ) = con v [ ( s 1 ,...,s m ) ∈M G 2 ( x ) m X t =1 ∂ g s t ( x ) , wher e for al l s ∈ [ S ] ∂ g s ( x ) = con v n ∇ h s,i ( x ) : i ∈ M s ( x ) o , and M s ( x ) denotes the active index set of g s at x by M s ( x ) := { i ∈ [ I ] : h s,i ( x ) = g s ( x ) } . W e make the following remarks on Prop osition 2.2 . (i) The sample based chance constrain t ( 9 ) is equiv alen t to an empirical v alue-at-risk (V aR) constrain t. The requiremen t that at least ( S − m ) scenarios satisfy g s ( x ) ≤ 0 holds if and only if g ( S − m ) ( x ) ≤ 0 , i.e., the empirical (1 − α )-quan tile of { g s ( x ) } S s =1 is nonp ositive. (ii) Similarly , b oth G 1 and G 2 can be viewed as the empirical Conditional V alue-at-Risk (CV aR) [ 43 ] up to a scaling factor, and eac h admits an equiv alen t minimization form ulation based on strong duality of linear programs, which will b e useful in the following section: G 1 ( x ) = min α ∈ R ( ( m + 1) α + S X s =1 [ g s ( x ) − α ] + ) , G 2 ( x ) = min β ∈ R ( mβ + S X s =1 [ g s ( x ) − β ] + ) . (iii) Under Assumption 1 – 2 , G 1 and G 2 are conv ex and contin uous on the compact set X . There- fore, G 1 , G 2 and their difference ϕ ( x ) are Lipschitz con tinuous on X and their Clarke sub d- ifferen tials are well-defined [ 17 , section 2.1]. The authors in [ 50 ] prop osed a difference of conv ex based algorithm (DCA) for solving Prob- lem ( 10 ). At the k th iteration, let x k ∈ ˆ X . W e pic k a subgradien t n k ∈ ∂ G 2 ( x k ), linearize the conca ve part in the constraint, and solve the following subproblem as the up date: x k +1 = arg min x ∈X { f ( x ) : G 1 ( x ) − G 2 ( x k ) − ⟨ n k , x ⟩ ≤ 0 } . (11) W e note that the ab ov e DCA scheme has sev eral intrinsic limitations despite its strong numer- ical p erformance: 8 (i) DCA can b e view ed as a conserv ativ e approximation of Problem ( 2 ). The affine ma jorization of the concav e part can sometimes b e ov erly conserv ative, so the subproblem ( 11 ) ma y b ecome infeasible. A similar problem may also arise in other conserv ative approximation techniques, e.g., the CV aR approximation [ 43 ]. (ii) The DCA requires an initial feasible p oin t (i.e., x 0 ∈ ˆ X ) to linearize the constraint, and finding such a p oint can b e computationally demanding. (iii) Problem ( 10 ) is a polyhedral DC program when h i,s are affine for all i ∈ [ I ] and s ∈ [ S ]. While DCA can b e computationally efficient in the p olyhedral setting due to the finite con vergence guaran tee, it often p erforms p o orly with nonlinear constrain ts due to pronounced oscillations and slow progress (see [ 50 ] and the numerical exp erimen ts in Section 5 for details). T o address the first t wo limitations, w e propose an exact p enalt y based approac h for the DC program ( 10 ), which has b een extensively studied in the literature [ 35 , 33 ] and do es not require an initial feasible solution x 0 ∈ ˆ X . Let σ > 0 b e the p enalty parameter, and we consider the follo wing p enalized problem: min x ∈X f ( x ) + σ [ ϕ ( x )] + . (12) It is well-kno wn ([ 28 , prop osition 2.1]) that the ℓ 1 exact p enalt y of a DC function still admits an explicit DC decomp osition. Therefore, Problem ( 12 ) admits a DC decomp osition as follows: min x ∈ X f ( x ) + σ max { G 1 ( x ) , G 2 ( x ) } − σG 2 ( x ) . (13) F or a fixed σ > 0, we may also adopt a proximal DC algorithm to solve Problem ( 13 ). Similar to the DCA, for the k th iteration, w e can pic k a subgradien t n k ∈ ∂ G 2 ( x k ) b y Prop osition 2.2 . W e then need to solve the following subproblem: min x ∈ X n f ( x ) + σ max { G 1 ( x ) , G 2 ( x ) } − σ ⟨ n k , x ⟩ o . (14) W e remark that the subproblem ( 14 ) is feasible under the mild assumption that X is nonempty . T o stabilize the iterates, w e can also add a pro ximal term with ρ > 0. By introducing an auxil- iary epigraph v ariable t ∈ R and leveraging the dual represen tation of CV aR, we can obtain an equiv alent formulation as follows: min x ∈X , t, η 1 ,η 2 , u ∈ R S + , v ∈ R S + f ( x ) + σt − σ ⟨ n k , x ⟩ + ρ 2 ∥ x − x k ∥ 2 s.t. t ≥ ( m + 1) η 1 + S X s =1 u s , u s ≥ g s ( x ) − η 1 , ∀ s ∈ [ S ] , t ≥ m η 2 + S X s =1 v s , v s ≥ g s ( x ) − η 2 , ∀ s ∈ [ S ] . (15) W e now presen t the full p enalty framew ork for Problem ( 13 ) in Algorithm 1 , whic h is a double- lo op algorithm. The inner lo op solves the DC program ( 13 ) via up date ( 14 ) to a critical p oint, and the outer loop up dates the parameter σ b y enlarging it with a constant factor β > 0. F ollowing [ 41 , Chapter 17], w e initialize the algorithm with a small p enalty σ 0 and then increase σ progressively , so that the violation of the chance constraint (equiv alen tly , [ ϕ ( x )] + ) is gradually driven to zero. 9 Similar to the standard analysis of DC program [ 28 ], the sequence generated by Algorithm 1 subsequen tly conv erges to a critical p oint of Problem ( 13 ). Let H ( x ) = max { G 1 ( x ) , G 2 ( x ) } . W e sa y x ∗ ∈ X is a critical p oin t of Problem ( 13 ) if there exists n ∗ ∈ ∂ G 2 ( x ∗ ) , h ∗ ∈ ∂ H ( x ∗ ) such that the following condition holds: 0 ∈ ∇ f ( x ∗ ) + σ h ∗ − σ n ∗ + N X ( x ∗ ) . Algorithm 1 Penalt y Based Proximal DC Algorithm in the Primal Space Require: Initial σ 0 > 0, β > 1. 1: for t = 0 , 1 , 2 , . . . do 2: for k = 0 , 1 , 2 , . . . do 3: Find g k,t ∈ ∂ G 2 ( x k,t ) via Prop osition 2.2 4: Solv e the subproblem ( 15 ) with σ = σ t 5: end for 6: Up date σ t +1 ← β σ t 7: end for W e conclude this subsection with some remarks on Algorithm 1 . (i) Algorithm 1 admits effective warm starts. Sp ecifically , the outer p enalt y lo op can reuse the output solution of the previous p enalty level as the initialization for the next p enalt y level, whic h often accelerates con vergence in practice. In addition, when σ is small, and constraint enforcemen t is mild, it is usually unnecessary to solv e the pro ximal DC subproblem with fixed- σ to high precision. Instead, one ma y adopt a lo ose inner stopping tolerance at early p enalt y stages and gradually tighten it as σ increases, thereby reducing the ov erall run time. (ii) In our exp erimen ts, we set ρ to a small constant (10 − 4 − 10 − 3 ) to improv e n umerical stability and mitigate oscillations, which are observed in the n umerical exp eriments when ρ = 0. (iii) The primal-space framew ork extends directly to the case where the ob jective itself admits a DC decomp osition, namely f ( x ) = f 1 ( x ) − f 2 ( x ) with f 1 , f 2 b eing conv ex. In this case, the p enalized ob jectiv e can b e written as f 1 ( x ) + σH ( x ) − f 2 ( x ) + σG 2 ( x ) , whic h is still a DC function on X . A similar DCA algorithm can also b e applied by solving a sequence of subproblems where the conca ve part ( f 2 ( x ) + σ G 2 ( x )) is linearized using first- order approximation at the current iterate. Similar to the DCA algorithm in [ 50 ], Algorithm 1 is particularly efficient when applied to a polyhedral DC program, whic h has the prop ert y of finite conv ergence under the mild assump- tion [ 47 , theorem 6]. A sufficien t condition for such prop erty is that X is a p olyhedron and f , h i,s are affine for all i, s . In the numerical exp erimen ts (see Section 5 ), we observe that Algorithm 1 yields a b etter solution with a lo wer ob jectiv e v alue than the DCA algorithm in the p olyhedral setting. T he running time of b oth algorithms is comparable in the setting with a linear ob jectiv e function. At the same time, Algorithm 1 exhibits substantial sp eedup o ver the DCA algorithm when the ob jectiv e function is quadratic. 10 In contrast, when the constraints are nonlinear, Algorithm 1 may exhibit pronounced oscilla- tions, similar to the DCA algorithm, and require substan tially more iterations and runtime to solve a single subproblem, degrading its practical p erformance. Moreov er, Algorithm 1 fails to return a feasible solution within a reasonable run time even though we use a larger initial p enalty σ 0 and a more aggressiv e growth factor β . This phenomenon motiv ates us to prop ose an alternative exact p enalt y approach with b etter numerical stability in the general setting. 2.2 Equiv alen t F orm ulation in the Lifted Space Motiv ated b y the limitations discussed abov e, w e next consider equiv alen t form ulations of Prob- lem ( 2 ), whic h will serve as the basis for our subsequen t algorithmic developmen t. Observ e that the chance constraint ( 9 ) admits an equiv alen t reformulation [ 55 ] as follows: S X s =1 I { g s ( x ) ≤ 0 } ≥ S − m ⇐ ⇒ S X s =1 I { [ g s ( x )] + > 0 } ≤ m. In tro ducing an auxiliary v ariable y ∈ R S + , w e obtain an equiv alent reformulation of Problem ( 2 ) with a cardinality constraint: v ∗ = min x ∈X , y ≥ 0 { f ( x ) : ∥ y ∥ 0 ≤ m, g s ( x ) ≤ y s , ∀ s ∈ [ S ] } , where the cardinality constrain t ∥ y ∥ 0 ≤ m enforces that the n umber of violated scenarios cannot exceed m . It is well-kno wn in the literature that the ℓ 0 -norm admits a DC decomp osition [ 18 , 23 ], that is ∥ y ∥ 0 ≤ m ⇐ ⇒ ∥ y ∥ 1 − ∥ y ∥ ( m ) = 0 , where ∥ y ∥ ( m ) denotes the Ky-F an norm of y . In [ 23 ], the authors considered a p enalty approac h based on the DC decomp osition of the cardinality constraint. In our setting, since y ≥ 0 , the p enalized problem can b e simplified as follo ws: v ∗ = min x ∈X , y ≥ 0 n f ( x ) + σ ( 1 ⊤ y − T m ( y )) : g s ( x ) ≤ y s , ∀ s ∈ [ S ] o , (16) where T m ( y ) = max u {⟨ u , y ⟩ : 0 ≤ u ≤ 1 , 1 ⊤ u ≤ m } denotes the sum of the m -largest elements of y . T o obtain a simplified reform ulation, define the conv ex p olyhedron C := n z ∈ R S : 0 ≤ z s ≤ 1 ∀ s ∈ [ S ] , 1 ⊤ z ≥ S − m o . W e can now obtain an equiv alen t reformulation of Problem ( 16 ) using the supp ort function π C of the p olyhedron C as follows: v ∗ = min x ∈X , y ≥ 0 { f ( x ) − π C ( − σ y ) : g s ( x ) ≤ y s , ∀ s ∈ [ S ] } , (17) where π C ( y ) = max z { z ⊤ y : z ∈ C } . A k ey observ ation is that Problem ( 17 ) is a polyhedral DC program since the nonsmo oth conca ve part in the ob jectiv e function − π C ( − σ y ) is polyhedral. Moreov er, its F enchel conjugate takes a 11 simple form: the indicator function of the same p olyhedron comp osed with a linear mapping. Therefore, a natural idea is to lift this problem into the dual space using the T oland duality [ 48 , 28 , 5 ] for DC programs, whic h can transfer the nonsmo othness in the ob jectiv e function to simple p olyhedral constrain ts. Compared with the primal problem ( 17 ), the dual problem tak es the form of minimizing a concav e function on a p olyhedron, for which the standard DC algorithm guaran tees finite conv ergence. This simple structure can p oten tially improv e n umerical stability regardless of the nonlinearit y of the constraints and ob jective. W e formally state this in the follo wing prop osition. Prop osition 2.3. Supp ose that Assumptions 1 – 2 hold. Consider Pr oblem ( 17 ) in the unc on- str aine d form min ( x , y ) ∈ R d × R S n ζ ( x , y ) − η ( x , y ) o , wher e ζ ( x , y ) := f ( x ) + δ X ( x ) + δ R S + ( y ) + δ { ( x , y ): g s ( x ) ≤ y s , ∀ s ∈ [ S ] } ( x , y ) , η ( x , y ) := π C ( − σ y ) . L et ( p , q ) ∈ R d × R S denote the dual variables asso ciate d with ( x , y ) . Then the F enchel c onjugates of η and ζ satisfy: (i) η ∗ ( p , q ) = δ { 0 } ( p ) + inf z ∈ R S { δ C ( z ) : − σ z = q } . (ii) ζ ∗ ( p , q ) = sup x ∈X , y ≥ 0 g s ( x ) ≤ y s , ∀ s ∈ [ S ] n ⟨ p , x ⟩ + ⟨ q , y ⟩ − f ( x ) o . Conse quently, by p ar ameterizing the dual variable of the supp ort function dir e ctly by z ∈ C , the T oland dual of Pr oblem ( 17 ) c an b e written as min z ∈C − ζ ∗ ( 0 , − σ z ) = min z ∈C inf x ∈X , y ≥ 0 g s ( x ) ≤ y s , ∀ s ∈ [ S ] n f ( x ) + σ z ⊤ y o . (18) Pr o of. W e derive the t wo F enchel conjugates separately . The T oland dual problem ( 18 ) can then b e derived via straightforw ard calculation. (i): Computing η ∗ . Since η does not dep end on x , its conjugate separates as η ∗ ( p , q ) = sup x ∈ R d ⟨ p , x ⟩ + sup y ∈ R S n ⟨ q , y ⟩ − π C ( − σ y ) o . The first term equals δ { 0 } ( p ). F or the second term, write h ( u ) = π C ( u ), so that η ( y ) = ( h ◦ A )( y ) , A := − σ I . Since h is the supp ort function of C , we hav e h ∗ ( z ) = δ C ( z ) . 12 By the conjugacy rule for comp osition with a linear mapping, ( h ◦ A ) ∗ ( q ) = inf z ∈ R S n h ∗ ( z ) : A ⊤ z = q o . Because A ⊤ = − σ I , it follo ws that sup y ∈ R S n ⟨ q , y ⟩ − π C ( − σ y ) o = inf z ∈ R S { δ C ( z ) : − σ z = q } . Therefore η ∗ ( p , q ) = δ { 0 } ( p ) + inf z ∈ R S { δ C ( z ) : − σ z = q } , whic h prov es part (i). (ii): Computing ζ ∗ . By definition, ζ ∗ ( p , q ) = sup ( x , y ) ∈ R d × R S n ⟨ p , x ⟩ + ⟨ q , y ⟩ − ζ ( x , y ) o . Substituting the definition of ζ yields ζ ∗ ( p , q ) = sup x ∈X , y ≥ 0 g s ( x ) ≤ y s , ∀ s ∈ [ S ] n ⟨ p , x ⟩ + ⟨ q , y ⟩ − f ( x ) o , whic h prov es part (ii). W e provide an alternative approach to deriving the same form ulation ( 18 ) based on integer programming tec hniques, as discussed in the literature. Starting from ( 2 ), w e in tro duce a binary v ector z ∈ { 0 , 1 } S to represent the indicator functions. T o obtain a lifted (higher-dimensional) reform ulation, we can introduce another nonnegative vector y ∈ R S + to indicate constrain t violation instead of using the big-M formulation. A similar approach has been proposed in [ 30 ]. T o mak e the pap er self-contained, we state the result formally in the following prop osition without pro of. Prop osition 2.4. [ 30 , pr op osition 1] Pr oblem ( 2 ) c an b e viewe d as the fol lowing e quivalent for- mulation: v ∗ = min x ∈X , y ∈ R S + , z ∈ [0 , 1] S ( f ( x ) : S X s =1 z s ≥ S − m, V ( y , z ) = 0 , g s ( x ) ≤ y s , ∀ s ∈ [ S ] ) , (19) wher e V ( y , z ) = P S s =1 y s z s . W e remark that the v ariable z in formulation ( 19 ) can b e either con tinuous or discrete. F rom this p erspective, Problem ( 19 ) can b e viewed as a sp ecial case of a mathematical program with complemen tarity constraint (MPCC) [ 34 , 45 ]. In [ 30 ], the authors prop osed an algorithm named ALSO-X+ to solve Problem ( 19 ) using alternating minimization with an iterative pro cedure. The algorithm can outp erform the CV aR approximation [ 43 ] under Assumptions 1 and 2 , and exhibits applicabilit y in v arious settings. Ho wev er, this algorithm lac ks a s tationarit y guarantee and is inefficien t due to the iterative pro cedure. 13 Motiv ated b y these limitations, we instead study the following p enalized formulation of Prob- lem ( 19 ), which is equiv alen t to the T oland dual problem ( 18 ): v ∗ σ = min x ∈X , y ∈ R S + , z ∈ [0 , 1] S ( f ( x ) + σV ( y , z ) : S X s =1 z s ≥ S − m, g s ( x ) ≤ y s , ∀ s ∈ [ S ] ) , (20) where σ > 0 is the p enalt y parameter. Note that Problem ( 20 ) is the ℓ 1 exact penalty formula- tion [ 24 ] of ( 19 ). In particular, the term V ( y , z ) can b e used directly as an exact p enalt y function without any p ositive-part truncation since y s ≥ 0 and z s ≥ 0 imply y s z s ≥ 0 for all s ∈ [ S ]. With this equiv alence in hand, w e take ( 20 ) as the starting p oint of our analysis since the p enalt y term only app ears once in the ob jective function. In particular, we propose an algorithmic framew ork that simultaneously addresses the limitations of b oth Algorithm 1 and ALSO-X+. 2.3 Algorithm in the Lifted Space In this subsection, w e dev elop an alternative exact p enalty framew ork for solving the lifted MPCC form ulation ( 19 ) through its p enalized counterpart ( 20 ). F or any fixed penalty parameter σ > 0, the p enalized model ( 20 ) is still noncon vex because of the bilinear term P S s =1 y s z s app earing in the ob jective. Nevertheless, from Prop osition 2.3 , we observ e that once the remaining v ariables ( x , y ) are minimized out, this nonconv exit y induces a concav e minimization structure in the dual v ariable z . Before proceeding, w e introduce sev eral pieces of notation used throughout the remainder of this pap er. Using the scenario-wise represen tation g s ( x ) = max i ∈ [ I ] h s,i ( x ), w e write the feasible region of Problem ( 20 ) as Ω 0 := n ( x , y , z ) ∈ X × R S + × C : h s,i ( x ) ≤ y s , ∀ s ∈ [ S ] , ∀ i ∈ [ I ] o . The ob jectiv e function of Problem ( 20 ) is denoted by F σ ( x , y , z ) = f ( x ) + σ V ( y , z ) . Similarly , the feasible region of Problem ( 19 ) is Ω := n ( x , y , z ) ∈ Ω 0 : y s z s = 0 , ∀ s ∈ [ S ] o . W e next isolate the dep endence on z by introducing the following v alue function. Prop osition 2.5. L et Ψ : R S × R ++ → R b e define d by Ψ( z , σ ) := max x ∈X , y ∈ R S + ( − f ( x ) − σ S X s =1 y s z s : g s ( x ) ≤ y s , ∀ s ∈ [ S ] ) . Under Assumptions 1 and 2 , the function Ψ( z , σ ) is c onvex in z for every fixe d σ > 0 . Pr o of. F or eac h fixed feasible pair ( x , y ), − F σ ( x , y , z ) is affine in z . By definition, Ψ( · , σ ) is the p oin twise suprem um of − F σ ( x , y , z ) on Ω 0 . Thus Ψ( · , σ ) is conv ex. 14 W e remark here that the con vexit y of Ψ in z do es not dep end on the con vexit y of the function f ( x ) or of X as long as the maximum is attained. Therefore, the follo wing analysis can b e naturally extended to the discrete setting, e.g., when X = { 0 , 1 } d . With this notation, w e can rewrite the p enalized problem ( 20 ) as an optimization problem in z only . Indeed, for an y fixed z , the inner minimization ov er ( x , y ) in ( 20 ) equals − Ψ( z , σ ), and hence Problem ( 20 ) is equiv alen t to the follo wing DC program min z ∈C − Ψ( z , σ ) . (21) T o implement a pro ximal DC metho d for ( 21 ), w e need a computable subgradient of Ψ( · , σ ) to linearize the concav e part. This can b e done by Danskin’s theorem, where we can restrict the v ariable y to a compact set under Assumptions 1 and 2 without loss of generalit y . Prop osition 2.6. [ 7 , pr op osition B.25] F or any z ∈ C and σ > 0 , let ( x ∗ ( z , σ ) , y ∗ ( z , σ )) b e an optimal solution to the fol lowing subpr oblem: ( x ∗ ( z , σ ) , y ∗ ( z , σ )) ∈ arg min x ∈X , y ∈ R S + ( f ( x ) + σ S X s =1 y s z s : g s ( x ) ≤ y s , ∀ s ∈ [ S ] ) . (22) Define n ( z , σ ) := − σ y ∗ 1 ( z , σ ) , . . . , y ∗ S ( z , σ ) ⊤ . Then n ( z , σ ) ∈ ∂ z Ψ( z , σ ) . W e no w apply a DC algorithm to solve the DC program ( 21 ) with fixed σ > 0. Given the curren t iterate z k ∈ C and σ > 0, w e first c ho ose a subgradien t n k ∈ ∂ z Ψ( z k , σ ), which can b e computed via Prop osition 2.6 . Linearizing the concav e term − Ψ( · , σ ) at z k and adding a pro ximal regularizer yields the conv ex subproblem z k +1 = arg min z ∈C n −⟨ n k , z ⟩ + ρ 2 ∥ z − z k ∥ 2 2 o = Π C ( z k − 1 ρ n k ) , where ρ > 0 is a proximal parameter to enhance numerical stabilit y and a v oid p otential oscilla- tion. Hence, eac h proximal DC up date in the z -space is reduced to a pro jection onto the simple p olyhedron C . This pro jection can b e computed efficien tly via sorting and scales well for large S . The complete implemen tation of the p enalt y metho d is summarized in Algorithm 2 . Similar to Algorithm 1 , w e start with a relatively small p enalty parameter σ 0 and increase it geometrically . The p enalty up date contin ues un til the violation of the chance constrain t b ecomes sufficiently small, and further increasing σ will hav e little effect. Moreov er, since the ob jectiv e function in ( 20 ) is differentiable, the sequence generated by Algorithm 2 conv erges subsequentially to a stationary p oin t of Problem ( 20 ), which can b e defined using the standard first-order stationarity condition. Let w := ( x , y , z ). W e say ¯ w ∈ Ω 0 is a stationary p oint of ( 20 ) if the following equation holds: 0 ∈ ∇ F σ ( ¯ w ) + N Ω 0 ( ¯ w ) , where N Ω 0 ( ¯ w ) is the normal cone of Ω 0 at ¯ w . W e make the following remarks on Algorithm 2 : (i) F or the fixed σ , the inner loop of Algorithm 2 can b e viewed as the sp ecial case of the pro ximal p oin t metho d for DC program prop osed in [ 5 ] b y letting the primal proximal step-size b e 0. 15 Algorithm 2 Penalt y Based Proximal DC Algorithm in the Lifted Space Require: Initial z 0 , 0 ∈ C , initial p enalt y σ 0 > 0, growth factor β > 1. 1: for t = 0 , 1 , 2 , . . . do 2: for k = 0 , 1 , 2 , · · · do 3: Find n k,t ∈ ∂ z Ψ( z k,t , σ t ) via solving subproblem ( 22 ) 4: Up date z k +1 ,t = Π C ( z k,t − 1 ρ n k,t ) 5: end for 6: Up date σ t +1 = β σ t . 7: end for When ρ = 0, the inner lo op for solving ( 21 ) con verges in finite iterations. This is b ecause the inner up date reduces to the classical DCA step z k +1 ∈ arg min z ∈C {−⟨ n k , z ⟩} , whic h is a linear program ov er the p olyhedron C . Hence, an optimal solution can alwa ys b e c hosen at an extreme p oint of C . The algorithm terminates in a finite n um b er of iterations if the ob jectiv e function is no longer strictly decreasing, since the num b er of extreme p oin ts of C is finite. This prop erty is structural and do es not rely on the sp ecific form of Ψ b ey ond Assumptions 1 and 2 . (ii) Similar to Algorithm 1 , we set ρ to b e a small constant (10 − 4 − 10 − 3 ) to improv e numerical stabilit y and mitigate oscillations. A larger ρ reduces the effectiv e step-size 1 /ρ , leading to more conserv ative up dates and noticeably longer runtime. (iii) In [ 30 ], the authors prop osed a bisection-based approximation of Problem ( 2 ) named ALSO- X+ via the lifted form ulation ( 19 ). Moreov er, they found that, when solving the subproblems, the alternating minimization (AM) approac h yields b etter solutions than a differen t DC approac h (see, e.g., [ 29 ]). Specifically , their DC metho d is built on the following quadratic DC decomp osition: y ⊤ z = 1 4 ∥ y + z ∥ 2 − ∥ y − z ∥ 2 . W e remark that Algorithm 2 can also be in terpreted as an AM approac h, but crucially without relying on the inefficient bisection pro cedure. In particular, for a fixed p enalty parameter σ , Algorithm 2 pro ceeds by alternately p erforming tw o steps: (a) solving a conv ex subproblem in ( x , y ) with z fixed; and (b) up dating z through a proximal pro jection step. (iv) The prop osed framew ork admits effective warm starts in b oth the outer and inner lo ops. (a) Warm start for z acr oss outer p enalty iter ations. In Algorithm 2 , we reuse the final iterate z ∗ ,t obtained at p enalt y level σ t as the initializer z 0 ,t +1 for the next p enalt y level σ t +1 . (b) Warm start for ( x , y ) acr oss inner iter ations. Within Algorithm 2 , the subproblem in Step 2 is solved rep eatedly with changing z k , while the feasible set remains unchanged. Therefore, one can warm-start the solv er for the ( x , y )-subproblem using the previous solution ( x k − 1 ,t , y k − 1 ,t ) as an initial p oint when computing ( x k,t , y k,t ). This is particu- larly effective b ecause only the ob jectiv e co efficien ts asso ciated with y v ary through z , 16 whereas all constraints sta y fixed. F or instance, we can use the simplex metho d to solve the subproblems when all the constraints and the ob jectiv e function are linear. (c) Early termination for smal l p enalties. F or early penalty lev els, it is often unnecessary to solve the fixed- σ subproblem to high accuracy , since constraint enforcement is weak when σ is small. In practice, one ma y terminate Algorithm 2 early for small σ , and gradually tighten the stopping criterion as σ increases. (v) In Algorithm 2 , ev ery ( x , y )-subproblem ( 22 ) is conv ex, and can b e solved efficien tly b y mo dern solv ers suc h as Gurobi. Imp ortan tly , across iterations, the constraints are unc hanged and only the linear ob jective co efficien ts in y v ary through z k,t . This structure mak es w arm- starting particularly effectiv e in practice. When the constrain ts admit sp ecial forms (e.g., p olyhedral constraints or linearly-constrained quadratic programs), the solver can amortize most of the presolv e and factorization costs: the exp ensiv e preprocessing is essentially p er- formed once, and subsequent solves are accelerated significantly . 3 Exact P enalty F ramew orks In this s ection, we aim to dev elop an exact p enalt y framew ork for b oth algorithms prop osed in Section 2 . T o this end, we provide tw o complemen tary theoretical guaran tees. First, under a con- strain t qualification, we sho w that the prop osed p enalt y for b oth the primal and lifted form ulation is globally exact, in the sense that all global minimizers of the original problem can b e reco v ered by solving the p enalized problem with a sufficiently large p enalty parameter. Second, w e fo cus on the lo cal exactness. F or the primal form ulation, w e pro v e the lo cal exactness of the penalty and pro vide sufficien t conditions to v erify lo cal optimalit y . F or the lifted problem, instead of pursuing lo cal exactness in terms of lo cal minimizers, which typically requires strong regularity conditions (e.g., strict complementary slackness), we establish an equiv alence b et ween strongly stationary p oin ts of the original and the stationary points of the p enalized formulation under a MPCC-tailored con- strain t qualification. This equiv alence implies local optimality in the lifted space and serves as a theoretical guarantee for the developed algorithm. 3.1 Global Exact P enalt y In this section, w e establish a global exact p enalt y relationship for the primal problem ( 10 ) and the lifted coun terpart ( 19 ). The main challenge is that Problem ( 19 ) is an MPCC, for which standard constrain t qualifications typically fail. T o circum ven t this difficult y , we adopt a recursiv e argument that exploits the fact that Problem ( 19 ) and the DC form ulation ( 10 ) are equiv alent at the level of global minimizers. Sp ecifically , we first prov e a global exact p enalt y result b et ween Problem ( 10 ) and its penalized form ( 12 ) under mild and verifiable assumptions. W e then transfer this result bac k to Problem ( 19 ) and ( 20 ) by lev eraging their equiv alence. Since Problem ( 10 ) minimizes a con vex function under a conv ex set with an additional DC constrain t under Assumptions 1 and 2 , w e now introduce the Mangasarian–F romovitz constraint qualification (MFCQ) based on the Clarke sub differen tial. Assumption 3. [ 10 , the or em 2.4], [ 8 , the or em 3.2] F or every ¯ x ∈ X with ϕ ( ¯ x ) = 0 , ther e exists 17 a dir e ction d ∈ T X ( ¯ x ) such that ϕ ◦ ( ¯ x ; d ) = sup v ∈ ∂ ϕ ( ¯ x ) ⟨ v , d ⟩ < 0 . W e remark that Assumption 3 is weak er than [ 50 , Assumption 2], which imp oses a generalized MF CQ condition for a DC constrain t of the form G 1 ( x ) − G 2 ( x ) ≤ 0. By [ 17 , proposition 2.3.1, 2.3.3], we hav e ∂ ( G 1 − G 2 )( x ) ⊆ ∂ G 1 ( x ) + ∂ ( − G 2 )( x ) = ∂ G 1 ( x ) − ∂ G 2 ( x ) . Therefore, we hav e ϕ ◦ ( x ; d ) = sup v ∈ ∂ ϕ ( x ) ⟨ v , d ⟩ ≤ sup s G 1 ∈ ∂ G 1 ( x ) , s G 2 ∈ ∂ G 2 ( x ) s G 1 − s G 2 , d = sup s G 1 ∈ ∂ G 1 ( x ) ⟨ s G 1 , d ⟩ − inf s G 2 ∈ ∂ G 2 ( x ) ⟨ s G 2 , d ⟩ for any d ∈ T X ( x ). The first equality is from [ 17 , prop osition 2.1.2(b)]. A sufficien t condition for the equiv alence b et ween the t wo assumptions is that G 2 is differentiable at x . The reader ma y refer to [ 50 , remark 2] for additional verifiable sufficien t conditions under whic h b oth constrain t qualifications hold. W e are now able to present the theorem establishing the global exact p enalt y b et ween the primal Problem ( 10 ) and ( 12 ). Theorem 3.1. Supp ose that Assumption 1 , 2 and 3 hold. L et ¯ x ∈ X b e given. Then the fol lowing statements ar e e quivalent: (a) ¯ x is a glob al minimizer for Pr oblem ( 10 ) ; (b) ϕ ( ¯ x ) = 0 and ther e exists ¯ σ > 0 such that ¯ x is a glob al minimizer for Pr oblem ( 12 ) for any σ > ¯ σ . Pr o of. By Theorem 1.3 , it suffices to establish the global error b ound dist( x , ˆ X ) ≤ κ [ ϕ ( x )] + , ∀ x ∈ X , (23) for some constant κ > 0, where ˆ X := { x ∈ X : ϕ ( x ) ≤ 0 } . By Prop osition 1.1 and Assumption 3 , for every b oundary p oin t ¯ x ∈ X with ϕ ( ¯ x ) = 0, there exist constants κ ¯ x > 0 and ρ ¯ x > 0 such that dist( x , ˆ X ) ≤ κ ¯ x [ ϕ ( x )] + , ∀ x ∈ X ∩ B ( ¯ x , ρ ¯ x ) . (24) Let B := X ∩ { x : ϕ ( x ) = 0 } . Since X is compact and ϕ is con tinuous, B is compact. The op en co ver { B ( ¯ x , ρ ¯ x ) } ¯ x ∈B admits a finite sub cov er { B ( ¯ x j , ρ j ) } J j =1 . Define κ 0 := max j ∈ [ J ] κ ¯ x j , U := X ∩ J [ j =1 B ( ¯ x j , ρ j ) . Then ( 24 ) implies the error b ound dist( x , ˆ X ) ≤ κ 0 [ ϕ ( x )] + holds for all x ∈ U . Let V := X \ U . Then V is compact and, since B ⊆ U , we hav e V ∩ { x : ϕ ( x ) = 0 } = ∅ . Thus π := min x ∈V | ϕ ( x ) | > 0. F or any x ∈ V , either ϕ ( x ) ≤ 0 with dist( x , ˆ X ) = 0 or ϕ ( x ) > 0 with 18 [ ϕ ( x )] + ≥ π . In b oth cas es, using dist( x , ˆ X ) ≤ diam( X ) for x ∈ X , we obtain dist( x , ˆ X ) ≤ diam( X ) ≤ diam( X ) π [ ϕ ( x )] + , ∀ x ∈ V . Com bining this estimate on V with the error b ound on U prov es the global b ound ( 23 ). Before proceeding, we note that the global exact penalty equiv alence can also b e established in a widely encoun tered structured piecewise-linear setting. This structure allows us to derive the same exact penalty result by exploiting p olyhedral geometry and the finite combinatorial nature of activ e pieces, without in voking a generalized MF CQ condition. Such a setting arises in man y practical applications, including our numerical exp eriments. W e summarize this result in the follo wing prop osition. Prop osition 3.2. Supp ose that Assumption 2 holds, X is a p olyhe dr on and that e ach h s,i ( x ) is affine in x . Then the fol lowing statements ar e e quivalent: (a) ¯ x is a glob al minimizer for Pr oblem ( 10 ) ; (b) ϕ ( ¯ x ) = 0 and ther e exists ¯ σ > 0 such that ¯ x is a glob al minimizer for Pr oblem ( 12 ) for any σ > ¯ σ . Pr o of. As in Theorem 1.3 , it suffices to show that there exists κ > 0 such that dist( x , ˆ X ) ≤ κ [ ϕ ( x )] + , ∀ x ∈ X . (25) W e divide the pro of into t wo steps. Step 1. A linear error b ound for the union of g s ( x ) ≤ 0 on X . F or any T ⊆ [ S ], define J T := { x ∈ X : g s ( x ) ≤ 0 ∀ s ∈ T } . Since X is a p olyhedron and each h i,s ( x ) is affine, g s ( x ) ≤ 0 is equiv alen t to the linear system h i,s ( x ) ≤ 0 for all i ∈ [ I ]. Fix any linear inequalit y description A T x ≤ c T of J T . By Hoffman’s b ound (see, e.g., [ 25 ]), there exists γ T > 0 such that dist( x , J T ) ≤ γ T ∥ ( A T x − c T ) + ∥ , ∀ x ∈ R d . Since each inequality in A T x ≤ c T is either from X or of the form h i,s ( x ) ≤ 0, there exists C T > 0 suc h that ∥ ( A T x − c T ) + ∥ ≤ C T max s ∈ T [ g s ( x )] + holds for all x ∈ X . Hence, with κ T := γ T C T , dist( x , J T ) ≤ κ T max s ∈ T [ g s ( x )] + , ∀ x ∈ X . Step 2. A global linear error bound for ϕ ( x ) on X . The c hance constraint in ( 2 ) is equiv alen t to ˆ X = S | T | = S − m J T . Given x ∈ X , let T ( x ) ⊆ [ S ] b e an index set of size ( S − m ) attaining the ( S − m ) smallest v alues among { g s ( x ) } S s =1 . Since dist( x , S T J T ) = min T dist( x , J T ) for a finite union of closed sets, we ha ve dist( x , ˆ X ) ≤ dist( x , J T ( x ) ). By Remark (i) of Prop osition 2.2 , w e ha ve max s ∈ T ( x ) [ g s ( x )] + = [ g ( S − m ) ( x )] + = [ ϕ ( x )] + . 19 Therefore, dist( x , ˆ X ) ≤ dist( x , J T ( x ) ) ≤ κ T ( x ) [ ϕ ( x )] + . Let κ := max | T | = S − m κ T . κ is finite since there are finitely many suc h T and this yields ( 25 ). The exact p enalt y prop ert y established for the DC mo del ( 10 ) naturally suggests a parallel result for the lifted bilinear form ulation ( 19 ). W e finally sho w that the p enalized lifted problem ( 20 ) is globally exact: for sufficiently large σ , minimizing ( 20 ) recov ers the global minimizers of ( 19 ). Theorem 3.3. Supp ose that Assumption 1 , 2 and 3 hold. L et ( ¯ x , ¯ y , ¯ z ) b e fe asible for Pr oblem ( 20 ) . Then the fol lowing two statements ar e e quivalent: (a) ( ¯ x , ¯ y , ¯ z ) is a glob al minimizer of Pr oblem ( 19 ) ; (b) V ( ¯ y , ¯ z ) = 0 and ther e exists ¯ σ > 0 such that ( ¯ x , ¯ y , ¯ z ) is a glob al minimizer of Pr oblem ( 20 ) for al l σ > ¯ σ . Pr o of. W e divide the pro of into three steps. Step 1: A lo wer b ound linking V and [ ϕ ] + . Let ( x , y , z ) b e feasible for Problem ( 20 ). Since y s ≥ 0 and g s ( x ) ≤ y s , we ha ve y s ≥ [ g s ( x )] + for all s . Let ψ s ( x ) := [ g s ( x )] + and let ψ (1) ( x ) ≤ · · · ≤ ψ ( S ) ( x ) denote the sorted v alues. Then V ( y , z ) = S X s =1 y s z s ≥ S X s =1 ψ s ( x ) z s . Let C = { z ∈ R S : P S s =1 z s ≥ S − m, 0 ≤ z s ≤ 1 , ∀ s } . The minimum of P S s =1 ψ s ( x ) z s o ver z ∈ C is attained by assigning weigh t 1 to the S − m smallest comp onen ts. Hence, for ev ery z ∈ C , S X s =1 ψ s ( x ) z s ≥ S − m X s =1 ψ ( s ) ( x ) ≥ ψ ( S − m ) ( x ) . By Remark (i) of Proposition 2.2 , ϕ ( x ) = G 1 ( x ) − G 2 ( x ) = g ( S − m ) ( x ), so [ ϕ ( x )] + = [ g ( S − m ) ( x )] + = ψ ( S − m ) ( x ). Therefore, V ( y , z ) ≥ [ ϕ ( x )] + , ∀ ( x , y , z ) ∈ Ω 0 . (26) Step 2: (a) ⇒ (b). Assume ( ¯ x , ¯ y , ¯ z ) is a global minimizer of ( 19 ). Then V ( ¯ y , ¯ z ) = 0 and ¯ x is a global minimizer of ( 10 ) since the tw o problems are equiv alen t and coincide in the ob jective function. By Theorem 3.1 , there exists ¯ σ > 0 suc h that, for every σ > ¯ σ , f ( ¯ x ) ≤ f ( x ) + σ [ ϕ ( x )] + , ∀ x ∈ X . (27) No w fix any σ > ¯ σ and any ( x , y , z ) feasible for ( 20 ). Using ( 27 ) and ( 26 ), we obtain F σ ( x , y , z ) = f ( x ) + σ V ( y , z ) ≥ f ( x ) + σ [ ϕ ( x )] + ≥ f ( ¯ x ) = F σ ( ¯ x , ¯ y , ¯ z ) , whic h shows that ( ¯ x , ¯ y , ¯ z ) is a global minimizer of ( 20 ) for all σ > ¯ σ . Step 3: (b) ⇒ (a). Assume V ( ¯ y , ¯ z ) = 0 and ( ¯ x , ¯ y , ¯ z ) is a global minimizer of ( 20 ) for some σ > 0. Then ( ¯ x , ¯ y , ¯ z ) ∈ Ω and is feasible for ( 19 ). F or an y ( x , y , z ) feasible for ( 19 ), we hav e V ( y , z ) = 0 20 and thus F σ ( x , y , z ) = f ( x ). Global optimality of ( ¯ x , ¯ y , ¯ z ) for ( 20 ) yields f ( ¯ x ) = F σ ( ¯ x , ¯ y , ¯ z ) ≤ F σ ( x , y , z ) = f ( x ) , ∀ ( x , y , z ) ∈ Ω , so ( ¯ x , ¯ y , ¯ z ) is a global minimizer of ( 19 ). 3.2 Lo cal Exact P enalt y of the Primal F orm ulation In this section, w e aim to establish the lo cal exact p enalt y relationship for the primal form u- lation ( 10 ) and its p enalized coun terpart ( 12 ). The result is identical to Theorem 1.2 under Assumption 3 and thus we omit the pro of. Theorem 3.4. Supp ose that Assumptions 1 , 2 and 3 hold. L et ¯ x ∈ X b e given. Then the fol lowing statements ar e e quivalent: (a) ¯ x is a lo c al minimizer of Pr oblem ( 10 ) , i.e., ther e exists r 0 > 0 such that f ( ¯ x ) ≤ f ( x ) , ∀ x ∈ ˆ X ∩ B ( ¯ x , r 0 ) . (b) ϕ ( ¯ x ) ≤ 0 and ther e exists ¯ σ > 0 such that, for every σ > ¯ σ , ¯ x is a lo c al minimizer of the p enalize d pr oblem ( 12 ) , i.e., ther e exists r σ > 0 such that f ( ¯ x ) + σ [ ϕ ( ¯ x )] + ≤ f ( x ) + σ [ ϕ ( x )] + , ∀ x ∈ X ∩ B ( ¯ x , r σ ) . W e p oint out that Assumption 3 is not necessary in the structured piecewise-linear setting. Sp ecifically , if X is a p olyhedron and each h s,i ( x ) is affine in x , then one can derive a glob al linear error b ound of the form dist( x , ˆ X ) ≤ κ [ ϕ ( x )] + , ∀ x ∈ X , b y the argumen t in Prop osition 3.2 . This global error b ound immediately implies the ab o ve lo cal exact p enalt y prop erty . Moreov er, since Algorithm 1 can only guarantee conv ergence to a critical p oin t, in the ab o ve piecewise-linear setting, we can provide an equiv alen t characterization of lo cal minimizers together with a simple verifiable sufficient condition. Prop osition 3.5. [ 18 , pr op osition 6.1.1.] Supp ose Assumptions 1 and 2 hold. In addition, assume that e ach h s,i ( x ) is affine in x . Then ¯ x ∈ X is a d -stationary p oint of Pr oblem ( 12 ) if and only if it is a lo c al minimizer of Pr oblem ( 12 ) . A sufficient condition for a critical p oin t of a DC program to b e d-stationary is that the conca ve part is differentiable at this p oin t [ 18 , prop osition 6.1.10]. In the nonsmo oth setting, this is equiv alen t to the fact that its Clark e sub differential is a singleton. In our structured piecewise- linear setting, this differentiabilit y requirement can b e chec ked directly using the definition of G 2 in ( 8 ). G 2 is differentiable at ¯ x ∈ X provided that the follo wing tw o uniqueness conditions hold: (i) ( Unique top- m sc enarios ) The m largest comp onen ts of ( g 1 ( ¯ x ) , . . . , g S ( ¯ x )) are uniquely iden- tified, i.e., g ( S − m ) ( ¯ x ) < g ( S − m +1) ( ¯ x ) , 21 so that the index set of the top- m scenarios S m ( ¯ x ) := n s ∈ [ S ] : g s ( ¯ x ) ≥ g ( S − m +1) ( ¯ x ) o satisfies |S m ( ¯ x ) | = m ; (ii) ( Unique active pie c e within e ach sele cte d sc enario ) F or ev ery s ∈ S m ( ¯ x ), the maximizer in g s ( ¯ x ) = max i ∈ [ I ] h s,i ( ¯ x ) is unique, i.e., arg max i ∈ [ I ] h s,i ( ¯ x ) = 1 . 3.3 Equiv alence b et ween (Strongly) Stationary P oints of the Lifted F orm ula- tion In this section, w e in v estigate the lo cal connection b etw een the bilinear mo del Problem ( 19 ) and its smo oth p enalized counterpart Problem ( 20 ). A well-kno wn subtlet y in MPCC is that even when the p enalty parameter is sufficiently large, the set of lo cal minimizers of an MPCC generally do es not admit a one-to-one corresp ondence with that of its p enalt y counterpart [ 45 , 21 , 29 ]. Nev ertheless, any accumulation p oin t of the sequence generated by Algorithm 2 is a stationary p oin t of Problem ( 20 ). Therefore, instead of seeking a direct equiv alence b et ween lo cal minimiz- ers, w e adopt a more practical p erspective b y relating stationary points of the smo oth p enalized problem ( 20 ) to a suitable notion of generalized stationarity for Problem ( 19 ). W e no w in tro duce the definition of strong stationarity for Problem ( 19 ). Let ¯ w = ( ¯ x , ¯ y , ¯ z ) ∈ Ω b e given. Define the index sets I y ( ¯ w ) := { s ∈ [ S ] : ¯ y s = 0 , ¯ z s > 0 } , I z ( ¯ w ) := { s ∈ [ S ] : ¯ z s = 0 , ¯ y s > 0 } , I 0 ( ¯ w ) := { s ∈ [ S ] : ¯ y s = ¯ z s = 0 } . Using these index sets, we define the r elaxe d pr o gr am at ¯ w b y fixing the activ e-side v ariables to maintain the same complementarit y pattern. min x , y , z ∈ Ω 0 { f ( x ) : y s = 0 , ∀ s ∈ I y ( ¯ w ) , z s = 0 , ∀ s ∈ I z ( ¯ w ) } . (28) Definition 3.6 (Strong stationarity [ 29 ]) . Under Assumptions 1 and 2 , a p oint ¯ w ∈ Ω is c al le d a strongly stationary point of Pr oblem ( 19 ) if it is an optimal solution to the c onvex r elaxe d pr o gr am ( 28 ) . Denote the feasible region of Problem ( 28 ) by Ω re ( ¯ w ) := n ( x , y , z ) ∈ Ω 0 : y s = 0 ∀ s ∈ I y ( ¯ w ) , z s = 0 ∀ s ∈ I z ( ¯ w ) o . Since ( 28 ) is a conv ex program, ¯ w solv es it if and only if 0 ∈ ∇ f ( ¯ x ) × { 0 } × { 0 } + N Ω re ( ¯ w ) ( ¯ w ) . (29) 22 The next theorem sho ws that every strongly stationary p oint is a lo cal minimizer of Prob- lem ( 19 ), which is known in the MPCC literature. Theorem 3.7. [ 29 , pr op osition 1] L et ¯ w = ( ¯ x , ¯ y , ¯ z ) ∈ Ω b e a str ongly stationary p oint of Pr ob- lem ( 19 ) . Then ¯ w is a lo c al minimizer of Pr oblem ( 19 ) . W e next establish the connection b et ween stationary p oints of the p enalized problem ( 20 ) and strongly stationary p oints of ( 19 ). F or this purp ose, we adopt a standard constraint qualification tailored to MPCC mo dels. Definition 3.8 (MPCC-MF CQ [ 26 , 29 ]) . L et ¯ w ∈ Ω b e given. We say that MPCC-MFCQ holds at ¯ w if the classic al MFCQ holds at ¯ w for the c onvex pr o gr am ( 28 ) . Finally with this constraint qualification at hand, we are able to establish the equiv alence b et ween the stationary p oints of the tw o problems. Theorem 3.9. [ 29 , pr op osition 6] L et ( ¯ x , ¯ y , ¯ z ) ∈ Ω b e given. Supp ose that Assumptions 1 and 2 hold. (a) If ( ¯ x , ¯ y , ¯ z ) ∈ Ω is a stationary p oint for Pr oblem ( 20 ) for some σ > 0 and is fe asible for Pr oblem ( 19 ) , then it is a str ongly stationary p oint of Pr oblem ( 19 ) . (b) If ( ¯ x , ¯ y , ¯ z ) ∈ Ω is a str ongly stationary p oint for Pr oblem ( 19 ) and the MPCC-MFCQ holds, then it is a stationary p oint of Pr oblem ( 20 ) for al l σ sufficiently lar ge. 4 Iden tifying Spurious Lo cal Minimizers In the previous section, w e hav e established the lo cal exact p enalt y for the primal formulation and the fact that strong stationarity for the lifted formulation ( 19 ) implies lo cal optimality in the lifted space. A natural question is whether such a lo cal minimizer of the lifted form ulation ( 19 ) also yields a meaningful local optimalit y guaran tee for the primal formulation ( 10 ) in the x -space. More precisely , giv en a lo cal minimizer ( ¯ x , ¯ y , ¯ z ) of ( 19 ), whether the pro jection ¯ x is a lo cal minimizer of the DC formulation ( 10 ) remains unclear. Our aim in this section is to answer this question. W e show that such an implication from the x -space to the lifted space alw ays holds: ev ery lo cal minimizer ¯ x of ( 10 ) can b e lifted to a lo cal minimizer of ( 19 ). How ev er, the reverse do es not hold in general: ( 19 ) may admit spurious lo cal minimizers whose x -comp onen ts are not lo cally optimal for ( 10 ). W e then identify a mild regularit y condition under which the rev erse statement holds. Define ψ s ( x ) := [ g s ( x )] + for all s ∈ [ S ] and the set-v alued mapping Z ( x ) = arg min ( S X s =1 ψ s ( x ) z s : z ∈ C ) . Since C is nonempty , compact, and conv ex and ( ψ 1 ( x ) , . . . , ψ S ( x )) is contin uous in x , Berge’s maxim um theorem [ 6 ] implies that Z ( x ) is upp er semicon tinuous. W e first sho w that every lo cal minim um of the DC formulation ( 10 ) yields a lo cal minimum of the lifted formulation ( 19 ). Theorem 4.1. L et ¯ x ∈ ˆ X b e a lo c al minimizer of Pr oblem ( 10 ) . Then ther e exists ¯ y ∈ R S + and ¯ z ∈ C such that ( ¯ x , ¯ y , ¯ z ) ∈ Ω is a lo c al minimizer of Pr oblem ( 19 ) . 23 Pr o of. Since ¯ x ∈ ˆ X , Prop osition 2.1 implies that there exists an index set ¯ J ⊆ [ S ] with | ¯ J | = S − m suc h that g s ( ¯ x ) ≤ 0 for all s ∈ ¯ J . Define ¯ z ∈ C by ¯ z s = 1 if s ∈ ¯ J and ¯ z s := 0 otherwise. Define ¯ y ∈ R S + b y ¯ y s = 0 if s ∈ ¯ J and ¯ y s := [ g s ( ¯ x )] + otherwise. Then ¯ y ≥ 0, g s ( ¯ x ) ≤ ¯ y s for all s , and P S s =1 ¯ y s ¯ z s = 0, hence ( ¯ x , ¯ y , ¯ z ) ∈ Ω. Let ε > 0 b e such that f ( x ) ≥ f ( ¯ x ) for all x ∈ ˆ X ∩ B ( ¯ x , ε ). T ak e any ( x , y , z ) ∈ Ω with ∥ ( x , y , z ) − ( ¯ x , ¯ y , ¯ z ) ∥ < ε . Therefore, we hav e f ( x ) ≥ f ( ¯ x ) for ev ery ∥ ( x , y , z ) − ( ¯ x , ¯ y , ¯ z ) ∥ < ε and ( x , y , z ) ∈ Ω since Problem ( 10 ) and ( 19 ) hav e the same ob jectiv e function. Ho wev er, the inv erse of Theorem 4.1 do es not hold without additional assumptions. This can b e demonstrated using the following one-dimensional example. Example 4.2. Consider the one-dimensional setting with S = 2 , m = 1 , X = [ − 1 , + ∞ ) . L et f ( x ) = x , the sample b ase d c onstr aints g 1 ( x ) = [ x ] + and g 2 ( x ) = [ − x ] + . Sinc e g s ( x ) ≥ 0 , we have G 1 ( x ) − G 2 ( x ) = min { [ x ] + , [ − x ] + } = 0 , ∀ x ∈ X . Ther efor e, the DC formulation ( 10 ) r e duc es to min { x : x ≥ − 1 } , whose unique glob al minimizer is x = − 1 . We now show that x = 0 c an nevertheless app e ar as the x -c omp onent of a lo c al minimizer of the lifte d formulation, which c an b e written as fol lows min x, y ≥ 0 , z ∈C { x : g 1 ( x ) ≤ y 1 , g 2 ( x ) ≤ y 2 , y 1 z 1 + y 2 z 2 = 0 } . T ake ¯ x = 0 , ¯ y = (0 , 0) , and ¯ z = (1 , 1) . Then ( ¯ x, ¯ y , ¯ z ) ∈ Ω sinc e g 1 (0) = g 2 (0) = 0 , P 2 s =1 ¯ z s = 2 ≥ S − m = 1 , and P 2 s =1 ¯ y s ¯ z s = 0 . Mor e over, ther e exists r ∈ (0 , 1 / 4) such that for any fe asible ( x, y , z ) ∈ Ω with ∥ ( x, y , z ) − ( ¯ x, ¯ y , ¯ z ) ∥ < r , we have z 1 > 1 / 2 and z 2 > 1 / 2 , which to gether with y ≥ 0 and P 2 s =1 y s z s = 0 implies y 1 = y 2 = 0 . Henc e fe asibility for c es g 1 ( x ) ≤ 0 and g 2 ( x ) ≤ 0 , i.e., [ x ] + = [ − x ] + = 0 , so ne c essarily x = 0 . Conse quently, every fe asible p oint in a sufficiently smal l neighb orho o d of ( ¯ x, ¯ y , ¯ z ) has the same obje ctive value f ( x ) = 0 , showing that ( ¯ x, ¯ y , ¯ z ) is a lo c al minimizer of ( 19 ) while ¯ x = 0 is not a lo c al minimizer of ( 10 ) . In the ab ov e example, C = { ( z 1 , z 2 ) ∈ [0 , 1] 2 : z 1 + z 2 ≥ S − m = 1 } and one can verify that Z ( x ) = { (1 , 0) } , x < 0 , C , x = 0 , { (0 , 1) } , x > 0 . Example 4.2 arises b ecause, at ¯ x = 0, the set-v alued mapping Z ( x ) contains isolated elements. In particular, for an y sequence x n ↓ 0 we hav e Z ( x n ) = { (0 , 1) } . Similarly , for an y sequence x n ↑ 0 w e hav e Z ( x n ) = { (1 , 0) } and th us no choice z n ∈ Z ( x n ) can conv erge to (1 , 1). This is precisely the pathology that allows ( x, y , z ) = (0 , (0 , 0) , (1 , 1)) to b e a lo cal minimizer of ( 19 ) ev en though 0 is not a lo cal minimizer of the DC form ulation ( 10 ). W e next introduce an assumption that can eliminate such spurious isolated lo cal minimizers. Theorem 4.3. L et ( ¯ x , ¯ y , ¯ z ) ∈ Ω b e a lo c al minimizer of Pr oblem ( 19 ) . Supp ose that Z ( x ) is lower semic ontinuous at ¯ x . Then ¯ x is a lo c al minimizer of Pr oblem ( 10 ) . 24 Pr o of. Define ˜ y ∈ R S b y ˜ y s = ψ s ( ¯ x ) for all s ∈ [ S ]. W e first show that ( ¯ x , ˜ y , ¯ z ) ∈ Ω and is also a lo cal minimizer of Problem ( 19 ). Since ( ¯ x , ¯ y , ¯ z ) ∈ Ω, w e ha ve g s ( ¯ x ) ≤ ¯ y s , ¯ y s ≥ 0, and ¯ y s ¯ z s = 0 for all s ∈ [ S ]. Hence ˜ y s = [ g s ( ¯ x )] + ≤ ¯ y s , so g s ( ¯ x ) ≤ ˜ y s and ˜ y s ≥ 0 for all s . Since ˜ y ≤ ¯ y , ˜ y ≥ 0 , and ¯ y ⊤ ¯ z = 0, w e get ˜ y ⊤ ¯ z = 0. Th us ( ¯ x , ˜ y , ¯ z ) ∈ Ω. Supp ose, to the con trary , that ( ¯ x , ˜ y , ¯ z ) is not a local minimizer of Problem ( 19 ). Then there exists a sequence { ( x k , y k , z k ) } ⊂ Ω such that ( x k , y k , z k ) → ( ¯ x , ˜ y , ¯ z ) and f ( x k ) < f ( ¯ x ) for all k . P artition [ S ] into I + := { s : ¯ z s > 0 } , I 0 , + > := { s : ¯ z s = 0 , ¯ y s > ψ s ( ¯ x ) } , I 0 , + = := { s : ¯ z s = 0 , ¯ y s = ψ s ( ¯ x ) > 0 } , I 0 , 0 := { s : ¯ z s = 0 , ¯ y s = 0 } . F or each k , define ˆ y k ∈ R S b y ˆ y k s := 0 , s ∈ I + , ¯ y s , s ∈ I 0 , + > , [ g s ( x k )] + , s ∈ I 0 , + = ∪ I 0 , 0 . Then ( x k , ˆ y k , ¯ z ) ∈ Ω for all sufficien tly large k : if s ∈ I + , then ¯ z s > 0, so z k s > 0 for all sufficien tly large k ; since ( x k , y k , z k ) ∈ Ω, y k s ≥ 0, z k s ≥ 0, and P S t =1 y k t z k t = 0, hence y k s z k s = 0 and therefore y k s = 0. Th us g s ( x k ) ≤ y k s = 0 = ˆ y k s . If s ∈ I 0 , + > , then ¯ y s > ψ s ( ¯ x ) = [ g s ( ¯ x )] + ≥ g s ( ¯ x ), so by con tinuit y of g s , g s ( x k ) < ¯ y s = ˆ y k s for all sufficiently large k . If s ∈ I 0 , + = ∪ I 0 , 0 , then by definition ˆ y k s = [ g s ( x k )] + ≥ g s ( x k ) and ˆ y k s ≥ 0. In all cases, since ¯ z s = 0 for s / ∈ I + , w e hav e ˆ y k s ¯ z s = 0 for ev ery s . Moreo ver, ˆ y k → ¯ y : for s ∈ I + , b oth are 0; for s ∈ I 0 , + > , ˆ y k s = ¯ y s ; and for s ∈ I 0 , + = ∪ I 0 , 0 , ˆ y k s = [ g s ( x k )] + → [ g s ( ¯ x )] + = ψ s ( ¯ x ) = ¯ y s . Th us ( x k , ˆ y k , ¯ z ) → ( ¯ x , ¯ y , ¯ z ). Since the ob jectiv e of Problem ( 19 ) dep ends only on x , w e hav e f ( x k ) = f ( x k , ˆ y k , ¯ z ) < f ( ¯ x ) for all k , contradicting the local minimalit y of ( ¯ x , ¯ y , ¯ z ). Therefore, ( ¯ x , ˜ y , ¯ z ) is a lo cal minimizer of Problem ( 19 ). No w let r > 0 b e such that f ( x ) ≥ f ( ¯ x ) ∀ ( x , y , z ) ∈ Ω with ∥ ( x , y , z ) − ( ¯ x , ˜ y , ¯ z ) ∥ < r. Since ( ¯ x , ˜ y , ¯ z ) ∈ Ω, we hav e ¯ z ∈ C and P S s =1 ˜ y s ¯ z s = 0. As ˜ y s = ψ s ( ¯ x ) ≥ 0 for all s , it follows that P S s =1 ψ s ( ¯ x ) ¯ z s = 0, so ¯ z attains the minimum of min n S X s =1 ψ s ( ¯ x ) z s : z ∈ C o . Hence ¯ z ∈ Z ( ¯ x ). Fix ε := r / 3. Since Z is low er semicontin uous at ¯ x and ¯ z ∈ Z ( ¯ x ), there exists ρ 1 > 0 suc h that for every x ∈ X with ∥ x − ¯ x ∥ < ρ 1 , one can choose z ( x ) ∈ Z ( x ) satisfying ∥ z ( x ) − ¯ z ∥ < ε . Since 25 ψ is contin uous, there exists ρ 2 > 0 such that ∥ ψ ( x ) − ψ ( ¯ x ) ∥ < ε whenever ∥ x − ¯ x ∥ < ρ 2 . Set ρ := min { ρ 1 , ρ 2 , ε } and tak e any x ∈ ˆ X ∩ B ( ¯ x , ρ ). Let y ( x ) := ψ ( x ) and choose z ( x ) ∈ Z ( x ) as ab ov e. Since x ∈ ˆ X , the optimal v alue of min n S X s =1 ψ s ( x ) z s : z ∈ C o is 0, so P S s =1 y s ( x ) z s ( x ) = 0. Also, g s ( x ) ≤ [ g s ( x )] + = y s ( x ) and y s ( x ) ≥ 0 for all s , hence ( x , y ( x ) , z ( x )) ∈ Ω. F urthermore, ∥ ( x , y ( x ) , z ( x )) − ( ¯ x , ˜ y , ¯ z ) ∥ ≤ ∥ x − ¯ x ∥ + ∥ y ( x ) − ˜ y ∥ + ∥ z ( x ) − ¯ z ∥ < ε + ε + ε = r. Therefore, by the lo cal minimality of ( ¯ x , ˜ y , ¯ z ), f ( x ) = f ( x , y ( x ) , z ( x )) ≥ f ( ¯ x , ˜ y , ¯ z ) = f ( ¯ x ) . Since x ∈ ˆ X ∩ B ( ¯ x , ρ ) w as arbitrary , ¯ x is a lo cal minimizer of Problem ( 10 ). A simple sufficient condition for the low er semicontin uit y of Z ( x ) at ¯ x is that it is a singleton, whic h has b een prov ed in [ 9 ]. In our setting, a standard sufficient condition ensuring that Z ( ¯ x ) is a singleton is that the threshold exhibits a strict gap, namely ψ ( S − m ) ( ¯ x ) = 0 < ψ ( S − m +1) ( ¯ x ) , whic h implies that the set of ( S − m ) smallest comp onen ts of { ψ 1 ( ¯ x ) } S s =1 is uniquely determined and thus the set-v alued mapping Z ( ¯ x ) is a singleton. Finally , we emphasize that the lo w er semicontin uit y of Z is only a sufficien t condition: spurious lo cal minima may o ccur at isolated lifted p oints, but not ev ery isolated lifted lo cal minimizer is spurious in the sense of having an x -comp onen t that fails to b e lo cally optimal for the DC form ulation. Example 4.4. L et us r evisit Example 4.2 but r eplac e the obje ctive by f ( x ) = x 2 . Sinc e ϕ ( x ) ≡ 0 on X , the DC formulation ( 10 ) b e c omes min { x 2 : x ≥ − 1 } , so x = 0 is a glob al, henc e lo c al minimizer. On the other hand, the lifte d p oint ( ¯ x, ¯ y , ¯ z ) = (0 , (0 , 0) , (1 , 1)) r emains fe asible for ( 19 ) and is again an isolate d lo c al minimizer in the lifte d sp ac e by the same neighb orho o d ar gument as in Example 4.2 . In this c ase, however, the lifte d lo c al minimizer is not spurious: its x -c omp onent ¯ x = 0 is also a lo c al minimizer of the primal formulation, even though Z ( · ) is stil l not lower semic ontinuous at 0 . 5 Numerical Exp eriments In this section, we conduct exp erimen ts to test the p erformance of the prop osed algorithms in Section 2 on b oth real and synthetic datasets. F or ease of reference, w e denote our prop osed Algorithm 1 by P enDC-P(rimal) and Algorithm 2 by P enDC-L(ifted). W e compare our algorithms 26 with some other state-of-the-art metho ds, including the CV aR approximation [ 43 , 40 ], mixed- in teger form ulation ( 30 ) (MIP) prop osed in [ 2 ], a DC based algorithm (DCA) in [ 50 ], and the bisection-based appro ximation algorithm ALSO-X+ prop osed in [ 30 ]. F or the MIP formulation, w e directly solve the following mixed-in teger program: v ∗ = min x ∈X , z ∈{ 0 , 1 } S ( f ( x ) : S X s =1 z s ≥ S − m, g s ( x ) ≤ (1 − z s ) M s , ∀ s ∈ [ S ] ) , (30) In particular, w e use Gurobi (v12.0.3) to solve all linear, quadratic, and mixed-integer (sub)problems. W e set a time limit of 600 seconds for all the MIP runs with the default optimality gap tolerance of 0.01%. F or instances that cannot b e solv ed to optimality within the time limit, we use “gap” to denote a veraged optimality gap as gap(%) = ( | UB − LB | ) / ( | LB | ) × 100 among the instances. W e use “fv al” to denote av eraged returned ob jectiv e v alues of the instances, “time” to denote the av eraged CPU time (in seconds) and “prob” to denote the av erage probability of the c hance constrain t. W e use “/” to indicate that the algorithm fails to provide a v alid feasible solution for at least one of the instances. F or each setting, w e generate fiv e random instances and report the a verage p erformance ov er these instances. The row “solved” rep orts the num b er of instances returned with a feasible solution among those five instances. F or the outer loop of the prop osed metho d, we terminate the algorithm when the sample based chance constraint is satisfied. F or the inner loop of the prop osed metho ds and all the test metho ds except ALSO-X+, w e terminate the algorithm when | f k − f k +1 | / max { 1 , | f k +1 |} ≤ 10 − 6 for k = 0 , 1 , 2 · · · . F or P enDC-P and PenDC-L, the initial p oints are chosen randomly . F or the first t w o outer iterations of the t w o p enalt y metho ds, w e terminate the inner loop after 1 and 2 iterations, resp ectively , as a warm start. F or ALSO-X+, we terminate the algorithm when the difference b etw een the upp er b ound and the low er b ound of the ob jectiv e v alue | t U − t L | ≤ 10 − 6 . All n umerical exp erimen ts in this section are implemented in Python 3.9.21 and executed on a Lin ux serv er equipp ed with 256 GB RAM and a 96-core AMD EPYC 7402 CPU running at 2.8 GHz. 5.1 A V aR-constrained P ortfolio Optimization Problem In this section, w e consider a V alue-at-Risk (V aR) constrained mean–v ariance p ortfolio selection mo del studied in [ 4 , 50 ]. Let µ ∈ R n and Σ ∈ R n × n denote the estimated mean vector and co v ariance matrix of returns for n risky assets, and let γ > 0 b e a risk-a version parameter. Let x ∈ R n + denote the p ortfolio weigh t vector. The problem can b e formulated as follows: min x ∈ R n ( γ x ⊤ Σ x − µ ⊤ x : Pr { ˜ ξ ⊤ x ≥ R } ≥ 1 − α, n X i =1 x i = 1 , 0 ≤ x i ≤ u i , i ∈ [ n ] ) . The problem can b e in terpreted as follo ws: the ob jectiv e minimizes a quadratic v ariance p enalt y min us exp ected return, sub ject to a chance constrain t requiring that the realized p ortfolio return ξ ⊤ x exceeds a presp ecified target R with probability at least 1 − α . In addition, w e imp ose the budget constrain t P n i =1 x i = 1 and b o x constraints 0 ≤ x i ≤ u i for each i ∈ [ n ] to av oid o verconcen tration. T o construct test instances from real data, w e use a dataset of 2523 daily returns of 435 27 sto c ks in the S&P 500 index ov er Marc h 2006 to March 2016, which can b e downloaded from https://github.com/INFORMSJoC/2024.0648 . Specifically , we consider four problem sizes n ∈ { 100 , 200 , 300 , 400 } and set the sample size to S = 3 n . F or eac h pair ( n, S ), we run the algorithms on fiv e indep enden t instances and rep ort the a v erage p erformance metrics o v er these five instances. F or the PenDC-L metho d, we set σ 0 = 5 · 10 − 3 , β = 4 . 0, and ρ = 10 − 4 . F or the PenDC-P metho d, w e set σ 0 = 3 · 10 − 3 , β = 1 . 5, and ρ = 0. T able 1: Comparisons of the p ortfolio optimization problem. ( α, S ) MIP CV aR PenDC-L PenDC-P DCA ALSO-X+ 0 . 05 300 fv al -0.013550 -0.011785 -0.013398 -0.013053 -0.012125 -0.013097 time(gap) 207.1164 (0.0%) 0.1093 0.1109 0.1936 0.4343 3.3849 prob 0.9500 1.0000 0.9500 0.9507 0.9653 0.9500 solved 5/5 5/5 5/5 5/5 5/5 5/5 0 . 05 600 fv al -0.013508 -0.011790 -0.013379 -0.012992 -0.012527 -0.013208 time(gap) 600.0085 (4.8%) 0.6746 0.3961 0.9973 5.4026 18.6961 prob 0.9500 1.0000 0.9507 0.9500 0.9620 0.9507 solved 5/5 5/5 5/5 5/5 5/5 5/5 0 . 05 900 fv al -0.013428 -0.011697 -0.013382 -0.012817 -0.012850 -0.013144 time(gap) 600.0144 (8.7%) 1.9169 0.9956 2.5666 33.2572 58.1248 prob 0.9500 1.0000 0.9504 0.9500 0.9182 0.9504 solved 5/5 5/5 5/5 5/5 5/5 5/5 0 . 05 1200 fv al -0.013557 -0.011825 -0.013597 -0.013247 -0.013090 -0.013273 time(gap) 600.0488 (10.6%) 4.7453 2.2373 7.1035 83.0278 155.2660 prob 0.9504 1.0000 0.9507 0.9505 0.9248 0.9503 solved 5/5 5/5 5/5 5/5 5/5 5/5 0 . 10 300 fv al -0.014429 -0.011785 -0.014281 -0.014138 -0.013522 -0.014108 time(gap) 36.5684 (0.0%) 0.1085 0.1002 0.1781 1.1241 3.3367 prob 0.9000 1.0000 0.9060 0.9007 0.9020 0.9007 solved 5/5 5/5 5/5 5/5 5/5 5/5 0 . 10 600 fv al -0.014213 -0.011790 -0.014151 -0.013986 -0.013718 -0.014073 time(gap) 600.0077 (1.4%) 0.6771 0.3166 0.8218 11.4020 17.5378 prob 0.9003 1.0000 0.9057 0.9000 0.8953 0.9000 solved 5/5 5/5 5/5 5/5 5/5 5/5 0 . 10 900 fv al -0.014347 -0.011697 -0.014272 -0.013964 -0.013831 -0.014196 time(gap) 600.0097 (3.1%) 1.9063 0.8954 2.3002 43.9097 54.7558 prob 0.8998 1.0000 0.9058 0.9000 0.9042 0.9007 solved 5/5 5/5 5/5 5/5 5/5 5/5 0 . 10 1200 fv al -0.014558 -0.011825 -0.014455 -0.014448 -0.014253 -0.014453 time(gap) 600.0161 (3.6%) 4.7528 1.9307 6.4855 154.4464 141.1504 prob 0.9015 1.0000 0.9107 0.9000 0.8973 0.9013 solved 5/5 5/5 5/5 5/5 5/5 5/5 The results in T able 1 demonstrate that PenDC-L p erforms comp etitiv ely in terms of solution qualit y and computational efficiency on the test problems. Across all tested com binations of ( α, S ), P enDC-L pro duces ob jectiv e v alues that are nearly identical to the MIP b enc hmark whenever the MIP is solved to optimalit y , while requiring orders-of-magnitude less runtime and remaining stable as n increases. In contrast, the CV aR approximation is fast but systematically more conserv ativ e, as reflected by its empirical satisfaction probabilities b eing close to 1 in all instances. This conser- v atism directly translates into a noticeable ob jectiv e gap. Compared with the DCA, P enDC-P can impro ve the p erformance of DCA b y pro ducing a feasible solution with low er ob jectiv e v alue and shorter running time. F or the ALSO-X+ metho d, we can see that it can return a feasible solution with the low est ob jectiv e v alue among the remaining metho ds. How ev er, this comes at the cost of substan tially longer runtime. 28 5.2 A Probabilistic Resource Planning Problem W e next consider a chance constrained linear resource planning problem, which has b een studied in [ 38 , 4 ]. The mo del concerns shipping go o ds from n suppliers to m customers at minim um total transp ortation cost, where customer demands are uncertain. Sp ecifically , the demand of customer j ∈ [ m ] is mo deled by a random v ariable ξ j , while each supplier i ∈ [ n ] is sub ject to a deterministic capacit y limit θ i > 0. Let c ij ≥ 0 denote the unit shipping cost from supplier i to customer j , and let x ij ≥ 0 be the amount shipp ed along arc ( i, j ). The shipmen t plan is decided in adv ance of demand realization, and feasibility is enforced in a probabilistic sense b y requiring that all customer demands are met simultaneously with probability at least (1 − α ). The resulting formulation is min x ∈ R n × m n X i =1 m X j =1 c ij x ij : Pr ( n X i =1 x ij ≥ ξ j , ∀ j ∈ [ m ] ) ≥ 1 − α , m X j =1 x ij ≤ θ i , ∀ i ∈ [ n ] , x ij ≥ 0 , ∀ i ∈ [ n ] , j ∈ [ m ] . W e use the public dataset provided at http://homepages.cae.wisc.edu/ ~ luedtkej/ . Through- out, we let ( n, m ) ∈ { (40 , 100) , (40 , 200) } and consider three sample sizes S ∈ { 1000 , 2000 , 3000 } . F or b oth the PenDC-L and PenDC-P metho ds, we set σ 0 = 5, β = 4 . 5, and ρ = 10 − 3 . T able 2: Comparisons of the probabilistic resource planning problem. ( α, n, S ) MIP CV aR PenDC-L PenDC-P DCA ALSO-X+ 0 . 05 100 1000 ! fv al 4.1309 / 4.1422 4.2554 4.2560 4.1754 time(gap) 143.8471 (0.0%) / 19.2711 25.2524 22.7656 67.8204 prob 0.9500 / 0.9502 0.9500 0.9500 0.9500 solved 5/5 4/5 5/5 5/5 5/5 5/5 0 . 05 100 2000 ! fv al 4.4168 4.6644 4.4264 4.6088 4.6083 4.4578 time(gap) 429.6553 (0.0%) 1.8925 54.8647 59.5165 46.3589 189.1209 prob 0.9500 1.0000 0.9500 0.9500 0.9500 0.9503 solved 5/5 5/5 5/5 5/5 5/5 5/5 0 . 05 200 2000 ! fv al 8.5613 / 8.3748 8.6736 / 8.4288 time(gap) 600.2509 (6.4%) / 204.4876 457.3059 / 645.4984 prob 0.9638 / 0.9500 0.9278 / 0.9501 solved 5/5 4/5 5/5 5/5 4/5 5/5 0 . 05 200 3000 ! fv al 8.8224 / 8.5996 8.9440 8.9440 8.6534 time(gap) 600.3817 (12.6%) / 334.5961 220.8843 165.6295 1302.5201 prob 0.9654 / 0.9500 0.9500 0.9500 0.9500 solved 5/5 4/5 5/5 5/5 5/5 5/5 0 . 10 100 1000 ! fv al 4.0540 / 4.0663 4.2306 4.2306 4.0847 time(gap) 406.5311 (0.0%) / 19.3168 25.8939 25.8532 81.7118 prob 0.9000 / 0.9000 0.9000 0.9000 0.9000 solved 5/5 4/5 5/5 5/5 5/5 5/5 0 . 10 100 2000 ! fv al 4.3221 4.6644 4.3318 4.5787 4.5800 4.3581 time(gap) 600.3751 (0.6%) 1.8589 52.9770 62.6593 50.3822 235.6824 prob 0.9004 1.0000 0.9002 0.9000 0.9000 0.9000 solved 5/5 5/5 5/5 5/5 5/5 5/5 0 . 10 200 2000 ! fv al 8.4548 / 8.2194 / / 8.2649 time(gap) 600.2532 (12.4%) / 220.6448 / / 844.3599 prob 0.9079 / 0.9002 / / 0.9001 solved 5/5 4/5 5/5 4/5 4/5 5/5 0 . 10 200 3000 ! fv al 8.7236 / 8.4273 8.8973 8.8974 8.4733 time(gap) 602.9494 (18.0%) / 352.0631 235.6632 191.6438 1388.0507 prob 0.9171 / 0.9001 0.9000 0.9000 0.9001 solved 5/5 4/5 5/5 5/5 5/5 5/5 *The magnitude of fv al is 10 7 . 29 T able 2 highlights several distinctive computational features of the compared metho ds on the resource planning problem. Among the iterative approac hes, DCA is often the fastest whenev er it succeeds, b ecause DCA has a finite termination guarantee in the p olyhedral setting, and the n umber of iterations is smaller than that of the p enalty based metho ds. In this numerical setting, the DCA metho d usually con verges within only 3-5 iterations. Ho wev er, T able 2 also shows that b oth DCA and the CV aR approximation fail to provide a v alid solution under some instances, whic h is consisten t with the fact that b oth reform ulations may b ecome ov erly conserv ativ e and therefore infeasible. In comparison, the PenDC-P metho d can sometimes alleviate infeasibility caused by ov erly conserv ativ e subproblems, and can often return a low er ob jectiv e v alue with sligh tly longer running time compared with the DCA metho d, despite the fact that the ob jectiv e v alue is still larger than the ALSO-X+. Finally , we can see that PenDC-L outp erforms all the other metho ds in terms of the ob jectiv e v alue since it returns the low est ob jectiv e, even compared with ALSO-X+. Moreov er, its running time remains comparable to the DCA due to an efficien t w arm-start strategy . 5.3 Linear Ob jectiv e with Joint Quadratic Chance Constraint W e further test our algorithm on a problem with a linear ob jective and a joint nonlinear (conv ex) c hance constraint, whic h has been used as a benchmark in [ 27 , 50 ]. The decision v ariable is x ∈ R d + and the ob jective is to minimize the negativ e sum of comp onents. The feasibilit y requirement is imp osed through a joint quadratic c hance constraint of the form min x ∈ R d + ( − d X i =1 x i : Pr ( d X i =1 ξ 2 ij x 2 i ≤ θ , ∀ j ∈ [ m ] ) ≥ 1 − α ) , where the random co efficients { ξ ij } i ∈ [ d ] , j ∈ [ m ] follo w a dep enden t Gaussian structure. In particular, for each j , the vector ( ξ 1 j , . . . , ξ d j ) is multiv ariate normal with E [ ξ ij ] = j /d and V ar( ξ ij ) = 1, and the within- j correlations satisfy Cov( ξ ij , ξ i ′ j ) = 0 . 5 for i = i ′ . Across differen t indices j = j ′ , the corresp onding random v ariables are indep enden t, i.e., Co v ( ξ ij , ξ i ′ j ′ ) = 0. In the n umerical study , we fix d = 20, m = 20, and θ = 100, and we v ary the sample size S . Sp ecifically , w e consider S ∈ { 500 , 1000 , 2000 } . F or the PenDC-P metho d, we let σ 0 = 4 · 10 − 3 , β = 15 and ρ = 0. F or the PenDC-L metho d, we let σ 0 = 8 · 10 − 5 , β = 10 and ρ = 10 − 3 . F or the first t wo outer iterations of the tw o p enalt y metho ds, we terminate the inner lo op after 1 and 2 iterations, resp ectiv ely , as a warm start. W e also set a time limit of 600 seconds for the DCA metho d. F rom T able 3 , we can see that b oth the DCA and the PenDC-P metho ds exhibit long running times due to the severe oscillation in the up dates. Moreo ver, the PenDC-P metho d fails to pro vide a feasible solution in most of the higher-dimensional instances, and the DCA sometimes violates the empirical chance constrain t, which reflects their limitations in such settings. In contrast, the PenDC-L main tains stabilit y in run times while ac hieving the lo west ob jective v alues in most settings among the iterative metho ds. Compared with the ALSO-X+, the PenDC-L metho d returns a low er ob jectiv e v alue with considerably shorter running time, highlighting its scalability . 30 T able 3: Comparisons of the norm optimization problem. ( α, S ) MIP CV aR PenDC-L PenDC-P DCA ALSO-X+ 0 . 05 500 fv al -16.5403 -14.9226 -16.5263 -15.4856 / -16.5106 time(gap) 545.4053 (2.2%) 8.6787 43.0250 607.0177 / 220.0617 prob 0.9500 0.9792 0.9500 0.9500 / 0.9500 solved 5/5 5/5 5/5 5/5 4/5 5/5 0 . 05 1000 fv al -16.3881 -15.0186 -16.3725 / / -16.3646 time(gap) 600.1338 (24.6%) 20.4617 93.8265 / / 559.3773 prob 0.9500 0.9820 0.9500 / / 0.9500 solved 5/5 5/5 5/5 0/5 4/5 5/5 0 . 05 2000 fv al -16.0655 -14.8409 -16.1215 / -16.0585 -16.1146 time(gap) 600.7272 (50.1%) 48.9682 252.3422 / 626.9383 1289.8408 prob 0.9506 0.9808 0.9502 / 0.9500 0.9501 solved 5/5 5/5 5/5 0/5 5/5 5/5 0 . 10 500 fv al -17.4619 -15.7353 -17.4138 -15.6822 -17.3876 -17.4291 time(gap) 600.0095 (22.9%) 7.6620 42.2942 621.9502 590.9415 236.4814 prob 0.9000 0.9620 0.9000 0.8992 0.9000 0.9000 solved 5/5 5/5 5/5 5/5 5/5 5/5 0 . 10 1000 fv al -17.3869 -15.7922 -17.3578 / -17.2949 -17.3490 time(gap) 601.3015 (55.3%) 18.5244 94.8007 / 614.2368 611.8454 prob 0.9000 0.9638 0.9006 / 0.9000 0.9004 solved 5/5 5/5 5/5 0/5 5/5 5/5 0 . 10 2000 fv al -17.1944 -15.6308 -17.1653 / -17.1227 -17.1643 time(gap) 608.1971 (63.3%) 50.8092 258.9725 / 634.5459 1453.3697 prob 0.9001 0.9636 0.9003 / 0.9000 0.9002 solved 5/5 5/5 5/5 0/5 5/5 5/5 6 Conclusion In this pap er, we developed p enalt y based DC algorithms for the SAA appro ximation of c hance constrained programs in a conv ex setting. Using a rank-based DC representation of the empirical c hance constraint, w e prop osed a primal-space metho d that av oids feasible initialization and a n umerically more stable lifted formulation with a finite termination guaran tee. W e established exact p enalt y and stationarit y guarantees for b oth formulations under mild constraint qualifica- tion assumptions. Numerical studies demonstrated the efficiency of the prop osed metho ds. F uture w orks include extending the metho d to distributionally robust chance constrained programs (DR- CCPs) under W asserstein am biguity sets [ 52 , 14 ], and c hance constrained programs with linear conic inequality constraints [ 49 ]. References [1] Shabbir Ahmed and Dimitri J Papageorgiou. Probabilistic set cov ering with correlations. Op er ations R ese ar ch , 61(2):438–452, 2013. [2] Shabbir Ahmed and Alexander Shapiro. Solving chance-constrained sto c hastic programs via sampling and integer programming. In State-of-the-art de cision-making to ols in the information-intensive age , pages 261–269. Informs, 2008. [3] Shabbir Ahmed and W eijun Xie. Relaxations and approximations of chance constrain ts under finite distributions. Mathematic al Pr o gr amming , 170(1):43–65, 2018. [4] Xiaodi Bai, Jie Sun, and Xiao jin Zheng. An augmented lagrangian decomp osition metho d for c hance-constrained optimization problems. INFORMS Journal on Computing , 33(3):1056– 1069, 2021. 31 [5] Sebastian Banert and Radu Ioan Bot , . A general double-pro ximal gradien t algorithm for dc programming. Mathematic al pr o gr amming , 178(1):301–326, 2019. [6] Claude Berge. T op olo gic al Sp ac es: Including a T r e atment of Multi-V alue d F unctions, V e ctor Sp ac es and Convexity . Oliv er & Bo yd, Edin burgh, 1963. T ranslated by E. M. P atterson. See Ch. 6, § 3 (Maximum Theorem), p. 116. [7] Dimitri P Bertsek as. Nonlinear programming. Journal of the Op er ational R ese ar ch So ciety , 48(3):334–334, 1997. [8] Jonathan M Borwein. Stability and regular p oin ts of inequalit y systems. Journal of optimiza- tion the ory and applic ations , 48(1):9–52, 1986. [9] Oleg P Burdako v, Christian Kanzow, and Alexandra Sch w artz. Mathematical programs with cardinalit y constrain ts: reform ulation b y complemen tarity-t yp e conditions and a regulariza- tion metho d. SIAM Journal on Optimization , 26(1):397–425, 2016. [10] James V. Burke. An exact p enalization viewp oin t of constrained optimization. SIAM Journal on Contr ol and Optimization , 29(4):968–998, 1991. [11] Abraham Charnes and William W Co op er. Deterministic equiv alents for optimizing and satisficing under chance constraints. Op er ations r ese ar ch , 11(1):18–39, 1963. [12] Abraham Charnes, William W Co op er, and Gifford H Symonds. Cost horizons and certaint y equiv alents: an approac h to sto c hastic programming of heating oil. Management scienc e , 4(3):235–263, 1958. [13] P engyu Chen, Xu Shi, Rujun Jiang, and Jiulin W ang. Penalt y-based metho ds for simple bilev el optimization under h¨ olderian error b ounds. A dvanc es in Neur al Information Pr o c essing Systems , 37:140731–140765, 2024. [14] Zhi Chen, Daniel Kuhn, and W olfram Wiesemann. Data-driv en chance constrained programs o ver wasserstein balls. Op er ations R ese ar ch , 72(1):410–424, 2024. [15] Sin-Sh uen Cheung, An thon y Man-Cho So, and Kuncheng W ang. Linear matrix inequalities with sto c hastically dep enden t p erturbations and applications to chance-constrained semidefi- nite optimization. SIAM Journal on Optimization , 22(4):1394–1430, 2012. [16] Jeh um Cho and Anthon y Papa v asiliou. Exact mixed-integer programming approach for c hance-constrained multi-area reserve sizing. IEEE T r ansactions on Power Systems , 39(2):3310–3323, 2023. [17] F rank H Clark e. Optimization and nonsmo oth analysis . SIAM, 1990. [18] Ying Cui and Jong-Shi Pang. Mo dern nonc onvex nondiffer entiable optimization . SIAM, 2021. [19] Y an Deng, Huiwen Jia, Shabbir Ahmed, Jon Lee, and Siqian Shen. Scenario grouping and de- comp osition algorithms for c hance-constrained programs. INFORMS Journal on Computing , 33(2):757–773, 2021. 32 [20] Thai Dinh, Ricardo F uk asaw a, and James Luedtk e. Exact algorithms for the chance- constrained vehicle routing problem. Mathematic al Pr o gr amming , 172(1):105–138, 2018. [21] Ha w-ren F ang, Sven Leyffer, and T o dd Munson. A piv oting algorithm for linear programming with linear complemen tarit y constrain ts. Optimization Metho ds and Softwar e , 27(1):89–114, 2012. [22] Sh ubhec hyy a Ghosal and W olfram Wiesemann. The distributionally robust c hance- constrained vehicle routing problem. Op er ations R ese ar ch , 68(3):716–732, 2020. [23] Jun-y a Gotoh, Akiko T akeda, and Katsuy a T ono. Dc formulations and algorithms for sparse optimization problems. Mathematic al Pr o gr amming , 169(1):141–176, 2018. [24] S-P Han and Olvi L Mangasarian. Exact p enalty functions in nonlinear programming. Math- ematic al pr o gr amming , 17(1):251–269, 1979. [25] Alan J Hoffman. On appro ximate solutions of systems of linear inequalities. In Sele cte d Pap ers Of A lan J Hoffman: With Commentary , pages 174–176. W orld Scientific, 2003. [26] Tim Hoheisel, Christian Kanzow, and Alexandra Sc h wartz. Theoretical and n umerical com- parison of relaxation metho ds for mathematical programs with complementarit y constraints. Mathematic al Pr o gr amming , 137(1):257–288, 2013. [27] L Jeff Hong, Yi Y ang, and Liwei Zhang. Sequen tial conv ex appro ximations to joint c hance constrained programs: A monte carlo approac h. Op er ations R ese ar ch , 59(3):617–630, 2011. [28] Reiner Horst and Nguyen V Thoai. Dc programming: ov erview. Journal of Optimization The ory and Applic ations , 103(1):1–43, 1999. [29] F rancisco Jara-Moroni, Jong-Shi Pang, and Andreas W¨ ach ter. A study of the difference-of- con vex approach for solving linear programs with complementarit y constraints. Mathematic al Pr o gr amming , 169(1):221–254, 2018. [30] Nan Jiang and W eijun Xie. Also-x and also-x+: Better conv ex appro ximations for c hance constrained programs. Op er ations R ese ar ch , 70(6):3581–3600, 2022. [31] Nan Jiang and W eijun Xie. Also-x#: Better con vex approximations for distributionally robust c hance constrained programs. Mathematic al Pr o gr amming , 213:575–638, 2025. [32] Simge Kucuky avuz and Ruiwei Jiang. Chance-constrained optimization under limited dis- tributional information: A review of reformulations based on sampling and distributional robustness. EURO Journal on Computational Optimization , 10:100030, 2022. [33] Hoai An Le Thi, T ao Pham Dinh, and Huynh V an Ngai. Exact p enalty and error b ounds in dc programming. Journal of Glob al Optimization , 52(3):509–535, 2012. [34] Sv en Leyffer. Mathematical programs with complementarit y constraints. SIA G/OPT Views- and-News , 14(1):15–18, 2003. [35] Thomas Lipp and Stephen Boyd. V ariations and extension of the con vex–conca ve pro cedure. Optimization and Engine ering , 17(2):263–287, 2016. 33 [36] Zhaosong Lu. Sequen tial conv ex programming metho ds for a class of structured nonlinear programming. arXiv pr eprint arXiv:1210.3039 , 2012. [37] James Luedtk e and Shabbir Ahmed. A sample approximation approach for optimization with probabilistic constraints. SIAM Journal on Optimization , 19(2):674–699, 2008. [38] James Luedtk e, Shabbir Ahmed, and George L Nemhauser. An in teger programming approac h for linear programs with probabilistic constraints. Mathematic al pr o gr amming , 122(2):247–272, 2010. [39] Zhi-Quan Luo, Jong-Shi P ang, Daniel Ralph, and Shi-Quan W u. Exact p enalization and stationarit y conditions of mathematical programs with equilibrium constraints. Mathematic al Pr o gr amming , 75(1):19–76, 1996. [40] Ark adi Nemirovski and Alexander Shapiro. Con vex approximations of chance constrained programs. SIAM Journal on Optimization , 17(4):969–996, 2007. [41] Jorge No cedal and Stephen J W right. Numeric al optimization . Springer, 2006. [42] ´ Alv aro Porras, Line Roald, Juan Miguel Morales, and Salv ador Pineda. Unifying c hance- constrained and robust optimal pow er flow for resilient netw ork op erations. IEEE T r ansactions on Contr ol of Network Systems , 12(1):1052–1061, 2025. [43] R T yrrell Ro c k afellar and Stanislav Uryasev. Conditional v alue-at-risk for general loss distri- butions. Journal of b anking & financ e , 26(7):1443–1471, 2002. [44] Andrzej Ruszczy ´ nski. Probabilistic programming with discrete distributions and precedence constrained knapsack p olyhedra. Mathematic al Pr o gr amming , 93(2):195–215, 2002. [45] Holger Sc heel and Stefan Sc holtes. Mathematical programs with complemen tarity constrain ts: Stationarit y , optimalit y , and sensitivit y . Mathematics of Op er ations R ese ar ch , 25(1):1–22, 2000. [46] Qianhao Sun, Y ao Zhang, Han ting Zhao, W ei Huo, Jian Liao, and Jianxue W ang. Netw ork-side carb on emission reduction via dispatc hing p o wer electronic devices in ac-dc hybrid distribution systems. IEEE T r ansactions on Automation Scienc e and Engine ering , 2025. [47] Pham Dinh T ao and L T Hoai An. Con vex analysis approac h to dc programming: theory , algorithms and applications. A cta mathematic a vietnamic a , 22(1):289–355, 1997. [48] John F T oland. Dualit y in nonconv ex optimization. Journal of Mathematic al A nalysis and Applic ations , 66(2):399–415, 1978. [49] Wim v an Ack ooij, Pedro P´ erez-Aros, Claudia Soto, and Emilio Vilc hes. Inner moreau env elope of nonsmo oth conic chance-constrained optimization problems. Mathematics of Op er ations R ese ar ch , 49(3):1419–1451, 2024. [50] P eng W ang, Rujun Jiang, Qingyuan Kong, and Laura Balzano. A proximal difference-of- con vex algorithm for sample av erage approximation of chance constrained programming. IN- F ORMS Journal on Computing , 38(1):315–339, 2026. 34 [51] Yilin W en, Yi Guo, Zec hun Hu, and Gabriela Hug. Sto c hastic modeling for the aggregated flexibilit y of distributed energy resources. Ele ctric Power Systems R ese ar ch , 234:110628, 2024. [52] W eijun Xie. On distributionally robust c hance constrained programs with wasserstein distance. Mathematic al Pr o gr amming , 186(1):115–155, 2021. [53] W eijun Xie and Shabbir Ahmed. Bicriteria approximation of chance-constrained co vering problems. Op er ations R ese ar ch , 68(2):516–533, 2020. [54] Liang Xu, Chao Zhang, Zhou Xu, and Daniel Zh uoyu Long. A nonparametric robust opti- mization approac h for chance-constrained knapsac k problem. SIAM Journal on Optimization , 35(2):739–766, 2025. [55] Shenglong Zhou, Lili P an, Naih ua Xiu, and Geoffrey Y e Li. A 0/1 constrained optimization solving sample a verage approximation for chance constrained programming. Mathematics of Op er ations R ese ar ch , 50(4):2688–2716, 2025. A Omitted Pro ofs in Section 1.3 Pro of of Prop osition 1.1 Pr o of. W e start by recalling the standard representation of the Clark e directional deriv ativ e (see, e.g., [ 17 , Prop osition 2.1.2]): g ◦ ( ¯ x ; d ) = max v ∈ ∂ g ( ¯ x ) ⟨ v , d ⟩ . (31) Step 1: GMFCQ = ⇒ 0 / ∈ ∂ g ( ¯ x ) + N X ( ¯ x ) . Assume GMF CQ holds at ¯ x , i.e., there exists d ∈ T X ( ¯ x ) such that g ◦ ( ¯ x ; d ) < 0. By ( 31 ), this implies ⟨ v , d ⟩ < 0 for all v ∈ ∂ g ( ¯ x ). If 0 ∈ ∂ g ( ¯ x ) + N X ( ¯ x ), then there exist v ∈ ∂ g ( ¯ x ) and z ∈ N X ( ¯ x ) with v + z = 0. Since d ∈ T X ( ¯ x ) and z ∈ N X ( ¯ x ), w e ha ve ⟨ z , d ⟩ ≤ 0 and hence ⟨ v , d ⟩ = −⟨ z , d ⟩ ≥ 0, a con tradiction. Therefore 0 / ∈ ∂ g ( ¯ x ) + N X ( ¯ x ). Step 2: 0 / ∈ ∂ g ( ¯ x ) + N X ( ¯ x ) = ⇒ metric regularit y . Define k er [ ∂ g ( ¯ x ) , I d ] := ( λ, z ) ∈ R × R d : 0 ∈ λ ∂ g ( ¯ x ) + z . Since g ( ¯ x ) = 0, we hav e N R − ( g ( ¯ x )) = N R − (0) = R + . A direct verification shows that 0 / ∈ ∂ g ( ¯ x ) + N X ( ¯ x ) ⇐ ⇒ k er [ ∂ g ( ¯ x ) , I d ] ∩ N R − ( g ( ¯ x )) × N X ( ¯ x ) = { (0 , 0 ) } , whic h is the regularit y condition [ 10 , Eq. (2.8)] (see also [ 8 , Eq. (33)]). Applying [ 10 , Thm. 2.4] (see also [ 8 , Thm. 3.2]) to the set-v alued mapping G ( x ) = g ( x ) + R + , we conclude that F = G − 1 (0) is metrically regular at ¯ x . Finally , metric regularity at ( ¯ x , 0) yields constan ts κ > 0 and ε > 0 suc h that dist x , G − 1 (0) ≤ κ dist 0 , G ( x ) = κ [ g ( x )] + , ∀ x ∈ X ∩ B ( ¯ x , ε ) , whic h is exactly ( 5 ). 35 Pro of of Theorem 1.2 Pr o of. (i) Let σ > 0 and assume that ¯ x is a lo cal minimizer of ( 6 ) with g ( ¯ x ) ≤ 0. Then [ g ( ¯ x )] + = 0. T ake r > 0 such that f ( ¯ x ) + σ [ g ( ¯ x )] + ≤ f ( x ) + σ [ g ( x )] + , ∀ x ∈ X ∩ B ( ¯ x , r ) . F or any x ∈ F ∩ B ( ¯ x , r ), w e hav e [ g ( x )] + = 0 and thus f ( ¯ x ) ≤ f ( x ). Hence ¯ x is a lo cal minimizer of ( 3 ). (ii) Assume that ¯ x is a lo cal minimizer of ( 3 ). Let κ, ε b e the constan ts in the error b ound ( 5 ). Fix any σ > L f κ . W e show that ¯ x lo cally minimizes ( 6 ). T ake any x ∈ X ∩ B ( ¯ x , ε ) and choose ˜ x ∈ F suc h that ∥ x − ˜ x ∥ = dist( x , F ). By Lipschitzness of f and the error b ound, f ( ˜ x ) ≤ f ( x ) + L f ∥ ˜ x − x ∥ = f ( x ) + L f dist( x , F ) ≤ f ( x ) + L f κ [ g ( x )] + . Moreo ver, since ¯ x is a lo cal minimizer of ( 3 ), we hav e f ( ¯ x ) ≤ f ( ˜ x ) for all x sufficien tly close to ¯ x . Therefore, for such x , f ( ¯ x ) ≤ f ( x ) + L f κ [ g ( x )] + ≤ f ( x ) + σ [ g ( x )] + . Since [ g ( ¯ x )] + = 0, this sho ws that ¯ x is a local minimizer of ( 6 ). T aking ¯ σ := L f κ completes the pro of. 36
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment