Local Second-Order Limit Dynamics of the Alternating Direction Method of Multipliers for Semidefinite Programming
The alternating direction method of multipliers (ADMM) is widely used for solving large-scale semidefinite programs (SDPs), yet on instances with multiple primal-dual optimal solution pairs, it often enters prolonged slow-convergence regions where th…
Authors: Shucheng Kang, Heng Yang
Lo cal Second-Order Limit Dynamics of the Alternating Direction Metho d of Multipliers for Semidefinite Programming Sh uc heng Kang ∗ Heng Y ang † F ebruary 24, 2026 Abstract The alternating direction metho d of m ultipliers (ADMM) is widely used for solving large-scale semidef- inite programs (SDPs), yet on instances with m ultiple primal–dual optimal solution pairs, it often en ters prolonged slo w-con v ergence regions where the Karush–Kuhn–T uc k er (KKT) residuals nearly stall. T o explain and predict the fine-grained dynamical b eha vior inside these regions, we develop a lo cal second- order limit dynamics framework for ADMM near an arbitr ary KKT p oin t—not necessarily the even tual limit point of the iterates. Assuming the existence of a strictly complementary primal–dual solution pair, w e derive a second-order lo cal expansion of the ADMM dynamics by leveraging a refined and simplified v ariational characterization of the (parabolic) second-order directional deriv ativ e of the PSD pro jection op erator. This expansion rev eals a closed con v ex cone of directions along which the local first-order up date v anishes, and it induces a second-order limit map that go v erns the p ersisten t drift after transien t effects are filtered out. W e characterize fundamental properties of this mapping, including its kernel, range, and con tin uit y . A primal–dual decoupling further yields a clean scaling la w for the effect of the penalty parameter in ADMM. W e connect these prop erties to second-order dynamical features of ADMM, including fixed p oin ts, almost-in v ariant sets, and microscopic phases. Three empirical phenom- ena in slow-con v ergence regions are then explained or predicted: (i) angles betw een consecutiv e iterate differences are small yet nonzero, except for sparse spikes; (ii) primal and dual infeasibilities are insensitive to p enalt y-parameter up dates; and (iii) iterates can be transien tly trapp ed in a lo w-dimensional subspace for an extended p eriod. Extensive numerical experiments on the Mittelmann dataset corrob orate our theoretical predictions. ∗ School of Engineering and Applied Sciences, Harv ard Universit y . Email: skang1@g.harvard.edu † School of Engineering and Applied Sciences, Harv ard Universit y . Email: hankyang@seas.harvard.edu 1 Con tents 1 In tro duction 3 1.1 ADMM for SDP: Empirical Slow-Con v ergence Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Con tributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.5 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 Related W ork 8 3 Refined Second-Order Directional Deriv ative of Π S n + ( · ) 9 4 Lo cal Second-Order Limit Dynamics 13 4.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2 Second-Order Lo cal Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3 C ( s Z ) : the Cone where First-Order Up dates V anish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.4 Second-Order Limit Map ϕ ( s Z ; · ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5 P olar Description and Primal–Dual Decoupling 26 5.1 Simplification of K ◦ ( s Z ; s H ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.2 Second-Order Limits of ∆ X ( k ) and ∆ S ( k ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.3 Primal–Dual Decoupling of ϕ ( s Z ; s H ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6 Kernel of ϕ ( s Z ; · ) 32 6.1 Pro of of “ s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) = ⇒ ϕ ( s Z ; s H ) = 0 ” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 6.2 Pro of of “ s H ∈ T Z ⋆ ( s Z ) = ⇒ ϕ ( s Z ; s H ) = 0 ” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.3 Discussion: Small yet Non-Zero (∆ Z ( k ) , ∆ Z ( k +1) ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 7 Range of ϕ ( s Z ; · ) 39 7.1 General Case: ran( ϕ ( s Z ; · )) Ę aff ( C ( s Z )) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 7.2 Under One-Sided Uniqueness: ran( ϕ ( s Z ; · )) ⊆ aff ( C ( s Z )) . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 7.3 Discussion: Connections to Almost Inv ariant Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 8 Con tin uity of ϕ ( s Z ; · ) 42 8.1 Existence of Discontin uit y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 8.2 Almost-Sure Contin uity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 8.3 Discussion: “Spikes” in (∆ Z ( k ) , ∆ Z ( k +1) ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 9 σ ’s Effect on ϕ ( s Z ; · ) 46 9.1 First-Order Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 9.2 Second-Order Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 9.3 Discussion: σ ’s Up dating Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 10 Examples 53 10.1 Example I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 10.2 Example I I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 10.3 Example I I I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 11 Numerical Exp erimen ts 61 12 F uture Directions 64 13 Conclusion 64 A Pro of of Theorem 2 66 2 1 In tro duction Consider the following pair of primal–dual semidefinite programs (SDPs) in standard form: Primal: minimize ⟨ C, X ⟩ Dual: maximize b T y sub ject to A X = b sub ject to A ∗ y + S = C X ∈ S n + S ∈ S n + , (1) with primal v ariable X ∈ S n and dual v ariables S ∈ S n , y ∈ R m . S n is the set of real symmetric n × n matrices and S n + is the set of positive semidefinite (PSD) matrices in S n . The linear operator A : S n → R m is defined as A X : = ( ⟨ A 1 , X ⟩ , · · · , ⟨ A m , X ⟩ ) . A ∗ y : = P m i =1 y i A i is its adjoin t op erator. The coefficients C , A 1 , . . . , A m are symmetric n × n matrices, and b ∈ R m . It is assumed that { A i } m i =1 are linearly indep enden t so that AA ∗ is an inv ertible op erator. As the need to solve large-scale SDPs con tin ues to gro w— e.g., those stemming from the moment and sums-of-squares (SOS) relaxations in polynomial optim ization [ 25 – 27 , 38 , 53 , 56 ]—first-order metho ds (FOMs) ha ve attracted increasing in terest due to their lo w per-iteration cost and their ability to exploit problem structures such as sparsity . Among these methods, the Alternating Direction Metho d of Multipliers (ADMM) has become a particularly p opular choice, supp orted b y a wide range of implemen tations, applications, and algorithmic v ariants [ 18 , 42 , 54 , 57 , 60 ]. ADMM for SDP . Starting from ( X (0) , y (0) , S (0) ) , the classical three-step ADMM iteration for the SDP ( 1 ) reads [ 54 ]: y ( k +1) = ( AA ∗ ) − 1 ´ σ − 1 b − A ´ σ − 1 X ( k ) + S ( k ) − C ¯¯ , (2a) S ( k +1) = Π S n + ´ C − A ∗ y ( k +1) − σ − 1 X ( k ) ¯ , (2b) X ( k +1) = X ( k ) + σ ´ S ( k +1) + A ∗ y ( k +1) − C ¯ , (2c) where Π S n + ( · ) denotes the orthogonal pro jection on to the PSD cone S n + and σ > 0 is the penalty parameter. Under mild conditions, ( X ( k ) , y ( k ) , S ( k ) ) is conv ergen t to ( s X , s y , s S ) , one of the optimal solution pairs satisfying the Karush–Kuhn–T uc k er (KKT) conditions [ 54 , Theorem 2]: A s X = b, A ∗ s y + s S = C, s X , s S = 0 , s X ∈ S n + , s S ∈ S n + . (3) The ADMM iteration applied to the dual SDP is equiv alen t to the Douglas–Rac hford splitting (DRS) metho d applied to the primal SDP [ 31 ]: Z ( k +1) = Z ( k ) − P (Π S n + ( Z ( k ) ) − r X ) + P ⊥ (Π S n + ( − Z ( k ) ) − σ C ) , (4) where P : = A ∗ ( AA ∗ ) − 1 A denotes the orthogonal pro jection onto the range space of A . P ⊥ : = Id − P ( Id denotes the identit y mapping). r X is any constant matrix satisfying A r X = b . W e can recov er the primal and dual v ariables from ( 4 ) as: X ( k ) : = Π S n + ( Z ( k ) ) and S ( k ) : = 1 σ Π S n + ( − Z ( k ) ) . Thus, each primal–dual optimal solution pair ( s X , s S ) corresp onds to one optimal auxiliary v ariable s Z : = s X − σ s S . W e shall also call ( 4 ) the one-step ADMM for solving the SDP . Define the primal optimal set X ⋆ (resp. dual optimal set S ⋆ ) as the collection of X (resp. S ) satisfying KKT conditions in ( 3 ). W e further define the differ enc e optimal set Z ⋆ as Z ⋆ : = X ⋆ − σ S ⋆ . One-dimensional criteria. Both the three-step and the one-step ADMM for solving the SDP are high- dimensional dynamical systems. In practice, ho wev er, we often observ e them through one-dimensional quan tities. F or the three-step ADMM ( 2 ) in particular, the primal infeasibilit y , dual infeasibility , and relativ e gap—collectiv ely , the KKT residuals—are defined as r ( k ) p : = ∥A X ( k ) − b ∥ 2 1 + ∥ b ∥ 2 , r ( k ) d : = ∥A ∗ y ( k ) + S ( k ) − C ∥ F 1 + ∥ C ∥ F , r ( k ) g : = ˇ ˇ C, X ( k ) − b T y ( k ) ˇ ˇ 1 + ˇ ˇ C, X ( k ) ˇ ˇ + ˇ ˇ b T y ( k ) ˇ ˇ , (5) 3 with r ( k ) max : = max { r ( k ) p , r ( k ) d , r ( k ) g } the maximum KKT residual. Since X ( k ) ⪰ 0 and S ( k ) ⪰ 0 at all iterations, w e omit the PSD-violation terms from ( 5 ). F or the one-step ADMM ( 4 ), we denote ∆ Z ( k ) : = Z ( k +1) − Z ( k ) , which is tightly related to r ( k ) max [ 24 ]. W e write ∥ ∆ Z ( k ) ∥ F for its F rob enius norm. Similarly , w e define ∆ X ( k ) and ∆ S ( k ) with their F rob enius norms ∥ ∆ X ( k ) ∥ F and ∥ ∆ S ( k ) ∥ F . The angle betw een t wo consecutiv e ∆ Z ( k ) is denoted b y (∆ Z ( k ) , ∆ Z ( k +1) ) , defined as (∆ Z ( k +1) , ∆ Z ( k ) ) : = arccos ˜ ∆ Z ( k ) , ∆ Z ( k +1) ∥ ∆ Z ( k ) ∥ F · ∥ ∆ Z ( k +1) ∥ F ¸ . W e will frequently use these one-dimensional criteria in the subsequen t analysis. 1.1 ADMM for SDP: Empirical Slo w-Con v ergence Patterns Despite its gro wing popularity and wide adoption, ADMM often suffers from slo w-conv ergence issues when solving SDPs [ 20 , 26 , 60 ]: after sev eral thousand iterations, progress often slows down dramatically and ma y nearly stall. This empirical observ ation almost aligns with existing theory . In general, ADMM for SDPs is widely understoo d to ha ve sublinear con v ergence. Under additional regularit y at the limiting KKT p oint— suc h as tw o-sided constraint nondegeneracy [ 10 , 19 ] and strict complementarit y [ 24 ]—one can establish local linear con v ergence. Although these tw o regularit y conditions hold generic al ly [ 1 ], they may b oth fail in SDPs in volving multiple KKT points, suc h as the imp ortan t SDP instances frequently arising from Momen t-SOS relaxation with finite con v ergence [ 27 ]. F or these SDPs, metric subregularity of the KKT operator at the limiting p oin t, whic h is required for lo cal linear con v ergence of primal–dual splitting methods, ma y easily fail to hold [ 12 , Example 1]. Consequently , slow-con vergence regions are generally una v oidable for ADMM on SDPs with multiple KKT points, and c haracterizing these regions is of b oth practical and theoretical imp ortance. Empirical patterns in slo w-con v ergence regions. A ma jor motiv ation for this paper comes from the empirical observ ation that these slo w-con vergence regions exhibit remark ably consistent patterns. While a comprehensive numerical study is pro vided in § 11 , here we focus on four represen tative SDPs from the Mittelmann dataset 1 : cnhil10 , foot , neu1g , and texture . These small- to medium-scale instances are among those for whic h ADMM struggles to reach high accuracy ( e.g., r max ≤ 10 − 10 ) within 10 6 iterations [ 24 ]. Exp erimen t I. W e first run three-step ADMM for ab out 10 6 iterations. The initial guesses ( X (0) , y (0) , S (0) ) are all zero and the initial σ is set to 1 . In the first 20000 iterations, σ is up dated using the classical heuristic that balances the primal and dual infeasibilities [ 54 ]; afterward, w e fix σ . Figure 1 rep orts the tra jectories of (∆ Z ( k +1) , ∆ Z ( k ) ) , ∥ ∆ Z ( k ) ∥ F , and r ( k ) max . W e observ e the first noticeable pattern: During the prolonged p erio d where ∥ ∆ Z ( k ) ∥ F and r ( k ) max nearly stall, (∆ Z ( k +1) , ∆ Z ( k ) ) tends to b e small y et nonzero (typically around 10 − 3 to 10 − 5 ), except for a few “sparse spik es”. This is unusual b ecause even the smallest decision-v ariable dimension among these four S DPs exceeds 5000 . In suc h high dimensions, t wo randomly generated v ectors are t ypically nearly orthogonal, not nearly parallel. Exp erimen t I I. W e next p erform a more delicate exp erimen t. T aking ( X (40000) , y (40000) , S (40000) ) as a new initialization, we gradually increase σ b y a factor of 10 ov er the next 5000 iterations, mimicking the effect of σ up dating in practice. Figure 2 shows the tra jectories of ∥ ∆ X ( k ) ∥ F , ∥ ∆ S ( k ) ∥ F , r ( k ) p , and r ( k ) d as functions of σ . W e observ e the second noticeable pattern: 1 https://plato.asu.edu/ftp/sparse_sdp.h tml 4 cnhil10 foot neu1g texture Figure 1: T ra jectories of r ( k ) max , ∥ ∆ Z ( k ) ∥ F , and (∆ Z ( k ) , ∆ Z ( k +1) ) in Exp erimen t I. As σ changes, r ( k ) p and r ( k ) d remain almost unchanged. Meanwhile, log 10 ( ∥ ∆ X ( k ) ∥ F ) (resp. log 10 ( ∥ ∆ S ( k ) ∥ F ) ) increases (resp. decreases) approximately linearly with log 10 ( σ ) , with slop e close to +1 (resp. − 1 ). This observ ation conflicts with the common wisdom b ehind up dating σ in practice, whic h aims to balance the primal and dual infeasibilities [ 54 ]. The apparen t insensitivity of r ( k ) p and r ( k ) d to σ therefore p oses a significan t c hallenge for designing effectiv e σ -update rules. cnhil10 foot neu1g texture Figure 2: T ra jectories of ∥ ∆ X ( k ) ∥ F , ∥ ∆ S ( k ) ∥ F , r ( k ) p , and r ( k ) d w.r.t σ in Exp eriment I I. In this pap er, we aim to understand the mechanisms underlying ADMM’s slow-con vergence regions, with t wo goals: (i) to explain the tw o empirical patterns abov e; (ii) to predict additional qualitativ e b eha viors in the slo w-con vergence regions and to shed light on algorithmic design for ADMM on SDPs with multiple KKT points. 1.2 Con tributions Assuming the existenc e of a strictly complemen tary primal–dual solution pair, we view ADMM for SDPs as a structured nonlinear dynamical system and study its limiting behavior in a neighborho od of an arbitrary s Z ∈ Z ⋆ . F o cusing on the cone of directions along whic h the lo cal first-order update v anishes, w e construct a local second-order limit map ϕ ( s Z ; · ) as a vector field. The induced lo cal second-order limit dynamics is Z ( k +1) = Z ( k ) + 1 2 ϕ ( s Z ; Z ( k ) − s Z ) + o ( ∥ Z ( k ) − s Z ∥ 2 F ) . (6) 5 First-OrderDynamics Second-OrderDynamics Second-OrderLimitDynamics Figure 3: Illustration of the lo cal second-order limit dynamics of ADMM for SDPs. The spectrahedron represen ts the optimal solution set Z ⋆ . The blue cone depicts C ( s Z ) , the cone of directions along which ADMM’s lo cal first-order up date v anishes. The purple cone depicts T Z ⋆ ( s Z ) , the tangent cone to Z ⋆ attac hed at s Z . In the left panel, the green p oin ts and flow indicate the transien t lo cal first-order dynamics, which v anishes as k → ∞ and conv erges to C ( s Z ) . The red p oin ts and w a vy tra jectories illustrate the transien t lo cal second-order dynamics. F or each p oint of the form s Z + t s H with a stalled first-order direction s H ∈ C ( s Z ) , the second-order iterate difference con v erges to t 2 2 ϕ ( s Z ; s H ) (red arrows in the right panel), capturing ADMM’s limiting behavior up to second order. W e fo cus on this surrogate model for tw o reasons: (i) it captures ADMM’s lo cal limiting b eha vior while filtering out transien t effects; (ii) it concen trates the complexity of the ADMM dynamics into the limit map ϕ ( s Z ; · ) , which we show admits clean and useful structure. W e then analyze the fundamen tal prop erties of ϕ ( s Z ; · ) ( e.g., kernel, range, con tinuit y , and primal–dual partition) and connect them to qualitativ e features of the limit dynamics ( e.g., fixed p oin ts, almost-inv ariant sets, phase transitions, and the role of σ ). T w o notable aspects of our framew ork are: 1. Rather than prop osing new sufficien t conditions to guaran tee fast lo cal linear con v ergence of ADMM for SDPs, w e focus on mo deling and analyzing the mechanisms that drive ADMM’s slow-con v ergence regions, thereb y bridging theory and practice. This physics-driv en viewp oin t complemen ts the exist- ing literature on lo cal linear con vergence of ADMM for SDPs [ 19 , 24 ] and introduces new to ols for understanding ADMM’s lo cal dynamical b eha vior when fast conv ergence do es not o ccur. 2. Our analysis do es not require s Z to b e the limiting p oin t to whic h ADMM even tually con v erges. This shifts the emphasis from a p ointwise, asymptotic paradigm to a r e gion-wise, tr ansient one. Such a p erspective is fundamentally differen t from existing second-order analyses for (nonlinear) SDPs, which are t ypically dev elop ed around a fixed limiting solution [ 17 , 44 , 48 ]. Concretely , our contributions are as follo ws. A refined and simplified formula for the second-order directional deriv ative of Π S n + ( · ) . A central tec hnical ingredien t in building our second-order analysis is the (parab olic) second-order directional deriv ative of the PSD pro jection Π S n + ( · ) . Our deriv ation builds on [ 59 , Theorem 4.1] and [ 35 , Propositions 3.1–3.2], with t wo key refinements: (i) we correct sev eral minor typos in both references, whic h yields a cleaner and more streamlined expression; (ii) w e expose a self-similar structure b et ween the first- and (parab olic) second-order directional deriv atives of Π S n + ( · ) . This self-similarit y is rep eatedly exploited in our second-order analysis. W e exp ect the refined v ariational c haracterization to be useful b ey ond the present setting. A lo cal second-order limiting mo del for ADMM near any s Z ∈ Z ⋆ . Starting from an y s Z ∈ Z ⋆ , w e expand the one-step ADMM dynamics ( 4 ) around it up to second order, using the first- and (parabolic) 6 second-order directional deriv ativ es of Π S n + ( · ) . F or the operator go verning the local first-order dynamics, we pro ve its firm nonexpansiv eness and giv e a detailed characterization of its nonempty fixed-p oint set C ( s Z ) . F or any stalled first-order direction s H ∈ C ( s Z ) , w e sho w that the op erator associated with the local second- order dynamics is also firmly nonexpansive but, in general, do es not admit fixed p oints. Instead, w e prov e the existence of the limit of the iterate difference for the second-order dynamics and denote it b y ϕ ( s Z ; s H ) . By v arying s H o v er C ( s Z ) , we obtain the local second-order limit map ϕ ( s Z ; · ) : C ( s Z ) 7→ S n , which b ecomes the cen tral ob ject of the pap er, and w e define the induced limit dynamics accordingly . See Figure 3 for an illustration. After uncov ering a primal–dual decoupling structure hidden in ϕ ( s Z ; s H ) , w e study four core prop erties of the limit map ϕ ( s Z ; · ) and their physical implications: • Kernel of ϕ ( s Z ; · ) : – Mathematic al pr o of. W e prov e that ker( ϕ ( s Z ; · )) coincides with T Z ⋆ ( s Z ) , the tangen t cone to Z ⋆ at s Z . This ties ADMM’s lo cal dynamics to Sturm’s square-ro ot error b ound under strict comple- men tarity [ 47 ]. – Physic al interpr etation. F rom the limit-dynamics viewp oin t, if ADMM is initialized with Z (0) sat- isfying Z (0) − s Z ∈ C ( s Z ) \T Z ⋆ ( s Z ) , then ∆ Z ( k ) transien tly tracks 1 2 ϕ ( s Z ; Z ( k ) − s Z ) . This mechanism partially explains the “small yet nonzero” b eha vior of (∆ Z ( k ) , ∆ Z ( k +1) ) observed in Exp erimen t I. • Range of ϕ ( s Z ; · ) : – Mathematic al pr o of. W e clarify the relationship b et w een ran( ϕ ( s Z ; · )) and aff ( C ( s Z )) : (i) in general, ran( ϕ ( s Z ; · )) Ę aff ( C ( s Z )) ; (ii) if, in addition, either the primal or the dual optimal solution is unique, then ran( ϕ ( s Z ; · )) ⊆ aff ( C ( s Z )) . – Physic al interpr etation. Interpreted through the limit dynamics, these results illuminate ho w C ( s Z ) can act as a local almost-in v arian t structure and why second-order up dates ma y remain confined to a low-dimensional subspace for a long time. • Con tin uit y of ϕ ( s Z ; · ) : – Mathematic al pr o of. W e first construct explicit p oin ts of discontin uit y of ϕ ( s Z ; · ) on C ( s Z ) , and then establish an almost-sure type contin uit y statemen t for ϕ ( s Z ; · ) with resp ect to the Leb esgue measure on aff ( C ( s Z )) . – Physic al interpr etation. In terms of limit dynamics, the “sparse” discontin uities of ϕ ( s Z ; · ) pro vide a concrete explanation for the “sparse spikes” in (∆ Z ( k ) , ∆ Z ( k +1) ) observ ed in Exp erimen t I, and enable accurate predictions of these microscopic phase transitions. • Effect of σ on ϕ ( s Z ; · ) : – Mathematic al pr o of. W e sho w that, under the local second-order limit dynamics model, the limitations of ∆ X ( k ) (resp. ∆ S ( k ) ) scales exactly in proportion to σ (resp. 1 σ ). W e further prov e that the second-order limits of both r ( k ) p and r ( k ) d are irrelev ant to σ . – Physic al interpr etation. This result directly explains the resp onse curves in Experiment I I. W e also discuss the implications for designing σ -up date strategies in second-order-dominan t regimes. Numerical verification. W e v alidate our theory on three (small-scale) SDP examples with multiple KKT p oin ts, where first- and second-order quantities can b e computed explicitly . W e further conduct exp erimen ts on the Mittelmann dataset. A cross a substantial subset of hard instances, we observe empirical patterns that are e xplained by our local second-order limit dynamics, supp orting the generality of the prop osed framework. All co des, data, and results can b e found in h ttps://gith ub.com/ComputationalRob otics/admmsdp-limitdyn . 1.3 Limitations Since our framew ork is ph ysics-driven, it prioritizes explanatory pow er o v er complete mathematical closure in a few places. The main ope n issue is that, while the second-order limit map and dynamics ( 6 ) exhibit ric h and clean structure, it is generally difficult to quantify the approximation error b etw een the limit mo del ( 6 ) 7 and the true ADMM dynamics ( 4 ), since three coupled lay ers of approximation are inv olved. In addition, our analysis assumes the existence of a strictly complemen tary solution pair, whic h ma y fail for certain problem classes. W e discuss these issues in more detail in § 12 . Despite these compromises, we hop e that our work can initiate a systematic study of the ubiquitous slow-con v ergence phenomena in first-order splitting metho ds for SDPs. Scop e and in terpretation. Throughout the pap er, we carefully separate rigorous mathematical results from empirical/physical interpretation. All formal statements (theorems/prop ositions/lemmas) are pro ved rigorously under Assumption 1 and Definitions 1 – 2 . The “Discussion” subsections (§ 6.3 , § 7.3 , § 8.3 , § 9.3 ) and the “Numerical Experiments” section (§ 11 ) are explicitly in terpretiv e: they connect the limit-dynamics framew ork to observ ed ADMM behavior, and are separated from—and not required for—the theoretical dev elopments. 1.4 Notation Giv en a finite-dimensional Hilb ert space ( H , ⟨· , ·⟩ ) and a con v ex set C ⊂ H , we write ri( C ) for the relativ e in terior of C , aff ( C ) for its affine hull, and cl( C ) for its closure. If, in addition, C is closed and conv ex, w e denote by T C ( x ) the tangen t cone to C at an y x ∈ C . Corresp ondingly , w e denote N C ( x ) as the normal cone to C at x . F or a con v ex cone K ⊂ H , we write K ◦ for its p olar cone. Given a mapping T : U 7→ H (with U ⊂ H ), we define k er( T ) : = { x ∈ U | T ( x ) = 0 } , ran( T ) : = {T ( x ) | x ∈ U } , and Fix( T ) : = { x ∈ U | T ( x ) = x } . W e denote b y Π C ( x ) the orthogonal pro jection of x ∈ H on to C . W e define B r ( x ) : = { y ∈ H | ∥ y − x ∥ ≤ r } , where ∥·∥ : = a ⟨· , ·⟩ . Let dist( x, C ) : = inf y ∈ C ∥ x − y ∥ . F or the space S n , the inner product is ⟨ A, B ⟩ = tr ` A T B ˘ = tr p AB q for all A, B ∈ S n . F or an y H ∈ S n , w e denote b y O n ( H ) the set of orthonormal matrices that diagonalize H , and we write ∥ H ∥ F for its F rob enius norm. W e denote by S n − the set of negative semidefinite (NSD) matrices in S n . When the matrix size n is clear from the context, we abbreviate Π S n + ( H ) (resp. Π S n − ( H ) ) as Π + ( H ) (resp. Π − ( H ) ). W e further denote λ max ( H ) (resp. λ min ( H ) ) as the maxim um (resp. minim um) eigenv alue of H . F or symmetric matrices, we displa y only the upp er-triangular part for simplicit y; the symmetric entries are indicated b y “ ∼ ”. W e denote b y I k the k × k identit y matrix. F or an y v ∈ R n , w e write ∥ v ∥ 2 for its Euclidean norm. 1.5 Outline After a brief o verview of related w ork in § 2 , we present in § 3 a simplified formula for the (parab olic) second- order directional deriv ative of Π S n + ( · ) . Building on this result, § 4 develops a detailed second-order analysis around an arbitrary s Z ∈ Z ⋆ , which naturally leads to the definition of the lo cal second-order limit map ϕ ( s Z ; · ) and its induced dynamics—the core concepts of this pap er. § 5 analyzes the primal-dual decoupling structures of the limit map immediately afterw ards. W e then in v estigate four fundamental properties of ϕ ( s Z ; · ) in parallel: (i) its k ernel in § 6 ; (ii) its range in § 7 ; (iii) its con tin uit y in § 8 ; and (iv) the effect of σ in § 9 . The connection b et ween each property of ϕ ( s Z ; · ) and ADMM’s dynamical b eha vior is discussed at the end of the corresp onding section. In § 10 , w e presen t three SDP examples, whic h serve as sanit y c hec ks and illustrations of the theory and also contribute to our pro ofs. In § 11 , we rep ort numerical exp erimen ts on the Mittelmann dataset. § 12 lists several future directions and op en problems. Finally , § 13 concludes the pap er. 2 Related W ork Our second-order analysis targets ADMM for SDPs and dra ws on to ols from con v ex analysis, matrix analysis, monotone operator theory , and dynamical systems. First-order pro ximal methods for SDP . ADMM can b e viewed as a represen tative primal–dual proxi- mal metho d arising from monotone op erator theory [ 43 ]. Beyond the classical ADMM approach for SDPs [ 54 ], 8 symmetric Gauss–Seidel (sGS)-ADMM [ 11 ] has attracted increasing attention as an efficien t sc heme for solv- ing general SDPs to medium accuracy . Other pro ximal methods, such as the primal–dual hybrid gradient (PDHG) method [ 22 ], hav e lik ewise been inv estigated for SDP-type formulations. More recently , these pro xi- mal frameworks ha ve b een integrated with low-rank factorization schemes to b etter exploit problem structure and scalability [ 20 , 52 ]. On the theoretical side, sufficient conditions guaranteeing fast lo cal linear conv ergence ha ve b een established, including tw o-sided constraint nondegeneracy [ 19 ] and strict complemen tarity [ 24 ] at the limiting KKT p oin t. Numerical evidence supp orting fast lo cal conv ergence under such conditions can b e found in [ 24 , 52 ]. V ariational prop erties of Π S n + ( · ) . The PSD cone pro jection op erator [ 21 ] can b e viewed as a sp ectral function generated by the ReLU map max { x, 0 } . F or sp ectral functions that are twice (con tinuously) F réchet differen tiable, explicit form ulas for the first- and second-order F réc het deriv ativ es are provided in [ 28 , 30 ]. The broader class of second-order directional differentiabilit y is systematically treated in [ 59 ], and [ 35 ] further lev erages these results to derive the (parab olic) second-order directional deriv ative of Π S n + ( · ) . A dditional v ariational prop erties, including strong semismo othness, are studied in [ 49 ], with extensions to more general sp ectral operators discussed in [ 14 ]. Recent w ork also explores approximating the PSD cone pro jection via comp osite p olynomial filtering motiv ated b y homomorphic encryption considerations [ 23 ]. Second-order analysis for (nonlinear) SDP . Second-order v ariational analysis for (nonlinear) SDPs pro vides a systematic language for curv ature, constraint qualifications, and stability . F oundational dev elop- men ts include first-order optimalit y and sensitivit y frameworks [ 44 ] and second-order sufficient conditions together with constraint nondegeneracy-t ype regularity [ 48 ]. More recen t studies introduce weak er second- order conditions [ 17 ] and extend suc h analyses to stratum-restricted settings [ 5 ]. These second-order condi- tions also pla y a role in c haracterizing critical points arising from reformulations suc h as the squared-v ariable approac h versus the original nonlinear SDP [ 16 ]. In contrast, our w ork adopts a different viewp oin t, empha- sizing transien t dynamical behavior in ADMM for SDPs rather than asymptotic optimality conditions at a single limiting p oin t. Optimization algorithms as dynamical systems. Man y optimization algorithms—including primal– dual splitting metho ds for conic programming—can b e naturally view ed as highly structured iterative maps [ 6 , 43 ]. Compared with complexity analyses, ho w ever, dynamical features suc h as phases and (almost- )in v ariant sets remain relativ ely under-explored. Within the existing literature, partial smo othness and activ e-set identification [ 29 ] pro vide a p o w erful mechanism for explaining pronounced phase transitions from slo w conv ergence to faster lo cal regimes [ 32 , 33 ]. In the case of SDP , this t ypically requires the limiting KKT p oin t satisfies strict complemen tarity . F or first-order metho ds in linear programming, dedicated geometric to ols hav e also been developed to explain phase-transition phenomena even without partial smo othness as- sumptions [ 36 , 55 ]. In this pap er, we in vestigate multiple dynamical features of ADMM for SDPs through the lens of “limit dynamics”. This p ersp ectiv e is motiv ated by dynamical systems theory , where understand- ing limiting b ehaviors ( e.g., limit cycles [ 40 ] and center manifolds [ 9 ]) is a standard approac h to analyzing complicated tra jectories. 3 Refined Second-Order Directional Deriv ativ e of Π S n + ( · ) Let f : R 7→ R b e a (parab olically) second-order directionally differentiable scalar function. Its first- and (parab olic) second-order directional deriv ativ es are defined b y f ′ ( z ; h ) : = lim t ↓ 0 f ( z + th ) − f ( z ) t , (7a) f ′′ ( z ; h, w ) : = lim t ↓ 0 f ( z + th + t 2 2 w ) − f ( z ) − tf ′ ( z ; h ) 1 2 t 2 . (7b) 9 Next, let F : S n 7→ S n b e a (parab olically) second-order directionally differen tiable sp ectral function gener- ated b y f . Namely , for any X ∈ S n with Q ∈ O n ( X ) and { λ i } n i =1 the eigen v alues of X , F ( X ) = F ( Q diag p { λ i } n i =1 q Q T ) = Q diag p { f ( λ i ) } n i =1 q Q T . The first- and (parabolic) second-order directional deriv atives of F are defined as F ′ ( Z ; H ) : = lim t ↓ 0 F ( Z + tH ) − F ( Z ) t , (8a) F ′′ ( Z ; H , W ) : = lim t ↓ 0 F ( Z + tH + 1 2 t 2 W ) − F ( Z ) − tF ′ ( Z ; H ) 1 2 t 2 . (8b) In particular, w e are interested in the case when F = Π S n + ( · ) , the PSD cone pro jection op erator generated b y f ( x ) = max { x, 0 } . W e kno w that Π S n + ( · ) is second-order directionally differentiable; see [ 59 ]. Nested eigen-structure description. T o presen t the first- and second-order directional deriv ativ es of Π S n + ( · ) succinctly , we introduce a nested eigen-structure description that recursiv ely captures the eigen v alue structures of the first- and second-order perturbation matrices H and W . W e adopt the notation of [ 59 ], with adjustmen ts tailored to the PSD cone pro jection. 1. First-level description. Start with a diagonal matrix Z . Denote its distinct p ositiv e eigen v alues (if any) b y { µ a } a ∈I + and its distinct negative eigen v alues (if any) by { µ b } b ∈I − . If Z has a zero eigenv alue, denote it by µ 0 = 0 . Define I : = I + ∪ I − ∪ { 0 } . F or each sub-blo c k k ∈ I , let the corresponding index set be α k . That is, Z α k α l = ( µ k · I | α k | , k = l ∈ I , 0 , k = l, k , l ∈ I . Here, for any matrix A , A α k α l denotes the sub-blo ck of A with row indices α k and column indices α l . W e also write A α k for the c olumns of A indexed b y α k . Finally , define α + : = ∪ a ∈I + α a and α − : = ∪ b ∈I − α b . 2. Se c ond-level description. Let Z ∈ S n b e given by the first-level description, and let H ∈ S n b e another symmetric matrix. F or an y k ∈ I , extract the corresp onding sub-block of H , denoted by H α k α k . Denote its distinct p ositiv e eigenv alues (if any) by { η k,i } i ∈I k, + and its distinct negativ e eigenv alues (if any) by { η k,j } j ∈I k, − . If it has a zero eigenv alue, denote it by η k, 0 = 0 . Define I k : = I k, + ∪ I k, − ∪ { 0 } . F or H α k α k , let each eigen-blo ck i ∈ I k b e indexed by a set β k,i . Equiv alently , there exists Q k ∈ O | α k | ( H α k α k ) such that ( Q k β k,i ) T H α k α k Q k β k,j = ( η k,i · I | β k,i | , i = j ∈ I k , 0 , i = j, i, j ∈ I k . Finally , define β k, + : = ∪ i ∈I k, + β k,i and β k, − : = ∪ j ∈I k, − β k,j . 3. Thir d-level description. Let Z ∈ S n b e given b y the first-lev el description, and let H ∈ S n b e given b y the second-lev el description. Let W ∈ S n . F or an y k ∈ I , define V k ( H , W ) : = W α k α k + X l ∈I \{ k } 2 µ k − µ l · H α k α l H α l α k . (9) W e ma y abbreviate V k ( H , W ) as V k if there is no ambiguit y . Abbreviate ( Q k β k,i ) T V k Q k β k,j as ˆ V i,j k . F or an y k ∈ I , i ∈ I k , we denote ˆ V i,i k ’s distinct p ositiv e eigenv alues (if any) by { ζ k,i,i ′ } i ′ ∈I k,i, + and its 10 distinct negative eigenv alues (if any) by { ζ k,i,j ′ } j ′ ∈I k,i, − . If ˆ V i,i k has a zero eigenv alue, denote it b y ζ k,i, 0 = 0 . Define I k,i : = I k,i, + ∪ I k,i, − ∪ { 0 } . F or ˆ V i,i k , let each eigen-block i ′ ∈ I k,i b e indexed b y a set γ k,i,i ′ . Equiv alen tly , there exists ˆ Q k,i ∈ O | β k,i | ( ˆ V i,i k ) suc h that ( ˆ Q k,i γ k,i,i ′ ) T ˆ V i,i k ˆ Q k,i γ k,i,j ′ = ( ζ k,i,i ′ · I | γ k,i,i ′ | , i ′ = j ′ ∈ I k,i , 0 , i ′ = j ′ , i ′ , j ′ ∈ I k,i . Finally , define γ k,i, + : = ∪ i ′ ∈I k,i, + γ k,i,i ′ and γ k,i, − : = ∪ j ′ ∈I k,i, − γ k,i,j ′ . No w supp ose w e are giv en a triplet ( Z, H , W ) from the ab ov e three-level description. F or visualization, w e partition the n × n matrix into 3 × 3 sub-blo c ks, based on Z ’s p ositiv e-zero-negativ e eigen v alue structures. The partition is represen ted by dashe d lines. F or instance, H = » — – H α + α + H α + α 0 H α + α − ∼ H α 0 α 0 H α 0 α − ∼ ∼ H α − α − fi ffi fl = » — — — – { H α a α b } a ∈I + b ∈I + { H α a α 0 } a ∈I + { H α a α b } a ∈I + b ∈I − ∼ H α 0 α 0 { H α 0 α b } b ∈I − ∼ ∼ { H α a α b } a ∈I − b ∈I − fi ffi ffi ffi fl . Similarly , for the α 0 α 0 blo c k of W , we partition it into 3 × 3 sub-blo c ks following H α 0 α 0 ’s positive-zero- negativ e eigen v alue structures. Since H α 0 α 0 is no longer diagonal, a basis c hange is necessary . F or instance, W α 0 α 0 = Q 0 » — – x W β 0 , + β 0 , + x W β 0 , + β 0 , 0 x W β 0 , + β 0 , − ∼ x W β 0 , 0 β 0 , 0 x W β 0 , 0 β 0 , − ∼ ∼ x W β 0 , − β 0 , − fi ffi fl ( Q 0 ) T = Q 0 » — — — — — – n x W β 0 ,i β 0 ,j o i ∈I 0 , + b ∈I 0 , + n x W β 0 ,i β 0 , 0 o i ∈I 0 , + n x W β 0 ,i β 0 ,j o i ∈I 0 , + b ∈I 0 , − ∼ x W β 0 , 0 β 0 , 0 n x W β 0 , 0 β 0 ,j o j ∈I 0 , − ∼ ∼ n x W β 0 ,i β 0 ,j o i ∈I 0 , − b ∈I 0 , − fi ffi ffi ffi ffi ffi fl ( Q 0 ) T . where x W : = ( Q 0 ) T W α 0 α 0 Q 0 . First-order directional deriv ative of Π S n + ( · ) . W e provide the follo wing classical result from [ 49 , Theorem 4.7] that gives the first-order directional deriv ative of Π S n + ( · ) . Theorem 1 ( Π ′ S n + ( Z ; H ) ) . L et Z ∈ S n b e given by the first-level description. F or any H ∈ S n given by the se c ond-level description, Π ′ + ( Z ; H ) = » — — – H α + α + H α + α 0 n µ a µ a − µ b H α a α b o a ∈I + b ∈I − ∼ Π + ( H α 0 α 0 ) 0 ∼ ∼ 0 fi ffi ffi fl . (10) F or a non-diagonal Z ∈ S n : Pick Q ∈ O n ( Z ) . Denote r Z : = Q T Z Q diagonal and r H : = Q T H Q : Π ′ + ( Z ; H ) = Q Π ′ + ( r Z ; r H ) Q T . (11) (P arab olic) second-order directional deriv ativ e of Π S n + ( · ) . Our result builds on [ 59 , Theorem 4.1] and [ 35 , Propositions 3.1–3.2], with t wo key refinements: (i) we correct several minor t yp os in [ 59 ] and [ 35 ], whic h in turn yields a simplified formula; (ii) w e reveal a self-similar structure b etw een the α 0 α 0 blo c k of Π ′′ + ( Z ; H , W ) and Π ′ + ( Z ; H ) , whic h serves as a key ingredient in the subsequent second-order analysis. 11 Theorem 2 ( Π ′′ S n + ( Z ; H , W ) ) . L et the triplet ( Z, H , W ) b e given by the thr e e-level description. Then, Π ′′ + ( Z ; H , W ) = (12) » — — — — — — — — — — — — — – W α a α b +2 P c ∈I − − µ c ( µ c − µ a )( µ c − µ b ) H α a α c H α c α b a ∈I + b ∈I + W α a α 0 +2 P c ∈I − 1 µ a − µ c H α a α c H α c α 0 − 2 1 µ a H α a α 0 Π + ( − H α 0 α 0 ) a ∈I + µ a µ a − µ b W α a α b +2 P c ∈I + − µ b ( µ b − µ a )( µ b − µ c ) H α a α c H α c α b +2 1 µ a − µ b H α a α 0 H α 0 α b +2 P c ∈I − µ a ( µ a − µ b )( µ a − µ c ) H α a α c H α c α b a ∈I + b ∈I − ∼ 2 P c ∈I + 1 µ c H α 0 α c H α c α 0 +Π ′ + ( H α 0 α 0 ; V 0 ( H,W )) ( 2 P c ∈I + 1 µ c − µ b H α 0 α c H α c α b +2 1 − µ b Π + ( H α 0 α 0 ) H α 0 α b ) b ∈I − ∼ ∼ n 2 P c ∈I + µ c ( µ c − µ a )( µ c − µ b ) H α a α c H α c α b o a ∈I − b ∈I − fi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi fl . wher e V 0 ( H , W ) is define d in ( 9 ) . Π ′ + ( H α 0 α 0 ; V 0 ( H , W )) in the α 0 α 0 blo ck is c alculate d by ( 11 ) , sinc e H α 0 α 0 is not diagonal. F or a non-diagonal Z ∈ S n : Pick Q ∈ O n ( Z ) . Denote r Z : = Q T Z Q diagonal, r H : = Q T H Q, Ă W : = Q T W Q : Π ′′ + ( Z ; H , W ) = Q Π ′′ + ( r Z ; r H , Ă W ) Q T . (13) F or readabilit y , we p ostp one the proof and discussion of Theorem 2 to Appendix A . One ma y ha v e already noticed that Π ′′ S n + ( Z ; H , W ) α 0 α 0 exhibits a strong structural resem blance to Π ′ S n + ( Z ; H ) . This is not a coincidence; rather, it stems from the self-similarity b et w een the first- and (parab olic) second-order directional deriv atives of f ( x ) = max { x, 0 } : f ′ ( h ; w ) = f ′′ (0; h, w ) = w , h > 0 max { w , 0 } , h = 0 0 , h < 0 . First- and (parab olic) second-order directional deriv ativ es of Π S n − ( · ) . F or c on venience and further use, w e also deriv e Π ′ S n − ( Z ; H ) and Π ′′ S n − ( Z ; H , W ) . Theorem 3 ( Π ′ S n − ( Z ; H ) ) . L et Z ∈ S n b e given by the first-level description. F or any H ∈ S n given by the se c ond-level description, Π ′ − ( Z ; H ) = » — — – 0 0 n − µ b µ a − µ b H α a α b o a ∈I + b ∈I − ∼ Π − ( H α 0 α 0 ) H α 0 α − ∼ ∼ H α − α − fi ffi ffi fl . (14) F or a non-diagonal Z ∈ S n : Pick Q ∈ O n ( Z ) . Denote r Z : = Q T Z Q diagonal and r H : = Q T H Q : Π ′ − ( Z ; H ) = Q Π ′ − ( r Z ; r H ) Q T . (15) Pr o of. Since Π + ( Z ) + Π − ( Z ) = Z , w e get Π ′ + ( Z ; H ) + Π ′ − ( Z ; H ) = H . Then, ( 14 ) is deriv ed from ( 10 ) b y calculating H − Π ′ + ( Z ; H ) . 12 Theorem 4 ( Π ′′ S n − ( Z ; H , W ) ) . L et the triplet ( Z, H , W ) b e given by the thr e e-level description. Then, Π ′′ − ( Z ; H , W ) = (16) » — — — — — — — — — — — — — — — — – n 2 P c ∈I − µ c ( µ c − µ a )( µ c − µ b ) H α a α c H α c α b o a ∈I + b ∈I + 2 P c ∈I − 1 µ c − µ a H α a α c H α c α 0 +2 1 − µ a H α a α 0 Π − ( H α 0 α 0 ) a ∈I + − µ b µ a − µ b W α a α b +2 P c ∈I + µ b ( µ b − µ a )( µ b − µ c ) H α a α c H α c α b +2 1 µ b − µ a H α a α 0 H α 0 α b +2 P c ∈I − − µ a ( µ a − µ b )( µ a − µ c ) H α a α c H α c α b a ∈I + b ∈I − ∼ 2 P c ∈I − 1 µ c H α 0 α c H α c α 0 +Π ′ − ( H α 0 α 0 ; V 0 ( H,W )) W α a α b +2 P c ∈I + 1 µ b − µ c H α 0 α c H α c α b − 2 1 µ b Π − ( − H α 0 α 0 ) H α 0 α b b ∈I − ∼ ∼ W α a α b +2 P c ∈I + − µ c ( µ c − µ a )( µ c − µ b ) H α a α c H α c α b a ∈I − b ∈I − fi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi fl . wher e V 0 ( H , W ) is define d in ( 9 ) . Π ′ − ( H α 0 α 0 ; V 0 ( H , W )) in the α 0 α 0 blo ck is c alculate d by ( 15 ) , sinc e H α 0 α 0 is not diagonal. F or a non-diagonal Z ∈ S n : Pick Q ∈ O n ( Z ) . Denote r Z : = Q T Z Q diagonal, r H : = Q T H Q, Ă W : = Q T W Q : Π ′′ − ( Z ; H , W ) = Q Π ′′ − ( r Z ; r H , Ă W ) Q T . (17) Pr o of. Since Π + ( Z ) + Π − ( Z ) = Z , w e get Π ′ + ( Z ; H ) + Π ′ − ( Z ; H ) = H and Π ′′ + ( Z ; H , W ) + Π ′′ − ( Z ; H , W ) = W . Then, for Π ′′ − ( Z ; H , W ) ’s α 0 α 0 blo c k: Π ′′ − ( Z ; H , W ) α 0 α 0 = W α 0 α 0 − 2 X c ∈I + 1 µ c H α 0 α c H α c α 0 − Π ′ + ( H α 0 α 0 ; V 0 ( H , W )) = W α 0 α 0 − 2 X c ∈I + 1 µ c H α 0 α c H α c α 0 + Π ′ − ( H α 0 α 0 ; V 0 ( H , W )) − V 0 ( H , W ) = W α 0 α 0 − 2 X c ∈I + 1 µ c H α 0 α c H α c α 0 + Π ′ − ( H α 0 α 0 ; V 0 ( H , W )) − W α 0 α 0 + 2 X c ∈I + 1 µ c H α 0 α c H α c α 0 + 2 X c ∈I − 1 µ c H α 0 α c H α c α 0 =2 X c ∈I − 1 µ c H α 0 α c H α c α 0 + Π ′ − ( H α 0 α 0 ; V 0 ( H , W )) , where we use ( 9 ) and the fact that Π ′ + ( H α 0 α 0 , V 0 ) + Π ′ − ( H α 0 α 0 , V 0 ) = V 0 . The other blo c ks in Π ′′ − ( Z ; H , W ) can be deriv ed from ( 12 ) and simple calculation. 4 Lo cal Second-Order Limit Dynamics As sho wn in [ 24 ], ADMM for SDPs conv erges lo cally at a linear rate when the iterates con v erge to a nonsingular KKT point Z sc whose primal–dual components satisfy strict complemen tarity . In this section, w e study ADMM’s finer dynamical b eha vior near an arbitrary , p ossibly singular KKT point s Z . In contrast to [ 24 ], w e do not assume that the iterates conv erge to s Z , whic h allo ws us to shift the analysis from a p oin twise, asymptotic paradigm to a region-wise, transien t one. W e will show that, in this regime, the lo cal dynamics can b e effectively describ ed b y a se c ond-or der limit map ϕ ( s Z ; · ) . 13 In § 4.1 , w e state the standing assumptions used throughout the pap er. In § 4.2 , we expand the one-step ADMM update ( 4 ) up to second order around s Z , lev eraging the expression for Π ′′ + ( Z ; H , W ) in Theorem 2 . In § 4.3 , we examine the geometry of C ( s Z ) , a closed con v ex cone along whic h the first-order updates v anish, and discuss its relationship with T Z ⋆ ( s Z ) , the tangen t cone to the set of KKT p oin ts at s Z . In § 4.4 , w e sho w that, under the local second-order expansion model, for every first-order direction H ∈ C ( s Z ) , the limit of second-order drifting, denoted as ϕ ( s Z ; H ) , exists and need not v anish. This nonzero second-order effect motiv ates the central ob ject of the pap er: the se c ond-or der limit map ϕ ( s Z ; · ) : C ( s Z ) 7→ S n , viewed as a vector field. Sp ecifically , at the points Z with Z − s Z ∈ C ( s Z ) and ∥ Z − s Z ∥ F → 0 , we associate the second-order displacemen t 1 2 ϕ ( s Z ; Z − s Z ) . The definition of the corresp onding se c ond-or der limit dynamics then follo ws immediately . As a local surrogate for the nonlinear dynamics ( 4 ) near s Z , it captures the limiting b ehavior of ( 4 ). 4.1 Assumptions Assumption 1. Two assumptions ar e made thr oughout the p ap er: 1. The line ar op er ator A : S n 7→ R m is surje ctive. 2. Ther e exists a KKT p oint satisfying strict c omplementarity, i.e., ∃ ( X sc , y sc , S sc ) satisfying ( 3 ) , s.t. rank( X sc ) + rank( S sc ) = n . Equivalently, Z sc = X sc − σ S sc is nonsingular. The surjectivity of A in Assumption 1 guarantees that AA ∗ is inv ertible, which in turn ensures that the ADMM iterations ( 2 ) and ( 4 ) are well defined. Note that Assumption 1 requires neither a Slater condition nor constraint nondegeneracy . The requirement that a strictly complementary solution pair exists is mild and standard in the SDP literature, including analyses of in terior-point metho ds (IPMs) [ 2 , 37 ] and augmented Lagrangian metho ds (ALMs) [ 12 , 34 ]. Moreov er, man y SDPs arising in real-w orld applications (including instances with m ultiple KKT p oin ts) admit a strictly complementary primal–dual solution pair [ 24 ]. On the other hand, the existence of a strictly complementary solution pair rules out pathological SDPs whose optimal set has singularit y degree greater than one [ 45 , 47 , 51 ]; such problem classes can b e challenging even for IPMs [ 45 ]. 4.1.1 A 4 × 4 matrix blo c k partition under strict complemen tarit y W e first give a characterization of X ⋆ and S ⋆ . Prop osition 1 ( X ⋆ and S ⋆ ) . Under Assumption 1 (1), the primal and dual optimal solution sets X ⋆ and S ⋆ c an b e expr esse d as X ⋆ : = { X | P ( X − r X ) = 0 , X ∈ S n + , ⟨ X, S sc ⟩ = 0 } , (18a) S ⋆ : = { S | P ⊥ ( S − C ) = 0 , S ∈ S n + , ⟨ S, X sc ⟩ = 0 } , (18b) wher e r X c an b e any c onstant matrix satisfying A r X = b . A ctual ly, the fixe d strictly c omplementary solution p air ( X sc , S sc ) in ( 18 ) c an b e r eplac e d by any other fixe d primal–dual optimal solution p air ( s X , s S ) ∈ X ⋆ × S ⋆ . Pr o of. Since A is surjectiv e b y Assumption 1 , fixing any constant matrix r X satisfying A r X = b , w e hav e A X = b ⇐ ⇒ A ( X − r X ) = 0 ⇐ ⇒ P ( X − r X ) = 0 . Symmetrically , ∃ y ∈ R m , s.t. A ∗ y + S = C is equiv alen t to P ⊥ ( S − C ) = 0 . Th us, from ( 3 ), the set of primal-dual optimal solution pairs ( X , S ) can b e expressed as C : = ( X, S ) A X = b, X ⪰ 0 , ∃ y ∈ R m , s.t. A ∗ y + S = C, S ⪰ 0 , ⟨ X, S ⟩ = 0 = ( X, S ) P ( X − r X ) = 0 , X ⪰ 0 , P ⊥ ( S − C ) = 0 , S ⪰ 0 , ⟨ X, S ⟩ = 0 . 14 Then, b y Assumption 1 (2), there exists a strictly complementary solution pair ( X sc , S sc ) ∈ C , i.e., P ( X sc − r X ) = 0 , X sc ⪰ 0 , P ⊥ ( S sc − C ) = 0 , S sc ⪰ 0 , ⟨ X sc , S sc ⟩ = 0 . No w all we need to prov e is X ⋆ = { X | ∃ S ∈ S n , s.t. ( X, S ) ∈ C } and S ⋆ = { S | ∃ X ∈ S n , s.t. ( X, S ) ∈ C } . W e shall only pro ve the primal part X ⋆ = { X | ∃ S ∈ S n , s.t. ( X, S ) ∈ C } , since the dual part can b e pro ven symmetrically . (i) The “ ⊆ ” part. T ak e an y X ∈ X ⋆ . Then, b y the definition of S sc , ( X, S sc ) ∈ C . (ii) The “ ⊇ ” part. T ak e an y ( X , S ) ∈ C . Then, by definition, P ( X − r X ) = P ( X sc − r X ) = 0 , X , X sc ⪰ 0 , P ⊥ ( S − C ) = P ⊥ ( S sc − C ) = 0 , S, S sc ⪰ 0 . W e shall prov e ⟨ X, S sc ⟩ = 0 . T o see this, on the one hand, ⟨ X − X sc , S − S sc ⟩ = ⟨P ( X − X sc ) , P ( S − S sc ) ⟩ + P ⊥ ( X − X sc ) , P ⊥ ( S − S sc ) = 0 . On the other hand, ⟨ X − X sc , S − S sc ⟩ = ⟨ X, S ⟩ + ⟨ X sc , S sc ⟩ − ⟨ X, S sc ⟩ − ⟨ X sc , S ⟩ . Com bining these t wo, we get ⟨ X, S sc ⟩ + ⟨ X sc , S ⟩ = 0 . Since X, X sc , S, S sc ⪰ 0 , the only p ossibilit y is ⟨ X, S sc ⟩ = 0 and ⟨ X sc , S ⟩ = 0 . Thus, X ∈ X ⋆ . Clearly , the strictly complemen tary pair ( X sc , S sc ) can b e replaced b y an y fixed pairs ( s X , s S ) ∈ C . Giv en Prop osition 1 , without loss of generalit y , we can partition S n in to 2 × 2 blocks indexed by ( α P , α D ) × ( α P , α D ) , where α P ∪ α D = { 1 , . . . , n } and α P ∩ α D = ∅ . Then, up to a c hange of basis, X = « X α P α P 0 ∼ 0 ff , S = « 0 0 ∼ S α D α D ff , ∀ ( X , S ) ∈ X ⋆ × S ⋆ , (19) with [ X sc ] α P α P ≻ 0 and [ S sc ] α D α D ≻ 0 . Notice that the blo ck partition based on strict complementarit y is represen ted by solid lines, which distinguishes it from dashe d lines partitioning the p ositive-zero-negativ e eigen v alue structures. No w fix an optimal solution pair ( s X , s S ) ∈ X ⋆ × S ⋆ . Without loss of generalit y , we further assume that b oth s X and s S are diagonal. Indeed, if this is not the case, then b y the 2 × 2 blo c k structure in ( 19 ), there exists an orthonormal matrix Q of the form Q : = « Q α P α P 0 ∼ Q α D α D ff , whic h sim ultaneously diagonalizes s X and s S : Q T s X Q = » — – « Λ P 0 ∼ 0 ff 0 ∼ 0 fi ffi fl , Q T s S Q = » — – 0 0 ∼ « 0 0 ∼ Λ D ff fi ffi fl . Moreo ver, for an y ( X, S ) ∈ X ⋆ × S ⋆ , the same 2 × 2 block partition ( 19 ) is preserved under this congruence transformation, since Q T X Q = « Q T α P α P X α P α P Q α P α P 0 ∼ 0 ff , Q T S Q = « 0 0 ∼ Q T α D α D S α D α D Q α D α D ff . 15 A ccordingly , apply the following congruent transformation to the SDP data A and C : { A i } m i =1 ← Q T A i Q m i =1 , C ← Q T C Q. Under the transformed data ( A , b, C ) , the optimal sets ( X ⋆ , S ⋆ ) still satisfy the 2 × 2 blo ck partition ( 19 ), while the c hosen pair ( s X , s S ) b ecomes diagonal. F or s Z : = s X − σ s S , we further assume that it satisfies the first-lev el description in § 3 . Under this assumption, the 2 × 2 partition in ( 19 ) refines the 3 × 3 blo c k partition in § 3 into a 4 × 4 one as follows. Define α P 0 : = α P \ α + and α D 0 : = α D \ α − . Then, for an y H ∈ S n , H = » — — — – H α + α + H α + α P 0 H α + α D 0 H α + α − ∼ H α P 0 α P 0 H α P 0 α D 0 H α P 0 α − ∼ ∼ H α D 0 α D 0 H α D 0 α − ∼ ∼ ∼ H α − α − fi ffi ffi ffi fl . If we further assume that H satisfies the second-lev el description in § 3 , then for an y W ∈ S n the α 0 α 0 blo c k of W can b e expressed—after a change of basis since H α 0 α 0 is not diagonal—as W α 0 α 0 = Q 0 » — — — — – x W β 0 , + β 0 , + x W β 0 , + β P 0 , 0 x W β 0 , + β D 0 , 0 x W β 0 , + β 0 , − ∼ x W β P 0 , 0 β P 0 , 0 x W β P 0 , 0 β D 0 , 0 x W β P 0 , 0 β 0 , − ∼ ∼ x W β D 0 , 0 β D 0 , 0 x W β D 0 , 0 β 0 , − ∼ ∼ ∼ x W β 0 , − β 0 , − fi ffi ffi ffi ffi fl ( Q 0 ) T , where ( β 0 , P , β 0 , D ) is the primal–dual blo c k partition for H α 0 α 0 with β 0 , P ∪ β 0 , D = α 0 . x W : = ( Q 0 ) T W α 0 α 0 Q 0 , β P 0 , 0 : = β 0 , P \ β 0 , + , and β D 0 , 0 : = β 0 , D \ β 0 , − . Remark 1. One may have alr e ady notic e d the r ole playe d by the existenc e of a strictly c omplementary solution p air in enabling the 4 × 4 blo ck p artition. Without strict c omplementarity, the blo ck α 0 α 0 c an no longer b e cle anly sub divide d into four pie c es, which would signific antly incr e ase the c omplexity of the analysis. 4.2 Second-Order Lo cal Expansion Let s Z ∈ Z ⋆ b e diagonal and satisfy the first-level description in § 3 . Suppose that the ADMM iterate Z ( k ) admits the following expansion in a neigh b orhoo d of s Z : Z ( k ) = s Z + tH ( k ) + t 2 2 W ( k ) + o ( t 2 ) , (20) where t ↓ 0 is a scale parameter. Unlik e [ 24 ], this lo cal expansion do es not require s Z to be the even tual limit p oin t of Z ( k ) . Consequently , our lo cal framework is more flexible, shifting from a p ointwise, asymptotic p erspective to a r e gion-wise, tr ansient one. Recall the one-step ADMM up date ( 4 ) and rewrite it in finite-difference form as Z ( k +1) − Z ( k ) = δ ( Z ( k ) ) , where the residual map δ ( · ) : S n 7→ S n is defined by δ ( Z ) : = −P (Π + ( Z ) − r X ) + P ⊥ (Π + ( − Z ) − σ C ) = −P (Π + ( Z ) − r X ) − P ⊥ (Π − ( Z ) + σ C ) . (21) Clearly , δ ( s Z ) = 0 . Since b oth Π + ( · ) and Π − ( · ) are (parabolically) second-order directionally differentiable around s Z , the mapping δ ( · ) is also (parab olically) second-order directionally differen tiable at s Z , with δ ′ ( s Z ; H ) = −P Π ′ + ( s Z ; H ) − P ⊥ Π ′ − ( s Z ; H ) , (22a) δ ′′ ( s Z ; H , W ) = −P Π ′′ + ( s Z ; H , W ) − P ⊥ Π ′′ − ( s Z ; H , W ) . (22b) 16 Expanding Z ( k +1) up to second order then yields Z ( k +1) = Z ( k ) + δ ( s Z ) + t δ ′ ( s Z ; H ( k ) ) + t 2 2 δ ′′ ( s Z ; H ( k ) , W ( k ) ) + o ( t 2 ) = s Z + t n H ( k ) − P Π ′ + ( s Z ; H ( k ) ) − P ⊥ Π ′ − ( s Z ; H ( k ) ) o lo ooooooooooooooooooooooooooo omo ooooooooooooooooooooooooooo on = : H ( k +1) + t 2 2 n W ( k ) − P Π ′′ + ( s Z ; H ( k ) , W ( k ) ) − P ⊥ Π ′′ − ( s Z ; H ( k ) , W ( k ) ) o lo oooooooooooooooooooooooooooooooooooo omo oooooooooooooooooooooooooooooooooooo on = : W ( k +1) + o ( t 2 ) . (23) This expansion motiv ates the follo wing definitions of lo cal first- and second-order dynamics. Definition 1 (Lo cal first- and second-order dynamics) . A r ound s Z , define the lo c al first-or der dynamics as H ( k +1) = (Id + δ ′ ( s Z ; · ))( H ( k ) ) = H ( k ) − P Π ′ + ( s Z ; H ( k ) ) − P ⊥ Π ′ − ( s Z ; H ( k ) ) , (24) and the lo c al se c ond-or der dynamics as W ( k +1) = (Id + δ ′′ ( s Z ; H ( k ) , · ))( W ( k ) ) = W ( k ) − P Π ′′ + ( s Z ; H ( k ) , W ( k ) ) − P ⊥ Π ′′ − ( s Z ; H ( k ) , W ( k ) ) . (25) A notable fe atur e of the se c ond-or der dynamics is that W ( k +1) dep ends on b oth H ( k ) and W ( k ) . 4.3 C ( s Z ) : the Cone where First-Order Up dates V anish In this section, w e analyze the lo cal first-order dynamics ( 24 ). W e shall first see that H ( k +1) = (Id + δ ′ ( s Z ; · ))( H ( k ) ) will conv erge to one of the fixed p oin ts of Id + δ ′ ( s Z ; · ) . Recall that an operator T : S n 7→ S n is firmly nonexpansive on ( S n , ⟨· , ·⟩ ) , if ∥T ( H ) − T ( G ) ∥ 2 F + ∥ (Id − T )( H ) − (Id − T )( G ) ∥ 2 F ≤ ∥ H − G ∥ 2 F , ∀ H , G ∈ S n . Lemma 1 (Conv ergent first-order dynamics) . Under Assumption 1 , Id + δ ′ ( s Z ; · ) is firmly nonexp ansive on ( S n , ⟨· , ·⟩ ) . Mor e over, for any H (0) , H ( k +1) = (Id + δ ′ ( s Z ; · ))( H ( k ) ) c onver ges to a fixe d p oint of Id + δ ′ ( s Z ; · ) . Pr o of. F or ease of notation, for any H ∈ S n , we denote the mappings (Id + δ ′ ( s Z ; · ))( H ) as T ( H ) , Π ′ + ( s Z ; H ) as Ω( H ) , and Π ′ − ( s Z ; H ) as Ω ⊥ ( H ) . (i) W e first pro ve the firmly nonexpansiveness of T . W e ha v e ∀ H , G ∈ S n , ∥T ( H ) − T ( G ) ∥ 2 F + ∥ (Id − T )( H ) − (Id − T )( G ) ∥ 2 F = ∥P ⊥ [Ω( H ) − Ω( G )] ∥ 2 F + ∥P [Ω ⊥ ( H ) − Ω ⊥ ( G )] ∥ 2 F + ∥P [Ω( H ) − Ω( G )] ∥ 2 F + ∥P ⊥ [Ω ⊥ ( H ) − Ω ⊥ ( G )] ∥ 2 F = ∥ Ω( H ) − Ω( G ) ∥ 2 F + ∥ Ω ⊥ ( H ) − Ω ⊥ ( G ) ∥ 2 F = ∥ H − G ∥ 2 F − 2 Ω( H ) − Ω( G ) , Ω ⊥ ( H ) − Ω ⊥ ( G ) . All w e need to sho w is Ω( H ) − Ω( G ) , Ω ⊥ ( H ) − Ω ⊥ ( G ) ≥ 0 . Denote U : = Ω( H ) − Ω( G ) and V : = Ω ⊥ ( H ) − Ω ⊥ ( G ) . Only consider the upp er triangular parts of the symmetric matrix: • If (1) a ∈ I + , b ∈ I + ; or (2) a ∈ I + , b = 0 ; or (3) a = 0 , b ∈ I − ; or (4) a ∈ I − , b ∈ I − : ⟨ U α a α b , V α a α b ⟩ = 0 • (5) a = 0 , b = 0 : ⟨ U α 0 α 0 , V α 0 α 0 ⟩ = ⟨ Π + ( H α 0 α 0 ) − Π + ( G α 0 α 0 ) , Π − ( H α 0 α 0 ) − Π − ( G α 0 α 0 ) ⟩ = ⟨ Π + ( H α 0 α 0 ) , − Π − ( G α 0 α 0 ) ⟩ + ⟨ Π + ( G α 0 α 0 ) , − Π − ( H α 0 α 0 ) ⟩ ≥ 0 17 • (6) a ∈ I + , b ∈ I − : ⟨ U α a α b , V α a α b ⟩ = µ a µ a − µ b ( H − G ) α a α b , − µ b µ a − µ b ( H − G ) α a α b ≥ 0 (ii) W e second show Fix( T ) = ∅ . Since δ ′ ( s Z ; 0) = 0 , we ha v e 0 ∈ Fix( T ) . Therefore, by [ 6 , Example 5.18], H ( k +1) = T ( H ( k ) ) con verges to one of the points in Fix( T ) . Denote C ( s Z ) as Fix(Id + δ ′ ( s Z ; · )) (or equiv alen tly , k er( δ ′ ( s Z ; · )) ): C ( s Z ) : = H ∈ S n (Id + δ ′ ( s Z ; · ))( H ) = H = H ∈ S n δ ′ ( s Z ; H ) = 0 . (26) W e need an imp ortan t lemma b efore starting to characterize C ( s Z ) ’s structures. Lemma 2. F or G ∈ S n , under Assumption 1 : 1. If ∥P G ∥ F ≤ ϵ and G, s S = 0 , then | ⟨ G, S ⟩ | ≤ ∥ S − s S ∥ F · ϵ, ∀ S ∈ S ⋆ . 2. If ∥P ⊥ G ∥ F ≤ ϵ and G, s X = 0 , then | ⟨ G, X ⟩ | ≤ ∥ X − s X ∥ F · ϵ, ∀ X ∈ X ⋆ . Pr o of. (1) Since s S , S ∈ S ⋆ , w e ha ve P ⊥ s S = P ⊥ S = P ⊥ C from Proposition 1 . Thus, | ⟨ G, S ⟩ | = ˇ ˇ ⟨P G, P S ⟩ + P ⊥ G, P ⊥ S ˇ ˇ = ˇ ˇ ⟨P G, P S ⟩ + P ⊥ G, P ⊥ s S ˇ ˇ = ˇ ˇ P G, P s S + P ⊥ G, P ⊥ s S + P G, P ( s S − S ) ˇ ˇ = ˇ ˇ G, s S + P G, P ( s S − S ) ˇ ˇ = ˇ ˇ P G, P ( s S − S ) ˇ ˇ ≤∥P G ∥ F ∥P ( s S − S ) ∥ F ≤ ϵ ∥P ( s S − S ) ∥ F On the other hand, s S − S = P ⊥ ( s S − S ) + P ( s S − S ) = P ( s S − S ) . (2) By primal-dual symmetry . Prop osition 2 (Structure of C ( s Z ) ) . Under Assumption 1 : 1. C ( s Z ) is a nonempty, close d and c onvex c one. 2. C ( s Z ) = C P ( s Z ) + C D ( s Z ) , wher e C P ( s Z ) : = H = » — — — – H α + α + H α + α P 0 H α + α D 0 0 ∼ H α P 0 α P 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl P H = 0 , H α P 0 α P 0 ⪰ 0 , (27a) C D ( s Z ) : = H = » — — — – 0 0 0 0 ∼ 0 0 H α P 0 α − ∼ ∼ H α D 0 α D 0 H α P 0 α − ∼ ∼ ∼ H α − α − fi ffi ffi ffi fl P ⊥ H = 0 , H α D 0 α D 0 ⪯ 0 . (27b) Pr o of. (1) is directly from Lemma 1 and [ 6 , Prop osition 4.23]. (2) F or ease of notation, denote Π ′ + ( s Z ; · ) (resp. Π ′ − ( s Z ; · ) ) as Ω( · ) (resp. Ω ⊥ ( · ) ). Then from ( 26 ), H ∈ C ( s Z ) if and only if P Ω( H ) = 0 and P ⊥ Ω ⊥ ( H ) = 0 . 18 (i) W e first pro ve H α + α − = 0 , ∀ H ∈ C ( s Z ) . Notice that Ω( H ) , Ω ⊥ ( H ) = P Ω( H ) , P Ω ⊥ ( H ) + P ⊥ Ω( H ) , P ⊥ Ω ⊥ ( H ) = 0 . On the other hand, Ω( H ) , Ω ⊥ ( H ) = 2 X a ∈I + ,b ∈I − µ a · ( − µ b ) µ a − µ b ∥ H α a α b ∥ 2 F . Th us, H α a α b = 0 , ∀ a ∈ I + , b ∈ I − , i.e., H α + α − = 0 . (ii) W e second prov e H α P 0 α D 0 = 0 , ∀ H ∈ C ( s Z ) . Since Ω( H ) α − α − = 0 from Theorem 1 , we get Ω( H ) , s S = 0 . T ogether with P Ω( H ) = 0 and S sc ∈ S ⋆ , w e ha ve ⟨ Ω( H ) , S sc ⟩ = 0 from Lemma 2 . On the other hand, Ω( H ) = » — — — – H α + α + H α + α P 0 H α + α D 0 H α + α − ∼ [Π + ( H α 0 α 0 )] β 0 , P β 0 , P [Π + ( H α 0 α 0 )] β 0 , P β 0 , D 0 ∼ ∼ [Π + ( H α 0 α 0 )] β 0 , D β 0 , D 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl , S sc = » — — — – 0 0 0 0 ∼ 0 0 0 ∼ ∼ [ S sc ] α D 0 α D 0 [ S sc ] α D 0 α − ∼ ∼ ∼ [ S sc ] α − α − fi ffi ffi ffi fl , w e hav e ⟨ Ω( H ) , S sc ⟩ = D [Π + ( H α 0 α 0 )] β 0 , D β 0 , D , [ S sc ] α D 0 α D 0 E = 0 . Since [ S sc ] α D 0 α D 0 ≻ 0 and [Π + ( H α 0 α 0 )] β 0 , D β 0 , D ⪰ 0 , w e get [Π + ( H α 0 α 0 )] β 0 , D β 0 , D = 0 . This further implies [Π + ( H α 0 α 0 )] β 0 , P β 0 , D = 0 . Symmetrically , [Π − ( H α 0 α 0 )] β 0 , P β 0 , D = 0 . Thus, H α P 0 α D 0 = [ H α 0 α 0 ] β 0 , P β 0 , D = [Π + ( H α 0 α 0 )] β 0 , P β 0 , D + [Π − ( H α 0 α 0 )] β 0 , P β 0 , D = 0 . (iii) F rom (i) and (ii), ∀ H ∈ C ( s Z ) , it should be of the following form: H = » — — — – H α + α + H α + α P 0 H α + α D 0 0 ∼ H α P 0 α P 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl lo oooooooooooooooooooo omo oooooooooooooooooooo on = : U + » — — — – 0 0 0 0 ∼ 0 0 H α P 0 α − ∼ ∼ H α D 0 α D 0 H α P 0 α − ∼ ∼ ∼ H α − α − fi ffi ffi ffi fl lo ooooooooooooooooo omo ooooooooooooooooo on = : V , with U α P 0 α P 0 ⪰ 0 , V α D 0 α D 0 ⪯ 0 Moreo ver, P Ω( H ) = P U = 0 , P ⊥ Ω ⊥ ( H ) = P ⊥ V = 0 . Th us, U ∈ C P ( s Z ) , V ∈ C D ( s Z ) . This prov es the “ ⊆ ” part. F or the “ ⊇ ” part: take any U ∈ C P ( s Z ) , V ∈ C D ( s Z ) . Then, P Ω( U + V ) = P U = 0 , P ⊥ Ω ⊥ ( U + V ) = P ⊥ V = 0 , whic h closes the proof. 4.3.1 Relationships betw een C ( s Z ) and T Z ⋆ ( s Z ) The cone C ( s Z ) consists of directions H along which δ ( s Z + tH ) (the backw ard error [ 47 ], i.e., the KKT residual) v anishes to first order, whereas T Z ⋆ ( s Z ) consists of directions H along whic h dist( s Z + tH , Z ⋆ ) (the forw ard error, i.e., the distance to the optimal set) v anishes to first order. As we show below, these tw o cones are closely related. Prop osition 3 (Structure of T Z ⋆ ( s Z ) ) . Under Assumption 1 , 19 1. T Z ⋆ ( s Z ) = T X ⋆ ( s X ) − T S ⋆ ( s S ) , wher e T X ⋆ ( s X ) = H = » — – « H α + α + H α + α P 0 ∼ H α P 0 α P 0 ff 0 ∼ 0 fi ffi fl P H = 0 , H α P 0 α P 0 ⪰ 0 , (28a) T S ⋆ ( s S ) = H = » — – 0 0 ∼ « H α D 0 α D 0 H α D 0 α − ∼ H α − α − ff fi ffi fl P ⊥ H = 0 , H α D 0 α D 0 ⪯ 0 . (28b) 2. T Z ⋆ ( s Z ) = C ( s Z ) ∩ { H ∈ S n | H α + α D 0 = 0 , H α P 0 α − = 0 } . Pr o of. (1) W e first calculate T X ⋆ ( s X ) . Regularize X ⋆ to its affine h ull: pic king ( X sc , S sc ) as a maximal-rank primal–dual optimal solution pair, X ⋆ = n X X ∈ S n + ∩ n X P X = P r X , ⟨ X, S sc ⟩ = 0 oo = X X ∈ S n + , X α P α D = 0 , X α D α D = 0 lo ooooooooooooooooooooooo omo ooooooooooooooooooooooo on = : C 1 ∩ n X P X = P r X o loooooooooomooooooooo on = : C 2 . Since X sc ∈ ri( C 1 ) ∩ ri( C 2 ) , b y [ 41 , Theorem 6.42], T X ⋆ ( s X ) = T C 1 ( s X ) ∩ T C 2 ( s X ) = H H α P α P ∈ T S | α P | + ( ¯ X α P α P ) , P H = 0 , H α P α D = H α D α D = 0 = n H H α P 0 α P 0 ⪰ 0 , P H = 0 , H α P α D = H α D α D = 0 o . Symmetrically , T S ⋆ ( s S ) is of the form in ( 28 ). W e notice that T X ⋆ ( s X ) ∩ T S ⋆ ( s S ) = { 0 } . Thus, via [ 7 , Corollary 4.8 (v)] and [ 41 , Exercise 6.44], T Z ⋆ ( s Z ) = cl( T X ⋆ ( s X ) − T S ⋆ ( s S )) = T X ⋆ ( s X ) − T S ⋆ ( s S ) . (2) Denote { H ∈ S n | H α + α D 0 = 0 , H α P 0 α − = 0 } as M . Suppose H ∈ C ( s Z ) ∩ M . Then, from Prop osition 2 (2), H = U + V , with P U = 0 and P ⊥ V = 0 , with U = » — — — – U α + α + U α + α P 0 0 0 ∼ U α P 0 α P 0 ⪰ 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl , V = » — — — – 0 0 0 0 ∼ 0 0 0 ∼ ∼ V α D 0 α D 0 ⪯ 0 V α D 0 α − ∼ ∼ ∼ V α − α − fi ffi ffi ffi fl . Th us, U ∈ T X ⋆ ( s X ) , V ∈ −T S ⋆ ( s S ) . This prov es the “ ⊆ ” part. F or the “ ⊇ ” part, tak e any U ∈ T X ⋆ ( s X ) , V ∈ −T S ⋆ ( s S ) . Then, U ∈ C P ( s Z ) ∩ M and V ∈ C D ( s Z ) ∩ M . Th us, U + V ∈ ( C P ( s Z ) + C D ( s Z )) ∩ M = C ( s Z ) ∩ M . There are several sp ecial scenarios when C ( s Z ) = T Z ⋆ ( s Z ) . Corollary 1 (Special cases when C ( s Z ) = T Z ⋆ ( s Z ) ) . Under Assumption 1 , C ( s Z ) = T Z ⋆ ( s Z ) under any of the fol lowing c onditions: 1. If s X satisfies primal c onstr aint nonde gener acy and s S satisfies dual c onstr aint nonde gener acy; 2. If ( s X , s S ) is a strictly c omplementary solution p air. 20 3. If the line ar gr owth c ondition holds lo c al ly, i.e., ∃ γ > 0 , r > 0 , s.t. ∀ Z ∈ B r ( s Z ) , γ dist( Z , Z ⋆ ) ≤ ∥ δ ( Z ) ∥ F . Pr o of. (1) Under the t w o-sided nondegeneracy conditions, [ 24 , Theorem 5] has already pro v en Fix(Id + δ ′ ( s Z ; · )) = { 0 } . Th us, C ( s Z ) = T Z ⋆ ( s Z ) = { 0 } . (2) When ( s X , s S ) is of maximal rank, H α P α D = H α + α − = 0 for an y H ∈ C ( s Z ) . Th us, H α + α D 0 = 0 , H α P 0 α − = 0 naturally holds. By Prop osition 3 (2), C ( s Z ) = T Z ⋆ ( s Z ) . (3) Prov e by con tradiction. Supp ose T Z ⋆ ( s Z ) Ĺ C ( s Z ) . Then, pic k H ∈ C ( s Z ) \T Z ⋆ ( s Z ) . Since Z ⋆ is closed and con v ex, there exists a non-zero V ∈ N Z ⋆ ( s Z ) , suc h that Z ⋆ ⊆ { Z | V , Z − s Z ≤ 0 } = : H . Since H / ∈ T Z ⋆ ( s Z ) , w e ha ve ⟨ V , H ⟩ > 0 and dist( s Z + tH , Z ⋆ ) ≥ dist( s Z + tH , H ) = t ⟨ V , H ⟩ ∥ V ∥ F for all t > 0 . Th us, by the lo cal linear gro wth condition, for all 0 < t ≤ r ∥ H ∥ F , ∥ δ ( s Z + tH ) ∥ F ≥ γ dist( s Z + tH , Z ⋆ ) ≥ γ · t ⟨ V , H ⟩ ∥ V ∥ F = ⇒ ∥ δ ( s Z + tH ) ∥ F t ≥ γ · ⟨ V , H ⟩ ∥ V ∥ F > 0 . On the other hand, since H ∈ C ( s Z ) , lim t ↓ 0 ∥ δ ( s Z + tH ) ∥ F t = 0 , whic h leads to a con tradiction. It turns out that all three sufficien t conditions in Corollary 1 for T Z ⋆ ( s Z ) = C ( s Z ) are closely tied to lo cal linear con v ergence of ADMM for SDPs. Indeed, if w e additionally assume that s Z is the limiting point of the ADMM iterates, then: (1) lo cal linear conv ergence under t w o-sided nondegeneracy can b e derived from [ 10 , 19 ]; (2) lo cal linear con vergence of ADMM for SDPs under strict complemen tarity at the limiting KKT p oin t has been established recen tly in [ 24 ]; (3) a lo cal linear growth condition is kno wn to guaran tee lo cal linear con v ergence for a broad class of primal–dual splitting metho ds in more general nonsmo oth conv ex optimization settings [ 19 , 58 ]. In the con vex quadratic SDP setting, the lo cal linear growth condition is also closely related to metric subregularity of the KKT op erator [ 12 , Theorem 3.2], which is generally difficult to c haracterize. On the other hand, as w e will see in § 6 , an y direction H ∈ C ( s Z ) \T Z ⋆ ( s Z ) can lead to a second-order dominan t phenomenon. Suc h directions are easy to construct even for small-scale SDPs inv olving m ultiple KKT p oin ts, as illustrated by the examples in § 10 . Based on these observ ations, we conjecture that C ( s Z ) = T Z ⋆ ( s Z ) is a necessary condition to establish lo cal linear conv ergence, given the additional assumption that s Z is the final con vergen t p oin t of one-step ADMM’s iterations. 4.4 Second-Order Limit Map ϕ ( s Z ; · ) In this section, w e will develop the core concepts in the pap er: the lo cal second-order limit map and its induced local second-order limit dynamics. By Lemma 1 , the local first-order dynamics ( 24 ) ev entually v anishes and drives the iterates tow ard C ( s Z ) for an y initialization H (0) ∈ S n . In con trast, the true one-step ADMM dynamics ( 4 ) need not v anish. This motiv ates us to in v estigate the lo cal second-order dynamics ( 25 ) in the regime where the first-order dynamics has stalled. Fix an arbitrary s H ∈ C ( s Z ) . Then, by ( 24 ) and Lemma 1 , we hav e H ( k ) ≡ s H for all k ∈ N if H (0) is set to s H . Consequen tly , the lo cal second-order dynamics ( 25 ) reduces to W ( k +1) = (Id + δ ′′ ( s Z ; s H , · ))( W ( k ) ) = W ( k ) − P Π ′′ + ( s Z ; s H , W ( k ) ) − P ⊥ Π ′′ − ( s Z ; s H , W ( k ) ) . (29) As w e will see later, a fundamental difference b et w een the second-order dynamics ( 29 ) and the first-order dynamics ( 24 ) is that the second-order sequence { W ( k ) } in ( 29 ) need not con v erge. 21 4.4.1 δ ′′ ( s Z ; s H , W ) ’s simplification under s H ∈ C ( s Z ) Since s H ∈ C ( s Z ) , we hav e s H α + α − = 0 from Prop osition 2 . Therefore, Π ′′ + ( s Z ; s H , W ) in ( 12 ) and Π ′′ − ( s Z ; s H , W ) in ( 16 ) can be simplified as: Π ′′ + ( s Z ; s H , W ) = » — — — — — – W α + α + n W α a α b − 2 1 µ a Ď H α a α 0 Π + ( − Ď H α 0 α 0 ) o a ∈I + µ a µ a − µ b W α a α b +2 1 µ a − µ b Ď H α a α 0 Ď H α 0 α b a ∈I + b ∈I − ∼ 2 P c ∈I + 1 µ c Ď H α 0 α c Ď H α c α 0 +Π ′ + ( Ď H α 0 α 0 ; V 0 ( Ď H ,W )) { 2 1 − µ b Π + ( Ď H α 0 α 0 ) Ď H α 0 α b } b ∈I − ∼ ∼ 0 fi ffi ffi ffi ffi ffi fl , (30a) Π ′′ − ( s Z ; s H , W ) = » — — — — — – 0 { 2 1 − µ a Ď H α a α 0 Π − ( Ď H α 0 α 0 ) } a ∈I + − µ b µ a − µ b W α a α b +2 1 µ b − µ a Ď H α a α 0 Ď H α 0 α b a ∈I + b ∈I − ∼ 2 P c ∈I − 1 µ c Ď H α 0 α c Ď H α c α 0 +Π ′ − ( Ď H α 0 α 0 ; V 0 ( Ď H ,W )) n W α a α b − 2 1 µ b Π − ( − Ď H α 0 α 0 ) Ď H α 0 α b o b ∈I − ∼ ∼ W α − α − fi ffi ffi ffi ffi ffi fl . (30b) W e notice that V 0 ( s H , W ) is linear in W . Therefore, define Ă W as: Ă W α a α b : = V 0 = W α 0 α 0 − 2 P c ∈I + ∪I − 1 µ c s H α 0 α c s H α c α 0 , a = 0 , b = 0 W α a α b , Otherwise . (31) Define Υ( s Z ; s H ) : = » — — – 0 0 0 ∼ 2 P c ∈I + ∪I − 1 µ c s H α 0 α c s H α c α 0 0 ∼ ∼ 0 fi ffi ffi fl . (32) Then, W − Ă W ≡ Υ( s Z ; s H ) . F urthermore, define Θ( s Z ; s H , Ă W ) : = » — — – Ă W α + α + Ă W α + α 0 n µ a µ a − µ b Ă W α a α b o a ∈I + b ∈I − ∼ Π ′ + ( s H α 0 α 0 ; Ă W α 0 α 0 ) 0 ∼ ∼ 0 fi ffi ffi fl , (33a) Θ ⊥ ( s Z ; s H , Ă W ) : = » — — – 0 0 n − µ b µ a − µ b Ă W α a α b o a ∈I + b ∈I − ∼ Π ′ − ( s H α 0 α 0 ; Ă W α 0 α 0 ) Ă W α 0 α − ∼ ∼ Ă W α − α − fi ffi ffi fl , (33b) and E ( s Z ; s H ) : = » — — — — – 0 n − 2 1 µ a s H α a α 0 Π + ( − s H α 0 α 0 ) o a ∈I + n 2 1 µ a − µ b s H α a α 0 s H α 0 α b o a ∈I + b ∈I − ∼ 2 P c ∈I + 1 µ c s H α 0 α c s H α c α 0 n 2 1 − µ b Π + ( s H α 0 α 0 ) s H α 0 α b o b ∈I − ∼ ∼ 0 fi ffi ffi ffi ffi fl , (34a) E ⊥ ( s Z ; s H ) : = » — — — — – 0 n 2 1 − µ a s H α a α 0 Π − ( s H α 0 α 0 ) o a ∈I + n 2 1 µ b − µ a s H α a α 0 s H α 0 α b o a ∈I + b ∈I − ∼ 2 P c ∈I − 1 µ c s H α 0 α c s H α c α 0 n − 2 1 µ b Π − ( − s H α 0 α 0 ) s H α 0 α b o b ∈I − ∼ ∼ 0 fi ffi ffi ffi ffi fl . (34b) 22 F or all s H ∈ C ( s Z ) and Ă W ∈ S n , the following relationships hold: Π ′′ + ( s Z ; s H , W ) = Θ( s Z ; s H , Ă W ) + E ( s Z ; s H ) , (35a) Π ′′ − ( s Z ; s H , W ) = Θ ⊥ ( s Z ; s H , Ă W ) + E ⊥ ( s Z ; s H ) , (35b) Θ( s Z ; s H , Ă W ) + Θ ⊥ ( s Z ; s H , Ă W ) = Ă W , (35c) E ( s Z ; s H ) + E ⊥ ( s Z ; s H ) = Υ( s Z ; s H ) . (35d) No w w e are ready to simplify δ ′′ ( s Z ; s H , W ) : δ ′′ ( s Z ; s H , W ) = −P Π ′′ + ( s Z ; s H , W ) − P ⊥ Π ′′ − ( s Z ; s H , W ) = − P Θ( s Z ; s H , Ă W ) − P ⊥ Θ ⊥ ( s Z ; s H , Ă W ) − P E ( s Z ; s H ) − P ⊥ E ⊥ ( s Z ; s H ) , (36) where Θ( s Z ; s H , Ă W ) , Θ ⊥ ( s Z ; s H , Ă W ) are defined in ( 33 ) and E ( s Z ; s H ) , E ⊥ ( s Z ; s H ) are defined in ( 34 ). 4.4.2 W ( k +1) − W ( k ) is con v ergen t F rom ( 36 ) and ( 31 ), Ă W ( k +1) − Ă W ( k ) = W ( k +1) − W ( k ) = −P Θ( s Z ; s H , Ă W ( k ) ) − P ⊥ Θ ⊥ ( s Z ; s H , Ă W ( k ) ) − P E ( s Z ; s H ) − P ⊥ E ⊥ ( s Z ; s H ) . F rom ( 35 ), Ă W ( k ) = Θ( s Z ; s H , Ă W ( k ) ) + Θ ⊥ ( s Z ; s H , Ă W ( k ) ) . Thus, Ă W ( k +1) = n P ⊥ Θ( s Z ; s H , Ă W ( k ) ) + P Θ ⊥ ( s Z ; s H , Ă W ( k ) ) o + −P E ( s Z ; s H ) − P ⊥ E ⊥ ( s Z ; s H ) lo ooooooooooooooooo omo ooooooooooooooooo on = : Ψ( s Z ; Ď H ) . (37) F rom no w on, supp ose s H follows the second-level description in § 3 . Lemma 3 (Firmly nonexpansiveness of P ⊥ Θ( s Z ; s H , · ) + P Θ ⊥ ( s Z ; s H , · ) ) . P ⊥ Θ( s Z ; s H , · ) + P Θ ⊥ ( s Z ; s H , · ) in ( 37 ) is firmly nonexp ansive on ( S n , ⟨· , ·⟩ ) . Pr o of. The pro of pro cedure is similar to the one in Lemma 1 . F or ease of notation, w e abbreviate P ⊥ Θ( s Z ; s H , · )+ P Θ ⊥ ( s Z ; s H , · ) as T ( · ) , Θ( s Z ; s H , · ) as F ( · ) , and Θ ⊥ ( s Z ; s H , · ) as F ⊥ ( · ) : ∥T ( U ) − T ( V ) ∥ 2 F + ∥ (Id − T )( U ) − (Id − T )( V ) ∥ 2 F = ∥P ⊥ [ F ( U ) − F ( V )] ∥ 2 F + ∥P [ F ⊥ ( U ) − F ⊥ ( V )] ∥ 2 F + ∥P [ F ( U ) − F ( V )] ∥ 2 F + ∥P ⊥ [ F ⊥ ( U ) − F ⊥ ( V )] ∥ 2 F = ∥F ( U ) − F ( V ) ∥ 2 F + ∥F ⊥ ( U ) − F ⊥ ( V ) ∥ 2 F = ∥ U − V ∥ 2 F − 2 F ( U ) − F ( V ) , F ⊥ ( U ) − F ⊥ ( V ) . Th us, all we need to show is F ( U ) − F ( V ) , F ⊥ ( U ) − F ⊥ ( V ) ≥ 0 . Since F ( U ) − F ( V ) , F ⊥ ( U ) − F ⊥ ( V ) = 2 X a ∈I + ,b ∈I − µ a µ a − µ b · − µ b µ a − µ b ∥ U α a α b − V α a α b ∥ 2 F loooooooooooooooooooooooooooooo omooooooooooooooooooooooooooooooon ≥ 0 + Π ′ + ( s H α 0 α 0 ; U α 0 α 0 ) − Π ′ + ( s H α 0 α 0 ; V α 0 α 0 ) , Π ′ − ( s H α 0 α 0 ; U α 0 α 0 ) − Π ′ − ( s H α 0 α 0 ; V α 0 α 0 ) lo oooooooooooooooooooooooooooooooooooooooooooooooooooooo omo oooooooooooooooooooooooooooooooooooooooooooooooooooooo on = : LHS . 23 It boils do wn to pro ve LHS ≥ 0 . Abbreviate p U as ( Q 0 ) T U α 0 α 0 Q 0 and p V as ( Q 0 ) T V α 0 α 0 Q 0 : Π ′ + ( s H α 0 α 0 ; U α 0 α 0 ) = Q 0 » — — – p U β 0 , + β 0 , + p U β 0 , + β 0 , 0 n η 0 ,i η 0 ,i − η 0 ,j p U β 0 ,i β 0 ,j o i ∈I 0 , + j ∈I 0 , − ∼ Π + ( p U β 0 , 0 β 0 , 0 ) 0 ∼ ∼ 0 fi ffi ffi fl ( Q 0 ) T , Π ′ − ( s H α 0 α 0 ; V α 0 α 0 ) = Q 0 » — — – 0 0 n − η 0 ,j η 0 ,i − η 0 ,j p V β 0 ,i β 0 ,j o i ∈I 0 , + j ∈I 0 , − ∼ Π − ( p V β 0 , 0 β 0 , 0 ) p V β 0 , 0 β 0 , − ∼ ∼ p V β 0 , − β 0 , − fi ffi ffi fl ( Q 0 ) T . Th us, LHS = 2 X i ∈I 0 , + ,j ∈I 0 , − η 0 ,i η 0 ,i − η 0 ,j · − η 0 ,j η 0 ,i − η 0 ,j ∥ p U β 0 ,i β 0 ,j − p V β 0 ,i β 0 ,j ∥ 2 F loooooooooooooooooooooooooooooooooooooo omo oooooooooooooooooooooooooooooooooooooon ≥ 0 + D Π + ( p U β 0 , 0 β 0 , 0 ) − Π + ( p V β 0 , 0 β 0 , 0 ) , Π − ( p U β 0 , 0 β 0 , 0 ) − Π − ( p V β 0 , 0 β 0 , 0 ) E ≥ − D Π + ( p U β 0 , 0 β 0 , 0 ) , Π − ( p V β 0 , 0 β 0 , 0 ) E − D Π + ( p V β 0 , 0 β 0 , 0 ) , Π − ( p U β 0 , 0 β 0 , 0 ) E ≥ 0 , whic h concludes the proof. One ma y hav e already noticed that the operator P ⊥ Θ( s Z ; s H , · ) + P Θ ⊥ ( s Z ; s H , · ) in ( 37 ) closely resem bles Id + δ ′ ( s Z ; · ) in Lemma 1 . F or instance, P ⊥ Θ( s Z ; s H , · ) + P Θ ⊥ ( s Z ; s H , · ) is also p ositiv ely homogeneous, and hence 0 ∈ Fix( P ⊥ Θ( s Z ; s H , · ) + P Θ ⊥ ( s Z ; s H , · )) . The essen tial difference b etw een the lo cal first- and second- order dynamics, how ever, lies in the presence of the “constan t term” Ψ( s Z ; s H ) . Theorem 5 (Conv ergen t Ă W ( k +1) − Ă W ( k ) ) . F or the dynamic al system ( 37 ) , Ă W ( k +1) − Ă W ( k ) → ϕ ( s Z ; s H ) : = Ψ( s Z ; s H ) − Π K ( s Z ; Ď H ) (Ψ( s Z ; s H )) = Π K ◦ ( s Z ; Ď H ) (Ψ( s Z ; s H )) , as k → ∞ (39) wher e the close d c onvex c one K ( s Z ; s H ) is define d as K ( s Z ; s H ) : = cl ` P Θ( s Z ; s H , W ) + P ⊥ Θ ⊥ ( s Z ; s H , W ) W ∈ S n ˘ (40) and its p olar c one: K ◦ ( s Z ; s H ) : = Y ∈ S n P Θ( s Z ; s H , W ) + P ⊥ Θ ⊥ ( s Z ; s H , W ) , Y ≤ 0 , ∀ W ∈ S n . (41) Pr o of. This is a standard result from Monotone op erator theory . F or ease of notation, abbreviate the op erator P ⊥ Θ( s Z ; s H , · ) + P Θ ⊥ ( s Z ; s H , · ) as T ( · ) , Ψ( s Z ; s H ) as Ψ , and K ( s Z ; s H ) as K . Since T is firmly nonexpansive and p ositiv e homogeneous, K : = cl(ran(Id − T )) is a nonempty , closed, and conv ex cone from [ 39 , Lemma 4]. Denote S ( · ) : = T ( · ) + Ψ . Since T is firmly nonexpansiv e on ( S n , ⟨· , ·⟩ ) and Ψ is a constant drift, S is also firmly nonexpansiv e on ( S n , ⟨· , ·⟩ ) , y et it may not yield any fixed p oin t. On the other hand, from [ 4 , Corollary 2.3], we hav e the following weak er result for the dynamical system Ă W ( k +1) = S ( Ă W ( k ) ) in ( 37 ): Ă W ( k +1) − Ă W ( k ) → − Π cl(ran(Id −S )) (0) as k → ∞ . Since cl(ran(Id − S )) = − Ψ + cl(ran(Id − T )) = − Ψ + K , we get Π cl(ran(Id −S )) (0) = Π − Ψ+ K (0) = Π K (Ψ) − Ψ . F rom polar cone’s definition, Ψ = Π K (Ψ) + Π K ◦ (Ψ) . The closure in K do es not affect K ◦ , since for an arbitrary con vex cone C , (cl( C )) ◦ = C ◦ . 24 4.4.3 Lo cal second-order limit dynamics By Lemma 3 , the increment W ( k +1) − W ( k ) ev entually con verges to a constant second-order “drift” ϕ ( s Z ; s H ) , and this limit is independent of the initialization W (0) . F rom the viewp oin t of time-scale separation, this con vergence manifests as a second-order effect (scaled by t 2 2 with t ↓ 0 ), whereas the evolution of H ( k ) is a first-order effect (scaled b y t with t ↓ 0 ). It is therefore reasonable to assume that, by the time W ( k +1) − W ( k ) has con v erged, Z ( k ) remains unc hanged to first order. Consequen tly , b y ( 23 ), the limit dynamics after W ( k +1) − W ( k ) → ϕ ( s Z ; s H ) is Z ( k +1) = Z ( k ) + t 2 2 ϕ ( s Z ; s H ) + o ( t 2 ) . The update abov e pro duces a ra y in S n with a constant second-order “drift” as t ↓ 0 . Moreov er, Z ( k ) − s Z = t s H + o ( t ) for any fixe d k . Ho wev er, one must accoun t for the cumulativ e effect of this drift o v er many iterations: when k ∼ O ( 1 t ) , the accumulated second-order displacement can b ecome non-negligible compared to the first-order term t s H . In this regime, the original separation of first- and second-order dynamics in Definition 1 and ( 37 ) may no longer be accurate, and the effectiv e direction s H may need to be re-iden tified b ecause t is fixed and p ositiv e. W e address this issue by replacing the constant term t s H with the “feedback” term Z ( k ) − s Z . T o this end, w e use the follo wing elementary scaling property of ϕ ( s Z ; · ) . Prop osition 4 (P ositiv e 2-homogeneity of ϕ ( s Z ; · ) ) . F or any t > 0 and any s H ∈ C ( s Z ) , it holds that ϕ ( s Z ; t s H ) = t 2 ϕ ( s Z ; s H ) . Pr o of. By ( 34 ), w e hav e E ( s Z ; t s H ) = t 2 E ( s Z ; s H ) and E ⊥ ( s Z ; t s H ) = t 2 E ⊥ ( s Z ; s H ) . Next, we show that K ( s Z ; s H ) = K ( s Z ; t s H ) for all t > 0 . Indeed, for any W ∈ S n and an y t > 0 , Π ′ + ( t s H α 0 α 0 ; W α 0 α 0 ) = » — — – x W β 0 , + β 0 , + x W β 0 , + β 0 , 0 n tη 0 ,i tη 0 ,i − tη 0 ,j x W β 0 ,i β 0 ,i o i ∈I 0 , + i ∈I 0 , − ∼ Π + ( x W β 0 , 0 β 0 , 0 ) 0 ∼ ∼ 0 fi ffi ffi fl = Π ′ + ( s H α 0 α 0 ; W α 0 α 0 ) , where x W : = ( Q 0 ) T W α 0 α 0 Q 0 . Hence, b y ( 33 ), Θ( s Z ; s H , W ) = » — — – W α + α + W α + α 0 n tµ a tµ a − tµ b W α a α b o a ∈I + b ∈I − ∼ Π ′ + ( t s H α 0 α 0 ; W α 0 α 0 ) 0 ∼ ∼ 0 fi ffi ffi fl = Θ( s Z ; t s H , W ) for all W ∈ S n and t > 0 . By symmetry , Θ ⊥ ( s Z ; s H , W ) = Θ ⊥ ( s Z ; t s H , W ) for all W ∈ S n . It then follows from ( 40 ) that K ( s Z ; s H ) = K ( s Z ; t s H ) . Finally , using ( 39 ), ϕ ( s Z ; t s H ) = Ψ( s Z ; t s H ) − Π K ( s Z ; t Ď H ) (Ψ( s Z ; t s H )) = t 2 Ψ( s Z ; s H ) − Π K ( s Z ; Ď H ) ( t 2 Ψ( s Z ; s H )) = t 2 ϕ ( s Z ; s H ) , whic h concludes the proof. With Prop osition 4 , we hav e t 2 2 ϕ ( s Z ; s H ) = 1 2 ϕ ( s Z ; t s H ) for all t > 0 , which motiv ates the following definition. Definition 2 (Local second-order limit map and limit dynamics) . A t a p oint s Z ∈ Z ⋆ , the lo c al se c ond-or der limit map ϕ ( s Z ; · ) : C ( s Z ) 7→ S n is define d by ϕ ( s Z ; · ) : = Ψ( s Z ; · ) − Π K ( s Z ; · ) (Ψ( s Z ; · )) = Π K ◦ ( s Z ; · ) (Ψ( s Z ; · )) . (42) Her e Ψ( s Z ; · ) is define d in ( 37 ) , K ( s Z ; · ) in ( 40 ) , and K ◦ ( s Z ; · ) in ( 41 ) . The lo c al se c ond-or der limit dynamics is define d as Z ( k +1) = Z ( k ) + 1 2 ϕ ( s Z ; Z ( k ) − s Z ) + o ( ∥ Z ( k ) − s Z ∥ 2 F ) . (43) 25 Remark 2. One may have notic e d that thr e e layers of appr oximation ar e involve d b efor e arriving at the lo c al se c ond-or der limit dynamics ( 43 ) . At the first layer, we p erform a lo c al exp ansion governe d by the sc ale p ar ameter t ↓ 0 and obtain the lo c al se c ond-or der dynamics ( 25 ) . A t the se c ond layer, we take the iter ation limit of ( 25 ) as k → ∞ to obtain the lo c al se c ond-or der limit map ϕ ( s Z ; s H ) . A t the thir d layer, we r eplac e the c onstant term t s H by the fe e db ack term Z ( k ) − s Z to arrive at the lo c al se c ond-or der limit dynamics ( 43 ) . These c ouple d appr oximations al low us to fo cus on the p ersistent limiting b ehavior of ADMM’s lo c al dynamics and le ad to a mo del with cle an and useful structur e. A dmitte d ly, however, they c ome at the c ost of mathematic al rigor. F or this r e ason, r ather than claiming ( 43 ) as a ful ly rigor ous c onse quenc e of ( 4 ) , we intr o duc e it as p art of the definition of a new mathematic al obje ct whose pr op erties we then study. F rom the p erspective of a v ector field, the limit dynamics ( 43 ) assigns to each p oin t Z − s Z ∈ C ( s Z ) a displacemen t vector 1 2 ϕ ( s Z ; Z − s Z ) , which satisfies 1 2 ϕ ( s Z ; Z − s Z ) ∼ O ( ∥ Z − s Z ∥ 2 F ) up to higher-order terms. Therefore, understanding the mapping ϕ ( s Z ; · ) in ( 42 ) b ecomes k ey to understanding th e asso ciated limit dynamics ( 43 ). In the subsequent section, we will see that fundamental prop erties of ϕ ( s Z ; · ) ( e.g., k ernel, range, con tin uit y , and primal–dual partition) are tigh tly linked to dynamical features of ( 43 ) ( e.g., fixed p oin ts, almost-in v ariant sets, phases, and the effect of σ ), whic h in turn explain and predict the limiting b eha vior of the one-step ADMM up date ( 4 ) around s Z . Before analyzing ϕ ( s Z ; · ) ’s properties, we first exploit sev eral structural prop erties of ϕ ( s Z ; · ) . 5 P olar Description and Primal–Dual Decoupling In this section, w e simplify ϕ ( s Z ; · ) by exposing its primal–dual decoupling structure, whic h serv es as a foundation for the subsequent characterization of ϕ ( s Z ; · ) . In § 5.1 , w e simplify K ◦ ( s Z ; s H ) b y revealing the self-similar structure of Π ′′ + ( Z ; H , W ) . As a result, K ◦ ( s Z ; s H ) can b e expressed as a Minko wski sum of t w o simpler cones, K ◦ P ( s Z ; s H ) and K ◦ D ( s Z ; s H ) . In § 5.2 , we pro v e that the second-order limits of both ∆ X ( k ) and ∆ S ( k ) exist. Finally , in § 5.3 , w e link these limiting drifts tightly to K ◦ P ( s Z ; s H ) and K ◦ D ( s Z ; s H ) , thereby rev ealing a clean primal–dual decoupling mec hanism in the second-order-dominan t regimes. 5.1 Simplification of K ◦ ( s Z ; s H ) W e first simplify the structure of Q 0 ∈ O | α 0 | ( s H α 0 α 0 ) when s H ∈ C ( s Z ) . Lemma 4 (Blo c k-diagonal structure of Q 0 ) . Under Assumption 1 , fix any s Z ∈ Z ⋆ and s H ∈ C ( s Z ) . Supp ose s Z fol lows the first-level description in § 3 and s H fol lows the se c ond-level description. Then, Q 0 β 0 , P β 0 , D ≡ 0 for al l Q 0 ∈ O | α 0 | ( s H α 0 α 0 ) . Pr o of. F rom Proposition 2 (2), w e get s H = » — — — – s H α + α + s H α + α P 0 s H α + α D 0 0 ∼ s H α P 0 α P 0 0 s H α P 0 α − ∼ ∼ s H α D 0 α D 0 s H α D 0 α − ∼ ∼ ∼ s H α − α − fi ffi ffi ffi fl for an y s H ∈ C ( s Z ) . Thus, [ s H α 0 α 0 ] β 0 , P β 0 , D = 0 , whic h closes the pro of. It turns out K ◦ ( s Z ; s H ) has the follo wing nice structures. Prop osition 5 (Structure of K ◦ ( s Z ; s H ) ) . Under Assumption 1 , fix any s Z ∈ Z ⋆ and s H ∈ C ( s Z ) . Supp ose s Z fol lows the first-level description in § 3 and s H fol lows the se c ond-level description. Then, K ◦ ( s Z ; s H ) = 26 K ◦ P ( s Z ; s H ) + K ◦ D ( s Z ; s H ) , wher e K ◦ P ( s Z ; s H ) : = W = » — — — — — — — — – W α + α + W α + α 0 0 ∼ Q 0 » — — — — – x W β 0 , + β 0 , + x W β 0 , + β P 0 , 0 x W β 0 , + β D 0 , 0 0 ∼ x W β P 0 , 0 β P 0 , 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi ffi fl ( Q 0 ) T 0 ∼ ∼ 0 fi ffi ffi ffi ffi ffi ffi ffi ffi fl P W = 0 , x W = ( Q 0 ) T W α 0 α 0 Q 0 , x W β P 0 , 0 β P 0 , 0 ⪰ 0 , (44a) K ◦ D ( s Z ; s H ) : = W = » — — — — — — — — – 0 0 0 ∼ Q 0 » — — — — – 0 0 0 0 ∼ 0 0 x W β P 0 , 0 β 0 , − ∼ ∼ x W β D 0 , 0 β D 0 , 0 x W β D 0 , 0 β 0 , − ∼ ∼ ∼ x W β 0 , − β 0 , − fi ffi ffi ffi ffi fl ( Q 0 ) T W α 0 α − ∼ ∼ W α − α − fi ffi ffi ffi ffi ffi ffi ffi ffi fl P ⊥ W = 0 , x W = ( Q 0 ) T W α 0 α 0 Q 0 , x W β D 0 , 0 β D 0 , 0 ⪯ 0 . (44b) Pr o of. “ ⊆ ”: F rom ( 41 ), Y ∈ K ◦ ( s Z ; s H ) if and only if Y , P Θ( s Z ; s H , W ) + P ⊥ Θ ⊥ ( s Z ; s H , W ) ≤ 0 , ∀ W ∈ S n ⇐ ⇒ P Y , P Θ( s Z ; s H , W ) + P ⊥ Y , P ⊥ Θ( s Z ; s H , W ) ≤ 0 , ∀ W ∈ S n ⇐ ⇒ Y = U + V , P U = 0 , P ⊥ V = 0 , V , P Θ( s Z ; s H , W ) + U, P ⊥ Θ ⊥ ( s Z ; s H , W ) ≤ 0 , ∀ W ∈ S n ⇐ ⇒ Y = U + V , P U = 0 , P ⊥ V = 0 , V , Θ( s Z ; s H , W ) + U, Θ ⊥ ( s Z ; s H , W ) ≤ 0 , ∀ W ∈ S n . (45) (i) Set W α + α − = 0 , W α 0 α 0 = 0 , W α 0 α − = 0 , W α − α − = 0 . Then, from ( 33 ), Θ( s H ; W ) = » — – W α + α + W α + α 0 0 ∼ 0 0 ∼ ∼ 0 fi ffi fl , Θ ⊥ ( s H ; W ) = 0 . Since W α + α + and W α + α 0 can be chosen arbitrarily , we hav e V α + α + = 0 , V α + α 0 = 0 . Symmetrically , U α − α − = 0 , U α 0 α − = 0 . (ii) Set everything except W α + α − to be zero: V , Θ( s Z ; s H , W ) + U, Θ ⊥ ( s Z ; s H , W ) = 2 X a ∈I + X b ∈I − µ a µ a − µ b V α a α b + − µ b µ a − µ b U α a α b , W α a α b . Since W α + α − can be arbitrarily chosen, we hav e µ a µ a − µ b V α a α b + − µ b µ a − µ b U α a α b = 0 , ∀ a ∈ I + , b ∈ I − . (46) (iii) No w w e zoom in to the α 0 α 0 blo c k. Set ev erything except W α 0 α 0 to be zero. Then from ( 33 ), V , Θ( s Z ; s H , W ) + U, Θ ⊥ ( s Z ; s H , W ) = V α 0 α 0 , Π ′ + ( s H α 0 α 0 ; W α 0 α 0 ) + U α 0 α 0 , Π ′ − ( s H α 0 α 0 ; W α 0 α 0 ) . F urther denote p U = ( Q 0 ) T U α 0 α 0 Q 0 , p V = ( Q 0 ) T V α 0 α 0 Q 0 , x W = ( Q 0 ) T W α 0 α 0 Q 0 : V , Θ( s Z ; s H , W ) + U, Θ ⊥ ( s Z ; s H , W ) = * p V , » — — – x W β 0 , + β 0 , + x W β 0 , + β 0 , 0 n η 0 ,i η 0 ,i − η 0 ,j x W β 0 ,i β 0 ,j o i ∈I 0 , + j ∈I 0 , − ∼ Π + ( x W β 0 , 0 β 0 , 0 ) 0 ∼ ∼ 0 fi ffi ffi fl + + * p U , » — — – 0 0 n − η 0 ,j η 0 ,i − η 0 ,j x W β 0 ,i β 0 ,j o i ∈I 0 , + j ∈I 0 , − ∼ Π − ( x W β 0 , 0 β 0 , 0 ) x W β 0 , 0 β 0 , − ∼ ∼ x W β 0 , − β 0 , − fi ffi ffi fl + . 27 (a) Similar to (i), for x W , set everything to 0 except x W β 0 , + β 0 , + and x W β 0 , + β 0 , 0 . W e get p V β 0 , + β 0 , + = 0 and p V β 0 , + β 0 , 0 = 0 . Symmetrically , w e get p U β 0 , − β 0 , − = 0 and p U β 0 , 0 β 0 , 0 = 0 . (b) Similar to (ii), for x W , set everything to 0 except x W β 0 , + β 0 , − . W e get η 0 ,i η 0 ,i − η 0 ,j p V β 0 ,i β 0 ,j + − η 0 ,j η 0 ,i − η 0 ,j p U β 0 ,i β 0 ,j = 0 , ∀ i ∈ I 0 , + , j ∈ I 0 , − . (47) (c) F or x W , set everything to 0 except x W β 0 , 0 β 0 , 0 . W e get D p V β 0 , 0 β 0 , 0 , Π + ( x W β 0 , 0 β 0 , 0 ) E + D p U β 0 , 0 β 0 , 0 , Π − ( x W β 0 , 0 β 0 , 0 ) E ≤ 0 , ∀ x W β 0 , 0 β 0 , 0 . T ransversing x W β 0 , 0 β 0 , 0 through S | β 0 , 0 | + , w e get p V β 0 , 0 β 0 , 0 ⪯ 0 . Symmetrically , w e get p U β 0 , 0 β 0 , 0 ⪰ 0 . (iv) Since P U = 0 , P ⊥ V = 0 , w e hav e ⟨ U, V ⟩ = 0 . On the other hand, com bining (i) - (iii): ⟨ U, V ⟩ = 2 X a ∈I + X b ∈I − ⟨ U α a α b , V α a α b ⟩ + 2 X i ∈I 0 , + X j ∈I 0 , − D p U β 0 ,i β 0 ,j , p V β 0 ,i β 0 ,j E + D p U β 0 , 0 β 0 , 0 , p V β 0 , 0 β 0 , 0 E =2 X a ∈I + X b ∈I − µ b µ a ∥ U α a α b ∥ 2 F + 2 X i ∈I 0 , + X j ∈I 0 , − η 0 ,j η 0 ,i ∥ p U β 0 ,i β 0 ,j ∥ 2 F + D p U β 0 , 0 β 0 , 0 , p V β 0 , 0 β 0 , 0 E = 0 . where the last equality comes from ( 46 ) and ( 47 ). Since µ a > 0 , µ b < 0 , η 0 ,i > 0 , η 0 ,j < 0 and p U β 0 , 0 β 0 , 0 ⪰ 0 , p V β 0 , 0 β 0 , 0 ⪯ 0 , w e get U α + α − = 0 , V α + α − = 0 , p U β 0 , + β 0 , − = 0 , p V β 0 , + β 0 , − = 0 , D p U β 0 , 0 β 0 , 0 , p V β 0 , 0 β 0 , 0 E = 0 . (v) Combining (i) - (iv), w e know for an y Y = U + V ∈ K ( s Z ; s H ) , U and V hav e the follo wing structure: U = » — — — — — – U α + α + U α + α 0 0 ∼ Q 0 » — – p U β 0 , + β 0 , + p U β 0 , + β 0 , 0 0 ∼ p U β 0 , 0 β 0 , 0 ⪰ 0 0 ∼ ∼ 0 fi ffi fl ( Q 0 ) T 0 ∼ ∼ 0 fi ffi ffi ffi ffi ffi fl , V = » — — — — — – 0 0 0 ∼ Q 0 » — – 0 0 0 ∼ p V β 0 , 0 β 0 , 0 ⪯ 0 p V β 0 , 0 β 0 , − ∼ ∼ p V β 0 , − β 0 , − fi ffi fl ( Q 0 ) T V α 0 α − ∼ ∼ V α − α − fi ffi ffi ffi ffi ffi fl . Th us, U, s S = 0 . Since P U = 0 , w e get ⟨ U, S sc ⟩ = 0 from Lemma 2 . In addition with Lemma 4 , we get ⟨ U, S sc ⟩ = * Q 0 β 0 , D β 0 , D « p U β D 0 , 0 β D 0 , 0 0 ∼ 0 ff ( Q 0 β 0 , D β 0 , D ) T , [ S sc ] α D 0 α D 0 + = 0 . Since [ S sc ] α D 0 α D 0 ≻ 0 and p U β D 0 , 0 β D 0 , 0 ⪰ 0 , w e get p U β D 0 , 0 β D 0 , 0 = 0 . Symmetrically , w e get p V β P 0 , 0 β P 0 , 0 = 0 . Since p U β 0 , 0 β 0 , 0 ⪰ 0 and p V β 0 , 0 β 0 , 0 ⪯ 0 , their fine-grained structures can b e defined: p U β 0 , 0 β 0 , 0 = « p U β P 0 , 0 β P 0 , 0 ⪰ 0 0 ∼ 0 ff , p V β 0 , 0 β 0 , 0 = « 0 0 ∼ p V β D 0 , 0 β D 0 , 0 ⪯ 0 ff . Notice that under this complementary structure, D p U β 0 , 0 β 0 , 0 , p V β 0 , 0 β 0 , 0 E = 0 automatically holds. This finishes the “ ⊆ ” part. “ ⊇ ”: Now we shall prov e that for any U ∈ K ◦ P ( s Z ; s H ) and V ∈ K ◦ D ( s Z ; s H ) , ( 45 ) holds. T o see this, ∀ W ∈ S n : V , Θ( s Z ; s H , W ) + U, Θ ⊥ ( s Z ; s H , W ) = D p V β 0 , 0 β 0 , 0 , Π + ( x W β 0 , 0 β 0 , 0 ) E + D p U β 0 , 0 β 0 , 0 , Π − ( x W β 0 , 0 β 0 , 0 ) E ≤ 0 . This finishes the “ ⊇ ” part. 28 Corollary 2 (Relationship b et w een C ( s Z ) and K ◦ ( s Z ; s H ) ) . Under Assumption 1 , for any s Z ∈ Z ⋆ and s H ∈ C ( s Z ) , C P ( s Z ) ⊆ K ◦ P ( s Z ; s H ) , C D ( s Z ) ⊆ K ◦ D ( s Z ; s H ) . Pr o of. T ak e an y H ∈ C P ( s Z ) . F rom Proposition 2 (2) and Lemma 4 , H = » — — — – H α + α + H α + α 0 0 ∼ Q 0 « ( Q 0 β 0 , P β 0 , P ) T H α P 0 α P 0 Q 0 β 0 , P β 0 , P 0 ∼ 0 ff ( Q 0 ) T 0 ∼ ∼ 0 fi ffi ffi ffi fl , with H α P 0 α P 0 ⪰ 0 and P H = 0 . Th us, H ∈ K ◦ P ( s Z ; s H ) from Prop osition 5 . The relationship b et w een C D ( s Z ) and K ◦ D ( s Z ; s H ) can b e pro ven symmetrically . 5.2 Second-Order Limits of ∆ X ( k ) and ∆ S ( k ) Under the lo cal first- and second-order dynamics in Definition 1 , if w e initialize with H (0) = s H ∈ C ( s Z ) , then H ( k ) ≡ s H and W ( k +1) − W ( k ) → ϕ ( s Z ; s H ) b y Theorem 5 . Within this lo cal model, it is natural to ask whether the second-order limits of the primal and dual increments also exist. W e give an affirmative answer. F or the primal v ariable X ( k ) = Π + ( Z ( k ) ) , w e ha ve X ( k +1) − X ( k ) = Π + ( Z ( k +1) ) − Π + ( Z ( k ) ) = t p Π ′ + ( s Z ; H ( k +1) ) − Π ′ + ( s Z ; H ( k ) ) q + t 2 2 p Π ′′ + ( s Z ; H ( k +1) , W ( k +1) ) − Π ′′ + ( s Z ; H ( k ) , W ( k ) ) q + o ( t 2 ) , where Z ( k ) is of the form ( 20 ). In the presen t regime, the first-order up dates hav e stalled, i.e., H ( k +1) = H ( k ) = s H . Hence, it suffices to analyze the limit of Π ′′ + ( s Z ; s H , W ( k +1) ) − Π ′′ + ( s Z ; s H , W ( k ) ) as k → ∞ . Similarly , for the dual v ariable S ( k ) = − 1 σ Π − ( Z ( k ) ) , it suffices to study the limit of − 1 σ p Π ′′ − ( s Z ; s H , W ( k +1) ) − Π ′′ − ( s Z ; s H , W ( k ) ) q as k → ∞ . T o proceed, w e first establish the following auxiliary lemma. Lemma 5 (Conv ergen t difference of Π + ( · ) ) . F or a symmetric matrix se quenc e { X k } ∞ k =0 , we have lim k →∞ p X k +1 − X k q = ∆ = ⇒ lim k →∞ p Π + ( X k +1 ) − Π + ( X k ) q = Π + (∆) . Pr o of. Let Y k = X k k for all k ≥ 1 , and denote ∆ k : = X k +1 − X k . Then lim k →∞ Y k = lim k →∞ ˜ X 0 k + 1 k k − 1 X i =0 ∆ i ¸ = ∆ . Since Π + ( · ) is p ositiv ely homogeneous, Π + ( X k +1 ) − Π + ( X k ) = ( k + 1)Π + ( Y k +1 ) − k Π + ( Y k ) = Π + ( Y k +1 ) + k p Π + ( Y k +1 ) − Π + ( Y k ) q . The first term satisfies Π + ( Y k +1 ) → Π + (∆) b y con tinuit y of Π + ( · ) . F or the second term, note that Y k +1 − Y k = 1 k + 1 p ∆ k − Y k q , ∆ k − Y k → 0 , hence Y k +1 − Y k = o ( 1 k ) as k → ∞ . Since Π + ( · ) is 1 -Lipschitz, ∥ Π + ( Y k +1 ) − Π + ( Y k ) ∥ F ≤ ∥ Y k +1 − Y k ∥ F = o ˆ 1 k ˙ . (48) Therefore, k p Π + ( Y k +1 ) − Π + ( Y k ) q → 0 , and the claim follows. 29 Theorem 6 ( ϕ P ( s Z ; s H ) and ϕ D ( s Z ; s H ) ) . Under the lo c al first- and se c ond-or der dynamics in Definition 1 , with initialization H (0) = s H ∈ C ( s Z ) , the lo c al se c ond-or der limit of X ( k +1) − X ( k ) is ϕ P ( s Z ; s H ) : = lim k →∞ n Π ′′ + ( s Z ; s H , W ( k +1) ) − Π ′′ + ( s Z ; s H , W ( k ) ) o = Θ( s Z ; s H , ϕ ( s Z ; s H )) , (49) and the lo c al se c ond-or der limit of S ( k +1) − S ( k ) is ϕ D ( s Z ; s H ) : = − 1 σ lim k →∞ n Π ′′ − ( s Z ; s H , W ( k +1) ) − Π ′′ − ( s Z ; s H , W ( k ) ) o = − 1 σ Θ ⊥ ( s Z ; s H , ϕ ( s Z ; s H )) , (50) wher e Θ( s Z ; s H , · ) and Θ ⊥ ( s Z ; s H , · ) ar e define d in ( 33 ) . Mor e over, ϕ ( s Z ; s H ) = ϕ P ( s Z ; s H ) − σ ϕ D ( s Z ; s H ) . Pr o of. (i) F or the primal part, by ( 35 ), Π ′′ + ( s Z ; s H , W ( k +1) ) − Π ′′ + ( s Z ; s H , W ( k ) ) = Θ( s Z ; s H , Ă W ( k +1) ) − Θ( s Z ; s H , Ă W ( k ) ) = » — — – Ă W ( k +1) α + α + − Ă W ( k ) α + α + Ă W ( k +1) α + α 0 − Ă W ( k ) α + α 0 n µ a µ a − µ b ( Ă W ( k +1) α a α b − Ă W ( k ) α a α b ) o a ∈I + b ∈I − ∼ Π ′ + ( s H α 0 α 0 ; Ă W ( k +1) α 0 α 0 ) − Π ′ + ( s H α 0 α 0 ; Ă W ( k ) α 0 α 0 ) 0 ∼ ∼ 0 fi ffi ffi fl . Since Ă W ( k +1) − Ă W ( k ) → ϕ ( s Z ; s H ) as k → ∞ by Theorem 5 , we hav e Ă W ( k +1) α a α b − Ă W ( k ) α a α b → ϕ ( s Z ; s H ) α a α b , ∀ a ∈ I , ∀ b ∈ I . Th us, it remains to handle the only nonlinear term Π ′ + ( s H α 0 α 0 ; Ă W ( k +1) α 0 α 0 ) − Π ′ + ( s H α 0 α 0 ; Ă W ( k ) α 0 α 0 ) . F or any W ∈ S n , Π ′ + ( s H α 0 α 0 ; W α 0 α 0 ) = Q 0 » — — – x W β 0 , + β 0 , + x W β 0 , + β 0 , 0 n η 0 ,i η 0 ,i − η 0 ,j ( x W β 0 ,i β 0 ,j ) o i ∈I 0 , + j ∈I 0 , − ∼ Π + ( x W β 0 , 0 β 0 , 0 ) 0 ∼ ∼ 0 fi ffi ffi fl ( Q 0 ) T where x W = ( Q 0 ) T W α 0 α 0 Q 0 . Again, the only nonlinear comp onen t is the PSD pro jector located at the β 0 , 0 β 0 , 0 blo c k. By Lemma 5 , Π + p x W ( k +1) β 0 , 0 β 0 , 0 q − Π + p x W ( k ) β 0 , 0 β 0 , 0 q → Π + p p ϕ β 0 , 0 β 0 , 0 q as k → ∞ , where p ϕ : = ( Q 0 ) T ϕ ( s Z ; s H ) α 0 α 0 Q 0 . Therefore, as k → ∞ , Π ′ + ( s H α 0 α 0 ; Ă W ( k +1) α 0 α 0 ) − Π ′ + ( s H α 0 α 0 ; Ă W ( k ) α 0 α 0 ) → Q 0 » — — – p ϕ β 0 , + β 0 , + p ϕ β 0 , + β 0 , 0 n η 0 ,i η 0 ,i − η 0 ,j ( p ϕ β 0 ,i β 0 ,j ) o i ∈I 0 , + j ∈I 0 , − ∼ Π + ( p ϕ β 0 , 0 β 0 , 0 ) 0 ∼ ∼ 0 fi ffi ffi fl ( Q 0 ) T = Π ′ + ( s H α 0 α 0 ; ϕ ( s Z ; s H ) α 0 α 0 ) , whic h implies that Π ′′ + ( s Z ; s H , W ( k +1) ) − Π ′′ + ( s Z ; s H , W ( k ) ) → Θ( s Z ; s H , ϕ ( s Z ; s H )) . (ii) The dual part follows by symmetry: one can similarly sho w that Π ′′ − ( s Z ; s H , W ( k +1) ) − Π ′′ − ( s Z ; s H , W ( k ) ) → Θ ⊥ ( s Z ; s H , ϕ ( s Z ; s H )) as k → ∞ . The final iden tity ϕ ( s Z ; s H ) = ϕ P ( s Z ; s H ) − σ ϕ D ( s Z ; s H ) follows from ϕ ( s Z ; s H ) = Θ( s Z ; s H , ϕ ( s Z ; s H ))+Θ ⊥ ( s Z ; s H , ϕ ( s Z ; s H )) in ( 35 ). 30 5.3 Primal–Dual Decoupling of ϕ ( s Z ; s H ) Theorem 6 connects ϕ P ( s Z ; s H ) (resp. ϕ D ( s Z ; s H ) ) with the limiting b eha vior of X ( k +1) − X ( k ) (resp. S ( k +1) − S ( k ) ). The next theorem further rev eals a deeper connection b etw een ϕ P ( s Z ; s H ) (resp. ϕ D ( s Z ; s H ) ) and K ◦ P ( s Z ; s H ) (resp. K ◦ D ( s Z ; s H ) ). Theorem 7 (Primal–dual decoupling of ϕ ( s Z ; s H ) ) . L et ϕ P ( s Z ; s H ) b e define d in ( 49 ) and ϕ D ( s Z ; s H ) in ( 50 ) . L et K ◦ P ( s Z ; s H ) and K ◦ D ( s Z ; s H ) b e define d in ( 44 ) . Then, under Assumption 1 , ϕ P ( s Z ; s H ) = arg min W ∈K ◦ P ( s Z ; Ď H ) ∥ W + E ⊥ ( s Z ; s H ) ∥ 2 F = Π K ◦ P ( s Z ; Ď H ) ( −E ⊥ ( s Z ; s H )) , (51a) ϕ D ( s Z ; s H ) = − 1 σ arg min W ∈K ◦ D ( s Z ; Ď H ) ∥ W + E ( s Z ; s H ) ∥ 2 F = − 1 σ Π K ◦ D ( s Z ; Ď H ) ( −E ( s Z ; s H )) . (51b) wher e E ( s Z ; s H ) and E ⊥ ( s Z ; s H ) ar e define d in ( 34 ) . Pr o of. Since K ◦ ( s Z ; s H ) is closed and conv ex, the optimal solution of inf W ∈K ◦ ( s Z ; Ď H ) ∥ W − Ψ( s Z ; s H ) ∥ 2 F can b e attained and is unique, where Ψ( s Z ; s H ) is defined in ( 37 ). A dditionally , with Prop osition 5 , we get ϕ ( s Z ; s H ) = Π K ◦ ( s Z ; Ď H ) Ψ( s Z ; s H ) = arg min W ∈K ◦ ( s Z ; Ď H ) ∥ W − Ψ( s Z ; s H ) ∥ 2 F = arg min W = U + V , U ∈K ◦ P ( s Z ; Ď H ) ,V ∈K ◦ D ( s Z ; Ď H ) ∥ U + V − Ψ( s Z ; s H ) ∥ 2 F = arg min W = U + V , U ∈K ◦ P ( s Z ; Ď H ) ,V ∈K ◦ D ( s Z ; Ď H ) ∥ U + V + P E ( s Z ; s H ) + P ⊥ E ⊥ ( s Z ; s H ) ∥ 2 F . Since U ∈ K ◦ P ( s Z ; s H ) , w e get P U = 0 . Symmetrically , P ⊥ V = 0 . Therefore, ∥ U + V + P E ( s Z ; s H ) + P ⊥ E ⊥ ( s Z ; s H ) ∥ 2 F = ∥P ⊥ U + P V + P E ( s Z ; s H ) + P ⊥ E ⊥ ( s Z ; s H ) ∥ 2 F = ∥P ⊥ U + P ⊥ E ⊥ ( s Z ; s H ) ∥ 2 F + ∥P V + P E ( s Z ; s H ) ∥ 2 F = ∥ U + E ⊥ ( s Z ; s H ) ∥ 2 F + ∥ V + E ( s Z ; s H ) ∥ 2 F − ∥P E ⊥ ( s Z ; s H ) ∥ 2 F − ∥P ⊥ E ( s Z ; s H ) ∥ 2 F , where in the last equality , w e use the prop ert y that ∥ U + E ⊥ ( s Z ; s H ) ∥ 2 F = ∥P U + P E ⊥ ( s Z ; s H ) ∥ 2 F + ∥P ⊥ U + P ⊥ E ⊥ ( s Z ; s H ) ∥ 2 F = ∥P E ⊥ ( s Z ; s H ) ∥ 2 F + ∥P ⊥ U + P ⊥ E ⊥ ( s Z ; s H ) ∥ 2 F , ∥ V + E ( s Z ; s H ) ∥ 2 F = ∥P V + P E ( s Z ; s H ) ∥ 2 F + ∥P ⊥ V + P ⊥ E ( s Z ; s H ) ∥ 2 F = ∥P V + P E ( s Z ; s H ) ∥ 2 F + ∥P ⊥ E ( s Z ; s H ) ∥ 2 F . Notice that −∥P E ⊥ ( s Z ; s H ) ∥ 2 F − ∥P ⊥ E ( s Z ; s H ) ∥ 2 F is a constant and do es not affect the arg min . After observing that U ∈ K ◦ P ( s Z ; s H ) and V ∈ K ◦ D ( s Z ; s H ) are totally decoupled in the ob jectiv e, w e get ϕ ( s Z ; s H ) = arg min U ∈K ◦ P ( s Z ; Ď H ) ∥ U + E ⊥ ( s Z ; s H ) ∥ 2 F loooooooooooooooo omoooooooooooooooo on = : s U + arg min V ∈K ◦ D ( s Z ; Ď H ) ∥ V + E ( s Z ; s H ) ∥ 2 F lo oooooooooooooo omo ooooooooooooooon = : s V , where s U (resp. s V ) is attainable and unique since K ◦ P ( s Z ; s H ) (resp. K ◦ D ( s Z ; s H ) ) is closed and conv ex. No w we shall pro ve that s U = ϕ P ( s Z ; s H ) and s V = − σ ϕ D ( s Z ; s H ) . F rom Proposition 5 , ϕ ( s Z ; s H ) = s U + s V = » — — — — — — — — – s U α + α + s U α + α 0 0 ∼ Q 0 » — — — — – p U β 0 , + β 0 , + p U β 0 , + β P 0 , 0 p U β 0 , + β D 0 , 0 0 ∼ p U β P 0 , 0 β P 0 , 0 ⪰ 0 0 p V β P 0 , 0 β 0 , − ∼ ∼ p V β D 0 , 0 β D 0 , 0 ⪯ 0 p V β D 0 , 0 β 0 , − ∼ ∼ ∼ p V β 0 , − β 0 , − fi ffi ffi ffi ffi fl ( Q 0 ) T s V α 0 α − ∼ ∼ s V α − α − fi ffi ffi ffi ffi ffi ffi ffi ffi fl , 31 where p U = ( Q 0 ) T s U α 0 α 0 Q 0 and p V = ( Q 0 ) T s V α 0 α 0 Q 0 . Then, from Theorem 1 , Π ′ + ( s H α 0 α 0 ; ϕ ( s Z ; s H ) α 0 α 0 ) = Π ′ + ( s H α 0 α 0 ; s U α 0 α 0 + s V α 0 α 0 ) = Q 0 Π ′ + ( p H ; p U + p V )( Q 0 ) T = Q 0 » — — — – p U β 0 , + β 0 , + p U β 0 , + β P 0 , 0 p U β 0 , + β D 0 , 0 0 ∼ p U β P 0 , 0 β P 0 , 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl ( Q 0 ) T , where p H : = ( Q 0 ) T s H α 0 α 0 Q 0 is diagonal. Th us, from ( 33 ), Θ( s Z ; s H , ϕ ( s Z ; s H )) = » — — — — — — — — – s U α + α + s U α + α 0 0 ∼ Q 0 » — — — – p U β 0 , + β 0 , + p U β 0 , + β P 0 , 0 p U β 0 , + β D 0 , 0 0 ∼ p U β P 0 , 0 β P 0 , 0 ⪰ 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl ( Q 0 ) T 0 ∼ ∼ 0 fi ffi ffi ffi ffi ffi ffi ffi ffi fl = s U , where in the last equalit y , we use s U ∈ K ◦ P ( s Z ; s H ) and Prop osition 5 again. Symmetrically , s V = Θ ⊥ ( s Z ; s H , ϕ ( s Z ; s H )) . W e close the pro of b y recalling Theorem 6 : Θ( s Z ; s H , ϕ ( s Z ; s H )) = ϕ P ( s Z ; s H ) and Θ ⊥ ( s Z ; s H , ϕ ( s Z ; s H )) = − σ ϕ D ( s Z ; s H ) . 6 Kernel of ϕ ( s Z ; · ) The first prop ert y of ϕ ( s Z ; · ) that we study is its kernel ker( ϕ ( s Z ; · )) , i.e., { s H ∈ C ( s Z ) | ϕ ( s Z ; s H ) = 0 } . The set k er( ϕ ( s Z ; · )) directly characterizes when the local second-order limiting dynamics ( 43 ) is effe ctive , in the sense that the higher-order term o ( ∥ Z ( k ) − s Z ∥ 2 F ) in ( 43 ) can b e neglected. When s H ∈ C ( s Z ) \ k er( ϕ ( s Z ; · )) , we hav e ϕ ( s Z ; s H ) = 0 , and the second-order term dominates the ev olution in ( 43 ). Conv ersely , if s H ∈ k er( ϕ ( s Z ; · )) ( i.e., the second-order term v anishes) yet the true one-step ADMM iteration ( 4 ) do es not conv erge, then higher-order terms must b e taken into account. It turns out that k er( ϕ ( s Z ; · )) admits a clean characterization. Prop osition 6 (Kernel of ϕ ( s Z ; · ) ) . Under Assumption 1 , for any s Z ∈ Z ⋆ , ker( ϕ ( s Z ; · )) = T Z ⋆ ( s Z ) . The pro of of Prop osition 6 is divided into tw o parts. First, we show that ϕ ( s Z ; s H ) = 0 for any s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) (Lemma 6 in § 6.1 ). Second, we sho w that ϕ ( s Z ; s H ) = 0 for any s H ∈ T Z ⋆ ( s Z ) (Lemma 7 in § 6.2 ). In § 6.3 , we discuss one implication of ker( ϕ ( s Z ; · )) . In slo w-conv ergence regions of the one-step ADMM iteration ( 4 ), a typical pattern is that (∆ Z ( k ) , ∆ Z ( k +1) ) tends to b e very small y et is generally nonzero. W e explain this phenomenon using our local second-order limiting dynamics model ( 43 ), with the initialization Z (0) c hosen in s Z + ( C ( s Z ) \T Z ⋆ ( s Z )) . 6.1 Pro of of “ s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) = ⇒ ϕ ( s Z ; s H ) = 0 ” The motiv ation for this part comes from Sturm’s square-root error b ound under the existence of a strictly complemen tary primal–dual pair [ 47 ]. Since s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) , the forward error dist( s Z + t s H , Z ⋆ ) is of order t . Consequently , under Assumption 1 , the backw ard error δ ( s Z + t s H ) must exhibit a nonzero response at order t 2 . Lemma 6 ( s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) = ⇒ ϕ ( s Z ; s H ) = 0 ) . Under Assumption 1 , pick any s H ∈ C ( s Z ) . If ϕ ( s Z ; s H ) = 0 , then s H ∈ T Z ⋆ ( s Z ) . 32 Pr o of. W e aim to show that if ϕ ( s Z ; s H ) = 0 , then s H α + α D 0 = 0 and s H α P 0 α − = 0 . F rom Theorem 5 , if ϕ ( s Z ; s H ) = Ψ( s Z ; s H ) − Π K ( s Z ; Ď H ) (Ψ( s Z ; s H )) = 0 , w e hav e Ψ( s Z ; s H ) ∈ K ( s Z ; s H ) . By ( 40 ), there exists a con vergen t sequence { Ψ i } ∞ i =1 → Ψ( s Z ; s H ) , suc h that for eac h Ψ i , there exists W i ∈ S n with: P Θ( s Z ; s H , W i ) + P ⊥ Θ ⊥ ( s Z ; s H , W i ) = Ψ i . By the defini tion of { Ψ i } ∞ i =1 , ∀ ϵ > 0 , ∃ N ϵ ∈ N , such that ∀ i ≥ N ϵ , ∥ Ψ( s Z ; s H ) − Ψ i ∥ F ≤ ϵ . Substituting Ψ( s Z ; s H ) ’s form ula from ( 37 ): ∥P Θ( s Z ; s H , W i ) + P ⊥ Θ ⊥ ( s Z ; s H , W i ) − ( −P E ( s Z ; s H ) − P ⊥ E ⊥ ( s Z ; s H )) ∥ F ≤ ϵ = ⇒ ( ∥P { Θ( s Z ; s H , W i ) + E ( s Z ; s H ) }∥ F ≤ ϵ ∥P ⊥ { Θ ⊥ ( s Z ; s H , W i ) + E ⊥ ( s Z ; s H ) }∥ F ≤ ϵ . W e first fo cus on the primal part: ∥P Θ( s Z ; s H , W i ) + E ( s Z ; s H ) ∥ F ≤ ϵ . Perform an expansion for Θ( s Z ; s H , W i )+ E ( s Z ; s H ) from ( 33 ) and ( 34 ): Θ( s Z ; s H , W i ) + E ( s Z ; s H ) = » — — — — — – W i α + α + W i α a α 0 − 2 1 µ a Ď H α a α 0 Π + ( − Ď H α 0 α 0 ) a ∈I + µ a µ a − µ b W i α a α b +2 1 µ a − µ b Ď H α a α 0 Ď H α 0 α b a ∈I + b ∈I − ∼ Π ′ + ( Ď H α 0 α 0 ; W i α 0 α 0 ) +2 P c ∈I + 1 µ c Ď H α 0 α c Ď H α c α 0 { 2 1 − µ b Π + ( Ď H α 0 α 0 ) Ď H α 0 α b } b ∈I − ∼ ∼ 0 fi ffi ffi ffi ffi ffi fl . No w our goal is to sho w [Θ( s Z ; s H , W i ) + E ( s Z ; s H )] α D α D = « [Θ( s Z ; s H , W i ) + E ( s Z ; s H )] α D 0 α D 0 0 ∼ 0 ff with [Θ( s Z ; s H , W i ) + E ( s Z ; s H )] α D 0 α D 0 ⪰ 0 . (52) (a) F or 2 1 − µ b Π + ( s H α 0 α 0 ) s H α 0 α b , ∀ b ∈ I − , w e notice from Prop osition 2 (2) that Π + ( s H α 0 α 0 ) = « [Π + ( s H α 0 α 0 )] β 0 , P β 0 , P 0 ∼ 0 ff . Th us, Π + ( s H α 0 α 0 ) s H α 0 α b = « [Π + ( s H α 0 α 0 )] β 0 , P β 0 , P 0 ∼ 0 ff « s H α P 0 α b s H α D 0 α b ff = « [Π + ( s H α 0 α 0 )] β 0 , P β 0 , P s H α P 0 α b 0 ff , whic h directly implies [Θ( s Z ; s H , W i ) + E ( s Z ; s H )] α D 0 α − = 0 . (b) F or [Θ( s Z ; s H , W i ) + E ( s Z ; s H )] α D 0 α D 0 = [Π ′ + ( s H α 0 α 0 ; W i α 0 α 0 ) + 2 P c ∈I + 1 µ c s H α 0 α c s H α c α 0 ] β 0 , D β 0 , D : since 2 P c ∈I + 1 µ c s H α 0 α c s H α c α 0 ⪰ 0 , all we need to prov e is [Π ′ + ( s H α 0 α 0 ; W i α 0 α 0 )] β 0 , D β 0 , D ⪰ 0 . F rom ( 11 ), we hav e Π ′ + ( s H α 0 α 0 ; W i α 0 α 0 ) = Q 0 Π ′ + ` ( Q 0 ) T s H α 0 α 0 Q 0 , ( Q 0 ) T W i α 0 α 0 Q 0 ˘ ( Q 0 ) T = Q 0 » — — — — — – x W β 0 , + β 0 , + x W β 0 , + β P 0 , 0 x W β 0 , + β D 0 , 0 n η 0 ,i η 0 ,i − η 0 ,j x W β 0 ,i β 0 ,j o i ∈I 0 , + j ∈I 0 , − ∼ [Π + ( x W β 0 , 0 β 0 , 0 )] γ 0 , 0 , P γ 0 , 0 , P [Π + ( x W β 0 , 0 β 0 , 0 )] γ 0 , 0 , P γ 0 , 0 , D 0 ∼ ∼ [Π + ( x W β 0 , 0 β 0 , 0 )] γ 0 , 0 , D γ 0 , 0 , D 0 ∼ ∼ ∼ 0 fi ffi ffi ffi ffi ffi fl ( Q 0 ) T , 33 where we abbreviate ( Q 0 ) T W i α 0 α 0 Q 0 as x W . γ 0 , 0 , P and γ 0 , 0 , D divide Π + ( x W β 0 , 0 β 0 , 0 ) ’s indices by the primal and dual part. Moreo v er, from Lemma 4 , Q 0 β 0 , P β 0 , P ∈ O | β 0 , P | ( s H α P 0 α P 0 ) and Q 0 β 0 , D β 0 , D ∈ O | β 0 , D | ( s H α D 0 α D 0 ) . Therefore, [Π ′ + ( s H α 0 α 0 ; W i α 0 α 0 )] β 0 , D β 0 , D = Q 0 β 0 , D β 0 , D « [Π + ( x W β 0 , 0 β 0 , 0 )] γ 0 , 0 , D γ 0 , 0 , D 0 ∼ 0 ff ( Q 0 β 0 , D β 0 , D ) T ⪰ 0 . Com bining (a) and (b), we prov e ( 52 ). On the other hand, w e notice that Θ( s Z ; s H , W i ) + E ( s Z ; s H ) , s S = 0 regardless of the choice of W i . F rom Lemma 2 , ˇ ˇ Θ( s Z ; s H , W i ) + E ( s Z ; s H ) , S sc ˇ ˇ ≤ ϵ ∥ S sc − s S ∥ F . Since [ S sc ] α D 0 α D 0 ≻ 0 , [Θ( s Z ; s H , W i ) + E ( s Z ; s H )] α D 0 α D 0 ⪰ 0 , and Θ( s Z ; s H , W i ) + E ( s Z ; s H ) , S sc = D [Θ( s Z ; s H , W i ) + E ( s Z ; s H )] α D 0 α D 0 , [ S sc ] α D 0 α D 0 E from ( 52 ), we hav e ∥ [Θ( s Z ; s H , W i ) + E ( s Z ; s H )] α D 0 α D 0 ∥ F ≤ tr ´ [Θ( s Z ; s H , W i ) + E ( s Z ; s H )] α D 0 α D 0 ¯ ≤ D [Θ( s Z ; s H , W i ) + E ( s Z ; s H )] α D 0 α D 0 , [ S sc ] α D 0 α D 0 E λ min ([ S sc ] α D 0 α D 0 ) ≤ ∥ S sc − s S ∥ F λ min ([ S sc ] α D 0 α D 0 ) · ϵ. The first inequality comes from the prop ert y that ∥ A ∥ F ≤ tr p A q if A ⪰ 0 . The second inequality is b ecause for an y A ⪰ 0 and B ≻ 0 , λ min ( B ) · tr p A q ≤ ⟨ A, B ⟩ . T ogether with [Π ′ + ( s H α 0 α 0 ; W i α 0 α 0 )] β 0 , D β 0 , D ⪰ 0 , [2 P c ∈I + 1 µ c s H α 0 α c s H α c α 0 ] β 0 , D β 0 , D ⪰ 0 , and [Θ( s Z ; s H , W i ) + E ( s Z ; s H )] α D 0 α D 0 = [Π ′ + ( s H α 0 α 0 ; W i α 0 α 0 ) + 2 P c ∈I + 1 µ c s H α 0 α c s H α c α 0 ] β 0 , D β 0 , D , w e get [2 X c ∈I + 1 µ c s H α 0 α c s H α c α 0 ] β 0 , D β 0 , D F ≤ ∥ S sc − s S ∥ F λ min ([ S sc ] α D α D ) · ϵ. Observing that the abov e error b ound do es not contain W i and ϵ could be pic k ed arbitrarily small, w e get [2 X c ∈I + 1 µ c s H α 0 α c s H α c α 0 ] β 0 , D β 0 , D F = 0 . Due to the positive semi-definiteness of { s H α 0 α c s H α c α 0 } c ∈I + ’s, w e get s H α D 0 α c s H α c α D 0 F = [ s H α 0 α c s H α c α 0 ] β 0 , D β 0 , D F = 0 , ∀ c ∈ I + . Therefore, s H α + α D 0 = 0 . By primal–dual symmetry , s H α − α P 0 = 0 . Finally , by Prop osition 3 (2), we get s H ∈ T Z ⋆ ( s Z ) . 34 6.2 Pro of of “ s H ∈ T Z ⋆ ( s Z ) = ⇒ ϕ ( s Z ; s H ) = 0 ” In tuitively , if s H ∈ ri( T Z ⋆ ( s Z )) , then ϕ ( s Z ; s H ) = 0 . The following stronger result sho ws that ϕ ( s Z ; s H ) will v anish ev en when s H ∈ T Z ⋆ ( s Z ) \ ri( T Z ⋆ ( s Z )) . Lemma 7 ( s H ∈ T Z ⋆ ( s Z ) = ⇒ ϕ ( s Z ; s H ) = 0 ) . Under Assumption 1 , if s H ∈ T Z ⋆ ( s Z ) , then ϕ ( s Z ; s H ) = 0 . Pr o of. Proof b y construction. Since ( X sc , S sc ) is a strictly complementary pair, [ X sc ] α P 0 α P 0 ≻ 0 and [ S sc ] α D 0 α D 0 ≻ 0 . Define t w o constan ts κ P = λ max (2 P c ∈I + 1 µ c s H α P 0 α c s H α c α P 0 ) λ min ([ X sc ] α P 0 α P 0 ) , κ D = λ max ( − 2 P c ∈I − 1 µ c s H α D 0 α c s H α c α D 0 ) λ min ([ σ S sc ] α D 0 α D 0 ) . (53) Construct W as follows: » — — — — — — – κ P · [ X sc − s X ] α + α + κ P · [ X sc ] α + α P 0 0 0 ∼ κ P · [ X sc ] α P 0 α P 0 − 2 P c ∈I + 1 µ c s H α P 0 α c s H α c α P 0 0 0 ∼ ∼ − κ D · [ σ S sc ] α D 0 α D 0 − 2 P c ∈I − 1 µ c s H α D 0 α c s H α c α D 0 − κ D · [ σ S sc ] α D 0 α − ∼ ∼ ∼ − κ D · [ σ S sc − σ s S ] α − α − fi ffi ffi ffi ffi ffi ffi fl . W e shall prov e Ψ( s Z ; s H ) = P Θ( s Z ; s H , W ) + P ⊥ Θ ⊥ ( s Z ; s H , W ) , whic h implies Ψ( s Z ; s H ) ∈ K ( s Z ; s H ) . (i) Primal part. Since s H ∈ T Z ⋆ ( s Z ) , we hav e s H α a α D 0 = 0 , ∀ a ∈ I + and s H α P 0 α b = 0 , ∀ b ∈ I − . In addition with s H α 0 α 0 = « [ s H α 0 α 0 ] β 0 , P β 0 , P 0 ∼ [ s H α 0 α 0 ] β 0 , D β 0 , D ff , where [ s H α 0 α 0 ] β 0 , P β 0 , P ⪰ 0 , [ s H α 0 α 0 ] β 0 , D β 0 , D ⪯ 0 , w e ha v e − 2 1 µ a s H α a α 0 Π + ( − s H α 0 α 0 ) = − 2 1 µ a ” s H α a α P 0 0 ı « 0 0 ∼ − [ s H α 0 α 0 ] β 0 , D β 0 , D ff = 0 , ∀ a ∈ I + , 2 1 − µ b Π + ( s H α 0 α 0 ) s H α 0 α b = 2 1 − µ b « [ s H α 0 α 0 ] β 0 , P β 0 , P 0 ∼ 0 ff « 0 s H α D 0 α b ff = 0 , ∀ b ∈ I − . In additional with s H α a α 0 s H α 0 α b = ” s H α a α P 0 0 ı « 0 s H α D 0 α b ff = 0 , ∀ a ∈ I + , b ∈ I − , w e get E ( s Z ; s H ) = » — – 0 0 0 ∼ 2 P c ∈I + 1 µ c s H α 0 α c s H α c α 0 0 ∼ ∼ 0 fi ffi fl = » — — — – 0 0 0 0 ∼ 2 P c ∈I + 1 µ c s H α P 0 α c s H α c α P 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl from E ( s Z ; s H ) ’s definition in ( 34 ). Now w e calculate Θ( s Z ; s H , W ) . The most complex part is Π ′ + ( s H α 0 α 0 ; W α 0 α 0 ) . With κ P set as in ( 53 ), λ min ( κ P · [ X sc ] α P 0 α P 0 ) = λ max (2 P c ∈I + 1 µ c s H α P 0 α c s H α c α P 0 ) . Thus, κ P · [ X sc ] α P 0 α P 0 − 2 P c ∈I + 1 µ c s H α P 0 α c s H α c α P 0 ⪰ 0 . Symmetrically , κ D · [ σ S sc ] α D 0 α D 0 + 2 P c ∈I − 1 µ c s H α D 0 α c s H α c α D 0 ⪰ 0 . Define x W : = ( Q 0 ) T W α 0 α 0 Q 0 =( Q 0 ) T « κ P · [ X sc ] α P 0 α P 0 − 2 P c ∈I + 1 µ c s H α P 0 α c s H α c α P 0 0 ∼ − κ D · [ σ S sc ] α D 0 α D 0 − 2 P c ∈I − 1 µ c s H α D 0 α c s H α c α D 0 ff Q 0 , 35 with the blo c k-diagonal Q 0 defined in Lemma 4 . Thus, x W β 0 , P β 0 , P ⪰ 0 , x W β 0 , D β 0 , D ⪯ 0 , and x W β 0 , P β 0 , D = 0 . Consequen tly , Π + ( x W β 0 , 0 β 0 , 0 ) = « x W β P 0 , 0 β P 0 , 0 0 ∼ 0 ff b ecause x W β P 0 , 0 β P 0 , 0 (resp. x W β D 0 , 0 β D 0 , 0 ) is a principle submatrix of x W β 0 , P β 0 , P (resp. x W β 0 , D β 0 , D ). Now, Π ′ + ( s H α 0 α 0 ; W α 0 α 0 ) = Q 0 » — — — — — – x W β 0 , + β 0 , + x W β 0 , + β P 0 , 0 x W β 0 , + β D 0 , 0 n η 0 ,i η 0 ,i − η 0 ,j x W β 0 ,i β 0 ,j o i ∈I 0 , + j ∈I 0 , − ∼ [Π + ( x W β 0 , 0 β 0 , 0 )] γ 0 , 0 , P γ 0 , 0 , P [Π + ( x W β 0 , 0 β 0 , 0 )] γ 0 , 0 , P γ 0 , 0 , D 0 ∼ ∼ [Π + ( x W β 0 , 0 β 0 , 0 )] γ 0 , 0 , D γ 0 , 0 , D 0 ∼ ∼ ∼ 0 fi ffi ffi ffi ffi ffi fl ( Q 0 ) T = « Q 0 β 0 , P β 0 , P 0 0 Q 0 β 0 , D β 0 , D ff » — — — — – x W β 0 , + β 0 , + x W β 0 , + β P 0 , 0 0 0 ∼ x W β P 0 , 0 β P 0 , 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi ffi fl « Q 0 β 0 , P β 0 , P 0 0 Q 0 β 0 , D β 0 , D ff T = « W α P 0 α P 0 0 ∼ 0 ff . Therefore, from ( 33 ), Θ( s Z ; s H , W ) = » — — — – W α + α + W α + α P 0 0 0 ∼ W α P 0 α P 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl = « W α P α P 0 ∼ 0 ff . Com bining E ( s Z ; s H ) and Θ( s Z ; s H , W ) : Θ( s Z ; s H , W ) + E ( s Z ; s H ) = » — — — – κ P · [ X sc − s X ] α + α + κ P · [ X sc ] α + α P 0 0 0 ∼ κ P · [ X sc ] α P 0 α P 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl = κ P · ( X sc − s X ) . By the definition of X sc and s X , P X sc = P s X = P r X . Thus, P (Θ( s Z ; s H , W ) + E ( s Z ; s H )) = κ P · P ( X sc − s X ) = 0 . (ii) Dual part. Same as the pro of procedure for the primal part, we can sho w that E ⊥ ( s Z ; s H ) = » — — — – 0 0 0 0 ∼ 0 0 0 ∼ ∼ 2 P c ∈I − 1 µ c s H α P 0 α c s H α c α P 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl and Θ ⊥ ( s Z ; s H , W ) = « 0 0 ∼ W α D α D ff . With the fact that P ⊥ S sc = P ⊥ s S = P ⊥ C : P ⊥ (Θ ⊥ ( s Z ; s H , W ) + E ⊥ ( s Z ; s H )) = − κ D · P ⊥ ( σ S sc − σ s S ) = 0 . (iii) Com bining the primal and dual part: Ψ( s Z ; s H ) = −P E ( s Z ; s H ) − P ⊥ E ⊥ ( s Z ; s H ) = P Θ( s Z ; s H , W ) + P ⊥ Θ ⊥ ( s Z ; s H , W ) , whic h directly implies Ψ( s Z ; s H ) ∈ K ( s Z ; s H ) and ϕ ( s Z ; s H ) = 0 . 36 6.3 Discussion: Small y et Non-Zero (∆ Z ( k ) , ∆ Z ( k +1) ) F rom Prop osition 6 , as long as s Z ∈ Z ⋆ and Z ( k ) − s Z ∈ C ( s Z ) \T Z ⋆ ( s Z ) , ϕ ( s Z ; Z ( k ) − s Z ) is guaranteed to b e non- zero. In this case, the higher-order term in the second-order lo cal limit dynamics ( 43 ) can b e (transiently) omitted, and ∆ Z ( k ) ≈ 1 2 ϕ ( s Z ; Z ( k ) − s Z ) ∼ o ( ∥ Z ( k ) − s Z ∥ F ) . The approximation b ecomes more and more accurate as Z − s Z → 0 . In this case, (∆ Z ( k ) , ∆ Z ( k +1) ) ≈ ( ϕ ( s Z ; Z ( k ) − s Z ) , ϕ ( s Z ; Z ( k ) − s Z + ∆ Z ( k ) )) . Therefore, as long as ϕ ( s Z ; · ) can exhibit certain t yp e of con tinuit y at Z ( k ) − s Z (and the “almost-sure” type con tinuit y will b e established in § 8.2 ), one could exp ect (∆ Z ( k ) , ∆ Z ( k +1) ) → 0 as Z ( k ) − s Z → 0 . The “small y et non-zero” effect may b e due to the presence of higher-order terms. W e empirically verify our analysis with three SDP examples defined in § 10 . Across all examples, w e fix σ to 1 and the tolerance for r max is set to 10 − 14 . The maximum three-step ADMM iteration num b er is set to 1000 . s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) is chosen as: 1. F or ( SDP-I ), a = 1 and b = 1 in ( 58 ). The corresponding ϕ ( s Z ; s H ) is defined in ( 59 ). 2. F or ( SDP-II ), s H 12 = 1 , s H 22 = 1 , s H 23 = 1 in ( 60 ). The corresp onding ϕ ( s Z ; s H ) is defined in ( 61 ). 3. F or ( SDP-II I ), h = 1 and ϵ = 0 in ( 62 ). The corresponding ϕ ( s Z ; s H ) is defined in ( 63 ). The initial guess Z (0) is set to s Z + t s H with different t ’s. Accordingly , X (0) = Π + ( Z (0) ) and S (0) = − 1 σ Π − ( Z (0) ) . W e c heck the tra jectories of four quantities: ∥ ∆ Z ( k ) ∥ F , (∆ Z ( k ) , ∆ Z ( k +1) ) , ∥ 0 . 5 ϕ ( s Z ; Z (0) − s Z ) − ∆ Z ( k ) ∥ F ∥ ∆ Z ( k ) ∥ F , ∥ 0 . 5 ϕ ( s Z ; Z ( k ) − s Z ) − ∆ Z ( k ) ∥ F ∥ ∆ Z ( k ) ∥ F . Discussion on ∥ ∆ Z ( k ) ∥ F . The results are sho wn in Figure 4 . When t is relativ ely large ( e.g., log 10 ( t ) > − 2 ), ADMM still exhibits an observ able linear conv ergence rate. Ho w ev er, as t ↓ 0 , this rate approac hes to 1 and the iterations nearly stall. On the other hand, when t is sufficien tly small ( e.g., log 10 ( t ) < − 3 . 5 ), the second-order term starts to dominate the dynamics, and ∆ Z ( k ) transien tly conv erges to t 2 2 ϕ ( s Z ; s H ) . This quadratic relationship is eviden t in Figure 4 : as log 10 ( t ) decreases by 1 , the transien tly con vergen t log 10 ( ∥ ∆ Z ( k ) ∥ F ) decreases by approximately 2 across all three SDP examples. ( SDP-I ) ( SDP-I I ) ( SDP-I II ) Figure 4: log 10 ( ∥ ∆ Z ( k ) ∥ F ) in three SDP examples. In eac h example, the initialization is c hosen as Z (0) = s Z + t s H , where s Z ∈ Z ⋆ and s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) , and we sweep t from 10 − 1 to 10 − 5 . σ is fixed to 1 . 37 Discussion on (∆ Z ( k ) , ∆ Z ( k +1) ) . The results are shown in Figure 5 . As t ↓ 0 , the transien tly conv ergent (∆ Z ( k ) , ∆ Z ( k +1) ) tends to become smaller. One noticeable phenomenon is that the con vergen t angle do es not app ear to decrease monotonically: in all examples, as t decreases from 10 − 4 to 10 − 5 , the transiently con vergen t angle actually increases. This behavior may b e caused by n umerical issues when computing angles betw een t wo extremely small vectors in double p recision. ( SDP-I ) ( SDP-I I ) ( SDP-I II ) Figure 5: log 10 ( (∆ Z ( k ) , ∆ Z ( k +1) )) in three SDP examples. In each example, the initialization is chosen as Z (0) = s Z + t s H , where s Z ∈ Z ⋆ and s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) , and we sweep t from 10 − 1 to 10 − 5 . σ is fixed to 1 . Discussion on ∥ 0 . 5 ϕ ( s Z ; Z (0) − s Z ) − ∆ Z ( k ) ∥ F ∥ ∆ Z ( k ) ∥ F . The results are sho wn in Figure 6 . As t ↓ 0 , ∆ Z ( k ) first transien tly con verges to 0 . 5 ϕ ( s Z ; Z (0) − s Z ) = t 2 2 ϕ ( s Z ; s H ) , and then gradually deviates from it. This deviation is caused b y the change of Z ( k ) discussed in § 4.4 . ( SDP-I ) ( SDP-I I ) ( SDP-I II ) Figure 6: log 10 ( ∥ 0 . 5 ϕ ( s Z ; Z (0) − s Z ) − ∆ Z ( k ) ∥ F ∥ ∆ Z ( k ) ∥ F ) in the three SDP examples. F or visualization, we upp er-clamp the v alues at 1 . In each example, the initialization is c hosen as Z (0) = s Z + t s H , where s Z ∈ Z ⋆ and s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) , and we sweep t from 10 − 1 to 10 − 5 . σ is fixed to 1 . Discussion on ∥ 0 . 5 ϕ ( s Z ; Z ( k ) − s Z ) − ∆ Z ( k ) ∥ F ∥ ∆ Z ( k ) ∥ F . The results are shown in Figure 7 . Since the complete description of C ( s Z ) and the corresp onding ϕ ( s Z ; · ) is hard to obtain in ( SDP-II I ), we rep ort only the results for ( SDP-I ) 38 and ( SDP-I I ). Unlike Figure 6 , the second-order limit pred ictor 0 . 5 ϕ ( s Z ; Z ( k ) − s Z ) stably trac ks ∆ Z ( k ) . In terestingly , as log 10 ( t ) decreases b y 1 , the log of relative tracking error also decreases by 1 . When log 10 ( t ) ≤ − 1 . 5 , the tra jectories tend to be noisy . This is b ecause when t is relativ ely large, ∆ Z ( k ) ’s linearly conv erge to 0 quic kly . Therefore, the division becomes unstable in double precision. W e early stop the tra jectories as long as r ( k ) max approac hes 10 − 14 . ( SDP-I ) ( SDP-I I ) Figure 7: log 10 ( ∥ 0 . 5 ϕ ( s Z ; Z ( k ) − s Z ) − ∆ Z ( k ) ∥ F ∥ ∆ Z ( k ) ∥ F ) in the first t w o SDP examples. In eac h example, the initialization is c hosen as Z (0) = s Z + t s H , where s Z ∈ Z ⋆ and s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) , and w e sweep t from 10 − 1 to 10 − 5 . σ is fixed to 1 . 7 Range of ϕ ( s Z ; · ) The second property of ϕ ( s Z ; · ) that we study is its range, i.e., ran( ϕ ( s Z ; · )) . F or the one-step ADMM iteration ( 4 ), ϕ ( s Z ; · ) can b e interpreted as a second-order lo cal “steady-state resp onse” of ∆ Z ( k ) from an y initialization Z (0) = Z satisfying Z − s Z ∈ C ( s Z ) and Z → s Z , after filtering out all transient directions. It is therefore natural to exp ect that ran( ϕ ( s Z ; · )) lies in a subset whose dimension is muc h lo w er than that of the ambien t space S n . W e are particularly interested in how ran( ϕ ( s Z ; · )) relates to C ( s Z ) (up to an affine h ull). F or example, if one could establish that ran( ϕ ( s Z ; · )) ⊆ C ( s Z ) under suitable conditions, then it w ould follo w immediately that s Z + C ( s Z ) is an inv ariant set for the lo cal second-order limit dynamics ( 43 ) when the higher-order term o ( ∥ Z ( k ) − s Z ∥ 2 F ) is neglected. In § 7.1 , w e present a negative result: under Assumption 1 alone, one cannot ev en guarantee ran( ϕ ( s Z ; · )) ⊆ aff ( C ( s Z )) . In § 7.2 , how ev er, w e sho w that the inclusion ran( ϕ ( s Z ; · )) ⊆ aff ( C ( s Z )) does hold once uniqueness of either the primal or the dual optimal solution is imp osed. Finally , in § 7.3 , w e discuss almost-inv arian t sets around s Z , lev eraging the structure of ran( ϕ ( s Z ; · )) . 7.1 General Case: ran( ϕ ( s Z ; · )) Ę aff ( C ( s Z )) F rom ( SDP-I ) and ( SDP-I I ), one may conjecture that ran( ϕ ( s Z ; · )) ⊆ C ( s Z ) , or at least ran( ϕ ( s Z ; · )) ⊆ aff ( C ( s Z )) . How ev er, this is not true in general. Prop osition 7 ( ran( ϕ ( s Z ; · )) Ę aff ( C ( s Z )) in general) . Ther e exists an SDP data triplet ( A , b, C ) satisfying Assumption 1 , e quipp e d with s Z ∈ Z ⋆ , such that ran( ϕ ( s Z ; · )) Ę aff ( C ( s Z )) . Pr o of. F rom Prop osition 2 (2), H α P 0 α D 0 = 0 for all H ∈ C ( s Z ) . Th us, as long as w e can construct an SDP satisfying Assumption 1 , such that there exists s Z ∈ Z ⋆ and s H ∈ C ( s Z ) with ϕ ( s Z ; s H ) α P 0 α D 0 = 0 , the claim 39 holds. Please see ( SDP-I II ) for a concrete construction. Sp ecifically , consider s H = s H ( h, 0) in ( 62 ) with h > ? 2 . Then, from ( 64 ): ϕ ( s Z ; s H ) = » — — — — — — — — – − 4 9 σ − 2 ? 2 9 σ 0 0 − h 3 0 ∼ » — — — – 2 9 σ 4 9 σ 0 0 ∼ 2 9 σ 0 ? 2 h 12 ∼ ∼ 0 − h 3 ∼ ∼ ∼ − h 2 6 fi ffi ffi ffi fl − 2 3 σ 0 0 0 ∼ ∼ h 2 6 fi ffi ffi ffi ffi ffi ffi ffi ffi fl . Clearly , ϕ ( s Z ; s H ) α P 0 α D 0 = 0 . 7.2 Under One-Sided Uniqueness: ran( ϕ ( s Z ; · )) ⊆ aff ( C ( s Z )) Although in general ran( ϕ ( s Z ; · )) Ę aff ( C ( s Z )) , w e will sho w that this inclusion do es hold once additional conditions are imp osed. Before pro ceeding, w e first state a few elementary lemmas on relative interiors and affine h ulls. Lemma 8. Given a finite-dimensional Hilb ert sp ac e H . Assume M is an affine set and S ⊆ M . If ther e exists ¯ x ∈ M and ϵ > 0 , such that ( ¯ x + ϵ B ) ∩ M ⊆ S , then aff ( S ) = M . Pr o of. Clearly , aff ( S ) ⊆ M . F or the other direction, denote M as ¯ x + L , where L = M − M is the linear subspace parallel to M . Then, by assumption, (( ¯ x + ϵ B ) ∩ M ) − ¯ x = ϵ B ∩ L ⊆ S − ¯ x . But aff ( ϵ B ∩ L ) = L : picking an y v ∈ L and t ∈ R large enough, w e get v /t ∈ ϵ B . Thus, v/t ∈ ϵ B ∩ L and v = t · v /t ∈ span( ϵ B ∩ L ) . Since 0 ∈ ϵ B ∩ L , w e get span( ϵ B ∩ L ) = aff ( ϵ B ∩ L ) . By the minimality of the affine h ull, L = aff ( ϵ B ∩ L ) ⊆ aff ( S − ¯ x ) . The pro of is closed by showing aff ( S ) = ¯ x + span( S − ¯ x ) ⊇ ¯ x + L = M . Lemma 9. Given a finite-dimensional Hilb ert sp ac e H and two c onvex sets C 1 , C 2 ⊂ H . If ri( C 1 ) ∩ ri( C 2 ) = ∅ , then aff ( C 1 ) ∩ aff ( C 2 ) = aff ( C 1 ∩ C 2 ) . Pr o of. F or ease of notation, set M 1 : = aff ( C 1 ) , M 2 = aff ( C 2 ) , M = M 1 ∩ M 2 . T ak e an y ¯ x ∈ ri( C 1 ) ∩ ri( C 2 ) . By definition, ∃ ϵ 1 , ϵ 2 > 0 , s.t. ( ¯ x + ϵ 1 B ) ∩ M 1 ⊆ C 1 and ( ¯ x + ϵ 2 B ) ∩ M 2 ⊆ C 2 . Set ϵ = min { ϵ 1 , ϵ 2 } , w e get ( ¯ x + ϵ B ) ∩ M = (( ¯ x + ϵ B ) ∩ M 1 ) ∩ (( ¯ x + ϵ B ) ∩ M 2 ) ⊆ C 1 ∩ C 2 Th us, b y ( C 1 ∩ C 2 ) ⊆ M and Lemma 8 , we get aff ( C 1 ∩ C 2 ) = M . Lemma 10. Given a finite-dimensional Hilb ert sp ac e H and two sets C 1 , C 2 ⊂ H . aff ( C 1 + C 2 ) = aff ( C 1 ) + aff ( C 2 ) . Pr o of. The “ ⊆ ” part. Since C 1 ⊆ aff ( C 1 ) , C 2 ⊆ aff ( C 2 ) , w e ha ve C 1 + C 2 ⊆ aff ( C 1 ) + aff ( C 2 ) . Thus, by minimalit y of affine h ull, aff ( C 1 + C 2 ) ⊆ aff ( C 1 ) + aff ( C 2 ) . The “ ⊇ ” part. T ake any u ∈ aff ( C 1 ) and v ∈ aff ( C 2 ) . By affine hull’s definition, there exist x i ∈ C 1 and P i α i = 1 , s.t. u = P i α i x i . Similarly , there exist y j ∈ C 2 and P j β j = 1 , s.t. v = P j β j y j . Thus, u + v = X i α i x i + X j β j y j = X i,j ( α i β j )( x i + y j ) . Observing that P i,j α i β j = 1 and x i + y j ∈ C 1 + C 2 , w e get u + v ∈ aff ( C 1 + C 2 ) . Prop osition 8 ( ran( ϕ ( s Z ; · )) ⊆ aff ( C ( s Z )) under one-sided uniqueness) . Under Assumption 1 , if either the primal or the dual optimal solution is unique, then ran( ϕ P ( s Z ; · )) ⊆ aff ( C P ( s Z )) and ran( ϕ D ( s Z ; · )) ⊆ aff ( C D ( s Z )) . Conse quently, ran( ϕ ( s Z ; · )) ⊆ aff ( C ( s Z )) . 40 Pr o of. W e only prov e for the case when the primal solution is unique. The dual solution unique case can be pro ven symmetrically . Since the primal optimal solution is unique, X sc = s X . Th us, pic king an y s H ∈ C ( s Z ) , w e hav e s H α P 0 α P 0 = 0 , s H α P 0 α D 0 = 0 , s H α P 0 α − = 0 . Thus, Q 0 is degraded to Q 0 β 0 , D β 0 , D . The cones in ( 27 ) are degraded to: C P ( s Z ) = H = » — – H α + α + H α + α P 0 0 ∼ 0 0 ∼ ∼ 0 fi ffi fl P H = 0 , C D ( s Z ) = H = » — – 0 0 0 ∼ H α D 0 α D 0 H α D 0 α − ∼ ∼ H α − α − fi ffi fl P ⊥ H = 0 , H α P 0 α P 0 ⪯ 0 = H P ⊥ H = 0 looooooooomoooooooo on = : C 1 ∩ H = » — – 0 0 0 ∼ H α D 0 α D 0 ⪯ 0 H α D 0 α − ∼ ∼ H α − α − fi ffi fl lo oooooooooooooooooooooooo omo oooooooooooooooooooooooo on = : C 2 . Since C P ( s Z ) is already affine, aff ( C P ( s Z )) = C P ( s Z ) . F or C D ( s Z ) = C 1 ∩ C 2 , w e shall pro ve that − S sc + s S ∈ ri( C 1 ) ∩ ri( C 2 ) . T o see this: since P ⊥ S sc = P ⊥ s S = P ⊥ C , then − S sc + s S ∈ C 1 = ri( C 1 ) ; since [ − S sc + s S ] α D 0 α D 0 = − [ S sc ] α D 0 α D 0 ≺ 0 , then − S sc + s S ∈ ri( C 2 ) . Therefore, in v oking Lemma 9 : aff ( C D ( s Z )) = aff ( C 1 ) ∩ aff ( C 2 ) = H P ⊥ H = 0 ∩ H = » — – 0 0 0 ∼ H α D 0 α D 0 H α D 0 α − ∼ ∼ H α − α − fi ffi fl . On the other hand, the cones in ( 44 ) are degraded to: K ◦ P ( s Z ; s H ) = W = » — – W α + α + W α + α P 0 0 ∼ 0 0 ∼ ∼ 0 fi ffi fl P W = 0 , K ◦ D ( s Z ; s H ) = W = » — — — – 0 0 0 ∼ Q 0 β 0 , D β 0 , D « x W β D 0 , 0 β D 0 , 0 x W β D 0 , 0 β 0 , − ∼ x W β 0 , − β 0 , − ff ( Q 0 β 0 , D β 0 , D ) T W α D 0 α − ∼ ∼ W α − α − fi ffi ffi ffi fl P ⊥ W = 0 , x W = ( Q 0 β 0 , D β 0 , D ) T W α D 0 α D 0 Q 0 β 0 , D β 0 , D , x W β D 0 , 0 β D 0 , 0 ⪯ 0 . Th us, K ◦ P ( s Z ; s H ) = aff ( C P ( s Z )) , K ◦ D ( s Z ; s H ) ⊆ aff ( C D ( s Z )) . By Theorem 7 , ϕ P ( s Z ; s H ) ∈ aff ( C P ( s Z )) and − σ ϕ D ( s Z ; s H ) ∈ aff ( C D ( s Z )) . Thus, ϕ ( s Z ; s H ) = ϕ P ( s Z ; s H ) − σ ϕ D ( s Z ; s H ) ∈ aff ( C P ( s Z )) + aff ( C D ( s Z )) = aff ( C ( s Z )) , b y Lemma 10 . It remains unclear to us under what conditions the stronger inclusion ran( ϕ ( s Z ; · )) ⊆ C ( s Z ) holds. Remark 3. In [ 15 ], one-side d uniqueness of optimal solution in addition with Assumption 1 is c al le d the simplicit y c ondition. 41 7.3 Discussion: Connections to Almost In v arian t Sets Prop osition 8 indicates that, under the local second-order limit dynamics ( 43 ), ∆ Z ( k ) lies in aff ( C ( s Z )) whenev er Z ( k ) ∈ C ( s Z ) (up to higher-order terms), pro vided the additional uniqueness condition holds. Indeed, in b oth ( SDP-I ) and ( SDP-I I ), the stronger inclusion ran( ϕ ( s Z ; · )) ⊆ C ( s Z ) holds. One can readily v erify that dual uniqueness holds in both examples. On the other hand, in the nonlinear dynamics literature there is the notion of an almost invariant set [ 13 ]. Informally , an almost in v ariant set is a region of the state space that tra jectories tend to remain in for a long time, with only a small probability (or small “leak age”) of lea ving ov er a prescrib ed time horizon. This raises the question of whether C ( s Z ) ∩ B r ( s Z ) , for some fixed small r > 0 , can serve as a lo c al almost inv arian t set for the one-step ADMM dynamics ( 4 ). This question is difficult to answer in general, b ecause t wo comp eting forces must b e balanced: (i) the lo cal first-order dynamics ( 24 ) tends to drive Z ( k ) to ward C ( s Z ) (Lemma 1 ); (ii) the lo cal second-order dynamics ( 25 ) may drive Z ( k ) outside C ( s Z ) , as suggested b y Prop osition 7 . A simple visualization. W e illustrate the t w o-lev el effects using ( SDP-I ). The results are sho wn in Figure 8 . Figure 8 (a) depicts the vector field induced b y ϕ ( s Z ; · ) . Figures 8 (b)–(e) sho w one-step ADMM tra jectories initialized at Z (0) = s Z + tH for different c hoices of t and H . Across all exp erimen ts, w e fix σ = 1 and set the maxim um num b er of iterations to 1000 . W e make three empirical observ ations: (i) Starting from an y initialization, Z ( k ) collapses to C ( s Z ) in a single ADMM step, regardless of the choice of t . (ii) As t → 0 , the decrease in ∥ ∆ Z ( k ) ∥ F is muc h faster than ∥ Z ( k ) − s Z ∥ F , which remains of order O ( t ) . (iii) The tra jectories of Z ( k ) closely resemble the theoretical vector field in (a), regardless of the choice of t . T aken together, these observ ations suggest that C ( s Z ) in ( SDP-I ) is very likely to b e an almost inv ariant set. 8 Con tinuit y of ϕ ( s Z ; · ) The third prop erty of ϕ ( s Z ; · ) that we study is its con tin uit y on C ( s Z ) . Perhaps surprisingly , although the residual mapping of the one-step ADMM up date ( 21 ) is contin uous on the en tire ambien t space S n , the induced second-order limit map can be discontin uous. Indeed, as defined in ( 44 ), the cone-v alued mapping K ◦ ( s Z ; · ) may lose contin uit y at a point s H satisfying det( s H α 0 α 0 ) = 0 , whic h provides a potential source of discon tin uity for ϕ ( s Z ; · ) . W e construct an explicit example in Prop osition 9 (§ 8.1 ). On the p ositiv e side, we show that the set of discon tin uity p oin ts of ϕ ( s Z ; · ) has Leb esgue measure zero on aff ( C ( s Z )) ( cf. Prop osition 10 in § 8.2 ). Moreo ver, except for the trivial case C ( s Z ) = T Z ⋆ ( s Z ) , the set C ( s Z ) \T Z ⋆ ( s Z ) —where ϕ ( s Z ; · ) is nonzero ( cf. Proposition 6 )—has infinite Leb esgue measure ( cf. Proposition 11 in § 8.2 ). T ogether, these results establish an “almost-sure” type con tin uit y of ϕ ( s Z ; · ) on C ( s Z ) . In § 8.3 , w e discuss a subtle phenomenon in slo w-conv ergence regions. F or most iterations, the angle (∆ Z ( k ) , ∆ Z ( k +1) ) tends to b e small and v aries smoothly , as described in § 6.3 . Occasionally , how ev er, (∆ Z ( k ) , ∆ Z ( k +1) ) can spike to a large v alue (often close to π 2 ) b efore quic kly returning to a small v alue. W e use the almost-sure contin uit y of ϕ ( s Z ; · ) to explain these “sparse spikes” in the slo w-conv ergence regime. F or small-scale SDP instances, our surrogate limiting mo del ( 43 ) can even accurately predict such microscopic phase transitions. 8.1 Existence of Discon tin uit y Prop osition 9 (Discontin uit y in ϕ ( s Z ; · ) ) . Ther e exists an SDP data triplet ( A , b, C ) satisfying Assumption 1 with s Z ∈ Z ⋆ , { H i } ∞ i =1 ∈ k er( δ ′ s Z ) , and s H ∈ ker( δ ′ s Z ) , s.t. lim i →∞ H i = s H , yet lim i →∞ ϕ ( s Z , H i ) = ϕ ( s Z ; s H ) . Pr o of. Please see ( SDP-I I I ) for a constructiv e example. Concretely , under the SDP data provided by ( SDP-I II ), w e c ho ose a real sequence ϵ i ↓ 0 as i → ∞ . F or s H ( h, ϵ ) in ( 62 ), define H i : = s H ( h, ϵ i ) , s H : = s H ( h, 0) . As 42 (a) Theoretical Results (b) t = 10 − 2 (c) t = 10 − 3 (d) t = 10 − 4 (e) t = 10 − 5 Figure 8: (a) The theoretical v ector field ϕ ( s Z ; H ) in ( SDP-I ), where H ∈ C ( s Z ) . (b)–(e) In ( SDP-I ), tra jectories of Z ( k ) from different initializations Z (0) with v arying t and first-order p erturbation H . W e sw eep t from 10 − 2 to 10 − 5 . F or each fixed t , ( H 11 , H 12 , H 22 ) is sampled from {− 2 , − 1 , 1 , 2 } 3 , yielding 64 initial points in total. long as ϵ i ≥ 0 , { H i } ∞ i =1 and s H all belong to C ( s Z ) \T Z ⋆ ( s Z ) . On the other hand, from ( 63 ) and ( 64 ), as long as h > ? 2 , w e ha ve lim i →∞ ϕ ( s Z ; H i ) = lim i →∞ » — — — — — — — — – − 4 9 σ − 2 ? 2 9 σ 0 − ϵ i 3 − h 3 0 ∼ » — — — – 2 9 σ 4 9 σ 0 0 ∼ 2 9 σ 0 ? 2 h 12 ∼ ∼ h 2 − 2 9 − h 3 ∼ ∼ ∼ − 2 h 2 − 1 9 fi ffi ffi ffi fl − 2 3 σ 0 2 ϵ i 3 σ 0 ∼ ∼ h 2 +1 9 fi ffi ffi ffi ffi ffi ffi ffi ffi fl = » — — — — — — — — – − 4 9 σ − 2 ? 2 9 σ 0 0 − h 3 0 ∼ » — — — – 2 9 σ 4 9 σ 0 0 ∼ 2 9 σ 0 ? 2 h 12 ∼ ∼ h 2 − 2 9 − h 3 ∼ ∼ ∼ − 2 h 2 − 1 9 fi ffi ffi ffi fl − 2 3 σ 0 0 0 ∼ ∼ h 2 +1 9 fi ffi ffi ffi ffi ffi ffi ffi ffi fl , and ϕ ( s Z ; s H ) = » — — — — — — — — – − 4 9 σ − 2 ? 2 9 σ 0 0 − h 3 0 ∼ » — — — – 2 9 σ 4 9 σ 0 0 ∼ 2 9 σ 0 ? 2 h 12 ∼ ∼ 0 − h 3 ∼ ∼ ∼ − h 2 6 fi ffi ffi ffi fl − 2 3 σ 0 0 0 ∼ ∼ h 2 6 fi ffi ffi ffi ffi ffi ffi ffi ffi fl . 43 Clearly , lim i →∞ ϕ ( s Z , H i ) = ϕ ( s Z ; s H ) . 8.2 Almost-Sure Contin uity In ( SDP-II I ), the p oin t s H at which ϕ ( s Z ; · ) loses con tin uity corresponds to s H α 0 α 0 b eing rank deficien t, i.e., det( s H α 0 α 0 ) = 0 . In the next lemma, we show that ϕ ( s Z ; · ) is contin uous at ev ery s H whose s H α 0 α 0 is nonsingular. Lemma 11 (Contin uit y of ϕ ( s Z ; · ) at nonsingular s H α 0 α 0 ) . Under Assumption 1 , supp ose further that s Z ∈ Z ⋆ is singular, i.e., | α 0 | > 0 . Then ϕ ( s Z ; · ) is c ontinuous at every s H ∈ C ( s Z ) such that s H α 0 α 0 is nonsingular. Pr o of. W e pro ve the contin uity of ϕ P ( s Z ; · ) and ϕ D ( s Z ; · ) separately , and begin with the primal part. Since s H α 0 α 0 is nonsingular, we hav e | β 0 , 0 | = 0 . Hence, b y ( 44 ), the cone K ◦ P ( s Z ; s H ) reduces to K ◦ P ( s Z ; s H ) = W = » — — — – W α + α + W α + α 0 0 ∼ Q 0 « x W β 0 , P β 0 , P 0 ∼ 0 ff ( Q 0 ) T 0 ∼ ∼ 0 fi ffi ffi ffi fl P W = 0 , x W = ( Q 0 ) T W α 0 α 0 Q 0 = W = » — — — – W α + α + W α + α P 0 W α + α D 0 0 ∼ W α P 0 α P 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl lo oooooooooooooooooooooooooooomo ooooooooooooooooooooooooooo on = : M 1 ∩ { W | P W = 0 } loooooooomoooooooon = : M 2 , where the last equalit y uses Lemma 4 . By Proposition 2 (2), define C 1 : = H = » — — — – H α + α + H α + α P 0 H α + α D 0 0 ∼ H α P 0 α P 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl H α P 0 α P 0 ⪰ 0 , C 2 = M 2 . Then, C P ( s Z ) = C 1 ∩ C 2 , aff ( C 1 ) = M 1 , and aff ( C 2 ) = M 2 . With the observ ation that X sc − s X ∈ ri( C 1 ) ∩ ri( C 2 ) , w e ha v e K ◦ P ( s Z ; s H ) = aff ( C 1 ) ∩ aff ( C 2 ) = aff ( C P ( s Z )) by Lemma 9 . In particular, the abov e description of K ◦ P ( s Z ; s H ) at nonsingular s H α 0 α 0 is independent of s H . Next, b y W eyl’s theorem, for any fixed such s H , there exists ϵ > 0 suc h that for all H ∈ B ϵ ( s H ) , w e ha v e det( H α 0 α 0 ) = 0 . Therefore, for all H ∈ B ϵ ( s H ) ∩ C ( s Z ) , ϕ P ( s Z ; H ) = arg min W ∈K ◦ P ( s Z ; H ) ∥ W + E ⊥ ( s Z ; H ) ∥ 2 F = arg min W ∈ aff ( C P ( s Z )) ∥ W + E ⊥ ( s Z ; H ) ∥ 2 F = Π aff ( C P ( s Z )) ( −E ⊥ ( s Z ; H )) , b y Theorem 7 . Since Π S n + ( · ) (resp. Π S n − ( · ) ) is con tinuous on S n , it follo ws from ( 34 ) that −E ⊥ ( s Z ; · ) is con tinuous on B ϵ ( s H ) ∩ C ( s Z ) . Moreo ver, the pro jection mapping Π aff ( C P ( s Z )) ( · ) is contin uous on B ϵ ( s H ) ∩ C ( s Z ) . Therefore, ϕ P ( s Z ; · ) is con tin uous on B ϵ ( s H ) ∩ C ( s Z ) , and in particular contin uous at s H . The contin uit y for ϕ D ( s Z ; · ) at such an s H can be deduced symmetrically . Thus, ϕ ( s Z ; · ) = ϕ P ( s Z ; · ) − σ ϕ D ( s Z ; · ) is con tin uous at s H with det( s H α 0 α 0 ) = 0 . F rom now on, abbreviate aff ( C ( s Z )) as L . Supp ose the dimension of L is d . Let ρ d b e the standard Leb esgue measure on R d . Fix F as any linear isomorphism from R d to L . Then, the d -dimension Leb esgue measure on L is defined b y ρ L ( A ) = ρ d ( F − 1 ( A )) , ∀ A ⊂ L Borel . (54) Please note that the choice of F will only affect ρ L b y a p ositiv e constant. W e first show that the set of p oin ts making ϕ ( s Z ; · ) discontin uous is of measure zero in terms of ρ L . 44 Prop osition 10 (Measure-zero discontin uity) . Under Assumption 1 , fix any s Z ∈ Z ⋆ . Supp ose ρ L is define d in ( 54 ) . Then: ρ L ` s H ∈ C ( s Z ) ϕ ( s Z ; s H ) is disc ontinuous at s H ˘ = 0 . Pr o of. If s Z is nonsingular, then from Corollary 1 (2), C ( s Z ) = T Z ⋆ ( s Z ) . W e get ϕ ( s Z ; s H ) ≡ 0 for all s H ∈ C ( s Z ) from Prop osition 6 . Thus, the claim trivially holds. Now let us consider the case when s Z is singular. In voking Lemma 11 , discontin uity only o ccurs when s H α 0 α 0 is singular. Denote the p olynomial p : S n 7→ R as p ( H ) = det( H α 0 α 0 ) : s H ∈ C ( s Z ) ϕ ( s Z ; s H ) is discontin uous at s H ⊆ s H ∈ C ( s Z ) det( s H α 0 α 0 ) = 0 ⊆ s H ∈ L p ( s H ) = 0 = : D . All w e need to prov e is ρ L ( D ) = 0 . (i) W e first pro ve that there exists r H ∈ C ( s Z ) , s.t. p ( r H ) = 0 . Set r H as r H = ( X sc − s X ) − ( S sc − s S ) = » — — — — – [ X sc − s X ] α + α + [ X sc ] α + α P 0 0 0 ∼ [ X sc ] α P 0 α P 0 ≻ 0 0 0 ∼ ∼ − [ S sc ] α D 0 α D 0 ≺ 0 − [ S sc ] α D 0 α − ∼ ∼ ∼ − [ S sc − s S ] α − α − fi ffi ffi ffi ffi fl . It is easy to verify that P Π ′ + ( s Z ; r H ) + P ⊥ Π ′ − ( s Z ; r H ) = P ( X sc − s X ) − P ⊥ ( S sc − s S ) = 0 , and det( r H α 0 α 0 ) = 0 . (ii) Consider the restriction q = p |L : L 7→ R . Under the iden tification L ≃ R d via F , r q = q ◦ F : R d 7→ R is a polynomial on R d . Since r q ( F − 1 ( r H )) = q ( r H ) = 0 with r H ∈ L , r q is a nonzero p olynomial on R d . F rom [ 8 ], the set r D : = F − 1 ( D ) = { x ∈ R d | r q ( x ) = 0 } is of Leb esgue measure zero, i.e., ρ d ( r D ) = 0 . F rom ( 54 ), ρ L ( D ) = ρ d ( F − 1 ( D )) = ρ d ( r D ) = 0 , pro ving the desired result. Prop osition 10 shows that the set of discon tin uit y points has measure zero. Ho w ever, this do es not rule out the p ossibilit y that C ( s Z ) \T Z ⋆ ( s Z ) —the set on whic h ϕ ( s Z ; s H ) do es not v anish—also has measure zero. The follo wing prop osition disp els this concern. Prop osition 11 (Measure of C ( s Z ) \T Z ⋆ ( s Z ) ) . Under Assumption 1 , let ρ L b e define d in ( 54 ) . Then either of the two c ases holds: (i) T Z ⋆ ( s Z ) = C ( s Z ) ; (ii) T Z ⋆ ( s Z ) Ĺ C ( s Z ) and ρ L ( C ( s Z ) \T Z ⋆ ( s Z )) = ∞ . Pr o of. Case (i) is the trivial case, where ϕ ( s Z ; s H ) ≡ 0 for all s H ∈ C ( s Z ) . F or case (ii), there exists s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) , s.t. at least one of the follo wing tw o conditions holds: s H α + α D 0 = 0 and s H α P 0 α − = 0 . Otherwise, from Prop osition 3 (2), s H ∈ T Z ⋆ ( s Z ) . Th us, from Prop osition 3 (1), s H / ∈ span( T Z ⋆ ( s Z )) . This gives us span( T Z ⋆ ( s Z )) Ĺ span( C ( s Z )) , whic h implies dim( T Z ⋆ ( s Z )) < dim( C ( s Z )) . Since L = aff ( C ( s Z )) , ρ L ( T Z ⋆ ( s Z )) = 0 . Thus, by countable additivity , ρ L ( C ( s Z )) = ρ L ( C ( s Z ) \T Z ⋆ ( s Z )) + ρ L ( T Z ⋆ ( s Z )) = ρ L ( C ( s Z ) \T Z ⋆ ( s Z )) . On the other hand, since C ( s Z ) is a nonempty closed conv ex cone of dimension d ≥ 1 , ri( C ( s Z )) at least con tains a ray R : = { t s H | t > 0 } with a nonzero s H ∈ ri( C ( s Z )) . By definition, there exists r > 0 , s.t. A : = B r ( s H ) ∩ L ⊂ C ( s Z ) and ρ L ( A ) > 0 . Define tA : = { tH | H ∈ A } for any t > 0 , w e get ρ L ( tA ) = t d ρ L ( A ) → ∞ , as t → ∞ . Finally , since tA ⊂ C ( s Z ) for all t > 0 , w e conclude that ρ L ( C ( s Z )) = ∞ , and hence ρ L ( C ( s Z ) \T Z ⋆ ( s Z )) = ∞ . Prop osition 11 , together with Prop osition 10 , describ es the “almost-sure” con tin uity of ϕ ( s Z ; · ) ov er C ( s Z ) . 45 8.3 Discussion: “Spikes” in (∆ Z ( k ) , ∆ Z ( k +1) ) The almost-sure type discon tin uit y of ϕ ( s Z ; · ) pro vides a natural explanation for the microscopic phase tran- sitions observ ed inside ADMM’s slow-con vergence regions: (i) When the iterates Z ( k ) pass through re- gions where ϕ ( s Z ; Z ( k ) − s Z ) v aries con tinuously , the angle (∆ Z ( k ) , ∆ Z ( k +1) ) remains small and ev olv es smo othly , as discussed in § 6.3 ; (ii) when Z ( k ) hits a discon tin uit y p oin t of ϕ ( s Z ; · ) , the approximation ∆ Z ( k ) ≈ 1 2 ϕ ( s Z ; Z ( k ) − s Z ) abruptly switc hes to a differen t displacemen t v ector and quic kly stabilizes again. Since ∆ Z ( k ) is closely related to the KKT residuals in ADMM [ 24 , Lemma 4], w e also exp ect an observ able jump in r ( k ) max . W e use ( SDP-I II ) to illustrate the v alidity and accuracy of this explanation. By ( 63 ) and ( 64 ), if h > ? 2 , then s H ( h, 0) defined in ( 62 ) is a discontin uit y point of ϕ ( s Z ; · ) , whereas if h ≤ ? 2 , s H ( h, 0) is a con tin uit y p oin t. Now consider the initialization Z (0) : = s Z + t s H ( h, ϵ ) . When t → 0 and ϵ is set to a small p ositiv e v alue, we exp ect (∆ Z ( k ) , ∆ Z ( k +1) ) to exhibit a spike when h > ? 2 . Moreov er, ϵ should affect the spik e’s arriv al time: the smaller ϵ is, the earlier the spike o ccurs. The results are sho wn in Figure 9 . When ϵ = 10 − 2 is too large, no spike is observ ed ev en when h = 1 . 6 . When ϵ = 10 − 3 , larger v alues of h lead to earlier spike times. In addition, when h ≤ 1 . 40 , no spike is observ ed within the first 1000 iterations. Whenever (∆ Z ( k ) , ∆ Z ( k +1) ) spikes, there is also a clearly observ able jump in r ( k +1) max − r ( k ) max . By comparison, the jump in ∥ ∆ Z ( k +1) ∥ F − ∥ ∆ Z ( k ) ∥ F is less pronounced. The b eha vior for ϵ = 10 − 4 is similar to that for ϵ = 10 − 3 . When ϵ = 10 − 5 , the spik e o ccurs so early that it b ecomes indistinguishable from the initial transient phase b efore Z ( k ) has con verged to C ( s Z ) . 9 σ ’s Effect on ϕ ( s Z ; · ) The fourth property of ϕ ( s Z ; · ) that we study concerns its dependence on σ , the tunable p enalt y parameter in ( 2 ). This issue is b oth theoretically and computationally imp ortant, since the choice of σ can significan tly affect ADMM’s con vergence behavior. In general, the dep endence of th e one-step residual δ ( · ) in ( 21 ) on σ is highly intricate. Under our local second-order limit dynamics model, ho w ev er, this relationship b ecomes m uch simpler. Sp ecifically , in § 9.1 , we show that when σ is up dated to σ ′ (and Z = s Z + t s H + o ( t ) is up dated to Z ′ ), b oth the primal and dual iterates remain unchanged to first order as long as s H ∈ C ( s Z ) . Moreov er, the up dated p oin t Z ′ con tinues to lie in s Z ′ + C ( s Z ′ ) up to first order, so a corresp onding first-order direction s H ′ ∈ C ( s Z ′ ) is well defined. A t the second-order lev el, w e obtain a clean scaling la w: ϕ P ( s Z ; s H ) in ( 49 ) is up dated to ϕ P ( s Z ′ ; s H ′ ) = σ ′ σ ϕ P ( s Z ; s H ) , and ϕ D ( s Z ; s H ) in ( 50 ) is up dated to ϕ D ( s Z ′ ; s H ′ ) = σ σ ′ ϕ D ( s Z ; s H ) ( cf. § 9.2 ). An immediate corollary is that, under the second-order limiting model, b oth the primal and dual infeasibilities are inv arian t to σ tuning. Finally , in § 9.3 , we discuss practical strategies for updating σ in the second-order-dominan t regime. 9.1 First-Order Effect Supp ose w e change σ to σ ′ . Then, for Z = s Z + t s H + t 2 2 W + o ( t 2 ) with X = Π + ( Z ) and S = − 1 σ Π − ( Z ) , it is updated to: Z ′ : = X ′ − σ ′ S ′ = X − σ ′ σ S = Π + ( Z ) + σ ′ σ Π − ( Z ) . (55) F or the KKT p oin t s Z : = s X − σ s S , it is up dated to s Z ′ : = s X − σ ′ s S . Its corresp onding eigenv alues are up dated as follo ws: µ ′ k = ( σ ′ σ µ k , k ∈ I − µ k , otherwise . 46 Figure 9: T ra jectories of log 10 ( (∆ Z ( k ) , ∆ Z ( k +1) )) , log 10 ( |∥ ∆ Z ( k +1) ∥ F − ∥ ∆ Z ( k ) ∥ F | ) , and log 10 ( | r ( k +1) max − r ( k ) max | ) in ( SDP-I II ) with different Z (0) : = s Z + t s H ( h, ϵ ) . t is fixed as 10 − 4 and σ is fixed as 1 . The maxim um iteration n um b er of ADMM is 1000 . W e sw eep ( h, ϵ ) from { 1 . 6 , 1 . 5 , 1 . 4 , 1 . 3 } × { 10 − 2 , 10 − 3 , 10 − 4 , 10 − 5 } , leading to 16 points in total. 47 The corresp onding optimal set is changed to Z ′ ⋆ : = { X − σ ′ S | X ∈ X ⋆ , S ∈ S ⋆ } . W e aim to expand the new Z ′ around the new KKT p oin t s Z ′ up to second-order, i.e., Z ′ = s Z ′ + t s H ′ + t 2 2 W ′ + o ( t 2 ) . F rom ( 55 ), s H ′ = Π ′ + ( s Z ; s H ) + σ ′ σ Π ′ − ( s Z ; s H ) . (56) F or an arbitrary s H ∈ S n , it do es not hold in general that Π ′ + ( s Z ′ ; s H ′ ) = Π ′ + ( s Z ; s H ) and Π ′ − ( s Z ′ ; s H ′ ) = σ ′ σ Π ′ − ( s Z ; s H ) . Ho w ev er, as we will see, if s H ∈ C ( s Z ) , these equalities do hold. Lemma 12 (New partition for s H ′ ) . Under Assumption 1 , if s H ∈ C ( s Z ) , then Π ′ + ( s Z ′ ; s H ′ ) = Π ′ + ( s Z ; s H ) and Π ′ − ( s Z ′ ; s H ′ ) = σ ′ σ Π ′ − ( s Z ; s H ) . Pr o of. F rom Proposition 2 (2), w e ha ve s H = » — — — – s H α + α + s H α + α P 0 s H α + α D 0 0 ∼ s H α P 0 α P 0 0 s H α P 0 α − ∼ ∼ s H α D 0 α D 0 s H α D 0 α − ∼ ∼ ∼ s H α − α − fi ffi ffi ffi fl , with s H α P 0 α P 0 ⪰ 0 , s H α D 0 α D 0 ⪯ 0 . Th us, s H ′ = Π ′ + ( s Z ; s H ) + σ ′ σ Π ′ − ( s Z ; s H ) = » — — — – s H α + α + s H α + α P 0 s H α + α D 0 0 ∼ s H α P 0 α P 0 0 σ ′ σ s H α P 0 α − ∼ ∼ σ ′ σ s H α D 0 α D 0 σ ′ σ s H α D 0 α − ∼ ∼ ∼ σ ′ σ s H α − α − fi ffi ffi ffi fl , (57) and Π ′ + ( s Z ′ ; s H ′ ) = » — — — – s H α + α + s H α + α P 0 s H α + α D 0 0 ∼ s H α P 0 α P 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl = Π ′ + ( s Z ; s H ) , Π ′ − ( s Z ′ ; s H ′ ) = » — — — – 0 0 0 0 ∼ 0 0 σ ′ σ s H α P 0 α − ∼ ∼ σ ′ σ s H α D 0 α D 0 σ ′ σ s H α D 0 α − ∼ ∼ ∼ σ ′ σ s H α − α − fi ffi ffi ffi fl = σ ′ σ Π ′ − ( s Z ; s H ) , pro ving the claim. Since X ′ = Π + ( Z ′ ) and S ′ = − 1 σ ′ Π − ( Z ′ ) , from Lemma 12 : X ′ = Π + ( s Z ′ ) + t Π ′ + ( s Z ′ ; s H ′ ) + o ( t ) = s X + t Π ′ + ( s Z ; s H ) + o ( t ) = X + o ( t ) , S ′ = − 1 σ ′ Π − ( s Z ′ ) − t · 1 σ ′ Π ′ − ( s Z ′ ; s H ′ ) + o ( t ) = s S − t · 1 σ Π ′ − ( s Z ; s H ) + o ( t ) = S + o ( t ) . Therefore, b oth primal and dual iterates remain unc hanged up to first-order. The follo wing results sho w that if s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) , then only updating σ cannot escape from the second-order-dominant region. Lemma 13 ( s H ′ in C ( s Z ′ ) \T Z ′ ⋆ ( s Z ′ ) ) . Under Assumption 1 , if s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) , then s H ′ ∈ C ( s Z ′ ) \T Z ′ ⋆ ( s Z ′ ) . 48 Pr o of. (i) W e first sho w that s H ′ ∈ C ( s Z ′ ) . F rom Lemma 12 , Π ′ + ( s Z ′ ; s H ′ ) = Π ′ + ( s Z ; s H ) and Π ′ − ( s Z ′ ; s H ′ ) = σ ′ σ Π ′ − ( s Z ; s H ) . Th us, P Π ′ + ( s Z ′ ; s H ′ ) = P Π ′ + ( s Z ; s H ) = 0 , P ⊥ Π ′ − ( s Z ′ ; s H ′ ) = P ⊥ σ ′ σ Π ′ − ( s Z ; s H ) = 0 . Th us, δ ′ ( s Z ′ ; s H ′ ) = 0 . (ii) W e second show that s H ′ / ∈ T Z ′ ⋆ ( s Z ′ ) . Proof b y con tradiction. Supp ose s H ′ ∈ T Z ′ ⋆ ( s Z ′ ) . F rom Prop o- sition 3 (1), s H ′ α P α D = 0 . Thus, s H α + α D 0 = 0 and s H α P 0 α − = 0 from ( 57 ). Combining Prop osition 3 (2), s H ∈ T Z ⋆ ( s Z ) , whic h results in a con tradiction. 9.2 Second-Order Effect W e sho w the following scaling la w. Prop osition 12 ( σ ’s second-order effect) . Under Assumption 1 , supp ose s Z ∈ Z ⋆ and s H ∈ C ( s Z ) . When σ is up date d to σ ′ : ϕ P ( s Z ′ ; s H ′ ) = σ ′ σ ϕ P ( s Z ; s H ) , ϕ D ( s Z ′ ; s H ′ ) = σ σ ′ ϕ D ( s Z ; s H ) . Pr o of. (i) W e first show that K ◦ P ( s Z ′ ; s H ′ ) = K ◦ P ( s Z ; s H ) , K ◦ D ( s Z ′ ; s H ′ ) = K ◦ D ( s Z ; s H ) . T o see this: from Lemma 12 , s H ′ α 0 α 0 = « s H α P 0 α P 0 ⪰ 0 0 ∼ σ ′ σ s H α D 0 α D 0 ⪯ 0 ff . Th us, ( Q 0 ) ′ = Q 0 and η ′ 0 ,i = η 0 ,i , i ∈ β 0 , + η ′ 0 ,j = σ ′ σ η 0 ,j , j ∈ β 0 , − η ′ 0 ,k = 0 , k ∈ β 0 , 0 . This directly implies that ( β 0 , + , β 0 , 0 , β 0 , − ) remains the same after σ ’s up date. Since K ◦ P ( s Z ; s H ) and K ◦ D ( s Z ; s H ) only depends on Q 0 and the partition ( β 0 , + , β 0 , 0 , β 0 , − ) , w e finished the pro of. (ii) F or the primal part: due to s H ′ α 0 α 0 ’s structure, Π + ( s H ′ α 0 α 0 ) = Π + ( s H α 0 α 0 ) and Π − ( s H ′ α 0 α 0 ) = σ ′ σ Π − ( s H α 0 α 0 ) . F rom ( 34 ), E ⊥ ( s Z ′ ; s H ′ ) = » — — — – 0 n 2 1 − µ ′ a s H ′ α a α 0 Π − ( s H ′ α 0 α 0 ) o a ∈I + n 2 1 µ ′ a − µ ′ b s H ′ α a α 0 s H ′ α 0 α b o a ∈I + b ∈I − ∼ 2 P c ∈I − 1 µ ′ c s H ′ α 0 α c s H ′ α c α 0 n − 2 1 µ ′ b Π − ( − s H ′ α 0 α 0 ) s H ′ α 0 α b o b ∈I − ∼ ∼ 0 fi ffi ffi ffi fl = » — — — — – 0 n 2 σ ′ σ 1 − µ a s H α a α 0 Π − ( s H α 0 α 0 ) o a ∈I + 2 σ ′ σ µ a − σ ′ σ µ b s H α a α 0 s H α 0 α b a ∈I + b ∈I − ∼ 2 σ ′ σ P c ∈I − 1 µ c s H α 0 α c s H α c α 0 n − 2 1 µ b Π − ( − s H α 0 α 0 ) s H α 0 α b o b ∈I − ∼ ∼ 0 fi ffi ffi ffi ffi fl . 49 Therefore, b y Prop osition 5 : ϕ P ( s Z ′ ; s H ′ ) = arg min W ∈K ◦ P ( s Z ′ ; Ď H ′ ) ∥ W + E ⊥ ( s Z ′ ; s H ′ ) ∥ 2 F = arg min W ∈K ◦ P ( s Z ; Ď H ) ∥ W + E ⊥ ( s Z ′ ; s H ′ ) ∥ 2 F = arg min W ∈K ◦ P ( s Z ; Ď H ) 2 X a ∈I + W α a α 0 + σ ′ σ · 2 1 − µ a s H α a α 0 Π − ( s H α 0 α 0 ) 2 F + Ď W α 0 α 0 + σ ′ σ · 2 X c ∈I − 1 µ c s H α 0 α c s H α c α 0 2 F = arg min W ∈K ◦ P ( s Z ; Ď H ) 2 X a ∈I + W α a α 0 + σ ′ σ · [ E ⊥ ( s Z ; s H )] α a α 0 2 F + Ď W α 0 α 0 + σ ′ σ · [ E ⊥ ( s Z ; s H )] α 0 α 0 2 F = arg min W ∈K ◦ P ( s Z ; Ď H ) W + σ ′ σ · E ⊥ ( s Z ; s H ) 2 F = Π K ◦ P ( s Z ; Ď H ) ( − σ ′ σ · E ⊥ ( s Z ; s H )) = σ ′ σ · ϕ P ( s Z ; s H ) . One ma y notice that the key observ ation here is W α + α − ≡ 0 , ∀ W ∈ K ◦ P ( s Z ; s H ) . The last equality comes from the fact that for a closed con vex cone C ⊂ S n , Π C ( αx ) = α Π C ( x ) for all α > 0 . (iii) F or the dual part, similar to the primal part: E ( s Z ; s H ) = » — — — — – 0 n − 2 σ ′ σ 1 µ a s H α a α 0 Π + ( − s H α 0 α 0 ) o a ∈I + − 2 σ ′ σ µ a − σ ′ σ µ b s H α a α 0 s H α 0 α b a ∈I + b ∈I − ∼ 2 P c ∈I + 1 µ c s H α 0 α c s H α c α 0 n 2 1 − µ b Π + ( s H α 0 α 0 ) s H α 0 α b o b ∈I − ∼ ∼ 0 fi ffi ffi ffi ffi fl . Th us, b y Prop osition 5 : ϕ D ( s Z ′ ; s H ′ ) = − 1 σ ′ arg min W ∈K ◦ D ( s Z ′ ; Ď H ′ ) ∥ W + E ( s Z ′ ; s H ′ ) ∥ 2 F = − 1 σ ′ arg min W ∈K ◦ D ( s Z ; Ď H ) ∥ W + E ( s Z ′ ; s H ′ ) ∥ 2 F = − 1 σ ′ arg min W ∈K ◦ D ( s Z ; Ď H ) 2 X b ∈I − W α 0 α b + 2 1 − µ b Π + ( s H α 0 α 0 ) s H α 0 α b 2 F + W α 0 α 0 + 2 X c ∈I + 1 µ c s H α 0 α c s H α c α 0 2 F = − 1 σ ′ arg min W ∈K ◦ D ( s Z ; Ď H ) W + E ( s Z ; s H ) 2 F = σ σ ′ · − 1 σ arg min W ∈K ◦ D ( s Z ; Ď H ) W + E ( s Z ; s H ) 2 F = σ σ ′ ϕ D ( s Z ; s H ) . Again, the key observ ation is W α + α − ≡ 0 , ∀ W ∈ K ◦ D ( s Z ; s H ) . An immediate corollary from Prop osition 12 is that, the limiting behavior of primal/dual infeasibilit y is irr elevant to σ in the second-order-dominant regions. Corollary 3. Under Assumption 1 , let s Z ∈ Z ⋆ and s H ∈ C ( s Z ) . Under the first- and se c ond-or der lo c al dynamics mo dels in Definition 1 , when σ is up date d to σ ′ , the limits of b oth r ( k ) p and r ( k ) d in ( 5 ) r emain unchange d up to se c ond-or der. Pr o of. F rom [ 54 , Corollary 1], A X ( k ) − b = σ A ( S ( k +1) − S ( k ) ) , A ∗ y ( k ) + S ( k ) − C = 1 σ ( X ( k +1) − X ( k ) ) . Th us, r ( k ) p = σ ∥A ( S ( k +1) − S ( k ) ) ∥ 2 1 + ∥ b ∥ , r ( k ) d = 1 σ ∥ X ( k +1) − X ( k ) ∥ F 1 + ∥ C ∥ F . 50 F rom Theorem 6 , the lo cal second-order limit of X ( k +1) − X ( k ) (resp. S ( k +1) − S ( k ) ) is ϕ P ( s Z ; s H ) (resp. ϕ D ( s Z ; s H ) ). Th us, lim k →∞ r ( k ) p ∝ σ ∥A ϕ D ( s Z ; s H ) ∥ 2 , lim k →∞ r ( k ) d ∝ 1 σ ∥ ϕ P ( s Z ; s H ) ∥ F . On the other hand, from Proposition 12 , σ ′ ϕ D ( s Z ′ ; s H ′ ) = σ ′ σ σ ′ ϕ D ( s Z ; s H ) = σ ϕ D ( s Z ; s H ) , 1 σ ′ ϕ P ( s Z ′ ; s H ′ ) = 1 σ ′ σ ′ σ ϕ P ( s Z ; s H ) = 1 σ ϕ P ( s Z ; s H ) . whic h closes the proof. 9.3 Discussion: σ ’s Up dating Rules T raditional σ -up dating heuristics t ypically aim to balance the primal and dual infeasibilities, under the implicit assumption that ∆ X ( k ) : = X ( k +1) − X ( k ) and ∆ S ( k ) : = S ( k +1) − S ( k ) are nearly independent of σ [ 26 , 54 ]. Ho wev er, Prop osition 12 and Corollary 3 indicate that suc h heuristics b ecome ineffective in second-order-dominan t regions, since r ( k ) p and r ( k ) d are (locally) insensitiv e to σ . T o empirically verify Proposition 12 and Corollary 3 , w e fix s Z and s H as in § 6 for all three SDP ex- amples. Starting from Z (0) : = s Z + t s H with differen t choices of t , we uniformly increase log 10 ( σ ) from 0 to 1 o v er 1000 ADMM iterations. Since the change in σ is gradual and the iteration horizon is mo der- ate, we may assume that ∆ X ( k ) (resp. ∆ S ( k ) ) steadily tracks its second-order limit as t ↓ 0 . The results for ( SDP-I ), ( SDP-I I ), and ( SDP-I II ) are shown in Figures 10 , 11 , and 12 , resp ectiv ely . When t = 10 − 5 , the dep endence of (∆ X ( k ) , ∆ S ( k ) , r ( k ) p , r ( k ) d ) on σ is consisten t across all three examples: (i) log 10 ( ∥ ∆ X ( k ) ∥ F ) (resp. log 10 ( ∥ ∆ S ( k ) ∥ F ) ) increases (resp. decreases) linearly with log 10 ( σ ) , with slop e close to +1 (resp. − 1 ); (ii) r ( k ) p and r ( k ) d remain essen tially unc hanged as σ v aries. (a) t = 10 − 2 (b) t = 10 − 3 (c) t = 10 − 4 (d) t = 10 − 5 Figure 10: T ra jectories of ∥ ∆ X ( k ) ∥ F , ∥ ∆ S ( k ) ∥ F , r ( k ) p , and r ( k ) d in ( SDP-I ). Fix s Z ∈ Z ⋆ and s H ∈ C ( s Z ) . Pick t ∈ { 10 − 2 , 10 − 3 , 10 − 4 , 10 − 5 } such that Z (0) = s Z + t s H . log 10 ( σ ) is uniformly increased from 0 to 1 in 1000 iterations. Discussion on the one-sided uniqueness condition. It is d ifficult to design a “univ ersally” go od σ - up dating strategy in the second-order-dominant regime. F or instance, when b oth primal and dual constrain t nondegeneracy fail, it is lik ely that s H α + α D 0 in the C P ( s Z ) part and s H α P 0 α − in the C D ( s Z ) part are simultaneously nonzero ( e.g., ( 62 ) in ( SDP-I II )). In this case, enlarging σ amplifies ∆ X ( k ) , whic h ma y help reduce s H α + α D 0 . 51 (a) t = 10 − 2 (b) t = 10 − 3 (c) t = 10 − 4 (d) t = 10 − 5 Figure 11: T ra jectories of ∥ ∆ X ( k ) ∥ F , ∥ ∆ S ( k ) ∥ F , r ( k ) p , and r ( k ) d in ( SDP-I I ). Fix s Z ∈ Z ⋆ and s H ∈ C ( s Z ) . Pick t ∈ { 10 − 2 , 10 − 3 , 10 − 4 , 10 − 5 } such that Z (0) = s Z + t s H . log 10 ( σ ) is uniformly increased from 0 to 1 in 1000 iterations. (a) t = 10 − 2 (b) t = 10 − 3 (c) t = 10 − 4 (d) t = 10 − 5 Figure 12: T ra jectories of ∥ ∆ X ( k ) ∥ F , ∥ ∆ S ( k ) ∥ F , r ( k ) p , and r ( k ) d in ( SDP-I II ). Fix s Z ∈ Z ⋆ and s H ∈ C ( s Z ) . Pic k t ∈ { 10 − 2 , 10 − 3 , 10 − 4 , 10 − 5 } such that Z (0) = s Z + t s H . log 10 ( σ ) is uniformly increased from 0 to 1 in 1000 iterations. On the other hand, it also suppresses ∆ S ( k ) , whic h may w orsen the C D ( s Z ) comp onent. The situation can b e more fav orable when one-sided uniqueness holds in either the primal or the dual optimal solution set. F or example, when the primal solution is unique, Prop osition 8 ’s pro of implies s H α P 0 α − = 0 . In this case , we only need to eliminate s H α + α D 0 in the C P ( s Z ) part, and it may b e b eneficial to choose a large σ . Symmetrically , when the dual optimal solution is unique, w e only need to eliminate s H α P 0 α − in the C D ( s Z ) part, and it ma y b e b eneficial to choose a small σ . W e empirically verify this analysis using three SDP examples. The initial s Z ∈ Z ⋆ and s H ∈ C ( s Z ) are the same as in § 6 . W e fix t = 10 − 4 and initialize σ = 1 . After running 1000 ADMM iterations, we update σ to a v alue in { 10 − 2 , 10 − 1 , 1 , 10 , 10 2 } . W e then record the change in the maximum KKT residual r ( k ) max . The results are shown in Figure 13 . In b oth ( SDP-I ) and ( SDP-II ), we observe a significan t acceleration when σ is up dated from 1 to 10 − 2 . This is consistent with our analysis, since the dual optimal solution is unique in b oth examples. F or ( SDP-II I ), changing σ does not help the iterates escape the slow-con v ergence region. This is not surprising, since s H (1 , 0) defined in ( 62 ) has nonzero en tries in both s H α + α D 0 and s H α P 0 α − . 52 ( SDP-I ) ( SDP-I I ) ( SDP-I II ) Figure 13: T ra jectories of log 10 ( r ( k ) max ) in the three SDP examples. F or eac h example, w e fix s Z ∈ Z ⋆ , s H ∈ C ( s Z ) , and t = 10 − 4 , and initialize Z (0) : = s Z + t s H . The penalty parameter is initialized at σ = 1 . After 1000 iterations, w e update σ to a v alue in { 10 − 2 , 10 − 1 , 1 , 10 , 10 2 } and run an additional 1000 ADMM iterations. 10 Examples W e present three SDP examples in this s ection. F or each instance and the associated rank-deficient s Z ∈ Z ⋆ , w e compute the relev an t first-order ob jects ( e.g., C ( s Z ) , T Z ⋆ ( s Z ) ) and second-order ob jects ( e.g., K ( s Z ; s H ) , K ◦ ( s Z ; s H ) , Ψ( s Z ; s H ) , ϕ ( s Z ; s H ) ). These calculations serv e three purp oses: 1. These examples provide a sanit y chec k for the v alidit y of our second-order analysis. 2. More importantly , the examples serve as construc tiv e demonstrations of the k ey prop erties of ϕ ( s Z ; · ) . F or instance, ( SDP-I I I ) simultaneously establishes Prop osition 7 and Prop osition 9 . 3. When discussing the connection b et ween properties of ϕ ( s Z ; · ) and ADMM’s empirical b eha vior, we use n umerical results on these examples for illustration. 10.1 Example I SDP data. W e consider a 2 × 2 SDP from [ 12 , Example 1]: C = „ 0 0 ∼ 1 ȷ , A 1 = „ 0 1 ∼ − 1 ȷ , b = 0 . (SDP-I) Its optimal sets are: X ⋆ = „ X 11 0 ∼ 0 ȷ X 11 ≥ 0 , S ⋆ = „ 0 0 ∼ 1 ȷ , Z ⋆ = „ Z 11 0 ∼ − σ ȷ Z 11 ≥ 0 . Clearly , the primal optimal set is unbounded. Moreov er, except Z 11 = 0 , all other optimal solutions satisfies strict complemen tarity . W e are typically interested in the rank deficient optimal solution ( Z 11 = 0 ). 53 First-order information. Denote ( x ) + (resp. ( x ) − ) as an abbreviation for Π S 1 + ( x ) (resp. Π S 1 − ( x ) ). Since P X = 1 3 (2 X 12 − X 22 ) „ 0 1 ∼ − 1 ȷ , P ⊥ X = „ X 11 1 3 ( X 12 + X 22 ) ∼ 2 3 ( X 12 + X 22 ) ȷ , and Π ′ + ( s Z ; H ) = „ ( H 11 ) + 0 ∼ 0 ȷ , Π ′ − ( s Z ; H ) = „ ( H 11 ) − H 12 ∼ H 22 ȷ . w e ha v e P Π ′ + ( s Z ; H ) = 0 , P ⊥ Π ′ − ( s Z ; H ) = „ ( H 11 ) − 1 3 ( H 12 + H 22 ) ∼ 2 3 ( H 12 + H 22 ) ȷ . Th us, C ( s Z ) = „ a b ∼ − b ȷ a ≥ 0 , T Z ⋆ ( s Z ) = C ( s Z ) ∩ H ∈ S 2 H 12 = 0 = „ a 0 ∼ 0 ȷ a ≥ 0 . (58) w e c hoose an arbitrary s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) with b = 0 . Second-order information. F or this simple 2 × 2 SDP example, we use K ( s Z ; s H ) ’s form ula in ( 42 ) to calculate ϕ ( s Z ; s H ) . Through careful calculation: E ( s Z ; s H ) = „ 0 2 ab σ ∼ 0 ȷ , E ⊥ ( s Z ; s H ) = „ − 2 b 2 σ − 2 ab σ ∼ 0 ȷ , Ψ( s Z ; s H ) = „ 2 b 2 σ − 2 3 ab σ ∼ 8 3 ab σ ȷ . F or K ( s Z ; s H ) , there are t w o cases: • Case I: a > 0 . In this case, Θ( s Z ; s H , W ) = „ W 11 0 ∼ 0 ȷ , Θ ⊥ ( s Z ; s H , W ) = „ 0 W 12 ∼ W 22 ȷ , K ( s Z ; s H ) = „ 0 1 3 ( W 12 + W 22 ) ∼ 2 3 ( W 12 + W 22 ) ȷ . Th us, Π K ( s Z ; Ď H ) (Ψ( s Z ; s H )) = „ 0 2 3 ab σ ∼ 4 3 ab σ ȷ , ϕ ( s Z ; s H ) = „ 2 b 2 σ − 4 3 ab σ ∼ 4 3 ab σ ȷ . (59) • Case II: a = 0 . In this case, Θ( s Z ; s H , W ) = „ ( W 11 ) + 0 ∼ 0 ȷ , Θ ⊥ ( s Z ; s H , W ) = „ ( W 11 ) − W 12 ∼ W 22 ȷ , K ( s Z ; s H ) = „ ( W 11 ) − 1 3 ( W 12 + W 22 ) ∼ 2 3 ( W 12 + W 22 ) ȷ . Th us, Π K ( s Z ; Ď H ) (Ψ( s Z ; s H )) = „ 0 0 ∼ 0 ȷ , ϕ ( s Z ; s H ) = „ 2 b 2 σ 0 ∼ 0 ȷ . Clearly , the tw o cases can be com bined. 54 σ updating. Consider updating σ to σ ′ . F rom Lemma 12 and Lemma 13 : s H = „ a b ∼ − b ȷ = ⇒ s H ′ = „ a ′ b ′ ∼ − b ′ ȷ = « a σ ′ σ b ∼ − σ ′ σ b ff . Since ϕ P ( s Z ; s H ) = „ 2 b 2 σ 0 ∼ 0 ȷ , ϕ D ( s Z ; s H ) = „ 0 4 3 ab σ 2 ∼ − 4 3 ab σ 2 ȷ , w e ha v e ϕ P ( s Z ′ ; s H ′ ) = σ ′ σ „ 2 b 2 σ 0 ∼ 0 ȷ = σ ′ σ ϕ P ( s Z ; s H ) , ϕ D ( s Z ′ ; s H ′ ) = σ σ ′ „ 0 4 3 ab σ 2 ∼ − 4 3 ab σ 2 ȷ = σ σ ′ ϕ D ( s Z ; s H ) . 10.2 Example I I SDP data. Consider the follo wing SDP instance: C = » – 0 0 0 ∼ 0 0 ∼ ∼ 1 fi fl , A 1 = » – 1 0 0 ∼ 1 0 ∼ ∼ 1 fi fl , A 2 = » – 0 0 0 ∼ 0 1 ∼ ∼ 0 fi fl , b = „ 1 0 ȷ . (SDP-I I) The primal-dual optimal set: X ⋆ = » – a u 0 ∼ 1 − a 0 0 0 0 fi fl 0 ≤ a ≤ 1 , u 2 ≤ a (1 − a ) , S ⋆ = » – 0 0 0 ∼ 0 0 ∼ ∼ 1 fi fl , Z ⋆ = X ⋆ − σ S ⋆ . F or ( SDP-I I ), we can easily c hec k that it satisfies tw o-sided Slater conditions. Also, there exists a strictly complemen tary solution pair. W e are t ypically interested in the rank-deficient solutions, i.e., rank( s X ) = 1 of the form: s X = s X ( a ) = » – a ± ? a ? 1 − a 0 ∼ 1 − a 0 ∼ ∼ 0 fi fl , a ∈ [0 , 1] . Our framew ork requires s X to be diagonal. Therefore, some changes of basis are needed. Define the orthonor- mal matrix Q as: Q = » – ? a ∓ ? 1 − a 0 ± ? 1 − a ? a 0 0 0 1 fi fl . Under this basis: C ← Q T C Q = » – 0 0 0 ∼ 0 0 ∼ ∼ 1 fi fl , A 1 ← Q T A 1 Q = » – 1 0 0 ∼ 1 0 ∼ ∼ 1 fi fl , A 2 ← Q T A 2 Q = » – 0 0 ± ? 1 − a ∼ 0 ? a ∼ ∼ 0 fi fl , s X ← Q T s X Q = » – 1 0 0 ∼ 0 0 ∼ ∼ 0 fi fl , s S ← Q T s S Q = » – 0 0 0 ∼ 0 0 ∼ ∼ 1 fi fl . 55 First-order information. F or a fixed a , P X = 1 3 ( X 11 + X 22 + X 33 ) » – 1 0 0 ∼ 1 0 ∼ ∼ 1 fi fl + ( ± ? 1 − aX 13 + ? aX 23 ) » – 0 0 ± ? 1 − a ∼ 0 ? a ∼ ∼ 0 fi fl , Π ′ + ( s Z ; H ) = » – H 11 H 12 1 1+ σ H 13 ∼ ( H 22 ) + 0 ∼ ∼ 0 fi fl , Π ′ − ( s Z ; H ) = » – 0 0 σ 1+ σ H 13 ∼ ( H 22 ) − H 23 ∼ ∼ H 33 fi fl . Via careful calculation: C ( s Z ) = » — – − H 22 H 12 0 ∼ H 22 0 ∼ ∼ 0 fi ffi fl H 22 ≥ 0 , a ∈ [0 , 1) » — – − H 22 H 12 0 ∼ H 22 H 23 ∼ ∼ 0 fi ffi fl H 22 ≥ 0 , a = 1 , T Z ⋆ ( s Z ) = » – − H 22 H 12 0 ∼ H 22 0 ∼ ∼ 0 fi fl H 22 ≥ 0 , ∀ a ∈ [0 , 1] . (60) Therefore, to pick up s H ∈ C ( s Z ) \T Z ⋆ ( s Z ) , the only non trivial case is a = 1 and s H 23 = 0 . Second-order information. W e adopt the polar description in Proposition 5 to calculate ϕ ( s Z ; s H ) . Via careful calculation: E ( s Z ; s H ) = » – 0 0 2 1+ σ s H 12 s H 23 ∼ 2 s H 2 12 2 σ s H 22 s H 23 ∼ ∼ 0 fi fl , E ⊥ ( s Z ; s H ) = » – 0 0 − 2 1+ σ s H 12 s H 23 ∼ − 2 σ s H 2 23 − 2 σ s H 22 s H 23 ∼ ∼ 0 fi fl , and K ◦ P ( s Z ; s H ) = W = » — – W 11 W 12 0 ∼ W 22 0 ∼ ∼ 0 fi ffi fl W 11 + W 22 = 0 , s H 22 > 0 W = » — – W 11 W 12 0 ∼ W 22 0 ∼ ∼ 0 fi ffi fl « W 11 + W 22 = 0 , W 22 ≥ 0 ff , s H 22 = 0 , K ◦ D ( s Z ; s H ) = W = » – 0 0 0 ∼ 0 W 23 ∼ ∼ 0 fi fl . (i) F or the primal part, we need to consider t wo cases: (a) s H 22 > 0 . In this case, from Theorem 7 , ϕ P ( s Z ; s H ) = arg min W ∈K ◦ P ( s Z ; Ď H ) ∥ W + E ⊥ ( s Z ; s H ) ∥ 2 F = arg min W 11 + W 22 =0 ( W 22 − 2 σ s H 2 23 ) 2 + 2 W 2 12 + W 2 11 = » – − 1 σ s H 2 23 0 0 ∼ 1 σ s H 2 23 0 ∼ ∼ 0 fi fl . (b) s H 22 = 0 . Similar to case (a), ϕ P ( s Z ; s H ) = arg min W ∈K ◦ P ( s Z ; Ď H ) ∥ W + E ⊥ ( s Z ; s H ) ∥ 2 F = arg min W 11 + W 22 =0 , W 22 ≥ 0 ( W 22 − 2 σ s H 2 23 ) 2 + 2 W 2 12 + W 2 11 = » – − 1 σ s H 2 23 0 0 ∼ 1 σ s H 2 23 0 ∼ ∼ 0 fi fl . Clearly , case (a) and (b) can be com bined. 56 (ii) F or the dual part, from Theorem 7 : − σ ϕ D ( s Z ; s H ) = arg min W ∈K ◦ D ( s Z ; Ď H ) ∥ W + E ( s Z ; s H ) ∥ 2 F = arg min W 23 ∈ R ( W 23 + 2 σ s H 22 s H 23 ) 2 = » – 0 0 0 ∼ 0 − 2 σ s H 22 s H 23 ∼ ∼ 0 fi fl . (iii) Com bining the primal and dual part: ϕ ( s Z ; s H ) = ϕ P ( s Z ; s H ) − σ ϕ D ( s Z ; s H ) = » – − 1 σ s H 2 23 0 0 ∼ 1 σ s H 2 23 − 2 σ s H 22 s H 23 ∼ ∼ 0 fi fl . (61) σ updating. When σ is updated to σ ′ , s H is up dated to s H ′ = » – − s H 22 s H 12 0 ∼ s H 22 σ ′ σ s H 23 ∼ ∼ 0 fi fl from ( 56 ). Th us, ϕ P ( s Z ′ ; s H ′ ) = » – − 1 σ ′ ( σ ′ σ s H 23 ) 2 0 0 ∼ 1 σ ′ ( σ ′ σ s H 23 ) 2 0 ∼ ∼ 0 fi fl = σ ′ σ ϕ P ( s Z ; s H ) , ϕ D ( s Z ′ ; s H ′ ) = − 1 σ ′ » – 0 0 0 ∼ 0 − 2 σ ′ s H 22 ( σ ′ σ s H 23 ) ∼ ∼ 0 fi fl = σ σ ′ ϕ D ( s Z ; s H ) . 10.3 Example I I I SDP data. Consider a 6 by 6 SDP example. F or ease of notation, define E ij ∈ S 6 (1 ≤ i, j ≤ 6) as: E ij ( m, n ) : = ( 1 , m = i, n = j 0 , otherwise Moreo ver, 0 m × n is an abbreviation of all-zero matrix of size m × n and I m is an iden tit y matrix of size m × m . Define an orthonormal matrix Q as Q : = “ q 1 q 2 q 3 ‰ = » — – 1 ? 3 1 ? 2 1 ? 6 1 ? 3 − 1 ? 2 1 ? 6 1 ? 3 0 − 2 ? 6 fi ffi fl . The SDP data is b = » — — — – 6 0 . . . 0 fi ffi ffi ffi fl ∈ R 15 , C = „ 0 3 × 3 0 3 × 3 ∼ I 3 ȷ , A 1 = I 6 , A 2 = » — — – Q T » – 1 0 0 ∼ − 1 0 ∼ ∼ 0 fi fl Q 0 3 × 3 ∼ 0 3 × 3 fi ffi ffi fl , A 3 = » — — – Q T » – 1 0 0 ∼ 0 0 ∼ ∼ − 1 fi fl Q 0 3 × 3 ∼ 0 3 × 3 fi ffi ffi fl , A 4 = E 44 − E 55 , A 5 = E 55 − E 66 , A 6 = E 24 + E 42 , A 7 = E 25 + E 52 , A 8 = E 26 + E 62 , A 9 = E 34 + E 43 , A 10 = E 35 + E 53 , A 11 = E 36 + E 63 , A 12 = E 45 + E 54 , A 13 = E 46 + E 64 , A 14 = E 56 + E 65 , A 15 = E 16 + E 61 . (SDP-I II) One can verify that for ( SDP-I I I ), there exist strictly complemen tary and rank-deficient solution pairs: ( X sc , S sc ) = ˆ„ 2 I 3 0 3 × 3 ∼ 0 3 × 3 ȷ , „ 0 3 × 3 0 3 × 3 ∼ I 3 ȷ˙ , ( s X , s S ) = p 6 E 11 , 3 E 66 q . Th us, w e pick s Z = 6 E 11 − 3 σ E 66 . 57 First-order information. H ∈ C ( s Z ) if and only if: P » — — — — — — – H 11 H 12 H 13 H 14 H 15 0 ∼ « H 22 H 23 ∼ H 33 ff ⪰ 0 0 2 × 2 0 2 × 1 ∼ ∼ 0 2 × 2 0 2 × 1 ∼ ∼ ∼ 0 fi ffi ffi ffi ffi ffi ffi fl = 0 and P ⊥ » — — — — — — — — – 0 0 1 × 2 0 1 × 2 0 ∼ 0 2 × 2 0 2 × 2 H 26 H 36 ∼ ∼ « H 44 H 45 ∼ H 55 ff ⪯ 0 H 46 H 56 ∼ ∼ ∼ H 66 fi ffi ffi ffi ffi ffi ffi ffi ffi fl = 0 . Via calculation, we find a family of s H ’s belonging to C ( s Z ) \T Z ⋆ ( s Z ) : s H = s H ( h, ϵ ) = » — — — — — — — — – − 1 0 − ? 2 4 1 h 0 ∼ » — — — – 1 0 0 0 ∼ 0 0 0 ∼ ∼ − ϵ 0 ∼ ∼ ∼ − 1 fi ffi ffi ffi fl 1 1 1 1 ∼ ∼ 1 + ϵ fi ffi ffi ffi ffi ffi ffi ffi ffi fl ∈ C ( s Z ) \T Z ⋆ ( s Z ) , ∀ ϵ ≥ 0 , h ∈ R . (62) Second-order information. W e adopt the polar description in Prop osition 5 to calculate ϕ ( s Z ; s H ( h, ϵ )) . (i) F or the primal part, calculating E ( s Z ; s H ) from ( 34 ) as: E ⊥ ( s Z ; s H ) = » — — — — — — — — – 0 0 0 ϵ 3 h 3 − 4 h − ? 2+4 6( σ +2) ∼ » — — — – − 2 3 σ − 2 3 σ − 2 3 σ − 2 3 σ ∼ − 2 3 σ − 2 3 σ − 2 3 σ ∼ ∼ − 2 3 σ − 2 3 σ ∼ ∼ ∼ − 2 3 σ fi ffi ffi ffi fl − 2 3 σ 0 2 ϵ 3 σ 0 ∼ ∼ 0 fi ffi ffi ffi ffi ffi ffi ffi ffi fl . Calculate K ◦ P ( s Z ; s H ) from ( 44 ): K ◦ P ( s Z ; s H ) = U = » — — — — — — — — – U 11 U 12 U 13 U 14 U 15 0 ∼ » — — — – U 22 U 23 U 24 0 ∼ U 33 ≥ 0 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl 0 0 0 0 ∼ ∼ 0 fi ffi ffi ffi ffi ffi ffi ffi ffi fl P U = 0 = U = » — — — — — — — — – U 11 U 12 U 13 U 14 U 15 0 ∼ » — — — – U 22 U 23 U 24 0 ∼ U 33 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl 0 0 0 0 ∼ ∼ 0 fi ffi ffi ffi ffi ffi ffi ffi ffi fl r U = Q » – U 11 U 12 U 13 ∼ U 22 U 23 ∼ ∼ U 33 fi fl Q T , r U 11 = r U 22 = r U 33 = 0 , U 33 ≥ 0 , U 24 = 0 , U 14 ∈ R , U 15 ∈ R . 58 Th us, from Theorem 7 : ϕ P ( s Z ; s H ) = arg min U ∈K ◦ P ( s Z ; Ď H ) ∥ U + E ⊥ ( s Z ; s H ) ∥ 2 F = arg min U 33 ≥ 0 ,U 24 =0 , r U 11 = r U 22 = r U 33 =0 r U + Q » – 0 0 0 ∼ − 2 3 σ − 2 3 σ ∼ ∼ − 2 3 σ fi fl Q T 2 F + 2( U 14 + ϵ 3 ) 2 + 2( U 15 + h 3 ) 2 + 2( U 24 − 2 3 σ ) 2 = » — — — — — — — — – − 4 9 σ − 2 ? 2 9 σ 0 − ϵ 3 − h 3 0 ∼ » — — — – 2 9 σ 4 9 σ 0 0 ∼ 2 9 σ 0 0 ∼ ∼ 0 0 ∼ ∼ ∼ 0 fi ffi ffi ffi fl 0 0 0 0 ∼ ∼ 0 fi ffi ffi ffi ffi ffi ffi ffi ffi fl . (ii) F or the dual part: from ( 34 ), E ( s Z ; s H ) = » — — — — — — — — – 0 0 0 − ϵ 3 − h 3 4 h − ? 2+4 6( σ +2) ∼ » — — — – 0 0 0 0 ∼ 1 24 − ? 2 12 − ? 2 h 12 ∼ ∼ 1 3 h 3 ∼ ∼ ∼ h 2 3 fi ffi ffi ffi fl 2 3 σ 0 − 2 ϵ 3 σ 0 ∼ ∼ 0 fi ffi ffi ffi ffi ffi ffi ffi ffi fl . Ho wev er, K ◦ D ( s Z ; s H ( h, ϵ )) sho ws discontin uity at ϵ = 0 . (a) When ϵ = 0 : in this case, K ◦ D ( s Z ; s H ) = V = » — — — — — — — — – 0 0 0 0 0 0 ∼ » — — — – 0 0 0 0 ∼ 0 0 V 35 ∼ ∼ V 44 ≤ 0 V 45 ∼ ∼ ∼ V 55 fi ffi ffi ffi fl V 26 V 36 V 46 V 56 ∼ ∼ V 66 fi ffi ffi ffi ffi ffi ffi ffi ffi fl P ⊥ V = 0 = V = » — — — — — — — — – 0 0 0 0 0 0 ∼ » — — — – 0 0 0 0 ∼ 0 0 V 35 ∼ ∼ V 44 V 45 ∼ ∼ ∼ V 55 fi ffi ffi ffi fl V 26 V 36 V 46 V 56 ∼ ∼ V 66 fi ffi ffi ffi ffi ffi ffi ffi ffi fl V 44 ≤ 0 , » – V 44 V 55 V 66 fi fl ∈ a » – 1 − 1 0 fi fl + b » – 0 1 − 1 fi fl , a, b ∈ R , V 26 , V 35 , V 36 , V 45 , V 46 , V 56 ∈ R . 59 Th us, from Theorem 7 , we get − σ ϕ D ( s Z ; s H ) = arg min V ∈K ◦ D ( s Z ; Ď H ) ∥ V + E ( s Z ; s H ) ∥ 2 F = arg min V 44 ≤ 0 , V 44 = a, V 55 = − a + b, V 66 = − b » – V 44 V 55 V 66 fi fl + » – 1 3 h 2 3 0 fi fl 2 F + 2( V 26 + 2 3 σ ) 2 + 2( V 35 − ? 2 h 12 ) 2 + 2( V 45 + h 3 ) 2 + 2 V 2 36 + 2 V 2 46 + 2 V 2 56 = » — — — — — — — — – 0 0 0 0 0 0 ∼ » — — — – 0 0 0 0 ∼ 0 0 ? 2 h 12 ∼ ∼ a ⋆ − h 3 ∼ ∼ ∼ − a ⋆ + b ⋆ fi ffi ffi ffi fl − 2 3 σ 0 0 0 ∼ ∼ − b ⋆ fi ffi ffi ffi ffi ffi ffi ffi ffi fl . where ( a ⋆ , b ⋆ ) is defined as: ( a ⋆ , b ⋆ ) = arg min a ≤ 0 ( a + 1 3 ) 2 + ( − a + b + h 2 3 ) 2 + b 2 = ( ( 1 9 ( h 2 − 2) , − 1 9 ( h 2 + 1)) , | h | ≤ ? 2 (0 , − 1 6 h 2 ) , | h | > ? 2 (b) When ϵ > 0 : in this case, K ◦ D ( s Z ; s H ) = V = » — — — — — — — — – 0 0 0 0 0 0 ∼ » — — — – 0 0 0 0 ∼ 0 0 V 35 ∼ ∼ V 44 V 45 ∼ ∼ ∼ V 55 fi ffi ffi ffi fl V 26 V 36 V 46 V 56 ∼ ∼ V 66 fi ffi ffi ffi ffi ffi ffi ffi ffi fl P ⊥ V = 0 = V = » — — — — — — — — – 0 0 0 0 0 0 ∼ » — — — – 0 0 0 0 ∼ 0 0 V 35 ∼ ∼ V 44 V 45 ∼ ∼ ∼ V 55 fi ffi ffi ffi fl V 26 V 36 V 46 V 56 ∼ ∼ V 66 fi ffi ffi ffi ffi ffi ffi ffi ffi fl » – V 44 V 55 V 66 fi fl ∈ a » – 1 − 1 0 fi fl + b » – 0 1 − 1 fi fl , a, b ∈ R , V 26 , V 35 , V 36 , V 45 , V 46 , V 56 ∈ R . Th us, − σ ϕ D ( s Z ; s H ) = arg min V ∈K ◦ D ( s Z ; Ď H ) ∥ V + E ( s Z ; s H ) ∥ 2 F = arg min V 44 = a, V 55 = − a + b, V 66 = − b » – V 44 V 55 V 66 fi fl + » – 1 3 h 2 3 0 fi fl 2 F + 2( V 26 + 2 3 σ ) 2 + 2( V 35 − ? 2 h 12 ) 2 + 2( V 45 + h 3 ) 2 + 2 V 2 36 + 2( V 46 − 2 ϵ 3 σ ) 2 + 2 V 2 56 = » — — — — — — — — – 0 0 0 0 0 0 ∼ » — — — – 0 0 0 0 ∼ 0 0 ? 2 h 12 ∼ ∼ 1 9 ( h 2 − 2) − h 3 ∼ ∼ ∼ − 1 9 (2 h 2 − 1) fi ffi ffi ffi fl − 2 3 σ 0 2 ϵ 3 σ 0 ∼ ∼ 1 9 ( h 2 + 1) fi ffi ffi ffi ffi ffi ffi ffi ffi fl . 60 (iii) Com bining (i) and (ii), w e get: (a) If case I: | h | ≤ ? 2 ; or case II: | h | > ? 2 and ϵ > 0 , ϕ ( s Z ; s H ) = ϕ P ( s Z ; s H ) − σ ϕ D ( s Z ; s H ) = » — — — — — — — — – − 4 9 σ − 2 ? 2 9 σ 0 − ϵ 3 − h 3 0 ∼ » — — — – 2 9 σ 4 9 σ 0 0 ∼ 2 9 σ 0 ? 2 h 12 ∼ ∼ h 2 − 2 9 − h 3 ∼ ∼ ∼ − 2 h 2 − 1 9 fi ffi ffi ffi fl − 2 3 σ 0 2 ϵ 3 σ 0 ∼ ∼ h 2 +1 9 fi ffi ffi ffi ffi ffi ffi ffi ffi fl . (63) (b) If | h | > ? 2 and ϵ = 0 , ϕ ( s Z ; s H ) = ϕ P ( s Z ; s H ) − σ ϕ D ( s Z ; s H ) = » — — — — — — — — – − 4 9 σ − 2 ? 2 9 σ 0 0 − h 3 0 ∼ » — — — – 2 9 σ 4 9 σ 0 0 ∼ 2 9 σ 0 ? 2 h 12 ∼ ∼ 0 − h 3 ∼ ∼ ∼ − h 2 6 fi ffi ffi ffi fl − 2 3 σ 0 0 0 ∼ ∼ h 2 6 fi ffi ffi ffi ffi ffi ffi ffi ffi fl . (64) σ up dating. F ollowing the exact same pro cedure in § 10.1 and § 10.2 , w e can get ϕ P ( s Z ′ ; s H ′ ) = σ ′ σ ϕ P ( s Z ; s H ) and ϕ D ( s Z ′ ; s H ′ ) = σ σ ′ ϕ D ( s Z ; s H ) as we up date σ to σ ′ for all s H = s H ( h, ϵ ) . W e omit the details here. 11 Numerical Exp erimen ts Exp erimen t setup. T o further qualitatively ev aluate our analysis framework, we run the three-step ADMM ( 2 ) on the Mittelmann dataset, a widely used b enc hmark for SDP solv ers [ 3 , 20 , 46 , 50 ]. F or con- creteness, we select the single-blo c k instances with blo c k size no greater than 3000 , yielding 25 instances in total. All exp erimen ts w ere conducted on the Harv ard Univ ersit y F aculty of Arts and Sciences Researc h Computing (F ASRC) cluster. Jobs w ere submitted to the seas_compute partition, and each run requested 48 CPU cores and 64 GB of memory . 2 Exp erimen t I. After rescaling the SDP data and applying diagonal preconditioning to the constrain t matrix A , we start three-step ADMM ( 2 ) with all-zero initial guesses using a fixe d σ -up dating strategy that aims to balance the primal and dual infeasibilities [ 54 ] for 20000 iterations. After that, σ is k ept unchanged. A t the 40000 th iteration, w e record the curren t p enalt y parameter as σ 0 and set ( X (40000) , y (40000) , S (40000) ) as ( X 0 , y 0 , S 0 ) for subsequen t use. F or each instance, w e set the maxim um n um ber of iterations to 10 6 and the maxim um running time to 168 hours. W e terminate once the maxim um KKT residual satisfies r max ≤ 10 − 10 . W e rep ort the tra jectories of (∆ Z ( k ) , ∆ Z ( k +1) ) , ∥ ∆ Z ( k ) ∥ F , and r ( k ) max . The results are sho wn in Figure 14 . Based on whether ADMM solves an instance to r max b elo w 10 − 10 , w e divide the 25 SDPs into tw o groups: • “Easy” SDPs: 1et2048 , 1zc1024 , cphil12 , G48mb , G48mc , hamming8 , hamming9 , theta12 , theta102 , theta123 . ( 10 instances.) • “Hard” SDPs: 1dc1024 , 1tc2048 , cancer100 , cnhil10 , foot , G40mb , hand , neosfbr25 , neosfbr30e8 , neu1g , neu2g , neu3g , r12000 , swissroll , texture . ( 15 instances.) A cross the slo w-con v ergence regions of all “hard” instances, w e observ e that (∆ Z ( k ) , ∆ Z ( k +1) ) remains small yet nonzero for most iterations (t ypically around 10 − 3 to 10 − 5 ), except for several sparse spikes. This b eha vior is consistent with our lo cal second-order limiting dynamics mo del ( 43 ), as discussed in § 6.3 and § 8.3 . 2 https://docs.rc.fas.harv ard.edu/kb/running-jobs/ 61 1dc1024 1et2048 1tc2048 1zc1024 cancer100 cnhil10 cphil12 foot G40mb G48mb G48mc hamming8 hamming9 hand neosfbr25 neosfbr30e8 neu1g neu2g neu3g r12000 swissroll theta12 theta102 theta123 texture Figure 14: r ( k ) max , ∥ ∆ Z ( k ) ∥ F , and (∆ Z ( k ) , ∆ Z ( k +1) ) for 25 Mittelmann SDP datasets (single-block, blo ck size ≤ 3000 ). F or al l instances exhibiting slo w con v ergence, (∆ Z ( k ) , ∆ Z ( k +1) ) remains small for an extended p eriod, except for a few sparse spikes. In contrast, for man y instances that display an observ able sharp linear con vergence phase, (∆ Z ( k ) , ∆ Z ( k +1) ) is large. 62 Exp erimen t I I. T o further prob e the slow-con v ergence regimes of the 15 “hard” instances, w e restart from ( X 0 , y 0 , S 0 ) and σ 0 , and then uniformly increase log 10 ( σ ) from log 10 ( σ 0 ) to log 10 (10 σ 0 ) ov er the next 5000 iterations. As in § 9.3 , w e plot the resulting “response curv es” of ∥ ∆ X ( k ) ∥ F , ∥ ∆ S ( k ) ∥ F , r ( k ) p , and r ( k ) d as functions of σ . The results are shown in Figure 15 . W e further divide them into three groups: • Group I: cnhil10 , foot , neu1g , neu3g , texture . ( 5 instances.) • Group II: 1dc1024 , G40mb , hand , neosfbr25 , r12000 , swissroll . ( 6 instances.) • Group II I: 1tc2048 , cancer100 , neosfbr30e8 , neu2g . ( 4 instances.) The response curv es of Group I are compatible with our limit dynamics mo del. F or these instances, log 10 ( ∥ ∆ X ( k ) ∥ F ) (resp. log 10 ( ∥ ∆ S ( k ) ∥ F ) ) increases (resp. decreases) appro ximately linearly with log 10 ( σ ) , with slop e close to +1 (resp. − 1 ). F or these 5 SDPs, the slow-con v ergence regions are therefore likely driven b y the second-order limit dynamics ( 43 ). F or Group II, the resp onse curves partially resemble those in Group I. When σ is small, the slop e of log 10 ( ∥ ∆ X ( k ) ∥ F ) (resp. log 10 ( ∥ ∆ S ( k ) ∥ F ) ) is close to +1 (resp. − 1 ). Ho w ever, as σ increases, the curv es distort, either smo othly or abruptly . F or these 6 SDPs, we conjecture that updating σ helps the iterates escap e the curren t second-order-dominan t regions. F or Group I II, the resp onse curv es deviate substan tially from those predicted b y the lo cal second-order limit dynamics model. The mechanisms underlying the slo w-con v ergence behavior of these 4 instances therefore remain unclear to us. 1dc1024 1tc2048 cancer100 cnhil10 foot G40mb hand neosfbr25 neosfbr30e8 neu1g neu2g neu3g r12000 swissroll texture Figure 15: F or the selected 15 “hard” SDP instances in the Mittelmann dataset, we run ADMM iterations with initial guess ( X 0 , y 0 , S 0 ) and initial σ = σ 0 . W e uniformly increase log 10 ( σ ) from log 10 ( σ 0 ) to log 10 (10 σ 0 ) o ver the next 5000 iterations. W e rep ort the tra jectories of ∥ ∆ X ( k ) ∥ F , ∥ ∆ S ( k ) ∥ F , r ( k ) p , and r ( k ) d . 63 12 F uture Directions Our w ork op ens several directions for future researc h. Appro ximation error of the limit dynamics. As discussed in § 1.3 , a main open issue is to quantify the appro ximation error b et w een the lo cal second-order limit dynamics mo del ( 43 ) and the true ADMM dynamics ( 4 ). This app ears challenging in general, since the mo del relies on three coupled appro ximation la yers (cf. Remark 2 ). Dev eloping a principled error-con trol theory—for example, identifying regimes where the limit dynamics provides uniform guaran tees o v er time horizons relev an t to slo w-con vergence b ehavior— w ould substan tially strengthen the framew ork. Extension to singularit y degree > 1 . The current framework relies on the existence of a strictly comple- men tary solution pair (i.e., the singularity degree of the optimal set is 1 [ 47 ]) to simplify the blo c k structure and the ensuing analysis (cf. Remark 1 ). This assumption may fail in general. Extending the analysis to singularit y degree d > 1 is an imp ortan t direction. A natural conjecture is that one may need to understand lo cal expansions to order 2 d to capture the correct limit b ehavior, although the precise order and the righ t notion of limit map remain open. Characterization of the almost-in v ariant set. Our current discussion of lo cal almost-inv ariant sets is qualitative (cf. § 7.3 ). A quantitativ e theory that b ounds the leak age from C ( s Z ) would provide a more complete picture of slow-con v ergence regions. Ac hieving this likely requires new technical to ols to balance the tw o comp eting effects: the first-order dynamics that pulls iterates tow ard C ( s Z ) and the second-order drift that may push them aw a y . Algorithmic acceleration. Sev eral components of our analysis ha v e direct implications for algorithm design, esp ecially the dep endence on σ (cf. § 9 ). A promising direction is to detect second-order-dominant regimes using the σ – ( r ( k ) p , r ( k ) d ) response curves developed in Exp erimen t II (cf. § 11 ). Once such a regime is detected, one could design region-wise σ -adaptation rules informed b y additional problem structure (e.g., one-sided uniqueness) and b y the predicted scaling of the limit drifts. Extension to other splitting metho ds and conic programs. It w ould also b e in teresting to extend the presen t approac h b ey ond ADMM to other splitting schemes (e.g., sGS-ADMM and PDHG), and to other conic programs where pro jection op erators admit comparable second-order structure. Such extensions could help clarify whether second-order limit dynamics is a general mechanism underlying slow conv ergence across a broader class of op erator-splitting algorithms. 13 Conclusion This pap er developed a transien t, region-wise p erspective on the slo w-con v ergence behavior observed in ADMM for SDPs with multiple KKT p oin ts. W e refined and streamlined the (parab olic) second-order directional deriv ativ e formula for the PSD pro jection op erator, and lev eraged it to deriv e a detailed second- order expansion of the ADMM dynamics around an arbitrary KKT p oin t s Z . This expansion isolates the cone C ( s Z ) of first-order stalled directions and leads to the central ob ject of the pap er: the lo cal second-order limit map ϕ ( s Z ; · ) : C ( s Z ) 7→ S n and its induced limit dynamics. This limit dynamics serves as a lo cal surrogate for the nonlinear ADMM update after transient effects hav e deca yed. W e then analyzed four structural prop erties of ϕ ( s Z ; · ) : its kernel, range, con tin uit y , and dependence on the penalty parameter σ . These results explain or predict three empirical slow-con vergence patterns: • Using the c haracterization ker( ϕ ( s Z ; · )) = T Z ⋆ ( s Z ) together with the almost-sure t yp e con tin uit y of the limit map, w e show ed that (∆ Z ( k ) , ∆ Z ( k +1) ) tends to b e small yet nonzero, except for sparse spikes. 64 • By relating ran( ϕ ( s Z ; · )) to aff ( C ( s Z )) , w e show ed that Z ( k ) can b e transien tly trapp ed in a lo w- dimensional subspace for an extended perio d of time. • Exploiting a primal–dual decoupling of ϕ ( s Z ; · ) , we show ed that primal/dual infeasibilities are lo cally insensitiv e to σ in the second-order-dominant regimes, clarifying why classical balancing heuristics can b ecome ineffective. Extensiv e experiments on the Mittelmann dataset corrob orate these theoretical predictions. W e hop e our re- sults motiv ate a broader and more systematic study of the ubiquitous slo w-conv ergence b eha vior encountered in first-order splitting methods for SDPs. A c kno wledgments W e sincerely thank Henry W olk o wicz for v aluable discussions. 65 App endix A Pro of of Theorem 2 Let F : S n 7→ S n b e a (parab olically) second-order directional differentiable sp ectral function generated by a (parab olically) second-order directional differen tiable scalar function f : R 7→ R . F or x = y = z , define first- and (parabolic) second-order divided difference of f as f [1] ( x, y ) : = f ( x ) − f ( y ) x − y = f ( x ) x − y + f ( y ) y − x , f [2] ( x, y , z ) : = f [1] ( x, y ) − f [1] ( x, z ) y − z = f ( x ) ( x − y )( x − z ) + f ( y ) ( y − x )( y − z ) + f ( z ) ( z − x )( z − y ) . No w let ( Z , H , W ) be given b y the three-level description in § 3 . F or any k ∈ I , denote Φ k : S | α k | 7→ S | α k | as the sp ectral function generated by the scalar function as f ′ ( µ k ; · ) . F or any k ∈ I , i ∈ I k , denote Ψ k,i : S | β k,i | 7→ S | β k,i | , suc h that Ψ k,i is generated by the scalar function is f ′′ ( µ k ; η k,i , · ) . F or any a, b ∈ I , define Γ 1 ( H, W ) α a α b : = f [1] ( µ a , µ b ) W α a α b + P c = { a,b } 2 f [2] ( µ a , µ b , µ c ) H α a α c H α c α b − 2( f ( µ a ) − f ( µ b )) ( µ a − µ b ) 2 ( H α a α a H α a α b − H α a α b H α b α b ) , a = b − P c ∈I \{ a } 2( f ( µ a ) − f ( µ c )) ( µ a − µ c ) 2 H α a α c H α c α a , a = b , (65) and Γ 2 ( H , W ) α a α b = ( Q a [Ω a ◦ p Q a q T V a Q a ] p Q a q T , a = b Φ a ( H α a α a ) 2 H α a α b µ a − µ b + 2 H α a α b µ b − µ a Φ b ( H α b α b ) , a = b , (66) and Γ 3 ( H , W ) α a α b = Q a diag ˆ n Ψ a,i ( ˆ V i,i a ) o i ∈I a ˙ p Q a q T , a = b 0 , a = b . (67) T o prov e Theorem 2 , w e first need the (parabolic) second-order directional deriv ativ e’s formula for a general spectral function F from [ 59 , Theorem 4.1]. Theorem 8 ( F ′′ ( Z ; H , W ) ) . L et the triplet ( Z , H , W ) b e given by the thr e e-level description in § 3 . Then, for any a, b ∈ I , F ′′ ( Z ; H , W ) α a α b = Γ 1 ( H , W ) α a α b + Γ 2 ( H , W ) α a α b + Γ 3 ( H , W ) α a α b , wher e Γ 1 , Γ 2 , Γ 3 is define d in ( 65 ) to ( 67 ) . 3 No w we are ready to prov e Theorem 2 . F or the PSD cone pro jection op erator Π S n + ( · ) , f ( µ k ) = max { µ k , 0 } . Th us, f ′ ( µ k ; η k,i ) = η k,i , µ k > 0 max { η k,i , 0 } , µ k = 0 0 , µ k < 0 , and f ′′ ( µ k ; η k,i , ζ k,i,i ′ ) = ζ k,i,i ′ , µ k > 0 ζ k,i,i ′ , η k,i > 0 max { ζ k,i,i ′ , 0 } , η k,i = 0 0 , η k,i < 0 , µ k = 0 0 , µ k < 0 . 3 In [ 59 , Eq. (4.4) - (4.5)], the authors drop the multiplier 2 , and it should b e − f ( µ l ) − f ( µ k ) ( µ l − µ k ) 2 , instead of f ( µ l ) − f ( µ k ) ( µ l − µ k ) 2 . In [ 59 , Theorem 4.1], r F ′′ ( Z ; H , W ) s α a α a drops the term C ( H, W ) α a α a . 66 Case (1)(i): a ∈ I + , b ∈ I + and a = b . F or Γ 1 ( H , W ) α a α b , f [1] ( µ a , µ b ) = 1 , f ( µ a ) − f ( µ b ) = µ a − µ b , and f [2] ( µ a , µ b , µ c ) = ( 0 , c ∈ I + ∪ I 0 1 µ a − µ b ´ µ a µ a − µ c − µ b µ b − µ c ¯ = − µ c ( µ c − µ a )( µ c − µ b ) , c ∈ I − Th us, 4 Γ 1 ( H , W ) α a α b = W α a α b + 2 X c ∈I − − µ c ( µ c − µ a )( µ c − µ b ) H α a α c H α c α b − 2 µ a − µ b ( H α a α a H α a α b − H α a α b H α b α b ) . F or Γ 2 ( H , W ) α a α b , w e hav e Φ a ( H α a α a ) = H α a α a since f ′ ( µ a ; η a,i ) = η a,i . Symmetrically , Φ b ( H α b α b ) = H α b α b . Thus, Γ 2 ( H , W ) α a α b = 2 µ a − µ b H α a α a H α a α b − 2 µ a − µ b H α a α b H α b α b . F or Γ 3 ( H , W ) α a α b , it is 0 . Thus, Π ′′ S n + ( Z ; H , W ) α a α b = W α a α b + 2 X c ∈I − − µ c ( µ c − µ a )( µ c − µ b ) H α a α c H α c α b . Case (1)(ii): a ∈ I + , b ∈ I + and a = b . F or Γ 1 ( H , W ) α a α a , Γ 1 ( H , W ) α a α a = − 2 X c ∈I \{ a } f ( µ a ) − f ( µ c ) ( µ a − µ c ) 2 H α a α c H α c α a = − 2 X c ∈I + \{ a } 1 µ a − µ c H α a α c H α c α a − 2 X c ∈I 0 1 µ a H α a α c H α c α a − 2 X c ∈I − µ a ( µ a − µ c ) 2 H α a α c H α c α a . F or Γ 2 ( H , W ) α a α a , since r f ′ ( µ a , · ) s [1] ( η a,i , η a,j ) = 1 , w e ha ve Ω a β a,i ,β a,j = ( E | β a,i |×| β a,j | , i = j 0 , i = j Th us, Γ 2 ( H , W ) α a α a = Q a “ Ω a ◦ ` ( Q a ) T V a ( H , W ) Q a ˘‰ ( Q a ) T = V a ( H , W ) − Q a diag ˆ n ˆ V i,i a ( H , W ) o i ∈I a ˙ ( Q a ) T = W α a α a + X c ∈I \{ a } 2 µ a − µ c H α a α c H α c α a − Q a diag ˆ n ˆ V i,i a ( H , W ) o i ∈I a ˙ ( Q a ) T . F or Γ 3 ( H , W ) α a α a , since f ′′ ( µ a ; η a,i , ζ a,i,i ′ ) = ζ a,i,i ′ , Ψ a,i ( ˆ V i,i a ) = ˆ V i,i a . Thus, Γ 3 ( H , W ) α a α a = Q a diag ˆ n ˆ V i,i a ( H , W ) o i ∈I a ˙ ( Q a ) T . 4 In [ 35 , Eq. (10a)], the µ j in the numerator should b e − µ j . 67 Summing up all three terms: Π ′′ S n + ( Z ; H , W ) α a α a = Γ 1 ( H , W ) α a α a + Γ 2 ( H , W ) α a α a + Γ 3 ( H , W ) α a α a = W α a α a + 2 X c ∈I − 1 µ a − µ c H α a α c H α c α a − 2 X c ∈I − µ a ( µ a − µ c ) 2 H α a α c H α c α a = W α a α a + 2 X c ∈I − − µ c ( µ a − µ c ) 2 H α a α c H α c α a . Clearly , Case (1)(i) and Case (1)(ii)’s results can b e merged. Case (2): a ∈ I + , b ∈ I 0 . Γ 1 ( H , W ) α a α b is the same as Case (1)(i): a ∈ I + , b ∈ I + and a = b , except that µ b = 0 : Γ 1 ( H , W ) α a α b = W α a α b + 2 X c ∈I − 1 µ a − µ c H α a α c H α c α b − 2 µ a ( H α a α a H α a α b − H α a α b H α b α b ) . F or Γ 2 ( H , W ) α a α b , w e ha ve f ′ ( µ a ; η a,i ) = η a,i and f ′ ( µ b ; η b,i ) = f ′ (0; η b,i ) = max { η b,i , 0 } . Therefore, Φ a ( H α a α a ) = H α a α a , Φ b ( H α b α b ) = Π + ( H α b α b ) . Consequen tly , Γ 2 ( H , W ) α a α b = H α a α a 2 H α a α b µ a − 2 H α a α b µ a Π + ( H α b α b ) . F or Γ 3 ( H , W ) α a α b , it is 0 since a = b . Thus, Π ′′ S n + ( Z ; H , W ) α a α b = W α a α b + 2 X c ∈I − 1 µ a − µ c H α a α c H α c α b + 2 µ a H α a α b H α b α b − 2 µ a H α a α b Π + ( H α b α b ) = W α a α b + 2 X c ∈I − 1 µ a − µ c H α a α c H α c α b − 2 1 µ a H α a α b Π + ( − H α b α b ) . Case (3): a ∈ I + , b ∈ I − . In this case, f [1] ( µ a , µ b ) = µ a µ a − µ b , and f [2] ( µ a , µ b , µ c ) = 1 − − µ c µ b − µ c µ a − µ b = − µ b ( µ b − µ a )( µ b − µ c ) , c ∈ I + \{ a } 1 µ a − µ b , c ∈ I 0 µ a µ a − µ c µ a − µ b = µ a ( µ a − µ b )( µ a − µ c ) , c ∈ I − \{ b } Th us, Γ 1 ( H , W ) α a α b = µ a µ a − µ b W α a α b + 2 X c ∈I + \{ a } − µ b ( µ b − µ a )( µ b − µ c ) H α a α c H α c α b + 2 X c ∈I 0 1 µ a − µ b H α a α c H α c α b + 2 X c ∈I − \{ b } µ a ( µ a − µ b )( µ a − µ c ) H α a α c H α c α b − 2 µ a ( µ a − µ b ) 2 ( H α a α a H α a α b − H α a α b H α b α b ) . F or Γ 2 ( H , W ) α a α b : since Φ b ( H α b α b ) = 0 , w e ha ve Γ 2 ( H , W ) α a α b = 2 1 µ a − µ b H α a α a H α a α b . 68 Γ 3 ( H , W ) α a α b = 0 since a = b . Th us, Π ′′ S n + ( Z ; H , W ) α a α b = µ a µ a − µ b W α a α b + 2 X c ∈I + \{ a } − µ b ( µ b − µ a )( µ b − µ c ) H α a α c H α c α b + 2 X c ∈I 0 1 µ a − µ b H α a α c H α c α b + 2 X c ∈I − \{ b } µ a ( µ a − µ b )( µ a − µ c ) H α a α c H α c α b + 2 − µ b ( µ a − µ b ) 2 H α a α a H α a α b + 2 µ a ( µ a − µ b ) 2 H α a α b H α b α b = µ a µ a − µ b W α a α b + 2 X c ∈I + − µ b ( µ b − µ a )( µ b − µ c ) H α a α c H α c α b + 2 X c ∈I 0 1 µ a − µ b H α a α c H α c α b + 2 X c ∈I − µ a ( µ a − µ b )( µ a − µ c ) H α a α c H α c α b . Case (4): a ∈ I 0 , b ∈ I 0 . This case implies a = b . F or Γ 1 ( H , W ) α a α a : Γ 1 ( H , W ) α a α a = 2 X c ∈I \{ a } f ( µ c ) µ 2 c H α a α c H α c α a = 2 X c ∈I + 1 µ c H α a α c H α c α a . F or Γ 2 ( H , W ) α a α a , r f ′ ( µ a ; · ) s [1] ( η a,i , η a,j ) = max { η a,i , 0 } − max { η a,j , 0 } η a,i − η a,j . Th us, Ω a β a,i ,β a,j = ( max { η a,i , 0 }− max { η a,j , 0 } η a,i − η a,j E | β a,i |×| β a,j | , i = j 0 , i = j F or Γ 3 ( H , W ) α a α a , since f ′′ ( µ a ; η a,i , ζ a,i,i ′ ) = ζ a,i,i ′ , η a,i > 0 max { ζ a,i,i ′ , 0 } , η a,i = 0 0 , η a,i < 0 W e ha v e Ψ a,i ( ˆ V i,i a ) = ˆ V i,i a , i ∈ I a, + Π + ( ˆ V i,i a ) , i ∈ I a, 0 0 , i ∈ I a, − No w w e simplify Γ 2 ( H , W ) α a α a + Γ 3 ( H , W ) α a α a . Notice that from ( 11 ), Π ′ + ( H α a α a ; V a ) = Q a Υ a ( Q a ) T , where Υ a β a,i β a,j = ( max { η a,i , 0 }− max { η a,j , 0 } η a,i − η a,j ˆ V i,j a = Ω a β a,i β a,j ◦ ˆ V i,j a , i = j Ψ a,i ( ˆ V i,i a ) , i = j 69 Therefore, Γ 2 ( H , W ) α a α a + Γ 3 ( H , W ) α a α a = Π ′ + ( H α a α a ; V a ) . Also, Π ′′ S n + ( Z ; H , W ) α a α a = 2 X c ∈I + 1 µ c H α a α c H α c α a + Π ′ + ( H α a α a ; V a ( H , W )) , where V a ( H , W ) = W α a α a + 2 X c ∈I \{ a } 1 µ a − µ c H α a α c H α c α a = W α a α a − 2 X c ∈I + 1 µ c H α a α c H α c α a + 2 X c ∈I − 1 − µ c H α a α c H α c α a . Case (5): a ∈ I 0 , b ∈ I − . In this case, µ a = 0 , f [1] ( µ a , µ b ) = 0 , and f [2] ( µ a , µ b , µ c ) = ( − µ b ( µ b − µ a )( µ b − µ c ) = 1 µ c − µ b , c ∈ I + µ a ( µ a − µ b )( µ a − µ c ) = 0 , c ∈ I − \{ b } Th us, Γ 1 ( H , W ) α a α b = 2 X c ∈I + 1 µ c − µ b H α a α c H α c α b . F or Γ 2 ( H , W ) α a α b , Φ a ( H α a α a ) = Π + ( H α a α a ) and Φ b ( H α b α b ) = 0 . Thus, Γ 2 ( H , W ) α a α b = 2 − µ b Π + ( H α a α a ) H α a α b . Γ 3 ( H , W ) α a α b = 0 since a = b . Th us, Π ′′ S n + ( Z ; H , W ) α a α b = 2 X c ∈I + 1 µ c − µ b H α a α c H α c α b + 2 1 − µ b Π + ( H α a α a ) H α a α b . Case (6)(i): a ∈ I − , b ∈ I − and a = b . In this case, Γ 2 ( H , W ) α a α b = Γ 3 ( H , W ) α a α b = 0 , f ( µ a ) = f ( µ b ) = 0 . Thus, Π ′′ S n + ( Z ; H , W ) α a α b = Γ 1 ( H , W ) α a α b = 2 X c ∈I + f [2] ( µ a , µ b , µ c ) H α a α c H α c α b =2 X c ∈I + µ c µ c − µ a − µ c µ c − µ b µ a − µ b H α a α c H α c α b = 2 X c ∈I + µ c ( µ c − µ a )( µ c − µ b ) H α a α c H α c α b . Case (6)(ii): a ∈ I − , b ∈ I − and a = b . In this case, Γ 2 ( H , W ) α a α b = Γ 3 ( H , W ) α a α b = 0 , Ω a = 0 , and Ψ a,i ( ˆ V i,i a ) = 0 . Thus, Π ′′ S n + ( Z ; H , W ) α a α a = Γ 1 ( H , W ) α a α a = − 2 X c ∈I \{ a } f ( µ a ) − f ( µ c ) ( µ a − µ c ) 2 H α a α c H α c α a =2 X c ∈I + µ c ( µ a − µ c ) 2 H α a α c H α c α a . Clearly , Case (6)(i) and Case (6)(ii)’s results can b e merged. This closes the proof of Theorem 2 . 70 References [1] F arid Alizadeh, Jean-Pierre A Haeberly , and Michael L Overton. Complementarit y and nondegeneracy in semidefinite programming. Mathematic al pr o gr amming , 77(1):111–128, 1997. [2] F arid Alizadeh, Jean-Pierre A Haeberly , and Michael L Ov erton. Primal-dual interior-point methods for semidefinite programming: con vergence rates, stability and numerical results. SIAM journal on optimization , 8(3):746–768, 1998. [3] Mosek ApS. MOSEK optimization to olb o x for MA TLAB. User’s Guide and R efer enc e Manual, V ersion , 4(1), 2019. [4] Jean-Bernard Baillon. On the asymptotic behavior of nonexpansive mappings and semigroups in banach spaces. Houston Journal of Mathematics , 4:1–9, 1978. [5] Chenglong Bao, Chao Ding, F uxiao yue F eng, and Jingyu Li. Stratification for nonlinear semidefinite programming. arXiv pr eprint arXiv:2601.08362 , 2026. [6] Heinz H. Bausc hk e and P atrick L. Combettes. Convex A nalysis and Monotone Op er ator The ory in Hilb ert Sp ac es . Springer, 2 edition, 2017. [7] Heinz H Bauschk e, Hui Ouyang, and Xianfu W ang. On angles betw een conv ex sets in hilb ert spaces. Journal of Mathematic al A nalysis and Applic ations , 502(1):125239, 2021. [8] Ric hard Caron and Tim T raynor. The zero set of a polynomial. WSMR R ep ort , pages 05–02, 2005. [9] Jac k Carr. Applic ations of c entr e manifold the ory , volume 35. Springer Science & Business Media, 2012. [10] Zi Xian Chan and Defeng Sun. Constraint nondegeneracy , strong regularit y , and nonsingularit y in semidefinite programming. SIAM Journal on optimization , 19(1):370–396, 2008. [11] Liang Chen, Defeng Sun, and Kim-Ch uan T oh. An efficient inexact symmetric gauss–seidel based ma- jorized admm for high-dimensional conv ex composite conic programming. Mathematic al Pr o gr amming , 161(1):237–270, 2017. [12] Ying Cui, Defeng Sun, and Kim-Chuan T oh. On the asymptotic sup erlinear conv ergence of the augmen ted lagrangian method for semidefinite programming with m ultiple solutions. arXiv pr eprint arXiv:1610.00875 , 2016. [13] Mic hael Dellnitz and Oliver Junge. On the approximation of complicated dynamical b eha vior. SIAM Journal on Numeric al A nalysis , 36(2):491–515, 1999. [14] Chao Ding, Defeng Sun, Jie Sun, and Kim-Chuan T oh. Spectral op erators of matrices. Mathematic al Pr o gr amming , 168:509–531, 2018. [15] Lijun Ding and Madeleine Udell. On the simplicit y and conditioning of low rank semidefinite programs. SIAM Journal on Optimization , 31(4):2614–2637, 2021. [16] Lijun Ding and Stephen J W righ t. On squared-v ariable formulations for nonlinear semidefinite pro- gramming. arXiv pr eprint arXiv:2502.02099 , 2025. [17] F uxiaoyue F eng, Chao Ding, and Xudong Li. A quadratically con v ergent semismooth newton method for nonlinear semidefinite programming without generalized jacobian regularity . Mathematic al Pr o gr am- ming , pages 1–41, 2025. [18] Mic hael Garstk a, Mark Cannon, and Paul Goulart. Cosmo: A conic operator splitting metho d for con vex conic problems. Journal of Optimization The ory and Applic ations , 190(3):779–810, 2021. 71 [19] Deren Han, Defeng Sun, and Liwei Zhang. Linear rate conv ergence of the alternating direction method of multipliers for conv ex comp osite programming. Mathematics of Op er ations R ese ar ch , 43(2):622–637, 2018. [20] Qiushi Han, Zhen w ei Lin, Han w en Liu, Caih ua Chen, Qi Deng, Dongdong Ge, and Yin yu Y e. A c- celerating lo w-rank factorization-based semidefinite programming algorithms on gpu. arXiv pr eprint arXiv:2407.15049 , 2024. [21] Nic holas J Higham. Computing a nearest symmetric p ositive semidefinite matrix. Line ar algebr a and its applic ations , 103:103–118, 1988. [22] Xin Jiang and Lieven V anden b erghe. Bregman primal–dual first-order method and application to sparse semidefinite programming. Computational Optimization and Applic ations , 81(1):127–159, 2022. [23] Sh ucheng Kang, Haoyu Han, Antoine Groudiev, and Heng Y ang. F actorization-free orthogonal pro jection onto the positive semidefinite cone with comp osite polynomial filtering. arXiv pr eprint arXiv:2507.09165 , 2025. [24] Sh ucheng Kang, Xin Jiang, and Heng Y ang. Lo cal linear con vergence of the alternating direction metho d of m ultipliers for semidefinite programming under strict complemen tarit y . arXiv pr eprint arXiv:2503.20142 , 2025. [25] Sh ucheng Kang, Guorui Liu, and Heng Y ang. Global con tact-ric h planning with sparsity-ric h semidefi- nite relaxations. In R ob otics: Scienc e and Systems (RSS) , 2025. [26] Sh ucheng Kang, Xiao y ang Xu, Jay Sarv a, Ling Liang, and Heng Y ang. F ast and certifiable tra jectory optimization. In International W orkshop on the A lgorithmic F oundations of R ob otics (W AFR) , 2024. [27] Jean B Lasserre. Global optimization with polynomials and the problem of moments. SIAM Journal on Optimization , 11(3):796–817, 2001. [28] A drian S Lewis. Deriv ativ es of spectral functions. Mathematics of Op er ations R ese ar ch , 21(3):576–588, 1996. [29] A drian S Lewis. Activ e sets, nonsmo othness, and sensitivity . SIAM Journal on Optimization , 13(3):702– 725, 2002. [30] A drian S Lewis and Hristo S Sendo v. T wice differen tiable sp ectral functions. SIAM Journal on Matrix A nalysis and Applic ations , 23(2):368–386, 2001. [31] Y ongfeng Li, Zaiw en W en, Chao Y ang, and Y a-xiang Y uan. A semismo oth newton metho d for semidef- inite programs and its applications in electronic structure calculations. SIAM Journal on Scientific Computing , 40(6):A4131–A4157, 2018. [32] Jingw ei Liang, Jalal F adili, and Gabriel Peyré. Lo cal conv ergence prop erties of douglas–rac hford and al- ternating direction metho d of multipliers. Journal of Optimization The ory and Applic ations , 172(3):874– 913, 2017. [33] Jingw ei Liang, Jalal M F adili, and Gabriel Peyré. Lo cal linear con v ergence of forw ard–bac kw ard under partial smoothness. In Confer enc e on Neur al Information Pr o c essing Systems (NeurIPS) , volume 27, 2014. [34] F eng-Yi Liao, Lijun Ding, and Y ang Zheng. Inexact augmented lagrangian metho ds for conic optimiza- tion: Quadratic gro wth and linear conv ergence. In A dvanc es in Neur al Information Pr o c essing Systems , v olume 37, pages 41013–41050, 2024. 72 [35] Y ulan Liu and Shaoh ua P an. Second-order optimality conditions for mathematical program with semidefinite cone complemen tarity constrain ts and applications. Set-V alue d and V ariational A nalysis , 30(2):373–395, 2022. [36] Haihao Lu and Jinw en Y ang. On the geometry and refined rate of primal–dual h ybrid gradien t for linear programming. Mathematic al Pr o gr amming , 212(1):349–387, 2025. [37] Zhi-Quan Luo, Jos F Sturm, and Shuzhong Zhang. Sup erlinear conv ergence of a symmetric primal- dual path follo wing algorithm for sdp. In A dvanc es in Nonline ar Pr o gr amming: Pr o c e e dings of the 96 International Confer enc e on Nonline ar Pr o gr amming , pages 283–297. Springer, 1998. [38] P ablo A Parrilo. Semidefinite programming relaxations for semialgebraic problems. Mathematic al pr o gr amming , 96(2):293–320, 2003. [39] Amnon P azy . Asymptotic b eha vior of con tractions in hilbert space. Isr ael Journal of Mathematics , 9(2):235–240, 1971. [40] La wrence P erk o. Differ ential e quations and dynamic al systems , v olume 7. Springer Science & Business Media, 2013. [41] R T yrrell Ro c k afellar and Roger JB W ets. V ariational analysis . Springer, 1998. [42] Nikitas Rontsis, Paul Goulart, and Y uji Nak atsuk asa. Efficient semidefinite programming with approx- imate admm. Journal of Optimization The ory and Applic ations , 192(1):292–320, 2022. [43] Ernest K Ryu and W otao Yin. L ar ge-sc ale c onvex optimization: algorithms & analyses via monotone op er ators . Cambridge Universit y Press, 2022. [44] Alexander Shapiro. First and second order analysis of nonlinear semidefinite programs. Mathematic al pr o gr amming , 77(1):301–320, 1997. [45] Stefan Sremac, Hugo J W o erdeman, and Henry W olk owicz. Error b ounds and singularity degree in semidefinite programming. SIAM Journal on Optimization , 31(1):812–836, 2021. [46] Jos F. Sturm. Using sedumi 1.02, a MA TLAB to olbox for optimization ov er symmetric cones. Opti- mization Metho ds and Softwar e , 11(1-4):625–653, 1999. [47] Jos F Sturm. Error b ounds for linear matrix inequalities. SIAM Journal on Optimization , 10(4):1228– 1248, 2000. [48] Defeng Sun. The strong second-order sufficien t condition and constraint nondegeneracy in nonlinear semidefinite programming and their implications. Mathematics of Op er ations R ese ar ch , 31(4):761–776, 2006. [49] Defeng Sun and Jie Sun. Semismo oth matrix-v alued functions. Mathematics of Op er ations R ese ar ch , 27(1):150–169, 2002. [50] Reha H Tütüncü, Kim-Ch uan T oh, and Michael J T o dd. Solving semidefinite-quadratic-linear programs using sdpt3. Mathematic al pr o gr amming , 95:189–217, 2003. [51] Ha yato W aki, Maho Nak ata, and Masak azu Muramatsu. Strange b eha viors of in terior-p oin t metho ds for solving semidefinite programming problems in p olynomial optimization. Computational Optimization and Applic ations , 53(3):823–844, 2012. [52] Jie W ang, Liangbing Hu, and Bican Xia. A dual riemannian admm algorithm for low-rank sdps with unit diagonal. arXiv pr eprint arXiv:2512.04406 , 2025. 73 [53] Jie W ang, Victor Magron, and Jean-Bernard Lasserre. T ssos: A moment-sos hierarc h y that exploits term sparsit y . SIAM Journal on Optimization , 31(1):30–58, 2021. [54] Zaiw en W en, Donald Goldfarb, and W otao Yin. Alternating direction augmented Lagrangian metho ds for semidefinite programming. Mathematic al Pr o gr amming Computation , 2(3):203–230, 2010. [55] Zik ai Xiong. Accessible complexit y b ounds for restarted p dhg on linear programs with a unique opti- mizer. arXiv pr eprint arXiv:2410.04043 , 2024. [56] Heng Y ang and Luca Carlone. Certifiably optimal outlier-robust geometric p erception: Semidefinite relaxations and scalable global optimization. IEEE T r ansactions on Pattern A nalysis and Machine Intel ligenc e (TP AMI) , 45(3):2816–2834, 2022. [57] Liuqin Y ang, Defeng Sun, and Kim-Chuan T oh. SDPNAL+: A ma jorized semismo oth Newton-CG aug- men ted Lagrangian method for semidefinite programming with nonnegativ e constraints. Mathematic al Pr o gr amming Computation , 7(3):331–366, 2015. [58] Xiaoming Y uan, Shangzhi Zeng, and Jin Zhang. Discerning the linear con v ergence of admm for struc- tured con v ex optimization through the lens of v ariational analysis. Journal of Machine L e arning R e- se ar ch , 21(83):1–75, 2020. [59] Liw ei Zhang, Ning Zhang, and Xiantao Xiao. On the second-order directional deriv ativ es of singu- lar v alues of matrices and symmetric matrix-v alued functions. Set-V alue d and V ariational Analysis , 21(3):557–586, 2013. [60] Y ang Zheng, Giov anni F antuzzi, Antonis Papac hristo doulou, Paul Goulart, and Andrew W ynn. F ast ADMM for homogeneous self-dual em bedding of sparse SDPs. IF A C-Pap ersOnLine , 50(1):8411–8416, 2017. 74
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment