Faster Approximate Lossy Generalized Flow via Interior Point Algorithms

We present faster approximation algorithms for generalized network flow problems. A generalized flow is one in which the flow out of an edge differs from the flow into the edge by a constant factor. We limit ourselves to the lossy case, when these fa…

Authors: Samuel I. Daitch, Daniel A. Spielman

F aster Lossy Generalized Flo w via In terior P oin t Algorithms ∗ Sam uel I. Daitc h Departmen t of Computer Science Y ale Univ ersit y Daniel A. Spielman Program in Applied Mathematics and Departmen t of Computer Science Y ale Univ ersit y Ma y 30, 2018 Abstract W e pr e sent asymptotically faster approximation a lgorithms for the generalized flow prob- lems in whic h mult iplier s o n edges are at most 1. F or this lossy version o f the maximum generalized flow problem, we obtain a n additive ǫ a pproximation of the maximum flo w in time e O  m 3 / 2 log 2 ( U / ǫ )  , wher e m is the num b er of edges in the graph, all ca pacities ar e integers in the range { 1 , . . . , U } , a nd all loss m ultipliers are r atios o f in teger s in this range. F or minimum cost lo ssy g eneralized flow with costs in the range { 1 , . . . , U } , w e obtain a flow tha t has v alue within an additive ǫ of the max imum v alue and cost at most the o ptimal c o st. I n many param- eter ra ng es, these algo rithms improv e over the prev iously fastest alg orithms for the generalized maximum flow problem by a factor of m 1 / 2 and for the minimum cost generalize d flow problem by a factor of approximately m 1 / 2 /ǫ 2 . The algo rithms work b y accelera ting traditional interior po int alg orithms by quic kly solv- ing the linea r equa tions that arise in e a ch step. The contributions of this pap er are tw ofold. First, we ana lyze the per formance of interior p oint algorithms with approximate linea r system solvers. This a na lysis a lone provides an a lgorithm for the standard minimum cost flow problem that runs in time e O  m 3 / 2 log 2 U  —an improvemen t of approximately e O  n/m 1 / 2  ov er previo us algorithms. Second, w e examine the linear eq uations that arise when using an interior p oint algo rithm to solve generalize d flow problems. W e observe that these b elong to the family of symmetr ic M-matrices, and we then develop e O ( m )-time alg orithms for solving linear systems in these matrices. These algor ithms reduce the problem of solving a linear sys tem in a symmetric M- matrix to that of solving O (log n ) linear systems in symmetric diagonally-do mina nt matrices, which we can do in time e O ( m ) using the algor ithm of Spielman and T eng. All of o ur alg orithms op erate on num b ers of bit le ng th at most O (log nU /ǫ ). ∗ This material is based up on w ork supp orted by the National Science F ound ation under Grant N os. CCF-0707522 and CCF-0634957. A ny opinions, findings, and conclusions or recommendations ex pressed in th is material are th ose of the author(s) an d d o not necessarily reflect the views of th e National S cience F ound ation. 1 In tro du ction In terior-p oin t algorithms are one of the most p opular w a ys of solving linear programs. These algorithms are iterativ e, and their complexit y is dominated by the cost of solving a system of linear equations at eac h iteration. T ypical complexit y analyses of in terior p oin t algorithms apply w orst- case b ounds on the ru nning time of linear equ ations s olvers. Ho w eve r, in most app lications the linear equations that arise are quite sp ecial and ma y b e solv ed by faster algorithms. Eac h family of optimization problem leads to a f amily of linear equations. F or example, the maxim um flo w and minim um cost flow pr oblems require the solution of linear s ystems whose matrices are symmetric and diagonally-dominan t. The generalize d v ersions of these flo w problems result in symmetric M-matrices. The generalize d maximum flow pr oblem is sp ecified by a directed grap h ( V , E ), an in wa rd capacit y c ( e ) > 0 and a multiplier γ ( e ) > 0 for eac h edge e , and source and sink v ertices s and t . F or ev ery u nit fl o wing in to edge e , γ ( e ) flo ws out. In lossy generalized flo w problems, eac h multiplier γ ( e ) is restricted to b e at m ost 1. In the generaliz ed maxim um fl o w problem, one is ask ed to find the fl ow f : E → I R + that maximizes th e flo w into t given an u nlimited supply at s , sub j ect to the capacit y constr aints on th e amoun t of fl o w en tering eac h ed ge. In the generalized minim um cost flo w problem, one also has a cost function q ( e ) ≥ 0, and is ask ed to find the maximum flow of minim um cost (see [AMO93]). In the f ollo win g c h art, we compare the complexit y of our algorithms with the f astest algorithms of w hic h w e are a wa re. The ru nning times are giv en for net w orks in w hic h all capacities and costs are p ositiv e int egers less th an U and ev ery loss factor is a ratio of tw o integ ers less than U . F or the standard flo w p roblems, our algorithms are exact, but for the generalized flo w p roblems our algorithms find additiv e ǫ appr o ximations, wh ile the other approxi mation algorithms ha ve m ulti- plicativ e error (1 + ǫ ). Ho we ver, w e note that our algorithms only require arithm etic with num b ers of bit-length O (log ( nU /ǫ )), w hereas we susp ect that the algorithms ob taining m ultiplicativ e ap- pro ximations migh t require muc h longer n umbers . In th e c h art, C refers to the v alue of the flo w. Exact algorithms Appro ximation algorithms Our algorithm Generalized Maxim um Flo w O  m 2 ( m + n log n ) log U  [GJO97] e O  m 2 /ǫ 2  [FW02] e O  m 1 . 5 log 2 ( U /ǫ )  O  m 1 . 5 n 2 log( nU )  [V ai89] e O  m ( m + n log log B ) log ǫ − 1  [GFNR98][TW98][FW02] Generalized Minim um Cost Flow O  m 1 . 5 n 2 log( nU )  [V ai89] e O  m 2 log log B /ǫ 2  [FW02] e O  m 1 . 5 log 2 ( U /ǫ )  Maxim um Flow O  min( n 3 / 2 , m 1 / 2 ) m log ( n 2 /m ) log U  e O  m 1 . 5 log 2 U  [GR98] Minim um C ost Flo w O  nm log( n 2 /m ) log( nC )  [GT87] e O  m 1 . 5 log 2 U  O ( nm (log log U ) log ( nC )) [A GOT92] O (( m log n )( m + n log n )) [Orl88] 1.1 The solution of systems in M-matrices A symmetric m atrix M is diagonal ly dominant if eac h diagonal is at least the su m of the absolute v alues of th e other en tries in its ro w. A symmetric matrix M is an M -matrix if there is a p ositiv e di- agonal matrix D for whic h D M D is diagonally domin an t. Spielman and T eng [ST04, ST 06] sho we d 1 ho w to s olv e linear systems in diagonally d ominan t matrices to ǫ accuracy in time e O  m log ǫ − 1  . W e s ho w ho w to solv e linear systems in M -matrices by first computing a d iagonal matrix D for whic h D M D is diagonally dominan t, and then applying the solver of S pielman an d T eng. Our algorithm for fin ding th e matrix D applies the solv er of S pielman and T eng an exp ecte d O (log n ) times. While iterativ e algorithms are kn o wn that even tu ally pr o duce su c h a d iagonal matrix D , they ha ve no satisfactory complexit y analysis [Li02, LLH + 98, BCP T05]. 1.2 Analysis of in terior p oin t metho ds In our analysis of int erior-p oin t metho ds, w e examine the complexit y of the short-step du al path follo w ing algorithm of Ren egar [Ren88] as analyzed by Y e [Y e97]. The key observ ations requ ired by our complexit y analysis are th at none of the slack v ariables b ecome to o small dur in g the course of the algorithm and that the algorithm still w orks if one O (1 / √ m )-appro ximately solv es eac h linear system in the matrix norm (defined b elo w ). Conv enien tly , this is the same t y p e of appro ximation pro du ced by our algorithm and that of Spielman and T eng. T his is a v ery crude lev el of app r o xi- mation, and it means that these algorithms can b e applied v ery qu ickly . While other analyses of the b ehavior of in terior p oint metho ds with inexact solvers ha ve app eared [Ren96], we are unaw are of any analyses that are s u fficien tly fine for our pu rp oses. This an alysis is give n in detail in App endix C. 1.3 Outline of the pap er In Section 2, w e describ e th e results of our analysis of interior p oint m etho ds u sing apprximate solv ers. In Section 3, we describ e the formulat ion of the generalized flo w problems as linear pro- grams, and discu s s ho w to obtain the solutions from the output of an interio r-p oin t algorithm. In Section 4 , w e giv e our algorithm for solving linear systems in M-matrices. 2 In terior-P oin t Algorithm using an A ppro ximate Solv er Our algorithm uses n u merical metho ds to solv e a linear pr ogram f orm ulation of the generalized flo w problems. T he fastest int erior-p oin t metho d s for linear p r ograms, such as that of Renegar [Ren88] require only O ( √ n ) iterations to app roac h th e solution, wh ere eac h iteration takes a step th rough the conv ex p olytope b y s olving a system of linear equations. In this p ap er, we consider stepping through th e linear pr ogram using an only an appro ximate solv er, i.e. an algo rithm x = Solve ( M , b , ǫ ) that r eturns a solution satisfying   x − M − 1 b   M ≤ ǫ   M − 1 b   M where the matrix norm k·k M is giv en b y k v k M = √ v T M v . As men tioned ab o ve, we hav e analyze d the Renegar [Ren88] ve rs ion of th e d ual p ath-folllo wing algorithm, along the lines of the analysis that foun d in [Y e97 ], but mo dified to account for the u s e of an appro ximate solv er. In p articular, using the approximate solv er we implement an in terior-p oin t algorithm w ith the follo w ing prop erties: Theorem 2.1. x = Interio rPoint ( A, b , c , λ min , T , y 0 , ǫ ) takes input that satisfies • A is an n × m matrix; b is a leng th n ve ctor; c is a leng th m ve ctor 2 • AA T is p ositive definite, and λ min > 0 is a lower b ound on the eigenvalues of AA T • T > 0 is an upp er b ound on the absolute v alues of the c o or dinates in the dual line ar pr o gr am, i.e. k y k ∞ ≤ T and k s k ∞ ≤ T for al l ( y , s ) that satisfy s = c − A T y ≥ 0 • initial p oint y 0 is a length n v e ctor wher e A T y 0 < c • err or p ar ameter ǫ satisfies 0 < ǫ < 1 and r eturns x > 0 su ch that k A x − b k ≤ ǫ and c T x < z ∗ + ǫ . L et us define • U is the lar gest absolute value of any entry in A, b , c • s 0 min is the smal lest entry of s 0 = c − A T y 0 Then the algorithm makes O  √ m log T U m λ min s 0 min ǫ  c al ls to the appr oximate solver, of the form Solve  AS − 2 A T + v v T , · , ǫ ′  wher e S is a p ositive diagonal matrix with c ondition numb er O  T 2 U m 2 ǫ  , and v , ǫ ′ satisfy log k v k ǫ ′ = O  log T U m s 0 min ǫ  In App end ix C, we present a complete description of th is algorithm, with analysis and pro of of correctness. 3 Solving Generalized Flo w W e consider net work flo ws on a directed graph ( V , E ) with V = [ n ], E = { e 1 , · · · , e m } , source s ∈ V and sink t ∈ V . Edge e j go es from vertex v j to ve rtex w j . and has inw ard capacit y c ( e j ), flo w m ultiplier γ ( e j ) < 1, and cost q ( e j ). W e assume without loss of generalit y that t has a single in-edge, wh ic h we denote as e t , and n o out-edges. The generalized max-flo w app ro ximation algorithm will p ro du ce a flo w that send s no worse than ǫ less than the maximum p ossib le flo w to the sin k. The generaliz ed min-cost app ro ximation algorithm will pro duce a fl o w that, in add ition to b eing within ǫ of a maximum flo w, also has cost no greater th an the min im um cost of a maxim um fl o w (see [FW02]). 3.1 Fixing Approxima te Flo ws The in terior-p oint algorithm d escrib ed in the previous section pro d u ces an output that may not exactly s atisfy the linear constrain ts A x = b . In p articular, when w e apply th e algorithm to a net wo rk flo w linear p rogram, th e output may only b e an appro ximate fl o w: Definition 3.1. An ǫ -appro ximate flo w appr oximately satisfies al l c ap acity c onstr aints and flow c onservation c onstr aints. In p articular, ev e ry e dge may have flow up to ǫ over c ap acity, and eve ry vertex b e si des s and t may have u p to ǫ exc ess or deficit flow. An exact flo w satisfies al l c ap acity c onstr aints and has exact flow c onservation at al l ve rtic es exc ept s and t . 3 W e are going to mo d if y the graph sligh tly b efore ru nning the interio r-p oin t algorithm, so that it will b e easier to obtain an exact fl ow from the app ro ximate flow give n b y the interior-point algorithm. Let us compute the le ast-lossy-p aths tr e e T r o oted at s . This is the tree that conta ins, for eac h v ∈ V − { s, t } , the path π s,v from s to v that minimizes L ( v ) = Q e ∈ π s,v γ ( e ) − 1 , the factor b y whic h the flow along the path is diminished. W e can find this tree in time e O ( m ), u sing Dijkstra’s algorithm to solv e the single-source shortest-paths p roblem with edge weig hts − log γ ( e ). Next, we delete f r om th e graph all v ertices v su c h that L ( v ) > ǫ 2 mnU . Note that in a m axim um- flo w, it is not p ossible to ha v e more than ǫ 2 n flo wing in to suc h a v , since at most mU can flo w out of s . Th u s , deleting eac h suc h v cannot decrease the v alue of th e maxim um flo w b y more than ǫ 2 n . In total, w e ma y decrease the v alue of the maximum flo w b y at most ǫ 2 . W e define ǫ F LO W = ǫ 2 64 m 2 n 2 U 3 . In the su bsequent sections, we sh ow h ow to use the in terior-p oin t metho d to obtain an ǫ F LO W -appro ximate flo w th at has a v alue within ǫ 4 of the maximum flo w. Assuming that the graph had b een prep ro cessed as ab o v e, we ma y con v ert the appro ximate flo w in to an exact flo w: Lemma 3.2. Supp ose al l vertic es v ∈ V − { s, t } satisfy L ( v ) ≤ ǫ 2 mnU . In e O ( m ) time, we ar e able to c onvert an ǫ F LO W -appr oximate flow that has a value within ǫ 4 of the maximum flow into an exact flow that has a value within ǫ 2 of the maximum flow. The c ost of this exact flow is no gr e ater than the c ost of the appr oximate flow. Pr o of. L et us first fi x th e flo ws so that n o vertex has more flow out th an in . W e use the least-lossy- paths tree T , starting at the lea v es of the tree and working to wa rd s s . T o b alance the fl o w at a v ertex v w e increase the flo w on the tree edge int o v . After completing this pro cess, for eac h v we will ha ve added a path of fl o w that deliv ers at m ost ǫ 2 64 m 2 n 2 U 3 additional un its of flo w to v . Since L ( v ) ≤ ǫ 2 mnU , n o such path requir es more th an ǫ 2 64 m 2 n 2 U 3 · 2 mnU ǫ = ǫ 32 mnU 2 flo w on an edge, and so in total w e ha v e added no more than ǫ 32 mU 2 to eac h edge. Next, let us fix the fl o ws so that no vertex h as more flo w in than out. W e follo w a similar pro cedur e as ab ov e, except now w e m a y use an y sp anning tree r o oted at and d irected to wa rd s t . Starting fr om the lea ves, we b alance the v ertices by increasing flow out the tree edge. Since th e net wo rk is lossy , the total amount added to eac h edge is at most ǫ 2 64 m 2 n 2 U 3 · n ≤ ǫ 2 64 m 2 nU 3 . Recall that we started with eac h edge ha ving flow up to ǫ 2 64 m 2 n 2 U 3 o v er capacit y . After balancing the fl o ws at the vertic es, eac h edge ma y now b e o ver capacit y by as muc h as ǫ 32 mU 2 + ǫ 2 64 m 2 nU 3 + ǫ 2 64 m 2 n 2 U 3 ≤ ǫ 16 mU 2 Since the edge capacit ies are at least 1, th e flo w on an edge ma y b e as muc h as (1 + ǫ 16 mU 2 ) times the capacit y . F urthermore, wh ile balancing the flo ws we ma y hav e add ed as muc h as ǫ 16 mU 2 · mU = ǫ 2 16 U to the total cost of the flow. Assu ming that the v alue of app ro ximate flow w as at least ǫ 4 , its cost m ust also ha ve b een at least ǫ 4 , and so w e ha ve incr eased the cost by a m ultiplicativ e f actor of at most (1 + ǫ 4 U ). (If th e appr o ximate flo w had v alue less than ǫ 4 , then the emp t y flo w trivially solv es this flow rounding p roblem.) By scaling the en tire flo w do wn by a multiplica tive f actor of (1 + ǫ 4 U ) − 1 , we solv e th e capacit y violatio n s, and also reduce the cost of the exact flo w to b e n o greater than that of the app ro ximate flo w. Since the v alue of a flo w can b e at most U , the flo w scaling d ecreases the v alue of the flo w by no more than ǫ/ 4, as requir ed . 4 The ab ov e pr o cedure pr o duces an exact fl o w that is within ǫ/ 2 of the maximum fl o w in the prepro cessed graph , and th erefore is within ǫ of the maxim um flo w in the original graph. F urther - more, th e cost of the flow is no greater than the min im um cost of a maxim u m flo w in th e original graph. Th u s to solve a generalized flo w problem, it remains for us to describ e ho w to use the in terior- p oint algorithm to generate a ǫ F LO W -appro ximate fl o w that has a v alue within ǫ/ 4 of the m axim um flo w, an d , for the m in-cost problem, also has cost no greater than the the min im um cost of a maxim um flo w. 3.2 Generalized Max-Flo w W e f orm ulate the maxim um flow problem as a linear pr ogram as follo ws : L et A b e th e ( n − 2) × m matrix whose nonzero entries are A v j ,j = − 1 and A w j ,j = γ ( e j ), b ut without rows corrsp ond ing to s an d t . Let c b e th e length m v ector con taining the edge cap cities. Let u t b e the length m unit vect or with a 1 entry for edge e t . L et the vec tors x 1 and x 2 resp ectiv ely denote the flo w in to eac h edge and the unused in ward capacit y of eac h edge. Th e max-flow linear pr ogram, in canonical form, is: min x i − u T t x 1 s.t.  A I I   x 1 x 2  =  0 c  and x i ≥ 0 The constraint A x 1 = 0 ensu res that flow is conserve d at every v ertex except s and t , while the constrain t x 1 + x 2 = c ensu res that the capacities are ob eye d . No w, the d u al of the ab ov e lin ear program is not b ound ed, w hic h is a pr oblem for our inte rior- p oint algorithm. T o fix th is, we mo dify the linear program sligh tly: min x i  − u T t x 1 + 4 U ǫ F LO W ( 1 T m x 3 + 1 T n − 2 x 4 + 1 T n − 2 x 5 )  s.t. x i ≥ 0 and  A I − I I I − I        x 1 x 2 x 3 x 4 x 5       =  0 c  (W e use 1 k to den ote the all-ones v ector of length k .) Lemma 3.3. This mo difie d line ar pr o gr am has the same optimum value as the original line ar pr o gr am. Pr o of. L et us examine the new v ariables in the mo dified program and note that x 3 has the effec t of mo difyin g th e capacities, while x 4 and x 5 create excess or defi cit of flo w at th e v ertices. S ince w e ha v e a lossy net work, a unit mo dification of any of these v alues cannot c hange the v alue of the flo w b y more than 1, and therefore m ust increase the v alue of the mo dified linear program. Thus, at the optim u m w e ha v e x 3 = x 4 = x 5 = 0 and so the solution is the same as th at of the original linear p rogram. 5 The mo dified linear program has th e follo win g equiv alen t dual lin ear program: max y i c T y 2 s.t. s i ≥ 0 and       A T I I − I I − I        y 1 y 2  +       s 1 s 2 s 3 s 4 s 5       =       − u t 0 (4 U /ǫ F LO W ) · 1 m (4 U /ǫ F LO W ) · 1 n − 2 (4 U /ǫ F LO W ) · 1 n − 2       Lemma 3.4. The ab ove dual line ar pr o gr am is b ounde d. In p articular, the c o or dinates of al l fe asible dual p oints have absolute value at most ( nU + 1) · 4 U ǫ F LOW + 1 . Pr o of. O f the five constraints in the du al lin ear program, the last four give 4 U ǫ F LOW as an explicit b ound on the absolute v alue of y co ord inates. It then follo ws th at 8 U ǫ F LOW is a upp er b oun d on the co ordinates of s 2 , s 3 , s 4 , s 5 , and the co ord inates of s 1 = − u t − A T y 1 − y 2 can b e at most ( nU + 1) · 4 U ǫ F LOW + 1. W e refer to the s i v ariables as the slac k s . Recall that we must pro vide the interior-p oin t algorithm with an initial dual f easible p oint y 0 suc h that the corresp onding slac ks s 0 are b ounded a w ay from zero. W e c ho ose the follo wing initial p oint, and note that the slac ks are b ound ed from b elo w b y U ǫ F LOW :  y 0 1 y 0 2  =  0 − (2 U /ǫ F LO W ) · 1 m        s 0 1 s 0 2 s 0 3 s 0 4 s 0 5       =       (2 U /ǫ F LO W ) · 1 m − u t (2 U /ǫ F LO W ) · 1 m (2 U /ǫ F LO W ) · 1 m (4 U /ǫ F LO W ) · 1 n − 2 (4 U /ǫ F LO W ) · 1 n − 2       W e must also p ro vide the int erior-p oin t algorithm w ith a lo wer b ound on the eigen v alues of th e matrix  A I − I I I − I        A T I I − I I − I       =  AA T + 2 I A A T 3 I  Note that we ma y subtract 2 I from the ab o v e matrix and still ha ve a p ositiv e defin ite matrix, s o λ min = 2 is certainly a lo w er b ound on the eigen v alues. Using the ab o ve v alues for y 0 and λ min , and the b ound on the du al co ord inates give n in Lemma 3.4, we no w call Inter iorPoint on the m o dified max-flo w linear p r ogram, usin g error parameter ǫ F LOW 2 . In the solution retur ned b y the in terior-p oin t algorithm, th e v ector x 1 assigns a flo w v alue to eac h edge suc h that the fl o w constrain ts are nearly s atisfied: Lemma 3.5. x 1 is an ǫ F LO W -appr oximate flow with value within ǫ F LO W / 2 of the maximum flow. 6 Pr o of. O bserve that the amoun t flowing into t is at least − 1 times the v alue of the mo dified linear program. S in ce the in terior-p oin t algorithm generates a solution to the mo d ified linear p rogram within ǫ F LO W / 2 of the optim um v alue, whic h is − 1 times the maxim um flo w, the amoun t flo wing in to t surely m ust b e within ǫ F LO W / 2 of th e m aximum flo w. No w, let us note more precisely that the mo d ified linear program aims to minimize the ob jectiv e function computed b y subtr acting the amoun t flowing into t from 4 U /ǫ F LO W times the sum of the en tries of x 3 , x 4 , and x 5 . Since the minimum v alue of this ob jectiv e fu n ction m ust b e negativ e, and the solution returned by the inte rior-p oin t algorithm has a v alue within ǫ F LO W / 2 of th e minimum, the v alue of this solution must b e less than ǫ F LO W / 2 < U . The amount fl o wing into t is also at most U , so no en try of x 3 , x 4 , x 5 can b e greater th an 2 U / (4 U /ǫ F LO W ) = ǫ F LO W / 2. The interior-point algorithm guaran tees that k A x 1 + x 4 − x 5 k < ǫ F LO W 2 and k x 1 + x 2 − x 3 − c k < ǫ F LO W 2 and so w e ma y conclude that k A x 1 k < ǫ F LO W and x 1 ≤ c + ǫ F LO W Indeed, th is is precisely what is means for x 1 to describ e an ǫ F LO W -appro ximate flo w. 3.3 Generalized Min-Cost Flo w As a fir st step in solving the generlized m in-cost flow p roblem, we solv e the generalized max-flo w linear p rogram as describ ed ab o v e, to fin d a v alue F that is within ǫ 8 of th e maxim um flow. W e no w formulate a linear program for findin g the min im um cost flow th at delive r s F units of flo w to t : min x i q T x 1 s.t. x i ≥ 0 and  A I I   x 1 x 2  =  F · e t c  where q is th e length n v ector contai n in g the edge costs, and e t is the length n − 1 v ector that assigns 1 to v ertex t and 0 to all the other ve rtices except s . A is the same matrix as in the max-flo w lin ear program, except that w e include the r o w corresp onding to t , whic h trans lates to a new constraint that F u nits must flo w in to t . W e m ust again mo dify the linear p rogram so that the dual will b e b ounded: min x i  q T x 1 +  4 mU 2 ǫ F LO W   1 T m x 3 + 1 T n − 1 x 4 + 1 T n − 1 x 5   s.t. x i ≥ 0 and  A I − I I I − I        x 1 x 2 x 3 x 4 x 5       =  F · e t c  Lemma 3.6. This mo difie d line ar pr o gr am has the same optimum value as the original line ar pr o gr am. 7 Pr o of. W e examine the new v ariables and note that x 3 mo difies the capacities, while x 4 and x 5 create excess supp ly (or d emand) at the v ertices. A unit mo d ification to an y of these v alues can at b est create a n ew p ath for one unit of flo w to arriv e at the sink. Th is n ew p ath has cost at least 1, and it can r ep lace an p ath in the optim u m flo w of cost at m ost nU , for a n et impro ve ment in th e cost of the flow of at most n U − 1, wh ic h is less than 4 mU 2 ǫ F LOW . Thus the v alue of the mo d ifi ed lin ear program can only increase when these new v ariables are set to non-zero v alues. No w, the du al linear program is: max y i  F · e T t y 1 + c T y 2  s.t. s i ≥ 0 and       A T I I − I I − I        y 1 y 2  +       s 1 s 2 s 3 s 4 s 5       =       q 0 (4 mU 2 /ǫ F LO W ) · 1 m (4 mU 2 /ǫ F LO W ) · 1 n − 1 (4 mU 2 /ǫ F LO W ) · 1 n − 1       Lemma 3.7. The ab ove dual line ar pr o gr am is b ounde d. In p articular, the c o or dinates of al l fe asible dual p oints have absolute value at most ( nU + 1) · 4 mU 2 ǫ F LOW . Pr o of. O f the five constraints in the du al lin ear program, the last four give 4 mU 2 ǫ F LOW as an explicit b ound on the absolute v alue of y co ord inates. It then follo ws th at 8 mU 2 ǫ F LOW is a upp er b oun d on the co ordinates of s 2 , s 3 , s 4 , s 5 , and the coord inates of s 1 = q − A T y 1 − y 2 can b e at m ost ( nU + 1) · 4 mU 2 ǫ F LOW . Let u s also note th at y 0 =  0 − ( mU 2 /ǫ F LO W ) 1 m  is an initial interior dual p oin t with all slac ks at least mU 2 ǫ F LOW . Using the ab o ve initial p oin t, the b ound on the dual co ord inates from Lemma 3.7 , and λ min = 2 as in the pr evious section, w e run Inte riorPoin t on the m o dified min-cost linear pr ogram, with error p arameter ǫ F LOW 2 . In the solution returned by the inte r ior-p oin t algorithm, the vec tor x 1 assigns a fl o w v alue to eac h edge such that the flo w constrain ts are nearly satisfied: Lemma 3.8. x 1 is an ǫ F LO W -appr oximate flow with value within 5 ǫ 32 of the maximum flow. Pr o of. Note that any fl o w in total cannot cost more th at mU 2 , eve n if all ed ges are filled to maxim um capacit y . Therefore the v alue of th e solution outpu t by the in terior-p oin t algorithm can b e at most mU 2 + ǫ F LOW 2 < 2 mU 2 , and so in p articular no entry of x 3 , x 4 , x 5 can b e greater than ǫ F LOW 2 . No w, the inte r ior-p oin t algorithm guarantee s that k A x 1 + x 4 − x 5 − F · e t k < ǫ F LO W 2 and k x 1 + x 2 − x 3 − c k < ǫ F LO W 2 and so w e ma y conclude that k A x 1 − F · e t k < ǫ F LO W and x 1 ≤ c + ǫ F LO W These inequalities imply that this is a ǫ F LO W -appro ximate fl o w, and additionally th at at least F − ǫ F LO W is flo wing in to t . Since F is within ǫ 8 of th e maxim um flo w, the amount fl o wing in to t m ust b e within ǫ 8 + ǫ F LO W < 5 ǫ 32 of the maxim um flo w. 8 By scaling do wn the x 1 flo w sligh tly , we obtain a flo w that do es n ot exceed the minim um cost of a maxim um flo w: Lemma 3.9. x ′ 1 = (1 − ǫ 12 U ) x 1 is an ǫ F LO W -appr oximate flow with value within ǫ 4 of the maximum flow, and with c ost at most the minimum c ost of a maximum flow. Pr o of. W e ma y assume that the v alue of flow x 1 is at least 3 ǫ 32 , b ecause otherwise the maxim um flo w would ha ve to b e at most 3 ǫ 32 + 5 ǫ 32 = ǫ 4 , and so the empty flo w would trivially b e within ǫ 4 of the m axim um. Therefore, the minim u m cost of a m aximum fl o w must also at b e least 3 ǫ 32 . The int erior-p oin t algorithm guaran tees that the cost of x 1 do es n ot exceed this optim u m cost b y more th an ǫ F LOW 2 , and so must also not exceed the optimum cost b y a multiplicati ve factor of more than (1 + 16 ǫ F LOW 3 ǫ ) < (1 + ǫ 12 U ). Thus. x ′ 1 = (1 − ǫ 12 U ) x 1 m ust h a v e cost b elo w the optim um. F urthermore, since the v alue of the flow x 1 can b e at most U , scaling do wn by (1 − ǫ 12 U ) cannot decrease the v alue of th e flo w b y more than ǫ 12 . Th erefore, the v alue of the v alue x ′ 1 is within ǫ 12 + 5 ǫ 32 < ǫ 4 of the maxim um. 3.4 Running Time The linear systems in the ab o v e linear pr ograms tak e the form ¯ A =  A I − I I I − I  so the ru nning time of the in terior-p oint m etho d dep ends on our abilit y to approximate ly solv e systems of the form ¯ AS − 2 ¯ A T + v v T , where diagonal matrix S and v ector v are as d escrib ed in Theorem 2.1. As it turns out, this is n ot muc h more difficult th an solving a linear system in AS − 2 1 A T , where S 1 is the upp er left submatrix of S . The matrix AS − 2 1 A T is a symmetric M -matrix. In the n ext section, w e describ e how to ap- pro ximately solv e systems in s uc h matrices in exp ected time e O  m log κ ǫ  , where κ is the condition n umb er of the m atrix. W e then extend this r esult to solv e th e systems ¯ AS − 2 ¯ A T + v v T in time e O  m log κ k v k ǫ  , w here κ is the cond ition num b er of ¯ AS − 2 ¯ A T . Theorem 3.10. U sing out interior-p oint algorithm, we c an solve the gener alize d max flow and gener alize d min-c ost flow pr oblems in time e O  m 3 / 2 log 2 ( U /ǫ )  Pr o of. According to Theorem 2.1, th e int erior-p oin t algorithm requires O  √ m log T U m λ min s 0 min ǫ  calls to the solv er. Recall that T is an b ound on the co ord inates of the dual linear program, and s 0 min is the smallest slac k at the initial p oint. Ab o v e, w e ga ve b oth of these v alues to b e p olynomial in mU ǫ , for b oth the max-flo w and min-cost lin ear programs. W e also ga ve λ min = 2 as a lo wer b oun d on the eigen v alues of ¯ A ¯ A T . T h us , th e total n umb er of solve s is e O  √ m log U ǫ  . Again referrin g to Th eorem 2.1, we find that th e condition num b er of ¯ AS − 2 ¯ A T is b e p olynomial in mU ǫ , as is the expression k v k ǫ . W e conclude that eac h solv e tak es time e O  m log U ǫ  . The prepro cessing only to ok time e O ( m ) so w e obtain a total run ning ti me of e O  m 3 / 2 log 2 ( U /ǫ )  . 3.5 Standard Min-Cost Flo w In this section w e describ e ho w to use interio r-p oin t algorithms to giv e an exact s olution to th e standard (i.e. no multipliers on edges) min-cost flo w problem. W e use the follo win g prop erty of the standard flo w problem: 9 Theorem 3.11 (see [Sc h 03, Theorem 13.20]) . Given a flow network with inte ger c ap acities, and a p ositive inte ger F , let Ω F LO W b e the set of flow ve ctors x that flow F units into t and satisfy al l c ap acity and flow c onservation c onstr aints. Then Ω F LO W is a c onvex p olytop e in which al l vertic es have inte ger c o or dinates. Our goal is to fin d the flo w in Ω F LO W of min imum cost. Since the cost function is linear, if there is a unique minimum-cost fl ow of v alue F , it m ust o ccur at a vertex of Ω F LO W . By Theorem 3.11 th is m ust b e an integ er flo w, and we could fi nd th is flo w exactly b y r unnin g the int erior-p oin t algorithm until it is clear to which in teger flo w we are con v erging. Unfortunately , the minim um -cost flo w ma y not b e unique. Ho wev er, by applying the Isolation Lemma of Mulmuley , V azirani, and V azarani [MVV87], we can m o dify the cost fun ction sligh tly so that the minimum-cost flow is unique, and is also a minim um -cost fl o w und er the original cost function. Let u s fir st state a mo dified v ers ion of the Is olation Lemma: Lemma 3.12 (see [KS 01, Lemma 4]) . Given any c ol le ction of line ar functions on m variables with inte ger c o o efficients in the r ange { 0 , . . . , U } . If e ach variable is indep endently set uniformly at r andom to a value fr om the set { 0 , . . . , 2 mU } , then with pr ob ability at le ast 1 / 2 ther e is a uniq u e function in the c ol le ction that takes minumum value. W e no w d escrib e how to force the minimum-co st flo w to b e unique: Lemma 3.13. Given a flow network with c ap acities and c osts in the set { 1 , 2 , . . . , U } , and a p ositive inte ger F , mo dify the c ost of e ach e dge indep endently by adding a numb e r uniformly at r andom fr om the set  1 4 m 2 U 2 , 2 4 m 2 U 2 , . . . , 2 mU 4 m 2 U 2  . Then with pr ob ability at le ast 1 / 2 , the mo difie d network has a unique minimum-c ost flow of value F , and this flow i s also a minimum-c ost flow of value F in the original network. Pr o of. T he mo d ified cost of a flo w at a v ertex of Ω F LO W is a linear function of m indep enden t v ari- ables c hosen un iformly at random from the set  1 4 m 2 U 2 , 2 4 m 2 U 2 , . . . , 2 mU 4 m 2 U 2  . w here the coefficien ts are the co ordinates of the flo w v ector, whic h b y Lemma 3.11 are in tegers in the range { 0 , . . . , U } . S o the I solation Lemma tells us that with probablit y at least 1 / 2, there is a u nique vertex of Ω F LO W with minum um mo dified cost. No w, any v ertex that wa s n ot originally of minimum cost must hav e b een more exp ensiv e than the m inim um cost b y an in teger. S ince th e sum of the flows on all ed ges can b e at most mU , and no edge h ad its cost increased by more than 1 2 mU , the total cost of any fl o w cann ot hav e increased b y more than 1 / 2. Thus, a v ertex that w as not originally of min im um cost cannot hav e minimum mo dified cost. W e ma y no w giv e an exact algorithm for standard m in im um-cost flo w. Note that th is algorithm w orks for any intege r fl o w v alue, but in particular w e ma y easily fin d th e exact max-flo w v alue b y runn in g the interior-p oin t max-flo w algorithm w ith an error of 1 / 2, sin ce we kno w the max-flo w v alue is an in teger. Lemma 3.14. T o solve the standar d minimum-c ost flow pr oblem i n exp e cte d time e O  m 3 / 2 log 2 U  , p erturb the e dge c osts as in L e mma 3.13, then run the min-c ost flow interior p oint algorithm with an err or of 1 12 m 2 U 3 , and r ound the flow on e ach e dge to the ne ar est inte g e r. Pr o of. L et us prov e correctness assum ing that the mo d ified costs do isolate a unique minimum-cost flo w. The ru nning time th en follo w s d irectly from Theorem 3.10, and the fact from Lemma 3.13 that 10 after a constan t n umber of tries w e can exp ect th e mo dified costs to yield a unique min im um-cost flo w. W e fi rst note that the mo d ified edge costs are inte ger multiples of δ = 1 4 m 2 U 2 . T h erefore, b y Theorem 3.11 the cost of the minum um-cost flo w is at least δ less than the cost at an y other v ertex of Ω F LO W . No w, the flo w returned by th e interior-p oint algorithm can b e expr essed as a w eigh ted a v erage of the v ertices of Ω F LO W . Since the cost of this flo w is within 1 12 m 2 U 3 = δ 3 U of the minimum cost, this w eigh ted a verag e m ust assign a com bin ed we ight of at most 1 3 U to the non-minim u m-cost v ertices. Th erefore, the flow along any edge d iffers by at most 1 / 3 f r om th e minimum-cost flo w. So by round ing to the nearest integ er flow, we obtain the min im um-cost flo w. 4 Solving linear sy stems in symmetric M- Matrices A symmetric M -matrix is a p ositiv e defin ite symmetric m atrix with n on -p ositiv e off-diagonals (see, e.g. [HJ91, Axe96, BP94]). Ev ery M -matrix has a factorization of th e form M = AA T where eac h column of A has at most 2 nonzero entries [BCPT05]. Giv en suc h a factoriza tion of an M -matrix, w e w e will show how to solve linear systems in the M -matrix in nearly-linear time. Throughout this section, M will b e an n × n symm etric M -matrix and A will b e a n × m m atrix with 2 n onzero en tries p er column suc h that M = AA T . Note that M has O ( m ) non-zero en tries. Our algorithm will mak e use of the S p ielman-T eng e O ( m ) exp ected time approximate solv er for linear systems in symmetric diagonally-dominan t m atrices, where we recall th at a symmetric matrix is diagona l ly-dominant if eac h diagonal is at least the sum of the absolute v alues of the other en tries in its row. It is strictly diagonal ly-dominant if eac h d iagonal execceds eac h corresp onding sum. W e will use the follo wing s tand ard facts ab out s ymmetric M -matrices, wh ic h can b e found, for example, in [HJ91]: F act 4.1. If M =  M 11 M 12 M T 12 M 22  is a symmetric M - matrix with M 11 a princip al minor, then: 1. M is invertible and M − 1 is a nonne gative matrix. 2. M 12 is a nonp ositive matrix. 3. M 11 is an M -matrix. 4. The Sc h ur complemen t S = M 22 − M T 12 M − 1 11 M 12 is an M - matrix. 5. If al l eigenvalues of M fal l in the r ange [ λ min , λ max ] , then so do al l diagonal entries of S . 6. F or any p ositive diagonal matrix D , D M D is an M - matrix. 7. Ther e e xists a p ositive diagonal matrix D such that D M D is strictly diagonal ly-dominant. Our algorithm will work b y find in g a diagonal matrix D for whic h D M D is diagonally-dominan t, pro vidin g us with a system to wh ich w e ma y apply the s olver of S pielman and T eng. Ou r algorithm builds D b y an iterativ e p ro cess. In eac h iteration, it decreases the num b er of ro ws that are not d ominated by their diagonals b y an exp ected constan t factor. Th e main step of eac h iteration in vo lves the solution of O (log n ) diagonally-dominan t linear sys tems. F or simplicit y , w e first explain ho w our algorithm would work if w e made use of an algorithm x = ExactSol ve ( M , b ) th at exactly solv es th e sys tem M x = b , for d iagonally-dominan t M . W e then explain h o w we ma y s ubstitute an appr o ximate solv er. The key to our analysis is the follo win g lemma, whic h sa ys that if we m ultiply an M -matrix by a rand om diagonal matrix, then a constant fraction of the diagonals pr obably d ominate their ro ws. 11 Lemma 4.2 (Random Scaling Lemm a) . Given an n × n M -matrix M , and p ositive r e al values ζ ≤ 1 and r ≤ 1 4 , let D b e a r andom diagonal n × n matrix wher e e ach diagonal entry d i is chosen indep endently and uniformly fr om the interval (0 , 1) . L et T ⊂ [ n ] b e the set of r ows of M D with sums at le ast r times the pr e- sc ale d diagonal, i.e. T = { i ∈ [ n ] : ( M D 1 ) i ≥ r m ii } With pr ob ability at le ast 1 − 4 r 4 r +7 , we have | T | ≥  1 8 − r 2   1 − β − 2 3 ζ  n wher e β is the fr action of the diagonal entries of M that ar e less than ζ times the aver age diagonal entry. Note in particular that for r = 0, T is the set of rows dominated by their d iagonals. W e will use the Random Scaling Lemma to d ecrease the num b er of ro ws that are n ot d ominated b y their diagonals. W e will do this b y preserving the ro ws that are dominated b y their diagonals, and applying this lemma to the rest. Without loss of generalit y we w rite M =  M 11 M 12 M T 12 M 22  =  A 1 A T 1 A 1 A T 2 A 2 A T 1 A 2 A T 2  , where the ro ws in the top section of M are the ones th at are already diagonally- dominan t, so in particular M 11 is diago n ally-dominant. Let S = M 22 − M T 12 M − 1 11 M 12 b e the Sc hur complemen t and let S D b e the matrix conta inin g only the diagonal en tries of S . W e construct a random diagonal matrix D R of the same s ize as M 22 b y c ho osing eac h diagonal elemen t indep end en tly an d uniformly f r om (0 , 1). W e then create diagonal matrix D =  D 1 D 2  where D 2 = S − 1 / 2 D D R and the diagonal ent r ies of D 1 are giv en by − M − 1 11 M 12 D 2 1 . W e know th at the d iagonal ent ries of D 1 are p ositiv e b ecause F act 4.1 tells us th at M − 1 11 is nonn egativ e and M 12 is nonp ositiv e. W e now sho w that the fi rst set of ro ws of D M D are diagonally-dominan t, and a constan t fraction of the r est pr ob ab ly b ecome so as well. Since M is an M -matrix and D is p ositiv e diagonal, D M D has no p ositiv e off-diagonals. Therefore, the diagonally-dominan t ro ws of D M D are the ro ws with nonnegativ e ro w sums. The ro w sums of D M D are: D M D 1 =  D 1 M 11 D 1 1 + D 1 M 12 D 2 1 D 2 M T 12 D 1 1 + D 2 M 22 D 2 1  =  0 D 2 S D 2 1  = " 0 D R S − 1 / 2 D S S − 1 / 2 D D R 1 # Note that the d iagonal en tries of S − 1 / 2 D S S − 1 / 2 D are all 1. Th us by inv oking Lemma 4.2 w ith r = 0 and ζ = 1, w e fin d that there is a 1 / 7 p robabilit y that at least 1 / 24 of the ro w sums in the b ottom section of D M D b ecome nonnegativ e. F urtherm ore, we see that ro w sums in the top section remain nonnegativ e. The on ly prob lem with th is idea is that in eac h iteration it could take e O ( mn ) time to compute the en tire matrix S . F ortun ately , we actually only n eed to compute the diagonals of S , (i.e. the matrix S D ). In fact, we only actually n eed a diagonal matrix Σ that appr o ximates S D . As long the diagonals of Σ − 1 / 2 S Σ − 1 / 2 fall in a relativ ely narr o w range, we can still use the Random Scaling Lemma to get a constan t fraction of impro v ement at eac h iteration. 12 T o compu te these app ro ximate diagonal v alues quickly , we use the random pro jection technique of Johnson and Lin denstrauss [JL84]. In App end ix A, we pro ve the follo w ing v ariant of their resu lt, that d eals with random pro jections in to a space of constan t dimension: Theorem 4.3. F or al l c onstants α, β , γ , p ∈ (0 , 1) , ther e is a p ositive inte g er k = k J L ( α, β , γ , p ) such that the fol lowing holds: F or any ve ctors v 1 , . . . , v n ∈ I R m let R b e a k × m matrix with entries chosen indep endently at r andom fr om the standar d normal distribution, and let w i = q 1 k R v i . With pr ob ability at le ast p b oth of the fol lowing hold: 1. P n i =1 k v i k 2 k w i k 2 ≤ (1 + γ ) n 2.    n i : k v i k 2 k w i k 2 < 1 − α o    ≤ β n Let u s note that S = A 2 ( I − A T 1 M − 1 11 A 1 ) A T 2 = A 2 ( I − A T 1 M − 1 11 A 1 ) 2 A T 2 b ecause ( I − A T 1 M − 1 11 A 1 ) is a p ro jection matrix. So if we let a i denote the i th row of A 2 , we can write the i th diagonal of S as s ii = k ( I − A T 1 M − 1 11 A 1 ) a T i k 2 . Then if we us e Theorem 4.3 to create a random pro jection matrix R , k R ( I − A T 1 M − 1 11 A 1 ) a T i k 2 giv es a go o d approximat ion to s ii . Moreo ver, w e can us e one call to ExactSol ve to compute eac h of the constant num b er of rows of the matrix P = R ( I − A T 1 M − 1 11 A 1 ). Since A 2 has O ( m ) ent ries, w e can compute P A T 2 in O ( m ) time, and obtain all th e appr o ximations k P a T i k 2 in O ( m ) time, yielding the d esired appr o ximations of all s ii v alues. Our s u ggested algorithm, still using an exact s olver, is give n in Figure 1. T o mak e this algorithm fast, we replace the c alls to the exact solv er with calls to the appro ximate solv er STSol ve o f Sp ielman and T eng: Theorem 4.4 (Spielman-T eng [ST04, S T06]) . The algor ithm x = STSolve ( M , b , ǫ ) takes as input a symmetric diagonal ly- dominant n × n matrix M with m non-zer os, a c olumn v e ctor b , and an err or p ar ameter ǫ > 0 , and r e tu rns in exp e cte d time e O ( m log (1 /ǫ )) a c olumn ve ctor x satisfying k x − M − 1 b k M ≤ ǫ k M − 1 b k M W e define th e algorithm MMat rixSolve ( A, b , ǫ, λ min , λ max ) as a mo d ification of the algorithm ExactMat rixSolve in Figure 1. F or th is algorithm w e need to provide u pp er and lo w er b oun ds λ max , λ min on the eigen v alues of the matrix A , and the runnin g time will dep end on κ = λ max /λ min . The mo difications are th at we need to set p arameters: δ = (1 / 24) λ 1 / 2 min κ − 1 / 2 n − 1 ǫ 1 = . 005(1 . 01 κmn ) − 1 / 2 ǫ 2 = (1 / 72) κ − 5 / 2 n − 2 and su bstitute the calls to ExactSolve in lines 2 c , 2 h and 3 r esp ectiv ely with • ST Solve ( D 1 M 11 D 1 , D 1 A 1 r T i , ǫ 1 ) • ST Solve ( D 1 M 11 D 1 , D 1 ( − M 12 D ′ 2 + δI ) 1 , ǫ 2 ) • ST Solve ( DM D , D b , ǫ ). W e ma y note that the fi nal call to STSo lve guaran tees that k D − 1 x − D − 1 M − 1 b k D M D ≤ ǫ k D − 1 M − 1 b k D M D 13 x = Ex actMMatr ixSolve ( A, b ) Giv en: n × m matrix A , where M = AA T is an M-matrix and A has at most 2 non-zeros p er column. Returns: x satisfying M x = b 1. Set D := I . 2. Un til D M D is diagonally dominan t do: a. P ermute so that D M D =  D 1 M 11 D 1 D 1 M 12 D 2 D 2 M T 12 D 1 D 2 M 22 D 2  =  D 1 A 1 A T 1 D 1 D 1 A 1 A T 2 D 2 D 2 A 2 A T 1 D 1 D 2 A 2 A T 2 D 2  has the diagonally dominan t ro ws in the top section. Let a 1 , . . . a ν b e the ro ws of A 2 . b. Set k = k J L ( 1 100 , 1 5 , 1 100 , 1 3 ), and let R b e a random k × m matrix with indep en den t standard normal en tries. Let r i b e the i th r o w of R . c. F or i = 1 , . . . , k , compute q T i = Exac tSolve ( D 1 M 11 D 1 , D 1 A 1 r T i ). d. Set Q =  q T 1 · · · q T k  T . e. Let Σ b e the ν × ν diagonal matrix with en tries σ i = k ( R − QD 1 A 1 ) a T i k 2 . f. Let D R b e a uniform rand om ν × ν diagonal matrix with diagonal en tries in (0 , 1). g. Set D ′ 2 = Σ − 1 / 2 D R h. Set D ′ 1 to b e the matrix with d iagonal D 1 · ExactSolve ( D 1 M 11 D 1 , − D 1 M 12 D ′ 2 1 ) i. S et D :=  D ′ 1 D ′ 2  3. Return x = D · ExactSo lve ( D M D , D b ) Figure 1: Algorithm for solving a linear system in a symmetric M-matrix. T o sp eed u p the algorithm w e will rep lace the exact solv er with the Spielman T eng app ro ximate solv er. or equiv alen tly k x − M − 1 b k M ≤ ǫ k M − 1 b k M so the output fulfills the sp ecification of an appro ximate solv er, pro vided that the algorithm termi- nates. W e can in fact b oun d the r unnin g time of this algo r ith m as follo ws: Theorem 4.5. The exp e cte d running time of the algorithm MM atrixSol ve is e O  m log κ ǫ  . Pr o of. T he ru nning time is dominated by the calls to the Spielman-T eng solv er. There are O (1) suc h solv es p er iterations, eac h of wh ic h take time e O ( m log κ ), and at the conclusion of the algo- rithm, ther e is one final call of time e O  m log ǫ − 1  . So, to p ro v e the run ning time, it su ffices for us to give a O (log m ) b ound on the exp ected n umb er of iterations. In particular, it suffices to sho w that in eac h iteration, the n umb er of n on- diagonally-domainan t ro ws in D M D decreases by a constan t fraction with constan t pr obabilit y . In analyzing a single iteration, we let D =  D 1 D 2  denote the diagonal scaling at the start of the iteration, and w e let D ′ =  D ′ 1 D ′ 2  denote the new diagonal scaling. In App endix A, w e pro ve: Lemma 4.6. D ′ is a p ositive diagonal matrix. 14 This imp lies that D ′ M D ′ has no p ositive off-diago n als, thereby enabling us to chec k w h ic h r o ws of D ′ M D ′ are d iagonally-dominan t by lo oking for rows with nonn egativ e row sums. W e again let S = M 22 − M T 12 M − 1 11 M 12 denote the Sch ur complemen t, and let S D denote the matrix con taining the diagonal entries of S . Let us also define ˜ S = Σ − 1 / 2 S Σ − 1 / 2 . W e kn o w from F acts 4.1.4 and 4.1.6 that ˜ S is an M -matrix. Let ˜ S D b e th e matrix con taining the diagonal entries of ˜ S . In App endix A, we sh o w th at the ro w sums of M D ′ are r elated to ˜ S as follo ws: Lemma 4.7. M D ′ 1 ≥  0 Σ 1 / 2 ( ˜ S D R − 1 6 ˜ S D ) 1  The u pp er part of the ab ov e inequalit y tells u s that all the ro w sums that w ere nonn egativ e in D M D remain n onnegativ e in D ′ M D ′ . F r om the lo w er part of the inequalit y and by inv oking the Random Scaling Lemma on the matrix ˜ S with r = 1 6 , we find that with probabilt y at least 1 23 , the fraction of remainin g ro ws of D ′ M D ′ that no w ha v e p ositiv e r o w s ums is at least 1 24  1 − β − 2 3 ζ  , where for some ζ < 1, β is the fraction of the diagonal en tries of ˜ S that are less than ζ times the a v erage diagonal en try . Indeed w e pr o v e in App endix A: Lemma 4.8. With pr ob ability at le ast 1 9 , at most 1 5 of the diagonal entries of ˜ S ar e smal ler than  99 101  3 times the aver age diagonal entry. So with pr obabilit y at least 1 9 · 1 23 , the fr action of ro ws with negativ e ro w su ms in D M D th at no w ha v e p ositiv e ro w sums in D ′ M D ′ is at least 1 24  1 − 1 5 − 2 3  101 99  3  > 0. Th u s, w e may conclude that MMatr ixSolve is exp ected to terminate after O (log n ) iterations, as claimed. 5 Final Remarks The reason that our interior-point algorithm curr ently cannot p r o duce an exact solution to gen- eralized fl o w pr oblems is the dep endence of our M-matrix solv er on the condition num b er of the matrix, ev en wh en appro ximating in the matrix norm. It wo uld b e of inte rest to eliminate this dep end ence. It w ould also b e nice to extend the result to net works with gains. The main obstacle is that the resulting linear programs may b e ill-co n ditioned. References [A GOT92] R.K. Ahuja, A.V. Goldb erg, J.B. O rlin, and R.E. T arjan. Finding minimum-cost flows b y double scaling. Mathematic al Pr o gr amming , 53:243–2 66, 1992. [AMO93] Ra vindra K. Ahuja, Thomas L. Magnan ti, and James B. Orlin. Network Flows . Prentice Hall, I n c., Englew o o d C liffs, New Jersey , 1993. [Axe96] Ow e Axelsson. Iter ative Solution M etho ds . Cam br idge Univ ersity Pr ess, 1996. [BCPT05] Erik G. Boman, Doron Ch en, Ojas P arekh, and S iv an T oledo. On factor width and symmetric h-matrices. Line ar Algebr a and its Applic ations , 405:239–24 8, August 2005. 15 [BP94] Abraham Berman and Rob ert J. Plemmon s . Nonne gative Matric es in the Mathematic al Scienc es . So ciet y for Indu strial and Ap plied Mathematic s, Philadelphia, P A, USA, 1994. [FW02] L.K. Fleisc h er and K.D. W a yne. F ast and simple approximat ion sc hemes for generalized flo w. Mathematic al Pr o gr amming , 91(2):215 –238, 2002. [GFNR98] T.L. Griffith, M.A. F uller, G.B. Northcraft, and T. Radzik. F aster algorithms for the generalized net w ork flow p roblem. Mathematics of Op er ations R ese ar ch , 23(1): 69–199, 1998. [GJO97] D. Goldfarb, Z . Jin, and J . Orlin. Polynomia l-time h ighest-gain augmenting path al- gorithms for th e generalized circulation pr oblem. Mathematics of Op er ations R ese ar ch , 22:793 –802, 1997. [GR98] A.V. Goldb erg and S . Rao. Bey ond the flo w decomp osition b arrier. Journal of the ACM , 1998. [GT87] A.V. Goldb erg and R.E. T arj an. Solving minimum cost flo w problem by su ccessive ap- pro ximation. In Pr o c e e dings of the 19th ACM Symp osium on the The ory of Computing , pages 7–18, 1987. [HJ91] R. A. Horn and C. R. J ohnson. T opics in Matrix Analysis . C ambridge Un iv ersit y Press, 1991. [JL84] W.B. John son and J. Lind enstrauss. Extensions of lipsc hitz mappings in to a hilb ert space. Contemp. Math. , 26:189–20 6, 1984. [KS01] Adam R. Kliv ans and Daniel A. Spielman. Randomness efficien t id entit y testing of m ultiv ariate p olynomials. In Pr o c e e dings of the 33th ACM Symp osium on the The ory of Computing , pages 216–223 , 2001. [Li02] Lei Li. On the iterativ e criterion for generalized diagonally dominant matrices. SIAM Journal on Matrix Analysis , 24(1):17– 24, 2002. [LLH + 98] Bishan Li, Lei L i, Masunori Harada, Hiroshi Niki, and Mic hael J. Tsatsomeros. An iterativ e criterion for h-matrices. Line ar Algebr a and its Applic ations , 271:179–19 0, 1998. [MVV87] K. Mulm uley , U.V. V azirani, and V.V. V azirani. Matc h ing is as easy as matrix in ve rs ion. Combinatoric a , 7(1):105–11 3, 1987. [Orl88] J.B. O r lin. A f aster strongly p olynomial min im um cost fl o w algorithm. In P r o c e e dings of the 20th ACM Symp osium on the The ory of Computing , pages 377–38 7, 1988. [Ren88] J. Renegar. A p olynomial-time algorithm based on Newton’s metho d for linear pr o- gramming. M athematic al P r o gr amming , 40:59– 93, 1988. [Ren96] J. Ren egar. Condition n umb ers, the barr ier metho d, and the conjugate-g radient metho d. SIA M Journal on O ptimization , 6:879– 912, 1996. [Sc h03] Alexander Schrijv er. Combinatoria l Optimization: Polyh e dr a and Efficiency . Nu m - b er 24 in Algorithms and Com binatorics. S pringer-V erlag, Berlin, 2003. 16 [ST04] Da n iel A. Sp ielman and Shang-Hua T eng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. I n Pr o c e e dings of the thirty-sixth annual ACM Symp osium on The ory of Computing (STOC-04) , pages 81–90 , New Y ork, J u ne 13 –15 2004. A CM Press. F ull v ersion a v ailable at http://a rxiv.org /abs/cs. DS/0310051 . [ST06] Da n iel A. Spielman and Sh ang-Hua T eng. Nearly-linear time algorithms for precon- ditioning and solving symmetric, diagonally dominan t linear sy s tems. Av ailable at http://w ww.arxiv .org/abs /cs.NA/0607105 , 2006 . [TW98] E. T ardos an d K.D. W ayne. Simple generalized maximum flo w algorithms. In 6th International Inte ger Pr o gr amming and Combinatorial Optimization Confer e nc e , pages 310–3 24, 1998. [V ai89] P .M. V aidy a. Sp eeding up linear programming using fast matrix m ultiplication. In 30th Annual IE EE Symp osium on F oundations of Computer Scienc e , p ages 332–337 , 1989. [Y e97] Y. Y e. Interior P oint Algor ithms: The ory and Analysis . John Wiley , New Y ork, 1997. 17 A Pro ofs for Section 4 Lemma 4.3. Given ve ctors v 1 , . . . , v n ∈ I R m and c onstants α, β , γ , p ∈ (0 , 1) , for p ositive c onstant inte ger k = k J L ( α, β , γ , p ) , let R b e a k × m matrix with entries chosen indep endently at r andom fr om the standar d normal distribution, and let w i = q 1 k R v i . With pr ob ability at le ast p b oth of the fol lowing hold: (i) P n i =1 k v i k 2 k w i k 2 ≤ (1 + γ ) n (ii)    n i : k v i k 2 k w i k 2 < 1 − α o    ≤ β n Pr o of. L et Z i = k v i k 2 k w i k 2 and Z = P n i =1 Z i . Let r 1 , . . . , r k b e the ro ws of R , and let w ij = k − 1 / 2 h r j , v i i b e the j th en try of w i . Without loss of generalit y , w e assume that all the v i are un it vec tors. Th us for an y giv en i , the expressions k 1 / 2 w i 1 , . . . , k 1 / 2 w ik are indep enden t stand ard normal random v ariables. So the expression Z i k = 1 k k w i k 2 = 1 P k j =1 ( √ k w ij ) 2 has inv erse-c hi-square distribution, w ith m ean 1 k − 2 and v ariance 2 ( k − 2) 2 ( k − 4) . Th erefore, Z has mean k n k − 2 and v ariance at most 2 k 2 n 2 ( k − 2) 2 ( k − 4) , b ecause V ar [ Z ] = V ar " n X i =1 Z i # = n X i,j =1 Co v [ Z i Z j ] ≤ n X i,j =1 q V ar [ Z i ] V ar [ Z j ] = n 2 V ar [ Z i ] = k 2 n 2 V ar  Z i k  So usin g Cant elli’s inequalit y , we ma y conclude that Pr [ Z > (1 + γ ) n ] < V ar [ Z ] V ar [ Z ] + (1 + γ − k k − 2 ) 2 n 2 ≤ 2 2 + ( k − 4)(1 − 2 k ) 2 ( γ − 2 k − 2 ) 2 (1) By th e same reasoning, k Z i has c h i-square distribution, with mean k and v ariance 2 k . So u sing Can telli’s inequalit y , w e find that Pr [ Z i < 1 − α ] = Pr  k Z i > k 1 − α  < V ar [ k / Z i ] V ar [ k / Z i ] + ( k 1 − α − k ) 2 = 2 2 + ( α 1 − α ) 2 k Th u s, the s et { i : Z i < 1 − α } has exp ected cardinalit y less than 2 2+( α 1 − α ) 2 k n . So us ing Marko v’s inequalit y , w e conclude th at Pr [ | { i : Z i < 1 − α }| > β n ] < 2 β (2 + ( α 1 − α ) 2 k ) (2) Com binin g inequalities 1 and 2 via the un ion b oun d, we fi n d the probabilit y that (i) and (i i ) b oth o ccur is at least 1 − 2 2 + ( k − 4)(1 − 2 k ) 2 ( γ − 2 k − 2 ) 2 − 2 β (2 + ( α 1 − α ) 2 k ) whic h is greater than p for sufficient ly large k . 18 Lemma 4.2 (Rand om Scaling L emm a) . Given an n × n M -matrix M , and p ositive r e al values ζ ≤ 1 and r ≤ 1 4 , let D b e a r andom diagonal n × n matrix wher e e ach diagonal entry d i is chosen indep endently and uniformly fr om the interval (0 , 1) . L et T ⊂ [ n ] b e the set of r ows of M D with sums at le ast r times the pr e- sc ale d diagonal, i.e. T = { i ∈ [ n ] : ( M D 1 ) i ≥ r m ii } With pr ob ability at le ast 1 − 4 r 4 r +7 , we have | T | ≥  1 8 − r 2   1 − β − 2 3 ζ  n wher e β is the fr action of the diagonal entries of M that ar e less than ζ times the aver age diagonal entry. Pr o of. L et M O denote the matrix contai n in g only the off-diagonal elemen ts of M . Thus, M O has no p ositiv e entries. Let B b e th e set of ro ws of M in whic h the diagonal en try is less than ζ times the a verag e diagonal entry . Thus | B | = β n . W e define a subset J of ro ws of M w h ose sums are n ot to o far from b eing p ositiv e. In particular, w e let J b e the set of ro ws in whic h the su m of the off-diagonal entries is n o less than − 3 2 times the d iagonal ent ry: J =  i ∈ [ n ] : ( M O 1 n ) i ≥ − 3 2 m ii  Let u s pro ve that J cannot b e to o small. L et S b e the sum of the diagonal entries of M . W e ha ve: S = 1 T n M 1 n − X i ∈ [ n ] ( M O 1 n ) i ≥ − X i ∈ [ n ] ( M O 1 n ) i (b ecause M is p ositiv e definite) ≥ − X i ∈ [ n ] − ( J ∪ B ) ( M O 1 n ) i (b ecause M O is non-p ositiv e) ≥ 3 2 X i ∈ [ n ] − ( J ∪ B ) m ii (b y definition of J ) ≥ 3 2 ζ S n | [ n ] − ( J ∪ B ) | (b y definition of B ) ≥ 3 2 ζ S n ( n − | J | − β n ) So we see that | J | ≥ (1 − β − 2 3 ζ ) n Next, let u s s h o w that the rows in J ha ve a high probabilit y of b eing in T . Consider the i th ro w sum of M O D : ( M O D 1 ) i = X j 6 = i d j m ij = 1 2 X j 6 = i m ij + X j 6 = i ( d j − 1 2 ) m ij = 1 2 ( M O 1 ) i + X j 6 = i ( d j − 1 2 ) m ij Since eac h ( d j − 1 2 ) is symmetrically distribu ted aroun d zero, we may conclude that 1 2 ( M O 1 ) i is the m edian v alue of ( M O D 1 ) i . W e ma y also note that ( M D 1 ) i = ( M O D 1 ) i + d i m ii , and that the v alues of ( M O D 1 ) i and d i m ii are in d ep end ent. 19 W e th us ha ve, for i ∈ J : Pr [( M D 1 ) i ≥ r m ii ] ≥ Pr  ( M O D 1 ) i ≥ 1 2 ( M O 1 ) i  · Pr  d i m ii ≥ r m ii − 1 2 ( M O 1 ) i  = 1 2 · Pr  d i m ii ≥ r m ii − 1 2 ( M O 1 ) i  ≥ 1 2 · Pr  d i m ii ≥ r m ii + 1 2 · 3 2 m ii  (b y definition of J ) = 1 2 · Pr  d i ≥ r + 3 4  = 1 4 − r Th u s the exp ected size of J − T is at most  r + 3 4  | J | . S o we find Pr  | T | >  1 8 − r 2  | J |  ≥ Pr  | J ∩ T | >  1 8 − r 2  | J |  = Pr  | J − T | <  r 2 + 7 8  | J |  ≥ 1 − r + 3 4 r 2 + 7 8 (b y Mark o v’s inequalit y) = 1 − 4 r 4 r + 7 The lemma then follo ws from the lo wer b ound on | J | pro v en ab ov e. Lemma 4.6. D ′ is a p ositive diagonal matrix. Pr o of. D ′ 2 = Σ − 1 / 2 D R is trivially p ositiv e diagonal by construction. T o c heck that D ′ 1 is p ositiv e, we use Lemma A.1, wh ic h imp lies that D ′ 1 1 > − M − 1 11 M 12 D ′ 2 1 n − ν + δ  M − 1 11 1 n − ν − 3 4 λ − 1 max 1 n − ν  T o see why the ab o v e expression is p ositiv e, r ecall from F act 4.1 th at M − 1 11 and − M 12 are p ositiv e matrices. F u rthermore, note that the diagonals of M − 1 11 are at least λ − 1 max . Lemma A.1.   D ′ 1 1 − M − 1 11 ( − M 12 D ′ 2 + δI ) 1   < 3 4 δ λ − 1 max Pr o of. R ecall from th e algorithm that D − 1 1 D ′ 1 1 = STS olve ( D 1 M 11 D 1 , D 1 ( − M 12 D ′ 2 + δI ) 1 , ǫ 2 ) Therefore, STS olve guarante es that   D − 1 1 D ′ 1 1 − D − 1 1 M − 1 11 ( − M 12 D ′ 2 + δI ) 1   D 1 M 11 D 1 ≤ ǫ 2   D − 1 1 M − 1 11 ( − M 12 D ′ 2 + δI ) 1   D 1 M 11 D 1 or equiv alen tly   D ′ 1 1 − M − 1 11 ( − M 12 D ′ 2 + δI ) 1   M 11 ≤ ǫ 2   M − 1 11 ( − M 12 D ′ 2 1 + δ I ) 1   M 11 20 whic h in turn implies that   D ′ 1 1 − M − 1 11 ( − M 12 D ′ 2 + δ I ) 1   ≤ ǫ 2 κ 1 / 2   M − 1 11 ( − M 12 D ′ 2 + δI ) 1   W e can then see   D ′ 1 1 − M − 1 11 ( − M 12 D ′ 2 + δI ) 1   ≤ ǫ 2 κ 1 / 2   − M − 1 11 M 12 D ′ 2 1 + δM − 1 11 1   ≤ ǫ 2 κ 1 / 2   M − 1 11 M 12 D ′ 2 1   + δǫ 2 κ 1 / 2   M − 1 11 1   ≤ ǫ 2 κ 1 / 2   M − 1 11 M 12 D ′ 2 1   + δǫ 2 κ 1 / 2 λ − 1 min n 1 / 2 ≤ ǫ 2 κn 1 / 2   D ′ 2 1   + δǫ 2 κ 1 / 2 λ − 1 min n 1 / 2 (b y Lemma A.3) ≤ ǫ 2 κn 1 / 2    Σ − 1 / 2 1    + δ ǫ 2 κ 1 / 2 λ − 1 min n 1 / 2 ( D ′ 2 < Σ − 1 / 2 b y construction) ≤ 2 ǫ 2 κn 1 / 2    S − 1 / 2 D 1    + δǫ 2 κ 1 / 2 λ − 1 min n 1 / 2 (Σ − 1 / 2 ≤ 2 S − 1 / 2 D b y Lemma 4.8) ≤ 2 ǫ 2 κλ − 1 / 2 min n + δ ǫ 2 κ 1 / 2 λ − 1 min n 1 / 2 ( S − 1 / 2 D 1 < λ − 1 / 2 min 1 by F act 4.1.5) =  2 δ − 1 ǫ 2 κ 2 λ 1 / 2 min n + ǫ 2 κ 1 / 2 n 1 / 2  δ λ − 1 max =  2 3 + 1 72 κ − 2 n − 3 / 2  δ λ − 1 max < 3 4 δ λ − 1 max Lemma 4.8. With pr ob ability at le ast 1 9 , at most 1 5 of the diagonal entries of ˜ S ar e smal ler than  99 101  3 times the aver age diagonal entry. Pr o of. R ecall that the diagonal entries of ˜ S are ˜ s ii = s ii σ i , wh ere s ii =   ( I − A T 1 M − 1 11 A 1 ) a T i   2 and σ i =   ( R − QD 1 A 1 ) a T i   2 . Let us d efine w i = 1 k   R ( I − A T 1 M − 1 11 A 1 ) a T i   2 , where k = k J L  1 100 , 1 5 , 1 100 , 1 3  . By Lemma 4.3, there is at least 1 3 probabilit y that 1 ν ν X i =1 s ii w i ≤ 1 . 01 and      i : s ii w i ≤ . 99      ≤ 1 5 ν So, by Lemma A.2 b elo w, there is at least a 1 3 − 2 9 km > 1 9 probabilit y that th e a v erage diagonal en try of ˜ S is at most 1 ν ν X i =1 s ii σ i = 1 ν ν X i =1 s ii w i · w i σ i ≤ 1 . 01 k ( . 99) 2 and similarly w e ha ve the follo win g b ound on the num b er of small diagonal en tries:      i : s ii σ i ≤ . 99 k (1 . 01) 2      ≤ 1 5 ν Lemma A.2. With pr ob ability at le ast 1 − 2 9 km it holds for al l i that 1 k (1 . 01) 2 ≤ w i σ i ≤ 1 k ( . 99) 2 21 Pr o of. W e ha ve :    σ 1 / 2 i − k 1 / 2 w 1 / 2 i    =   k ( R − QD 1 A 1 ) a T i k − k ( R − RA T 1 M − 1 11 A 1 ) a T i k   ≤ k (( R − QD 1 A 1 ) − ( R − RA T 1 M − 1 11 A 1 )) a T i k = k ( RA T 1 M − 1 11 A 1 − QD 1 A 1 ) a T i k ≤ k a i k v u u t k X j =1 k r j A T 1 M − 1 11 A 1 − q j D 1 A 1 k 2 = k a i k v u u t k X j =1 k r j A T 1 M − 1 11 D − 1 1 − q j k 2 D 1 M 11 D 1 ≤ λ 1 / 2 max v u u t k X j =1 k r j A T 1 M − 1 11 D − 1 1 − q j k 2 D 1 M 11 D 1 ( k a i k 2 is i th diagonal of M 22 , s o cannot exceed M 22 ’s largest eigen v alue) ≤ λ 1 / 2 max ( s ii λ min ) 1 / 2 v u u t k X j =1 k r j A T 1 M − 1 11 D − 1 1 − q j k 2 D 1 M 11 D 1 (using F act 4.1.5) = ( κs ii ) 1 / 2 v u u t k X j =1 k r j A T 1 M − 1 11 D − 1 1 − q j k 2 D 1 M 11 D 1 ≤ ( κs ii ) 1 / 2 ǫ 1 v u u t k X j =1 k r j A T 1 M − 1 11 D − 1 1 k 2 D 1 M 11 D 1 (b y guaran tee of STSolv e ) = . 005 · s 1 / 2 ii (1 . 01 mn ) − 1 / 2 v u u t k X j =1 k r j A T 1 M − 1 11 A 1 k 2 = . 005 · s 1 / 2 ii (1 . 01 mn ) − 1 / 2 v u u t k X j =1 k r j k 2 (b ecause A T 1 M 11 A 1 is a pro jection matrix) ≤ . 01 · s 1 / 2 ii k 1 / 2 (1 . 01 n ) − 1 / 2 The ab o ve inequalit y do es not hold with probabilit y at most 2 9 km , based on the fact that expr ession P k j =1 k r j k 2 has chi-square distribu tion with mk degrees of freedom. ≤ . 01 · k 1 / 2 w 1 / 2 i (Lemma 4.3 implies that s ii ≤ 1 . 01 · nw i ) So we conclude that     r σ w i − k 1 / 2     ≤ . 01 · k 1 / 2 Lemma 4.7. M D ′ 1 ≥  0 Σ 1 / 2 ( ˜ S D R − 1 6 ˜ S D ) 1  22 Pr o of. M D ′ 1 =  M 11 D ′ 1 1 + M 12 D ′ 2 1 M T 12 D ′ 1 1 + M 22 D ′ 2 1  =  0 S D ′ 2 1  + M  D ′ 1 1 + M − 1 11 M 12 D ′ 2 1 − δ M − 1 11 1 0  +  δ 1 δ M T 12 M − 1 11 1  ≥  0 S D ′ 2 1  − λ max k D ′ 1 1 + M − 1 11 M 12 D ′ 2 1 − δ M − 1 11 1 k 1 +  δ 1 − δ k M T 12 M − 1 11 1 k 1  ≥  0 S D ′ 2 1  − 3 4 δ 1 +  δ 1 − δ κ 1 / 2 n 1  (using Lemmas A.1 and A.3) ≥  0 S D ′ 2 1 − 2 δ κ 1 / 2 n 1  = " 0 S Σ − 1 / 2 D R 1 − 1 12 λ 1 / 2 min 1 # ≥ " 0 S Σ − 1 / 2 D R 1 − 1 12 S 1 / 2 D 1 # (using F act 4.1.5) ≥ " 0 S Σ − 1 / 2 D R 1 − 1 6 S 1 / 2 D ˜ S 1 / 2 D 1 # (using Lemma 4.8) Lemma A.3. F or al l p ositive ve ctors v , k M T 12 M − 1 11 v k ≤ κ 1 / 2 n 1 / 2 k v k Pr o of. Defin e c = λ − 1 min κ − 1 / 2 n − 1 / 2 k v k = λ − 1 max κ 1 / 2 n − 1 / 2 k v k . k M T 12 M − 1 11 v k ≤ k M T 12 M − 1 11 v k 1 = − 1 T M T 12 M − 1 11 v (b y F act 4.1, M − 1 11 and − M 12 are n onnegtiv e) = 1 2 c  v T M − 1 11 v + c 2 1 T M 22 1 −  v T M − 1 11 c 1 T  M  M − 1 11 v c 1  ≤ 1 2 c  v T M − 1 11 v + c 2 1 T M 22 1  ≤ 1 2 k v k 2 cλ min + cλ max n ! = κ 1 / 2 n 1 / 2 k v k 23 B Solving Matrices from the In terior-P oin t Metho d In th e in terior-p oint algorithm, w e n eed to solv e matrices of the form M + v v T =  AD 2 1 A T + D 2 2 AD 2 1 D 2 1 A T D 2 1 + D 2 3  + v v T where A is an n × m matrix with en tries b ounded by U in abs olute v alue, AA T is an M-matrix, and D 1 , D 2 , D 3 are p ositiv e d iagonal matrices. W e sh ow ho w to do this using our M MatrixSo lve algorithm. Consider the Sc hur complemen t of M : M S = ( AD 2 1 A T + D 2 2 ) − AD 2 1 ( D 2 1 + D 2 3 ) − 1 D 2 1 A T = AD 2 1 D 2 3 ( D 2 1 + D 2 3 ) − 1 A T + D 2 2 = A S A T S where A S =  AD 1 D 3 ( D 2 1 + D 2 3 ) − 1 / 2 D 2  . Note that M S is also an M-matrix, and that the eigen- v alues of M S fall in the range [ d 2 min , d 2 max ( U √ nm + 1)] where d min and d max are resp ectiv ely the smallest and largest d iagonal entry in D 1 , D 2 , D 3 . W e can bu ild an solv er f or systems in M from a solv er for systems in M S , b y using the follo w ing easily v erifi able p rop erty of the Sc hur complement: Lemma B.1. F or M =  M 11 M 12 M T 12 M 22  and Schur c omplement M S = M 11 − M 12 M − 1 22 M T 12 , we have      x 1 x 2  − M − 1  b 1 b 2      M =   x 1 − M − 1 S ( b 1 − M 12 M − 1 22 b 2 )   M S +   x 2 − M − 1 22 ( b 2 − M T 12 x 1 )   M 22 Then, to solv e systems in M + v v T , we can u se the Sherman-Morrison formula: ( M + v v T ) − 1 = M − 1 − M − 1 v v T M − 1 1 + v T M − 1 v In p articular, we giv e the follo wing algorithm, whic h runs in time e O  m log κ k v k ǫ  : x = So lve ( M + v v T , b , ǫ ) where M =  AD 2 1 A T + D 2 2 AD 2 1 D 2 1 A T D 2 1 + D 2 3  and b =  b 1 b 2  and v =  v 1 v 2  • Define ǫ 1 = ǫ 2 (1 + v T M − 1 v ) − 1 and ǫ 2 = min  1 2 , ǫ 14 (1 + v T M − 1 v ) − 1  • y ′ = MMat rixSolve  A S , b 1 − AD 2 1 ( D 2 1 + D 2 3 ) − 1 b 2 , ǫ 1 , d 2 min , d 2 max ( U √ nm + 1 )  • y =  y ′ ( D 2 1 + D 2 3 ) − 1 ( b 2 − D 2 1 A T y ′ )  • z ′ = MMat rixSolve  A S , v 1 − AD 2 1 ( D 2 1 + D 2 3 ) − 1 v 2 , ǫ 2 , d 2 min , d 2 max ( U √ nm + 1)  • z =  z ′ ( D 2 1 + D 2 3 ) − 1 ( v 2 − D 2 1 A T z ′ )  • Return x = y − z z T b 1+ v T z 24 Lemma B.2. x = Solve ( M + v v T , b , ǫ ) satisfies   x − ( M + v v T ) − 1 b   M + v v T < ǫ   ( M + v v T ) − 1 b   M + v v T Pr o of. W e first sho w that   y − M − 1 b   M ≤ ǫ 1   M − 1 b   M :   y − M − 1 b   M =   y ′ − M − 1 S ( b 1 − AD 2 1 ( D 2 1 + D 2 3 ) − 1 b 2 )   M S (b y Lemma B.1) ≤ ǫ 1   M − 1 S ( b 1 − AD 2 1 ( D 2 1 + D 2 3 ) − 1 b 2 )   M S (guaran teed b y MMatrix Solve ) = ǫ 1    M − 1 b   M −   M − 1 22 b 2   M 22  (b y Lemma B.1) ≤ ǫ 1   M − 1 b   M By th e same reasoning,   z − M − 1 v   M ≤ ǫ 2   M − 1 v   M . Next, let us defin e the in ner pr o duct h v 1 , v 2 i M = v T 1 M v 2 . W e will u se r ep eatedly the inequalit y |h v 1 , v 2 i M | ≤ k v 1 k M k v 2 k M Recall that w e return the v alue x = y − z z T b 1+ v T z . So we b egin by analyzing the expr essions z z T b and v T z :   z z T b − M − 1 v v T M − 1 b   M ≤   z z T b − z v T M − 1 b   M +   z v T M − 1 b − M − 1 v v T M − 1 b   M =    z − M − 1 v , M − 1 b  M   k z k M +    M − 1 v , M − 1 b  M     z − M − 1 v   M ≤   z − M − 1 v   M   M − 1 b   M k z k M +   M − 1 v   M   M − 1 b   M   z − M − 1 v   M =   z − M − 1 v   M   M − 1 b   M  k z k M +   M − 1 v   M  ≤   z − M − 1 v   M   M − 1 b   M    z − M − 1 v   M + 2   M − 1 v   M  ≤ ǫ 2 ( ǫ 2 + 2)   M − 1 v   2 M   M − 1 b   M (3) | v T z − v T M − 1 v | =    M − 1 v , z − M − 1 v  M   ≤   M − 1 v   M   z − M − 1 v   M ≤ ǫ 2   M − 1 v   2 M = ǫ 2 ( v T M − 1 v ) (4) W e th us ha ve: 25   x − ( M + v v T ) − 1 b   M (5) =      y − z z T b 1 + v T z  −  M − 1 b − M − 1 v v T M − 1 b 1 + v T M − 1 v      M ≤   y − M − 1 b   M +     z z T b 1 + v T z − M − 1 v v T M − 1 b 1 + v T z     M +     M − 1 v v T M − 1 b 1 + v T z − M − 1 v v T M − 1 b 1 + v T M − 1 v     M =   y − M − 1 b   M + 1 1 + v T z    z z T b − M − 1 v v T M − 1 b   M + | v T z − v T M − 1 v | 1 + v T M − 1 v   M − 1 v v T M − 1 b   M  ≤   y − M − 1 b   M + 1 v T z    z z T b − M − 1 v v T M − 1 b   M + | v T z − v T M − 1 v | v T M − 1 v   M − 1 v v T M − 1 b   M  ≤   y − M − 1 b   M + 1 (1 − ǫ 2 ) v T M − 1 v  ǫ 2 ( ǫ 2 + 2)   M − 1 v   2 M   M − 1 b   M + ǫ 2   M − 1 v v T M − 1 b   M  (b y equatio n s 3 and 4) =   y − M − 1 b   M + ǫ 2 ( ǫ 2 + 2)   M − 1 v   2 M   M − 1 b   M + ǫ 2    M − 1 v , M − 1 b  M     M − 1 v   M (1 − ǫ 2 ) k M − 1 v k 2 M ≤   y − M − 1 b   M + ǫ 2 ( ǫ 2 + 2)   M − 1 v   2 M   M − 1 b   M + ǫ 2   M − 1 v   2 M   M − 1 b   M (1 − ǫ 2 ) k M − 1 v k 2 M =   y − M − 1 b   M + ǫ 2 ( ǫ 2 + 3) 1 − ǫ 2   M − 1 b   M ≤  ǫ 1 + ǫ 2 ( ǫ 2 + 3) 1 − ǫ 2    M − 1 b   M ≤ ǫ (1 + v T M − 1 v ) − 1   M − 1 b   M (6) So we conclude   x − ( M + v v T ) − 1 b   M + v v T ≤ (1 + v T M − 1 v ) 1 / 2   x − ( M + v v T ) − 1 b   M b y Lemma B.3(i) ≤ ǫ (1 + v T M − 1 v ) − 1 / 2   M − 1 b   M b y equation (6) = ǫ (1 + v T M − 1 v ) − 1 / 2 k b k M − 1 ≤ ǫ k b k ( M + v v T ) − 1 b y Lemma B.3(ii) = ǫ   ( M + v v T ) − 1 b   M + v v T Lemma B.3. F or al l ve c tors v , w , and symmetric p ositive definite M : (i) k w k M + v v T ≤ k w k M (1 + v T M − 1 v ) 1 / 2 (ii) k w k ( M + v v T ) − 1 ≥ k w k M − 1 (1 + v T M − 1 v ) − 1 / 2 26 Pr o of of (i). k w k M + v v T = ( w T ( M + v v T ) w ) 1 / 2 = ( w T M w + ( w T v ) 2 ) 1 / 2 = ( k w k 2 M +  w , M − 1 v  2 M ) 1 / 2 ≤ ( k w k 2 M + k w k 2 M   M − 1 v   2 M ) 1 / 2 = k w k M (1 +   M − 1 v   2 M ) 1 / 2 Pr o of of (ii). k w k ( M + v v T ) − 1 = ( w T ( M + v v T ) − 1 w ) 1 / 2 =  w T  M − 1 − M − 1 v v T M − 1 1 + v T M − 1 v  w  1 / 2 =  w T M − 1 w − w T M − 1 v v T M − 1 w 1 + v T M − 1 v  1 / 2 = k w k 2 M − 1 − h w , v i 2 M − 1 1 + k v k 2 M − 1 ! 1 / 2 ≥ k w k 2 M − 1 − k w k 2 M − 1 k v k 2 M − 1 1 + k v k 2 M − 1 ! 1 / 2 = k w k M − 1 1 − k v k 2 M − 1 1 + k v k 2 M − 1 ! 1 / 2 = k w k M − 1  1 + k v k 2 M − 1  − 1 / 2 27 C In terior-P oin t Metho d using an Appro ximate Solv er Throughout this section, we tak e Solve to b e an algorithm suc h that x = Solve ( M , b , ǫ ) satisfies   x − M − 1 b   M ≤ ǫ   M − 1 b   W e use the notational con ven tion that S denotes the diagonal matrix whose diagonal is s . The same app lies for X and x , etc. 1 k denotes th e all-ones v ector of length k . ˚ Ω den otes the in terior of p olytop e Ω. W e are giv en a canonical primal linear program z ∗ = min x  c T x : A x = b ; x ≥ 0  whic h has the same solution as the dual linear p rogram z ∗ = max ( y , s )  b T y : A T y + s = c ; s ≥ 0  where A is an n × m matrix, x , s , c are length m , and y , b are length n , and m ≥ n . (Unfortun ately , this use of n and m is reversed from the stand ard linear pr ogramming con ve ntion. W e d o this to b e consisten t with the standard graph-theory con ven tion th at we u se throughout the p ap er.) W e let Ω D denote the dual p olytope Ω D =  ( y , s ) : A T y + s = c ; s ≥ 0  so we can wr ite the solution to the linear program as z ∗ = max ( y , s ) ∈ Ω D b T y . In this app endix, w e present an I nteriorP oint algorithm based on that of Renegar [Ren88], mo dified to use an approxi mate solv er. Our analysis follo ws that foun d in [Y e97]. Theorem 2.1. x = Interio rPoint ( A, b , c , λ min , T , y 0 , ǫ ) takes input that satisfy • A is an n × m matrix; b is a length n v e ctor; c is a length m ve ctor • AA T is p ositive definite, and λ min > 0 is a lower b ound on the eigenvalues of AA T • T > 0 i s an upp er b ound on the absolute values of the dual c o or dinates, i .e. k y k ∞ < T and k s k ∞ < T f or al l ( y , s ) that satisfy s = c − A T y ≥ 0 • initial p oint y 0 is a length n v e ctor wher e A T y 0 < c • err or p ar ameter ǫ satisfies 0 < ǫ < 1 and r eturns x > 0 satisfying k A x − b k ≤ ǫ and z ∗ < c T x < z ∗ + ǫ . L et us define • U is the lar gest absolute value of any entry in A, b , c • s 0 min is the smal lest entry of s 0 = c − A T y 0 Then the algorithm makes O  √ m log T U m λ min s 0 min ǫ  c al ls to the appr oximate solver, of the form Solve  AS − 2 A T + v v T , · , ǫ ′  wher e S is a p ositive diagonal matrix with c ondition numb er O  T 2 U m 2 ǫ  , and v , ǫ ′ satisfy log k v k ǫ ′ = O  log T U m s 0 min ǫ  28 C.1 The Analytic Center Standard inte rior-p oin t metho ds fo cus on a particular p oin t in the interior of the du al p olytop e. This p oin t, called the analytic c enter , is the p oin t that maximizes the pro d uct of the slac ks, i.e. the pro du ct of the elemen ts of s . F or the purp ose of our analysis, we u se the follo win g equiv alent definition of the analytic cen ter: F act C .1 (see [Y e97 , § 3.1]) . The analyt ic center of Ω D =  ( y , s ) : A T y + s = c ; s ≥ 0  is the unique p oint ( ∗ y , ∗ s ) ∈ ˚ Ω D that satisfies η A ( ∗ s ) = 0 , wher e we define x A ( s ) = S − 1 ( I − S − 1 A T ( AS − 2 A T ) − 1 AS − 1 ) 1 m η A ( s ) = k S x A ( s ) − 1 m k =   S − 1 A T ( AS − 2 A T ) − 1 AS − 1 1 m   =   AS − 1 1 m   ( AS − 2 A T ) − 1 These defi nitions of x A and η A satisfying th e follo win g pr op erties: Lemma C.2. L et ( ∗ y , ∗ s ) b e the analytic c enter of Ω D . F or any p oint ( y , s ) ∈ ˚ Ω D we have (i) A x A ( s ) = 0 (ii) η A ( s ) < 1 implies x A ( s ) > 0 (iii) x A ( ∗ s ) = ∗ S − 1 1 m (iv) F or al l x satisfying A x = 0 , it holds that k S x − 1 m k ≥ η A ( s ) The firs t thr ee p rop erties are straightforw ard from the d efinition. W e present a pro of of th e last: Pr o of of C.2(iv). Note th at S x A ( s ) − 1 m is orthogo nal to S ( x − x A ( s )), b ecause h S x A ( s ) − 1 m , S ( x − x A ( s )) i =  S − 1 A T ( AS − 2 A T ) − 1 AS − 1 1 m , S ( x − x A ( s ))  =  ( AS − 2 A T ) − 1 AS − 1 1 m , A ( x − x A ( s ))  =  ( AS − 2 A T ) − 1 AS − 1 1 m , 0  = 0 W e th us ha ve k S x − 1 m k = k S x A ( s ) − 1 m + S ( x − x A ( s )) k ≥ k S x A ( s ) − 1 m k = η A ( s ) It will b e u seful to n ote that the slac ks of the analytic cen ter cannot b e to o small. W e can b ound the slac ks of the analytic center a wa y from zero as follo ws: Lemma C.3 (compare [Y e97, Th m 2.6]) . L et ( ∗ y , ∗ s ) b e the analytic c enter of Ω D . F or every ( y , s ) ∈ Ω D , we have ∗ s > 1 m s Pr o of.    ∗ S − 1 s    ∞ ≤ 1 T m ∗ S − 1 s = 1 T m ∗ S − 1 ∗ s + 1 T m ∗ S − 1 ( s − ∗ s ) = m + 1 T m ∗ S − 1 ( s − ∗ s ) = m + 1 T m ∗ S − 1  ( c − A T y ) − ( c − A T ∗ y )  = m + 1 T m ∗ S − 1 A T ( ∗ y − y ) = m where we know f r om Lemmas C.2(i) and C.2(iii) that A ∗ S − 1 1 m = 0 29 y + = Newt onStep ( A, c , y ) • Let s = c − A T y • Let d y = Solv e ( A S − 2 A T , − AS − 1 1 m , ǫ 3 ) w here ǫ 3 = 1 20( √ m +1) • Return y + = y + (1 − ǫ 3 ) d y Figure 2: Pro cedu re for stepping closer to the analytic cen ter Let us note that a p oint ( y , s ) ∈ ˚ Ω D that satisfies η A ( s ) < 1 is close to the analytic cen ter, in the s ense that the slac ks s are b oun ded b y a constan t ratio from the slac ks of the analytic center: Lemma C.4 ([Y e97 , Thm 3.2(iv)]) . Supp ose ( y , s ) ∈ ˚ Ω D satisfies η A ( s ) = η < 1 and let ( ∗ y , ∗ s ) b e the analytic c enter of Ω D . Then   S − 1 ∗ s − 1 m   ≤ η 1 − η . If ( y , s ) ∈ ˚ Ω D is sufficient ly close to the analytic cente r (as measured by η A ), then with a single call to the approxi mate solv er, we can tak e a Newton-t yp e step to fin d a p oin t ev en closer to the analytic center. Th is NewtonS tep pro cedure is presente d in Figure 2. In the fi r st p art of the follo wing lemma, we prov e that the p oin t returned by N ewtonSte p is indeed still inside the dual p olyto p e. I n the second part, w e show ho w close the new p oint is to the analytic center: Lemma C.5 (compare [Y e97, Th m 3.3]) . Supp ose ( y , s ) ∈ ˚ Ω D satisfies η A ( s ) = η < 1 . L et y + = Newt onStep ( A, c , y ) and s + = c − A T y + Then (i) s + > 0 and (ii) η A ( s + ) ≤ η 2 + 1 20 η Pr o of. (i) The solver guaran tees that   d y + ( AS − 2 A T ) − 1 AS − 1 1   AS − 2 A T ≤ ǫ 3   ( AS − 2 A T ) − 1 AS − 1 1   AS − 2 A T = ǫ 3 · η or equiv alen tly   S − 1 A T d y + S − 1 A T ( AS − 2 A T ) − 1 AS − 1 1   ≤ ǫ 3   S − 1 A T ( AS − 2 A T ) − 1 AS − 1 1   = ǫ 3 · η (7) and so   S − 1 A T d y   ≤ (1 + ǫ 3 )   S − 1 A T ( AS − 2 A T ) − 1 AS − 1 1   = (1 + ǫ 3 ) η < 1 + ǫ 3 W e th us ha ve   S − 1 s + − 1   =   S − 1  s − (1 − ǫ 3 ) A T d y  − 1   = (1 − ǫ 3 )   S − 1 A T d y   ≤ (1 − ǫ 3 )(1 + ǫ 3 ) < 1 Th u s S − 1 s + is p ositiv e and so is s + . 30 (ii) Let x = x A ( s ) and x + = x A ( s + ). W e hav e η A ( s + ) ≤   X s + − 1 m   (b y Lemma C.2(iv)) =   X ( s − (1 − ǫ 3 ) A T d y ) − 1 m   =   (1 − ǫ 3 ) X S ( X s − 1 m − S − 1 A T d y ) − (1 − ǫ 3 )( X S − I )( X s − 1 m ) + ǫ 3 ( X s − 1 m )   ≤ (1 − ǫ 3 )   X S ( S − 1 A T d y − S x + 1 m )   + (1 − ǫ 3 ) k ( X S − I )( X s − 1 m ) k + ǫ 3 k ( X s − 1 m ) k ≤ (1 − ǫ 3 ) k S x k   S − 1 A T d y − S x + 1 m   + (1 − ǫ 3 ) k S x − 1 m k 2 + ǫ 3 k S x − 1 m k (using the relation k V w k ≤ k v k ∞ k w k ≤ k v k k w k ) ≤ (1 − ǫ 3 )( k S x − 1 m k + k 1 m k )   S − 1 A T d y − S x + 1 m   + (1 − ǫ 3 ) k S x − 1 m k 2 + ǫ 3 k S x − 1 m k = (1 − ǫ 3 )( η + √ m )   S − 1 A T d y − S x + 1 m   + (1 − ǫ 3 ) η 2 + ǫ 3 η = (1 − ǫ 3 )( η + √ m )   S − 1 A T d y + S − 1 A T ( AS − 2 A T ) − 1 AS − 1 1 m   + (1 − ǫ 3 ) η 2 + ǫ 3 η ≤ (1 − ǫ 3 )( η + √ m ) ǫ 3 η + (1 − ǫ 3 ) η 2 + ǫ 3 η (b y equation 7) ≤ ǫ 3 ( η + √ m ) η + (1 − ǫ 3 ) η 2 + ǫ 3 η = η 2 + ǫ 3 ( √ m + 1) η = η 2 + 1 20 η C.2 The Path-F ollo wing A lgorithm In a path-follo wing algorithm, we mo dify the dual p olytop e Ω D =  ( y , s ) : A T y + s = c ; s ≥ 0  b y addin g an additional con train t b T y ≥ z , where z ≤ z ∗ . As we let z approac h z ∗ , the cen ter of the p olytop e appr oac h es th e solution to the dual linear program. Letting s g ap = b T y − z d enote the new slac k v ariable, w e define the mo dified p olytop e: Ω D b ,z =  ( y , s , s g ap ) :  A T y + s − b T y + s g ap  =  c − z  ; s , s g ap ≥ 0  Using a trick of Renegar, when we define th e analytic cent er of Ω D b ,z , we consider there to b e m copies of the slac k s g ap , as follo ws: Definition C.6. The analytic center of Ω D b ,z is the p oint ( ∗ y , ∗ s , ∗ s g ap ) ∈ ˚ Ω D b ,z , that satisfies ˜ η ( ∗ s , ∗ s g ap ) = 0 , wher e we define ˜ η ( s , s g ap ) = η ˜ A ( ˜ s ) wher e ˜ A =  A − b 1 T m  and ˜ s =  s s g ap 1 m  The cen tral path is the set of analytic c enters of the p olytop es n Ω D b ,z o z ≤ z ∗ A path-follo w ing algorithm steps through a sequence of p oints near the central p ath, as z increases to wards z ∗ . It is useful to note that giv en any p oint on the cen tral p ath, we may easily construct a feasible p rimal solution x , as follo ws: Lemma C .7. L et ( ∗ y , ∗ s , ∗ s g ap ) b e the analytic c enter of Ω D b ,z . Then the ve ctor x = ∗ s ga p m ∗ S − 1 1 m satisfies A x = b . Mor e gener al ly, for any ( y , s , s g ap ) ∈ ˚ Ω D b ,z , the ve ctor x = s ga p m S − 1 1 m satisfies k A x − b k ( ˜ A ˜ S − 2 ˜ A T ) − 1 = s g ap m · ˜ η ( s , s g ap ) 31 ( y + , s + , s + g ap , z + ) = Shift ( y , s , s g ap , z ) • Let z + = z + s ga p 10 √ m • Let y + = Newt onStep ( ˜ A, ˜ c , y ) where ˜ A =  A − b 1 T m  and ˜ c =  c − z + 1 m  • Let  s + s + g ap  =  c − A T y + b T y + − z +  Figure 3: Pro cedu re for taking a step along the cen tral path Pr o of. W e pro ve the second assertion: k A x − b k ( ˜ A ˜ S − 2 ˜ A T ) − 1 = s g ap m   AS − 1 1 m − ms − 1 g ap b   ( ˜ A ˜ S − 2 ˜ A T ) − 1 = s g ap m    ˜ A ˜ S − 1 1 2 m    ( ˜ A ˜ S − 2 ˜ A T ) − 1 = s g ap m · ˜ η ( s , s g ap ) The fi rst assertion no w follo ws from th e d efinition of analytic cen ter. Let u s n o w describ e ho w to tak e steps along the central path using our approxi mate s olver. In Figure 3 , w e present the pro cedure Shift , whic h tak es as input a v alue z < z ∗ and a p oin t ( y , s , s g ap ) ∈ ˚ Ω D b ,z satisfying η ( s , s g ap ) ≤ 1 10 . The output is a new v alue z + that is clo ser to z ∗ , and a n ew p oin t ( y + , s + , s + g ap ) ∈ ˚ Ω D b ,z + satisfying η ( s + , s + g ap ) ≤ 1 10 . The pro cedu re r equ ires a single call to the solv er. Let us examine this pro cedur e more closely . After defin ing the incremen ted v alue z + , if w e let s ′ g ap = b T y − z + = s g ap − ( z + − z ), then ( y , s , s ′ g ap ) is a p oint in the shifted p olytope Ω D b ,z + . Ho w eve r this p oin t ma y b e sligh tly farther a wa y f r om the cen tral path. On e call to the Newt onStep pro cedur e suffices to obtain a new p oin t ( y + , s + , s + g ap ) ∈ ˚ Ω D b ,z + that is sufficiently close to th e cen tral path, satisfying ˜ η ( s + , s + g ap ) ≤ 1 10 . W e pro v e this formally: Lemma C .8 (compare [Y e97 , Lem 4.5]) . Given z < z ∗ and ( y , s , s g ap ) ∈ ˚ Ω D b ,z wher e ˜ η ( s , s g ap ) ≤ 1 10 , let s ′ g ap = b T y − z + and ( y + , s + , s + g ap , z + ) = Shift ( y , s , s g ap , z ) . Then (i) z + < z ∗ (ii) s ′ g ap > 0 and ˜ η ( s , s ′ g ap ) < 21 100 (iii) s + , s + g ap > 0 and ˜ η ( s + , s + g ap ) < 1 10 Pr o of. (i) z + = z + b T y − z 10 √ m < z + ( b T y − z ) = b T y < z ∗ (ii) W e note th at s ′ g ap = s g ap − ( z + − z ) =  1 − 1 10 √ m  s g ap > 0. 32 Let u s write ˜ s =  s s g ap 1 m  and ˜ s ′ =  s s ′ g ap 1 m  and n ote that ˜ s − ˜ s ′ =  0 ( s g ap − s ′ g ap ) 1 m  =  0 ( z + − z ) 1 m  = " 0 s ga p 10 √ m 1 m # (8) Let u s defin e ˜ x =  x x g ap  = x ˜ A ( ˜ s ). So we ha ve ˜ η ( s , s ′ g ap ) = η ˜ A ( ˜ s ′ ) ≤    ˜ S ′ ˜ x − 1 2 m    (b y Lemma C.2(iv)) ≤    ˜ S ˜ x − 1 2 m    +    ( ˜ S ′ − ˜ S ) ˜ x    =    ˜ S ˜ x − 1 2 m    + 1 10 √ m k s g ap x g ap k (b y Equation 8) ≤    ˜ S ˜ x − 1 2 m    + 1 10 √ m ( k s g ap x g ap − 1 m k + k 1 m k ) =    ˜ S ˜ x − 1 2 m    + 1 10 √ m k s g ap x g ap − 1 m k + 1 10 ≤    ˜ S ˜ x − 1 2 m    + 1 10 k s g ap x g ap − 1 m k + 1 10 ≤    ˜ S ˜ x − 1 2 m    + 1 10    ˜ S ˜ x − 1 2 m    + 1 10 = 11 10 ˜ η ( s , s g ap ) + 1 10 (9) ≤ 11 10 · 1 10 + 1 10 = 21 100 (iii) By Lemma C.5, w e ha v e s + , s + g ap > 0 and ˜ η ( s + , s + g ap ) ≤ ˜ η ( s , s ′ g ap ) 2 + 1 20 ˜ η ( s , s ′ g ap ) ≤  21 100  2 + 1 20 · 21 100 < 1 10 W e now present the complete path-follo win g InteriorPo int algorithm, implemen ted u s ing an appro ximate solv er, in Figure 4. F or no w w e p ostp one describing the FindCen tralPath sub routine, whic h give s an initial p oint near the central path. In particular, it p ro duces a z C < z ∗ and ( y C , s C , s C g ap ) ∈ Ω D b ,z C satisfying ˜ η ( s C , s C g ap ) ≤ 1 10 . Once we h av e this initial cent r al path p oin t, Lemma C.8 tells us that after eac h call to Shif t we hav e a new v alue z < z ∗ and new central p ath p oint ( y , s , s g ap ) ∈ ˚ Ω D b ,z that satisfies ˜ η ( s , s g ap ) ≤ 1 10 . Later w e w ill analyze the n umb er of calls to Shift b efore the algortihm terminates. First, let us confi r m that the algorithm returns the correct output: Lemma C.9. The output of x = In teriorPo int ( A, b , c , y 0 , ǫ ) satisfies (i) x > 0 (ii) k A x − b k ≤ ǫ 12 √ 2 · T n 1 / 2 (iii) c T x < z ∗ + ǫ 33 x = In teriorPo int ( A, b , c , y 0 , ǫ ) • Comp ute ( y C , z C ) = FindCen tralPath ( A , b , c , y 0 ) and  s C s C g ap  =  c − A T y C b T y C − z C  • S et ( y , s , s g ap , z ) := ( y C , s C , s C g ap , z C ) • While s g ap > ǫ 3 : – S et ( y , s , s g ap , z ) := Shift ( y , s , s g ap , z ) • Comp ute v = Solve ( ˜ A ˜ S − 2 ˜ A T , ˜ A ˜ S − 1 1 2 m , ǫ 4 ) where ˜ A =  A − b 1 T m  and ǫ 4 = min 1 , s min T U · m 1 / 2 n ! and ˜ s =  s s g ap 1 m  and s min is the smallest entry of ˜ s • Return x = x ′ mx ′ ga p where  x ′ x ′ g ap  =  S − 1 1 m − S − 2 A T v s − 1 g ap + s − 2 g ap b T v  Figure 4: Dual path-follo wing interior-point algorithm using an appro ximate solv er Pr o of. (i) T o assist in our pro of, let us define ˜ x ′ =  x ′ x ′ g ap 1 m  and note that ˜ x ′ = ˜ S − 1 1 2 m − ˜ S − 2 ˜ A T v . W e ha v e    ˜ S ˜ x ′ − 1 2 m    =    ˜ S − 1 ˜ A T v    = k v k ˜ A ˜ S − 2 ˜ A T ≤    ( ˜ A ˜ S − 2 ˜ A T ) − 1 ˜ A ˜ S − 1 1 2 m    ˜ A ˜ S − 2 ˜ A T +    v − ( ˜ A ˜ S − 2 ˜ A T ) − 1 ˜ A ˜ S − 1 1 2 m    ˜ A ˜ S − 2 ˜ A T ≤ (1 + ǫ 4 )    ( ˜ A ˜ S − 2 ˜ A T ) − 1 ˜ A ˜ S − 1 1 2 m    ˜ A ˜ S − 2 ˜ A T (b y guaran tee of Solv e ) = (1 + ǫ 4 ) · ˜ η ( s , s g ap ) ≤ 2 · ˜ η ( s , s g ap ) ≤ 2 · 1 10 ≤ 1 5 (10) Since ˜ s is p ositiv e, w e conclude that ˜ x ′ m ust also b e p ositiv e, and s o m u st b e x . (ii) W e ha ve k A x − b k = 1 mx ′ g ap   A x ′ − mx ′ g ap b   = 1 mx ′ g ap    ˜ A ˜ x ′    = 1 mx ′ g ap    ˜ A ˜ S − 1 1 2 m − ˜ A ˜ S − 2 ˜ A T v    34 Observe that the largest eigen v alue of the matrix ˜ A ˜ A T = AA T + m b b T is less than the trace, which is at m ost 2 nmU 2 . Thus, the largest eigen v alue of ˜ A ˜ S − 2 ˜ A T is at m ost 2 nmU 2 s − 2 min . So w e pro ceed ≤ (2 nmU 2 s − 2 min ) 1 / 2 mx ′ g ap    ˜ A ˜ S − 1 1 2 m − ˜ A ˜ S − 2 ˜ A T v    ( ˜ A ˜ S − 2 ˜ A T ) − 1 = (2 n ) 1 / 2 U m 1 / 2 x ′ g ap s min    ( ˜ A ˜ S − 2 ˜ A T ) − 1 ˜ A ˜ S − 1 1 2 m − v    ˜ A ˜ S − 2 ˜ A T ≤ (2 n ) 1 / 2 U m 1 / 2 x ′ g ap s min · ǫ 4    ( ˜ A ˜ S − 2 ˜ A T ) − 1 ˜ A ˜ S − 1 1 2 m    ˜ A ˜ S − 2 ˜ A T (b y guaran tee of Solve ) = (2 n ) 1 / 2 U m 1 / 2 x ′ g ap s min · ǫ 4 · ˜ η ( s , s g ap ) ≤ (2 n ) 1 / 2 U m 1 / 2 x ′ g ap s min · ǫ 4 · 1 10 ≤ (2 n ) 1 / 2 U m 1 / 2 x ′ g ap s min · m 1 / 2 s min nT U · 1 10 = 1 5 √ 2 · T n 1 / 2 · 1 x ′ g ap ≤ 1 5 √ 2 · T n 1 / 2 · 5 4 · s g ap (from equation 10, we kno w s g ap x ′ g ap ≥ 4 5 ) ≤ 1 5 √ 2 · T n 1 / 2 · 5 4 · ǫ 3 ≤ ǫ 12 √ 2 · T n 1 / 2 (11) (iii) W e ha ve s T x = s T x ′ mx ′ g ap ≥ 5 s T x ′ 6 m · s g ap (from equation 10, we kno w s g ap x ′ g ap ≤ 6 5 ) = 5 k S x ′ k 1 6 m · s g ap ≥ 5 6 m  k 1 m k 1 −   S x ′ − 1 m   1  · s g ap = 5 6 m  m −   S x ′ − 1 m   1  · s g ap ≥ 5 6 m  m − √ m   S x ′ − 1 m    · s g ap ≥ 5 6 m  m − 1 5 √ m  · s g ap (b y equation 10) ≥ 5 6 m · 4 5 · m · s g ap = 2 3 · s g ap (12) 35 s T x = s T x ′ mx ′ g ap ≤ 5 s T x ′ 4 m · s g ap (from equation 10, we kno w s g ap x ′ g ap ≥ 4 5 ) = 5 k S x ′ k 1 4 m · s g ap ≤ 5 4 m    S x ′ − 1 m   1 + k 1 m k 1  · s g ap = 5 4 m    S x ′ − 1 m   1 + m  · s g ap ≤ 5 4 m  √ m   S x ′ − 1 m   + m  · s g ap ≤ 5 4 m  1 5 √ m + m  · s g ap (b y equation 10) ≤ 5 4 m · 6 5 · m · s g ap = 3 2 · s g ap (13) W e then ha ve c T x − z ∗ > c T x − b T y = ( c T − y T A ) x + y T ( A x − b ) = s T x + y T ( A x − b ) ≥ 2 3 · s g ap + y T ( A x − b ) (by Equation 13) ≤ . . . 2 3 · ǫ + y T ( A x − b ) ≤ 2 3 · ǫ + k y k k A x − b k ≤ 2 3 · ǫ + T n 1 / 2 k A x − b k ≤ 2 3 · ǫ + 1 12 √ 2 · ǫ (b y Lemma C.9(ii)) < ǫ 36 c T x − z ∗ < c T x − z = ( c T − y T A ) x + y T ( A x − b ) + b T y − z = s T x + y T ( A x − b ) + s g ap ≤ 5 2 · s g ap + y T ( A x − b ) (b y Equation 13) ≤ 5 6 · ǫ + y T ( A x − b ) ≤ 5 6 · ǫ + k y k k A x − b k ≤ 5 6 · ǫ + T n 1 / 2 k A x − b k ≤ 5 6 · ǫ + 1 12 √ 2 · ǫ (b y Lemma C.9(ii)) < ǫ Next, w e analyze the n umb er of Shif t iteratio ns unti l the algorithm terminates. W e can measure the progress of the algorithm with the p oten tial function B ( z ): B ( z ) = m X j =1 log ∗ s j + m log ∗ s g ap where ( ∗ y , ∗ s , ∗ s g ap ) is the analytic cen ter of Ω D b ,z So on, we will show ho w a decrease in B ( z ) implies that s g ap is decreasing and thus the algorithm is making p rogress. Let u s first sh o w that the v alue of B ( z ) d ecreases b y Ω ( √ m ) after eac h iteration. Lemma C.10 (compare [Y e97 , Lem 4.6]) . Given ( y , s , s g ap ) ∈ ˚ Ω D b ,z satisfiying ˜ η ( s , s g ap ) < 1 10 , let ( y + , z + ) = Shi ft ( y , z ) . Then B ( z + ) ≤ B ( z ) − Θ( √ m ) . Pr o of. L et ( ∗ y , ∗ s , ∗ s g ap ) and ( ∗ y + , ∗ s + , ∗ s + g ap ) resp ectiv ely b e the analytic cen ters of Ω D b ,z and Ω D b ,z + . 37 F ollo win g Lemma C.7 , we define x = ∗ s ga p m ∗ S − 1 1 m that satisfies A x = b . W e ha ve e B ( z + ) − B ( z ) 2 m = v u u u t   m Y j =1 ∗ s + j ∗ s j   1 m · ∗ s + g ap ∗ s g ap ≤ 1 2 m m X j =1 ∗ s + j ∗ s j + 1 2 ∗ s + g ap ∗ s g ap ≤ 1 + 1 2 m m X j =1 ∗ s + j − ∗ s j ∗ s j + 1 2 ∗ s + g ap − ∗ s g ap ∗ s g ap = 1 + 1 2 ∗ s g ap · ∗ s g ap m ( ∗ s + − ∗ s ) ∗ S − 1 1 m + 1 2 ∗ s g ap ( ∗ s + g ap − ∗ s g ap ) = 1 + 1 2 ∗ s g ap  ( ∗ s + − ∗ s ) T x + ( ∗ s + g ap − ∗ s g ap )  = 1 + 1 2 ∗ s g ap   ( c − A T ∗ y + ) − ( c − A T ∗ y )  T x + ( ∗ s + g ap − ∗ s g ap )  = 1 + 1 2 ∗ s g ap  ( ∗ y − ∗ y + ) T A x + ( ∗ s + g ap − ∗ s g ap )  = 1 + 1 2 ∗ s g ap  ( ∗ y − ∗ y + ) T b + ( ∗ s + g ap − ∗ s g ap )  (b y Lemma C .7) = 1 + 1 2 ∗ s g ap  ( b T ∗ y − ∗ s g ap ) − ( b T ∗ y + − ∗ s + g ap )  = 1 + 1 2 ∗ s g ap  z − z +  (14) = 1 − s g ap 20 √ m · ∗ s g ap ≤ 1 − 9 200 √ m ( ∗ s g ap s g ap ≤ 10 9 b y Lemma C.4) ≤ e − 9 200 √ m W e conclude that B ( z + ) − B ( z ) ≤ − 9 100 √ m . Let u s no w sho w that a d ecrease in the p oten tial fun ction B ( z ) implies a d ecrease in the v alue of s g ap : Lemma C .11 (compare [Y e97, Prop 4.2]) . Given ( y , s , s g ap ) ∈ ˚ Ω D b ,z and ( y + , s + , s + g ap ) ∈ ˚ Ω D b ,z + wher e z + > z and ˜ η ( s , s g ap ) ≤ η and ˜ η ( s + , s + g ap ) ≤ η for η < 1 . Then s g ap s + g ap ≤ (1 − 2 η ) ·  e B ( z 1 ) − B ( z 2 ) m − 1  Pr o of. L et ( ∗ y , ∗ s , ∗ s g ap ) and ( ∗ y + , ∗ s + , ∗ s + g ap ) resp ectiv ely b e the analytic centers of Ω D b ,z and Ω D b ,z + . W e define x = ∗ s ga p m ∗ S − 1 1 m and x + = ∗ s + ga p m ( ∗ S + ) − 1 1 m , which b y Lemma C.7 satisfy A x = b = A x + . 38 W e ha v e e B ( z ) − B ( z + ) m =   m Y j =1 ∗ s j ∗ s + j   1 m · ∗ s g ap ∗ s + g ap ≤ 1 m   m X j =1 ∗ s j ∗ s + j   · ∗ s g ap ∗ s + g ap =   1 + 1 m m X j =1 ∗ s j − ∗ s + j ∗ s + j   · ∗ s g ap ∗ s + g ap =  1 + 1 ∗ s + g ap ( ∗ s − ∗ s + ) T x +  · ∗ s g ap ∗ s + g ap =  1 + 1 ∗ s + g ap  ( c − A T ∗ y ) − ( c − A T ∗ y + )  T x +  · ∗ s g ap ∗ s + g ap =  1 + 1 ∗ s + g ap ( ∗ y − ∗ y + ) T A x +  · ∗ s g ap ∗ s + g ap =  1 + 1 ∗ s + g ap ( ∗ y − ∗ y + ) T b  · ∗ s g ap ∗ s + g ap =  1 + 1 ∗ s + g ap ( ∗ y − ∗ y + ) T A x  · ∗ s g ap ∗ s + g ap =  1 + 1 ∗ s + g ap  ( c − A T ∗ y ) − ( c − A T ∗ y + )  T x  · ∗ s g ap ∗ s + g ap =  1 + 1 ∗ s + g ap ( ∗ s − ∗ s + ) T x  · ∗ s g ap ∗ s + g ap ≤  1 + 1 ∗ s + g ap ∗ s T x  · ∗ s g ap ∗ s + g ap =  1 + ∗ s g ap ∗ s + g ap  · ∗ s g ap ∗ s + g ap ≤  1 + ∗ s g ap ∗ s + g ap  2 So ∗ s g ap ∗ s + g ap ≥ e B ( z ) − B ( z + ) 2 m − 1 Using L emma C.4, w e m a y conclude s g ap s + g ap = s g ap ∗ s g ap · ∗ s g ap ∗ s + g ap · ∗ s + g ap s + g ap ≥ 1 − η 1 − η 1 + η 1 − η ·  e B ( z ) − B ( z + ) 2 m − 1  Corollary C.12. The Inter iorPoint algor ithm makes O  √ m log s C ga p ǫ  c al ls to Shif t . 39 Pr o of. R ecall that the algorithm will terminate only when the v alue of s g ap has decreased from its initial v alue of s C g ap to b elo w ǫ 3 . Thus, Lemma C.11 ensu res us that s g ap will b e smaller than ǫ 3 once B ( z ) has decreased by Ω  m log s C ga p ǫ  . According to Lemma C.10, this o ccur s after O  √ m log s C ga p ǫ  Shift iterations. C.3 Finding the Central P ath It remains for us to describ e h o w to initialize the path-follo win g algorithm by finding a p oin t near the cen tral path. Essen tially , this is accomplished b y ru nning the path-follo w ing algorithm in rev erse. Instead of stepping to wards the optim um given b y b , w e step aw a y from the optim um giv en b y the v ector ¯ b = A ( S 0 ) − 1 1 m that d ep ends on our initial feasible p oin t ( y 0 , s 0 ) ∈ ˚ Ω D . Our analysis parallels that in the previous section. T he follo win g fu nction ¯ η m easures th e pro ximity of a p oint ( y , s , ¯ s g ap ) ∈ ˚ Ω D ¯ b , ¯ z to the cen tral path giv en by ¯ b : ¯ η ( s , ¯ s g ap ) = η ˜ ¯ A ( ˜ ¯ s ) w h ere ˜ ¯ A =  A − ¯ b 1 T m  and ˜ ¯ s =  s ¯ s g ap 1 m  T o initialize the algorithm, we obs erv e that ( y 0 , s 0 , m ) ∈ Ω D ¯ b , ¯ z 0 is on the ¯ b cent ral path, where we define ¯ z 0 = ¯ b T y 0 − m : Lemma C.13. ¯ η ( s 0 , m ) = 0 Pr o of. Defin ing ˜ ¯ s 0 =  s 0 m 1 m  , we hav e ˜ ¯ A ( ˜ ¯ S 0 ) − 1 1 2 m =  A − ¯ b 1 T m   ( S 0 ) − 1 1 m m − 1 1 m  = A ( S 0 ) − 1 1 m − ¯ b · 1 T m 1 m m = ¯ b − ¯ b = 0 Th u s, ¯ η ( s 0 , m ) =    ˜ ¯ A ( ˜ ¯ S 0 ) − 1 1 2 m    ( ˜ ¯ A ( ˜ ¯ S 0 ) − 2 ˜ ¯ A T ) − 1 = 0 W e present the Fin dCentral Path alg orithm in Figure 5. Starting with ¯ z = ¯ z 0 , w e tak e steps along the ¯ b cen tral p ath, decreasing ¯ z unti l it is su fficien tly small that the analytic cen ter of Ω D ¯ b , ¯ z is close to the analytic cen ter of Ω D , and therfore also clo se to the analytic cente r of Ω D b ,z for some sufficien tly small z . Let u s show that the Unshift pr o cedure indeed tak es steps near the ¯ b cent ral path: Lemma C.14 (compare Lemmas C.8) . Give n ( y , s , ¯ s g ap ) ∈ ˚ Ω D ¯ b ,z satisfying ¯ η ( s , ¯ s g ap ) ≤ 1 40 . L et ¯ s ′ g ap = b T y − ¯ z + and ( y + , s + , ¯ s + g ap , ¯ z + ) = Uns hift ( y , s , ¯ s g ap , ¯ z ) . Then (i) ¯ η ( s , ¯ s ′ g ap ) ≤ 51 400 (ii) ¯ η ( s + , ¯ s + g ap ) ≤ 1 40 . Pr o of of C.14(i). F ollo wing the pro of of Lemma C.8(i) through equation 9, we ha ve ¯ η ( s , ¯ s ′ g ap ) ≤ 11 10 · ¯ η ( s , ¯ s g ap ) + 1 10 ≤ 11 10 · 1 40 + 1 10 = 51 400 40 ( y , s , s g ap , z ) = FindC entralPa th ( A, b , c , y 0 ) • Define  s 0 ¯ s 0 g ap  =  c − A T y 0 m  ¯ b = A ( S 0 ) − 1 1 m ¯ z 0 = ¯ b T y 0 − ¯ s 0 g ap • S et ( y , s , ¯ s g ap , ¯ z ) := ( y 0 , s 0 , ¯ s 0 g ap , ¯ z 0 ) • While ¯ s g ap < 40 λ − 1 / 2 min T m k ¯ b k : – S et ( y , s , ¯ s g ap , ¯ z ) := Unshif t ( y , s , ¯ s g ap , ¯ z ) • Return ( y , s ) and z = b T y − 40 λ − 1 / 2 min T m k b k and s g ap = b T y − z ( y + , s + , ¯ s + g ap , ¯ z + ) = Unshift ( y , s , ¯ s g ap , ¯ z ) • Let ¯ z + = ¯ z − ¯ s ga p 10 √ m • Let y + = Newt onStep ( ˜ ¯ A, ˜ ¯ c , y ) where ˜ ¯ A =  A − ¯ b 1 T m  and ˜ ¯ c =  c − ¯ z + 1 m  • Let  s + ¯ s + g ap  =  c − A T y + ¯ b T y + − ¯ z +  Figure 5: Algorithm for find ing p oint n ear central path give n f easible in terior p oint Pr o of of C.14(ii). By Lemma C.5, we hav e ˜ η ( s + , ¯ s + g ap ) ≤ ˜ η ( s , ¯ s ′ g ap ) 2 + 1 20 ˜ η ( s , ¯ s ′ g ap ) ≤  51 400  2 + 1 20 · 51 400 < 1 40 Next, let us p ro ve that the p oint returned by Find CentralP ath is indeed near the original cen tral path (i.e. the p ath giv en by b ): Lemma C.15. F or y 0 satisfying A T y 0 < c , let ( y , s , s g ap , z ) = FindCent ralPath ( A, b , c , y 0 ) . Then ( y , s , s g ap ) ∈ ˚ Ω D b ,z and ˜ η ( s , s g ap ) ≤ 1 10 . Pr o of. Usin g the v alues at the end of th e algorithm, w e write ˜ s =  s s g ap 1 m  and ˜ ¯ s =  s ¯ s g ap 1 m  . 41 T o b egin, we note ¯ s g ap ≥ 40 λ − 1 / 2 min T m k ¯ b k = 40 m ( T − 2 λ min ) − 1 / 2 k ¯ b k ≥ 40 m k ¯ b k ( AS − 2 A T ) − 1 (15) where the last in equalit y follo ws b ecause the s m allest eigen v alue of AS − 2 A T is at least T − 2 λ min . Similarly , s g ap = 40 λ − 1 / 2 min T m k b k ≥ 40 m k b k ( AS − 2 A T ) − 1 (16) W e ha v e ˜ η ( s , s g ap ) =    ˜ A ˜ S − 1 1 2 m    ( ˜ A ˜ S − 2 ˜ A T ) − 1 ≤    ˜ A ˜ S − 1 1 2 m    ( AS − 2 A T ) − 1 (b ecause ˜ A ˜ S − 2 ˜ A T − AS − 2 A T = ms − 2 g ap b b T is p ositiv e semidefinite) =    ˜ ¯ A ˜ ¯ S − 1 1 2 m − ms − 1 g ap b + m ¯ s − 1 g ap ¯ b    ( AS − 2 A T ) − 1 ≤    ˜ ¯ A ˜ ¯ S − 1 1 2 m    ( AS − 2 A T ) − 1 + m ¯ s − 1 g ap k ¯ b k ( AS − 2 A T ) − 1 + ms − 1 g ap k b k ( AS − 2 A T ) − 1 ≤    ˜ ¯ A ˜ ¯ S − 1 1 2 m    ( AS − 2 A T ) − 1 + 1 40 + 1 40 (b y equations 15 and 16) ≤  1 + m ¯ s − 1 g ap k ¯ b k ( AS − 2 A T ) − 1  1 / 2    ˜ ¯ A ˜ ¯ S − 1 1 2 m    ( ˜ ¯ A ˜ ¯ S − 2 ˜ ¯ A T ) − 1 + 1 40 + 1 40 (b y Lemma B.3, using the fact that ˜ ¯ A ˜ ¯ S − 2 ˜ ¯ A T − AS − 2 A T = m ¯ s − 2 g ap ¯ b ¯ b T ) ≤  1 + 1 40  1 / 2    ˜ ¯ A ˜ ¯ S − 1 1 2 m    ( ˜ ¯ A ˜ ¯ S − 2 ˜ ¯ A T ) − 1 + 1 40 + 1 40 (b y equatio n 15) =  1 + 1 40  1 / 2 · ¯ η ( s , ¯ s g ap ) + 1 40 + 1 40 ≤ 2 · ¯ η ( ˜ ¯ s , ¯ s g ap ) + 1 20 ≤ 2 · 1 40 + 1 20 (b y Lemma C.14(ii)) = 1 10 T o measure the progress of the FindCen tralPath algorithm, we define ¯ B ( ¯ z ): ¯ B ( ¯ z ) = m X j =1 log ∗ s j + m log ∗ ¯ s g ap where ( ∗ y , ∗ s , ∗ ¯ s g ap ) is the analytic cen ter of Ω D ¯ b , ¯ z Lemma C.16 (compare Lemma C.10) . Giv e n ( y , s , ¯ s g ap ) ∈ ˚ Ω D ¯ b ,z satisfying ¯ η ( s , ¯ s g ap ) ≤ 1 40 . L et ( y + , s + , ¯ s + g ap , ¯ z + ) = Unshift ( y , s , ¯ s g ap , ¯ z ) . Then ¯ B ( ¯ z + ) ≥ ¯ B ( ¯ z ) + Θ( √ m ) . 42 Pr o of. W e will follo w the pro of of L emma C .10, with some minor c h anges. Before we pro ceed, let us recall the definition ¯ z + = ¯ z − ¯ s ga p 10 √ m to note that ¯ s ′ g ap = ¯ s g ap + ¯ z − ¯ z + = 10 √ m ( ¯ z − ¯ z + ) + ¯ z − ¯ z + ≤ 11 √ m ( ¯ z − ¯ z + ) (17) No w, w e switc h the places of z and z + , and follo w the pro of of Lemma C.10 up to Equ ation 14: e ¯ B ( ¯ z ) − ¯ B ( ¯ z + ) 2 m ≤ 1 + 1 2 ∗ ¯ s + g ap · ( ¯ z + − ¯ z ) W e con tinue: ≤ 1 − ¯ s ′ g ap 22 √ m · ∗ ¯ s + g ap (b y Equation 17) ≤ 1 − 1 22 √ m · 349 400 (b y Lemmas C.14 and C.4) ≤ e − 169 4400 √ m Corollary C .17. The FindC entralPa th algorithm makes O  √ m log T U m λ min s 0 min  c al ls to Unsh ift , wher e s 0 min is the smal lest entry of s 0 = c − A T y 0 . Pr o of. R ecall that th e algorithm will termin ate only when the v alue of ¯ s g ap has increased from its initial v alue of m to at least 40 λ − 1 / 2 min T m k ¯ b k . So, by Lemm a C.11, th is will h a v e happ ened once ¯ B ( z ) has increased b y Ω  m log  λ − 1 / 2 min T k ¯ b k  . According to Lemma C .16, this o ccurs after O  √ m log  λ − 1 / 2 min T k ¯ b k  iterations. T o complete the p ro of, we note that k ¯ b k =   A ( S 0 ) − 1 1 m   ≤ n 1 / 2 mU s 0 min C.4 Calls to t he Solve r In eac h call to Unshi ft , w e solv e one system in a matrix of the form ˜ ¯ A ˜ ¯ S − 2 ˜ ¯ A T = AS − 2 A T + m ¯ s − 2 g ap ¯ b ¯ b T and in eac h call to Shif t , w e solv e one system in a matrix of the form ˜ A ˜ S − 2 ˜ A T = AS − 2 A T + ms − 2 g ap b b T A t the end of the in terior-p oint algorithm w e h a v e one fi nal call of the latter form. In order to sa y something ab out the condition num b er of the ab o v e matrices, we must b oun d the slac k v ector s . W e are giv en an u pp er b ound of T on the elements of s , s o it remains to pro ve a low er b oun d: Lemma C.18. Thr oughout the Inte riorPoin t algorithm, s ≥ ǫ 48 nmT U s 0 43 Pr o of. At all times duing the algorithm, we know f rom Lemma C.4 that the elemen ts of s are b ound ed by a constan t factor from the slac ks at the current central path p oin t ∗ s . In p articular, taking int o accoun t Lemmas C .8 and C.14, w e su rely ha ve s ≥ 1 2 ∗ s . So let us b ou n d fr om b elo w the elemen ts of ∗ s . During the FindCent ralPath subr outine, as w e d ecrease ¯ z and expand the p olytope Ω D ¯ b , ¯ z , clearly the initial p oint s 0 remains in the interior of Ω D ¯ b , ¯ z throughout. Thus, by L emma C.3, we ha ve ∗ s ≥ 1 2 m s 0 , and so s ≥ 1 2 ∗ s ≥ 1 4 m s 0 . Unfortunately , durin g the main part of the algorithm, as w e increase z and shrink the p olytop e Ω D b ,z , th e initial p oin t ma y not remain inside th e p olytop e. In particular, once w e ha v e z ≥ b T y 0 , the initial p oin t is no longer in Ω D b ,z , bu t we ma y defi ne a related p oint ( y z , s z , s z g ap ) that is in Ω D b ,z . Giv en our current p oint ( y , s , s g ap ) ∈ Ω D b ,z for z ≥ b T y 0 , let u s define r = b T y − z 2( b T y − b T y 0 ) and note that 0 < r < 1 2 . W e then d efine y z = r y 0 + (1 − r ) y  s z s z g ap  =  c − A T y z b T y z − z  =  r s 0 + (1 − r ) s 1 2 ( b T y − z )  > 0 Therefore Lemma C.3 giv es ∗ s ≥ 1 2 m s z = r s 0 + (1 − r ) s 2 m ≥ r 2 m s 0 W e then find r = s g ap 2( b T y − b T y 0 ) ≥ s g ap 4 nT U ≥ ǫ 24 nT U The last inequalit y follo ws b ecause, when s g ap decreased b elo w ǫ 3 on the final step, using Lemma C.4 we find that it certainly could n ot h a v e decreased by more than a facto r of 1 2 . W e conclude s ≥ 1 2 ∗ s ≥ r 4 m s 0 ≥ ǫ 48 nmT U s 0 W e ma y now s u mmarize the calls to the solv er as follo ws: Theorem C .19. The Interi orPoint ( A, b , c , y 0 , ǫ ) algorithm makes O  √ m log T U m λ min s 0 min ǫ  c al ls to the appr oximate solver, of the form Solve  AS − 2 A T + v v T , · , Θ( m − 1 / 2 )  and one c al l of the form Solve  AS − 2 A T + v v T , · , Ω  s 0 min ǫ m 1 / 2 n 2 T 2 U 2  wher e S is a p ositive diagonal matrix with c ondition numb er O  T 2 U m 2 ǫ  , and v sa tisfies k v k = O U ( mn ) 1 / 2 s 0 min ǫ ! Pr o of. F r om Lemmas C.17 and C.12, the total n um b er of solv es is O  √ m  log T U m λ min s 0 min + log s C ga p ǫ  , where we know f r om the FindCent ralPath algorithm that s C g ap = 40 T m k b k λ 1 / 2 min = O  T U mn 1 / 2 λ 1 / 2 min  44 As we noted ab ov e, all solv es are in matrices that tak e the form AS − 2 A T + v v T , w here v = m 1 / 2 s − 1 g ap b or v = m 1 / 2 ¯ s − 1 g ap A ( S 0 ) − 1 1 m W e kno w that s g ap = Ω( ǫ ) and ¯ s g ap = Ω( m ), so we obtain the resp ectiv e b ounds k v k = O U ( mn ) 1 / 2 ǫ ! k v k = O U ( mn ) 1 / 2 s 0 min ! The condition n umb er of S comes f r om Lemma C.18 and th e u pp er b ound of T on the slac ks. The er r or p arameter f or the solv er is Θ  m − 1 / 2  from th e all New tonStep calls. In the final solv e, the err or parameter is s min m 1 / 2 T U n ≥ m 1 / 2 T U n · s 0 min ǫ 48 nmT U , again in vo king Lemma C.18. 45

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment