IRLS and Slime Mold: Equivalence and Convergence
In this paper we present a connection between two dynamical systems arising in entirely different contexts: one in signal processing and the other in biology. The first is the famous Iteratively Reweighted Least Squares (IRLS) algorithm used in compr…
Authors: Damian Straszak, Nisheeth K. Vishnoi
IRLS AND SLIME MOLD: EQUIV ALENCE AND CONVER GENCE D AMIAN STRASZAK AND NISHEETH K. VISHNOI Abstract. In this pap er w e present a connection betw een t wo dynamical systems arising in en tirely differen t con texts: one in signal pro cessing and the other in biology . The first is the famous Iterativ ely Reweigh ted Least Squares (IRLS) algorithm used in compressed sensing and sparse reco very while the second is the dynamics of a slime mold ( Physarum p olyc ephalum ). Both of these dynamics are geared tow ards finding a minimum ` 1 -norm solution in an affine subspace. Despite its simplicit y the conv ergence of the IRLS metho d has b een shown only for a certain re gularization of it and remains an imp ortant op en problem [Bec15, DDFG10]. Our first result sho ws that the tw o dynamics are pro jections of the same dynamical system in higher dimensions. As a consequence, and building on the recen t work on Ph ysarum dynamics, we are able to pro v e con vergence and obtain complexity b ounds for a damp e d version of the IRLS algorithm. Contents 1. In tro duction 1 2. Related W ork 3 3. Preliminaries 4 3.1. The IRLS algorithm 4 3.2. Con tin uous Ph ysarum dynamics for ` 1 -minimization 5 3.3. Discrete Physarum dynamics 6 4. IRLS vs Physarum 6 4.1. Ph ysarum dynamics and hidden v ariables 6 4.2. IRLS as alternate minimization 7 4.3. Comparing IRLS with Ph ysarum 8 5. Con v ergence and Complexity of Physarum Dynamics 10 App endix A. Example for Non-conv ergence of IRLS 15 References 16 Damian Straszak, ´ Ecole Polytec hnique F´ ed ´ erale de Lausanne (EPFL). Nisheeth K. Vishnoi, ´ Ecole Polytec hnique F´ ed ´ erale de Lausanne (EPFL). 1. Introduction Sparse reco v ery and basis pursuit. A classical task in signal pro cessing is to recov er a sparse signal from a small num b er of linear measurements. Mathematically , this can b e formulated as the problem of finding a solution to a linear system Ax = b where A ∈ R m × n , b ∈ R m are given and A has far fewer rows than columns (i.e. m n ). Among all the solutions, one would like to recov er one with the fewest non-zero en tries. This problem, kno wn as sparse reco v ery , is NP-hard and we cannot hop e to find an efficient algorithm in general. Ho w ev er, it has b een observed exp erimen tally that, when dealing with real-world data, a solution to the following ` 1 -minimization problem (also kno wn as b asis pursuit ): (1) min k x k 1 s . t . Ax = b is typically quite sparse, if not of optimal sparsit y . The history of theoretical in v estigations on ho w to explain the ab o ve phenomenon is particularly rich. It was first shown in [DH01, DE03] that the ` 1 -norm ob jective is in fact equiv alen t to sparsity for a sp ecific family of matrices and, later, the same w as argued for a class of random matrices [CR T06]. Finally , the notion of Restricted Isometry Prop ert y (RIP) was form ulated in [CT05] and shown to guarantee sparse recov ery via (1). Conse- quen tly , optimization problems of the form (1) b ecame imp ortant building blo cks for applications in signal pro cessing and statistics. Th us, fast algorithms for solving such problems are desired. Note that (1) can b e cast as a linear program of size linear in n and m and, hence, any linear programming algorithm can b e used to solv e it. How ev er, b ecause of the sp ecial structure of the problem, many algorithms were dev elop ed whic h outp erform standard LP solvers in terms of effi- ciency on real w orld instances. T o mak e an algorithm applicable in practice another prop erty is highly desirable: simplicit y . This is not only for the ease of implementation, but also due to the fact that simple solutions are t ypically more robust and extendable to slightly different settings, suc h as noise tolerance. Iterativ ely Reweigh ted Least Squares. One of the simplest algorithms for solving problem (1) is the Iterativ ely Reweigh ted Least Squares algorithm (IRLS). IRLS is a v ery general scheme for solving optimization problems: it pro duces a sequence of p oints y (0) , y (1) , . . . with every y ( k +1) obtained as a result of solving a w eighted ` 2 -minimization problem, where the w eights are appro- priately c hosen based on the previous point y ( k ) . Let us no w describ e one extremely p opular sc heme of this kind, which is of main fo cus in this pap er. W e pic k any starting p oint y (0) ∈ R n . Then, y ( k +1) is obtained from y ( k ) as the solution to the following optimization problem: (2) y ( k +1) def = argmin x ∈ R n n X i =1 x 2 i y ( k ) i s . t . Ax = b. F or this optimization problem to mak e sense, we need to assume that y ( k ) i 6 = 0 for ev ery i = 1 , 2 , . . . , n ; how ev er, the ab o v e can b e made formal without this assumption. Imp ortan tly , the w eigh ted ` 2 -minimization in (2) can b e solved via a formula which in volv es solving a linear system. The resulting algorithm do es not require an y prepro cessing of the data or an y sp ecial rules for c ho osing a starting p oint. These properties make the algorithm particularly attractive for practical use and, indeed, the IRLS algorithm is quite popular; see for instance [CY08, Gre84]. Ho w ev er, from a theoretical viewp oint, the algorithm is still far from being understoo d. No global conv ergence analysis is kno wn. One can construct examples to sho w that there may b e starting p oin ts for whic h the IRLS algorithm do es not con v erge, see App endix A. The only kno wn rigorous positive results [Osb85] concern the case when the algorithm is initialized v ery close to the optimal solution. It is an imp ortant op en problem to establish global conv ergence of the IRLS algorithm. 1 The dynamics of a slime mold. In a seemingly unrelated story , in 2000 a striking exp eriment demonstrated that a slime mold ( Physarum p olyc ephalum ) can solv e the shortest path problem in a maze [NYT00]. The need to explain how, re sulted in a mathematical mo del [TKN07] which was a dynamical system; we (lo osely) refer to this dynamical system as Ph ysarum dynamics. Subse- quen tly , this mo del w as successfully analyzed mathematically and generalized to many differen t graph problems ([MO07, IJNT11, BMV12, BBD + 13, SV16a]). In this wor k we prop ose an exten- sion of the Physarum dynamics for solving the basis pursuit problem. Given A, b as b efore, we let w (0) ∈ R n > 0 to b e any p oint with p ositiv e co ordinates and pick an y step size h ∈ (0 , 1). The discrete Ph ysarum dynamics iterates according to the following form ula: (3) w ( k +1) def = (1 − h ) w ( k ) + h q ( k ) . In the ab ov e, q ( k ) is the v ector that minimizes P n i =1 x 2 i w ( k ) i o v er all x ∈ R n suc h that Ax = b . The absolute v alue of q ( k ) should b e understoo d entry-wise. The ab ov e is a generalization of the Ph ysarum dynamics for the shortest s − t path problem in an undirected graph [TKN07], for which it was shown b y [BMV12] that w ( k ) con v erges to the characteristic vector of the shortest s − t path in G . Interestingly , since w ( k ) remains a p ositive vector at ev ery step k , the vector w ( k ) ma y not con v erge to the optimal solution. In Section 4 we explain how to define an auxiliary sequence y ( k ) whic h conv erges to an optimal solution. IRLS vs. Physarum. Both algorithms, IRLS and Physarum can b e seen as discrete dynamical systems, with up dates based on a certain weigh ted ` 2 -minimization, how ev er no formal relation b et ween them is apparen t. Our first result connects these tw o algorithms. Both of these algorithms are naturally view ed as discrete dynamical systems ov er a 2 n -dimensional domain Γ def = { ( y , w ) : y ∈ R n , w ∈ R n > 0 } with the v ector field F : Γ → R n × R n defined as: F ( y , w ) def = ( q − y , | q | − w ) q def = argmin x ∈ R n n X i =1 x 2 i w i s . t . Ax = b. (4) More precisely we pro v e the following theorem in Section 4. Theorem 1.1 (Informal) . Given a starting p oint ( y (0) , w (0) ) ∈ Γ and h ∈ (0 , 1] let us c onsider the se quenc e ( y ( k ) , w ( k ) ) k ∈ N gener ate d by taking steps in the dir e ction suggeste d by F : (5) ( y ( k +1) , w ( k +1) ) = (1 − h )( y ( k ) , w ( k ) ) + hF ( y ( k ) , w ( k ) ) . When h = 1 the se quenc e y ( k ) k ∈ N is identic al to that pr o duc e d by IRLS, while for h ∈ (0 , 1) , the se quenc e w ( k ) k ∈ N is e quivalent to Physarum dynamics. The ab ov e tells us additionally that IRLS and Physarum are complimentary in terms of their descriptions, since the v ariables y app ear in the definition of IRLS, while w are implicit (and vice v ersa for Physarum). Our second contribution is a global conv ergence analysis for Ph ysarum dynamics which implies the same for the damp e d version of IRLS. W e state it informally b elow; several details are omitted and only the dep endence on ε is emphasized, the quantities dep ending on the dimension and the input data are denoted by C 1 and C 2 . F or a precise formulation we refer to Theorem 5.1. Theorem 1.2 (Informal) . Supp ose we initialize the Physarum dynamics at an appr opriate p oint w (0) . T ake an arbitr ary ε > 0 and cho ose h ≤ ε C 1 . If we gener ate a se quenc e w ( k ) k ∈ N ac c or ding to the Physarum dynamics (3) , then after k = C 2 hε 2 steps one c an c ompute a ve ctor y ( k ) ∈ R n such that Ay ( k ) = b and y ( k ) 1 ≤ w ( k ) 1 ≤ k x ? k 1 · (1 + ε ) , wher e x ? is any optimal solution to (1) . 2 2. Rela ted Work IRLS. Man y different algorithms based on IRLS ha ve b een prop osed for solving a v ariet y of optimization problems. The b o ok [Osb85] presents (among others) the IRLS metho d for ` 1 - minimization and pro v es a lo cal con v ergence result (assuming the starting p oin t is sufficien tly close to the optimum and no zero-entries app ear in the iterates). The pap er [GR97] discusses a n um b er of differen t IRLS schemes for finding sparse solutions to underdetermined linear systems. It pro vides con v ergence results for a family of suc h metho ds, but the algorithm studied in our pap er is not co vered. In [RKD99], IRLS schemes for minimizing ( P n i =1 | x i | p ) 1 /p are prop osed, the sc heme giv en for p = 1 matches our setting, how ev er no global conv ergence results are obtained. W e no w discuss another line of w ork, for which rigorous conv ergence results are kno wn. T o circum v ent mathematical difficulties related to zero-entries appearing in IRLS iterates one can c ho ose a small p ositive constan t η > 0 and define a mo dified v ersion of the IRLS up date: (6) x ( k +1) = argmin n X i =1 x 2 i r x ( k ) i 2 + η 2 : x ∈ R n , Ax = b . Note that the abov e minimization problem mak es p erfect sense even when x ( k ) i = 0 for some i . Consequen tly , it has a unique solution, for ev ery choice of x ( k ) . It was pro v ed in [Bec15] that the sequence of p oints pro duced b y scheme (6) con verges to the optimal solution of: min n X i =1 x 2 i + η 2 1 / 2 s . t . Ax = b. (7) The n um ber of iterations required to get ε -close to the optimal solution is bounded b y O C ε , where C is a quantit y dep ending on A, b and η . The function P n i =1 x 2 i + η 2 1 / 2 appro ximates the ` 1 norm in the follo wing sense: ∀ x ∈ R n k x k 1 ≤ n X i =1 x 2 i + η 2 1 / 2 ≤ k x k 1 + n · η . In the case when the matrix A satisfies a v arian t of RIP (Restricted Isometry Prop ert y), [DDFG10] sho w ed that a scheme similar to (7) (with η k → 0 in place of constant η ) conv erges to the ` 1 - optimizer. The pro of relies on non-constructiv e argumen ts (compactness is rep eatedly used to obtain certain accumulation p oints) hence no quantitativ e b ounds on the global conv ergence rate follo w from this analysis. Ph ysarum dynamics. The discrete Physarum dynamics w e propose for the basis pursuit problem can b e seen as an analogue of the similarly lo oking, but tec hnically very differen t, dynamics for linear programming studied in [JZ12, SV16b]. Our second main result (Theorem 5.1) builds up, extends and simplifies a recent result [SV16a] of the authors for the case of flows ; when the matrix A corresp onds to an incidence matrix of an undirected graph. F or more on prior work on Physarum dynamics, the reader is referred to [SV16a]. 3 3. Preliminaries Notation for sets, v ectors and matrices. The set { 1 , 2 , . . . , n } is denoted b y [ n ]. All v ectors considered are column vectors. By x ∈ R Q , for some finite set Q , w e mean a | Q | -dimensional real v ector indexed by elements of Q , similarly for matrices. If x ∈ R n is a v ector then x S for S ⊆ [ n ] denotes a vector in R S whic h is the restriction of x to indices in S . The basis pursuit problem is to find a minimum ` 1 -norm solution to the linear system Ax = b , where A is an m × n matrix. W e assume that A has rank m . 1 The i -th column of A is denoted by a i ∈ R m . If x ∈ R n then by X we mean an n × n real diagonal matrix with x on the diagonal, i.e. X = Diag ( x ). Whenever x ∈ R n is a v ector and a scalar operation is applied to it, the result is a v ector with this scalar op e ration applied to every en try . F or example | x | denotes a vector y ∈ R n with y i = | x i | for every i ∈ [ n ]. When writing inequalities b etw een v ectors, like x ≤ y (for x, y ∈ R n ) we mean that x i ≤ y i for all i ∈ [ n ], also x > 0 means x i > 0 for every i ∈ [ n ]. F or a symmetric matrix M ∈ R d × d w e denote by M + its Mo ore-Penrose pseudoinv erse. It satisfies M M + x = M + M x = x for every x ∈ R d from the image of M . W eigh ted ` 2 -minimization. The weigh ted ` 2 -minimization problem is the following: for a giv en matrix A ∈ R m × n , vector b ∈ R m and weigh ts s ∈ R n > 0 find: argmin x ∈ R n n X i =1 s i x 2 i s . t . Ax = b. One can sho w that if the linear system Ax = b has a solution, then the ab o ve has a unique solution q ∈ R n whic h can b e computed as: q = S A > ( AS A > ) + b. 3.1. The IRLS algorithm. W e no w present the Iterativ ely Rew eighted Least Squares (IRLS) algorithm. 2 F or readabilit y , some tec hnical details are omitted; how ever, w e lea v e remarks wherev er additional care is required. Consider the basis pursuit problem: min k x k 1 s . t . Ax = b (8) where x ∈ R n , A ∈ R m × n and b ∈ R m . The algorithm starts from an arbitrary p oin t y (0) ∈ R n , e.g. y (0) = (1 , 1 , . . . , 1) > , and p erforms the follo wing iterations for k = 0 , 1 , 2 , . . . : (9) y ( k +1) = argmin n X i =1 x 2 i y ( k ) i : x ∈ R n , Ax = b . Th us, the new p oint is a result of ` 2 -minimization with weigh ts coming from the previous iteration. Remark 3.1. Note that the ab ove is wel l define d only if y ( k ) i 6 = 0 for every i ∈ [ n ] . A dditional c ar e is r e quir e d to de al with the c ase wher e some y ( k ) i ar e zer o. Informal ly, one c an imagine that if y ( k ) i = 0 then the weight on the i -th c o or dinate is + ∞ ; henc e, one is for c e d to cho ose x i = 0 . In fact the formal tr e atment fol lows this intuition: whenever y ( k ) i = 0 , one adds a har d c onstr aint x i = 0 and p erforms the weighte d ` 2 -norm minimization over the non-zer o c o or dinates. 1 It is enough here to assume b ∈ Im( A ) only . T o simplify notation, we work with the full-rank assumption. One can reduce the general case to full-rank by remo ving some num ber of rows from A . Both dynamics remain the same. 2 As men tioned before, IRLS is in fact a general algorithm scheme; how ever, in the remaining part of the pap er by IRLS we alwa ys mean the sp ecific IRLS for basis pursuit. 4 W e remark that the ` 2 -minimization problem in the update rule has a closed form solution in volving a pro jection: y ( k +1) = Y ( k ) A > ( AY ( k ) A > ) + b, where Y ( k ) def = Diag y ( k ) . It is easy to show that the ` 1 -norm of the subsequent iterates y (1) , y (2) , . . . is non-increasing; ho w ever, this does not necessarily imply that IRLS con v erges to the optimal solution. In fact no result on global conv ergence (to an optimal solution to (8)) is known for IRLS. While the con v ergence is indeed observ ed in practice, it remains op en to prov e this. One issue is that there are examples of instances and starting p oints, where the sequence pro v ably do es not conv erge to the optimal solution; see Appendix A. Ho w ev er, we b elieve that the following conjecture migh t hold regarding the conv ergence of IRLS. Conjecture 3.2. The set of starting p oints y (0) ∈ R n for which the se quenc e y ( k ) k ∈ N gener ate d by IRLS do es not c onver ge to an optimal solution to (8) is of me asur e zer o. One of the main obstacles in pro ving global con vergence for IRLS is its “non-uniform” b ehavior, dep ending on the supp ort of the curren t p oint. Unfortunately , the issue of y ( k ) ha ving zero-entries cannot b e a v oided. Note that this problem is not only a mathematical incon venience. In fact, when dealing with instances where one or more en tries of y ( k ) are close to zero, n umerical issues are likely to app ear. When solving a minimization problem of the kind (9), tiny v alues of y ( k ) i can b e unpleasant to deal with and cause errors. 3.2. Con tin uous Physarum dynamics for ` 1 -minimization. The Ph ysarum dynamics w as originally in tro duced for an undirected graph G = ( V , E ) as a contin uous time dynamical syste m o v er R E > 0 ([TKN07]). This mo del was prop osed to explain the exp erimental ly observ ed ability of Ph ysarum to solve the shortest path problem. It was then extended to a more general flo w problem: the transshipment problem ([IJNT11, BMV12]). W e propose an even more general treatmen t, in whic h there is no underlying graph, but just an abstract ` 1 -minimization problem o ver an affine subspace (8). Throughout our discussion we assume that A has rank m (thus in particular (8) is feasible). W e start b y giving the contin uous dynamics and subsequently turn it into a discrete one. The con tin uous Ph ysarum dynamics 3 starts from an arbitrary p ositive point w (0) ∈ R n > 0 , its instan taneous velocity vector is given b y (10) dw ( t ) dt = | q ( t ) | − w ( t ) , where q ( t ) ∈ R n is computed as (11) q ( t ) = W ( t ) A > ( AW ( t ) A > ) − 1 b. Here W ( t ) denotes the diagonal matrix Diag ( w ( t )). In the case of shortest path or the transship- men t problem, the v ector q ( t ) corresp onds to an electrical flow. It can b e equiv alen tly describ ed as the minimizer of weigh ted ` 2 norm P n i =1 x 2 i w i ( t ) 1 / 2 o v er { x ∈ R n : Ax = b } . Let us no w state an imp ortan t fact regarding (10). Theorem 3.3. F or every initial c ondition w (0) ∈ R n > 0 ther e exists a glob al solution w : [0 , ∞ ) → R n > 0 satisfying (10) . 3 More generally , we can define a dynamics solving the ab ov e problem with ob jective replaced by P n i =1 c i | x i | for an y c ∈ R n > 0 . The uniform cost case c = (1 , 1 , . . . , 1) > is how ev er the most interesting one (as the non-uniform case reduces to it b y scaling). 5 W e omit the proof. Let us only mention that the up date rule is defined b y a locally Lipsc hitz con tin uous function, hence the solution to (10) exists lo cally . T o prov e global existence, one needs to sho w in addition that no solution curve approaches the b oundary of R n > 0 in finite time. W e refer the reader to [SV16b] where a complete pro of of existence for a related dynamics is presented. (Though, the case of (10) is muc h simpler.) 3.3. Discrete Ph ysarum dynamics. W e apply Euler’s method to discretize the Physarum dy- namics from the previous subsection. Pic k a small p ositiv e step size h ∈ (0 , 1) and observe that: w ( t + h ) − w ( t ) ≈ h ˙ w ( t ) = h ( | q ( t ) | − w ( t )) . Hence, w ( t + h ) ≈ h | q ( t ) | + (1 − h ) w ( t ) . This motiv ates the follo wing discrete pro cess: pic k any w (0) ∈ R n > 0 and iterate for k = 0 , 1 , . . . : (12) w ( k +1) = h | q ( k ) | + (1 − h ) w ( k ) , where as previously q ( k ) is the result of ` 2 -minimization p erformed with resp ect to the weigh ts w ( k ) − 1 . It is given explicitly by the form ula q ( k ) = W ( k ) A > ( AW ( k ) A > ) − 1 b . 4. IRLS vs Physar um In this section w e present a pro of of Theorem 1.1. When comparing IRLS with the Physarum dynamics one can already see similarities b etw een these t w o algorithms: b oth of them are iterative metho ds whic h use w eigh ted ` 2 -minimization to perform the up date. How ev er, apart from this observ ation no formal connection is apparen t. Ph ysarum defines a sequence of strictly p ositiv e v ectors whose ` 1 -norm con verges to the optimal ` 1 -norm; in particular the iterates are nev er feasible. The iterates of the IRLS algorithm on the other hand, starting from k = 1, lie in the feasible region. It turns out that the key to understand how these algorithms are related to eac h other is by considering them as algorithms working in a larger space: R n × R n > 0 . W e show in the subsequent subsections that b oth algorithms can b e seen as main taining a pair ( y , w ) ∈ R n × R n > 0 suc h that y satisfies Ay = b and w is the vector of w eigh ts guiding the ` 2 -minimization. In terestingly , in the original presentation of the Physarum dynamics only the w v ariable is apparent. In con trast, IRLS k eeps trac k of just the y v ariables. This viewp oint allows us to explains how these tw o algorithms follo w essentially the same up date rule. 4.1. Ph ysarum dynamics and hidden v ariables. Recall that Physarum dynamics w as defined as starting from some p oin t w (0) ∈ R n > 0 and evolving according to the rule: w ( k +1) = (1 − h ) w ( k ) + h | q ( k ) | with h ∈ (0 , 1). Note that w ( k ) do es not quite conv erge to the optimal solution (it is alwa ys p ositiv e). The only guaran tee w e can pro ve is that w ( k ) 1 tends to k x ? k 1 (with x ? b eing an y optimal solution to (8)). Can w e recov er x ? from this pro cess? Supp ose that the starting p oint w (0) is not arbitrary , but chosen in a sp e cific w a y . Let y ∈ R n b e an y solution to Ay = b , for instance the least squares solution. F or w (0) w e c ho ose any vector w ∈ R n > 0 whic h satisfies | y | ≤ w entry-wise. Hence, our starting p oin t w (0) b elongs to the set: K def = { w ∈ R n > 0 : ∃ y ∈ R n s . t . ( Ay = b and | y | ≤ w ) } . W e no w observ e a surprising fact. F act 4.1. If { w ( k ) } k ∈ N is a se quenc e of p oints pr o duc e d by the Physarum dynamics and w (0) ∈ K , then w ( k ) ∈ K for every k ∈ N . 6 Pr o of. The pro of go es by induction. F or k = 0 the claim holds. Let k ≥ 0 and consider w ( k +1) . W e ha v e w ( k +1) = (1 − h ) w ( k ) + h | q ( k ) | . Hence, if y certifies that w ( k ) ∈ K ( Ay = b and | y | ≤ w ( k ) ) then, | (1 − h ) y i + hq ( k ) i | ≤ (1 − h ) | y i | + h | q ( k ) i | ≤ (1 − h ) w ( k ) i + h | q ( k ) i | = w ( k +1) i . In other words, | (1 − h ) y + hq ( k ) | ≤ w ( k +1) . This implies that w ( k +1) ∈ K since indeed A (1 − h ) y + hq ( k ) = b. The ab ov e pro of actually shows more. Let y (0) ∈ R n b e any p oint satisfying Ay (0) = b and w (0) ∈ R n > 0 satisfy y (0) ≤ w (0) . If we evolv e the pair ( y ( k ) , w ( k ) ) according to the rules: w ( k +1) = (1 − h ) w ( k ) + h q ( k ) , y ( k +1) = (1 − h ) y ( k ) + hq ( k ) , then Ay ( k ) = b and | y ( k ) | ≤ w ( k ) for every k ∈ N . This implies in particular that ∀ k ∈ N k x ? k 1 ≤ y ( k ) 1 ≤ w ( k ) 1 . Th us proving conv ergence of Ph ysarum dynamics is equiv alent to sho wing an appropriate upp er b ound on w ( k ) 1 . The ab ov e in terpretation of Physarum, as simultaneously evolving tw o sets of v ariables is k ey to understand its c onnection to IRLS. 4.2. IRLS as alternate minimization. W e no w presen t IRLS from a (kno wn) alternate mini- mization viewp oint; see [Bec15, DDFG10]. Consider the following function J : R n × R n > 0 → R : J ( y , w ) = n X i =1 y 2 i w i + n X i =1 w i . J is not well defined when w i = 0 for some i , but for simplicit y let us no w ignore this issue. 4 It turns out that IRLS can b e seen as an alternate minimization method applied to the function J . Let us first remark that J is not a conv ex function. Ho w ever, when either y or w is fixed, then J is conv ex as a function of the remaining v ariables. Consider the following alternate minimization algorithm for J ( y , w ). (1) Start with w (0) = (1 , 1 , . . . , 1) > . (2) F or k = 0 , 1 , 2 , . . . : • let y ( k +1) b e the y which minimizes J ( y , w ( k ) ) ov er y ∈ R n , Ay = b , • let w ( k +1) b e the w which minimizes J ( y ( k +1) , w ) ov er w ∈ R n > 0 . The ab ov e metho d tries to minimize the function J ( y, w ) by alternating betw een minimization o v er y with w fixed and minimization ov er w with y fixed. In general such a scheme is not guaranteed to conv erge to a global optim um (esp ecially when J is non-conv ex). W e no w describ e what these partial minimization steps corresp ond to. F act 4.2. Supp ose that w ∈ R n > 0 is fixe d, then: argmin y { J ( y , w ) : y ∈ R n , Ay = b } = argmin y ( n X i =1 y 2 i w i : y ∈ R n , Ay = b ) 4 The correct wa y to define J ( y , w ) in presence of zero entries is the following: whenever w i = y i = 0 we set y 2 i w i = 0 as and whenever w i = 0 and y i 6 = 0 we define y 2 i w i = + ∞ 7 The pro of is straightforw ard; the only p oint worth noting is that the second term in J ( y, w ) do es not dep end on y and hence do es not need to b e tak en into accoun t. W e no w analyze the second step. F act 4.3. Supp ose that y ∈ R n is fixe d and y i 6 = 0 for al l i ∈ [ n ] then: argmin { J ( y , w ) : w ∈ R n > 0 } = | y | . In the ab ov e w e make a simplifying assumption that no en try of y is zero. This is not crucial, but to drop this assumption, a more rigorous treatmen t is necessary . It can b e done, at a cost of making the notation less transparent. Pr o of. W e would like to minimize J ( y , w ) = n X i =1 y 2 i w i + n X i =1 w i for a fixed y . Note that the ab o ve function is separable, hence it suffices to minimize y 2 i w i + w i separately for every i . By a simple calculation one can find that the ab ov e expression is minimized when w i = | y i | . Note no w that F acts 4.2 and 4.3 together imply that the sequence y (1) , y (2) , . . . resulting from alternate minimization is the same as that pro duced by IRLS. As a b ypro duct, w e also obtain that y ( k ) 1 is non-increasing with k , b ecause: J y ( k ) , w ( k ) = n X i =1 y ( k ) i 2 w ( k ) i + n X i =1 w ( k ) i = n X i =1 y ( k ) i 2 y ( k ) i + n X i =1 y ( k ) i = 2 y ( k ) 1 . Of course J y ( k ) , w ( k ) is non-increasing for k ≥ 1, hence y ( k ) 1 is non-increasing as w ell. 4.3. Comparing IRLS with Physarum. In this subsection we conclude our previous consider- ations b y giving a unifying viewp oint on Ph ysarum and IRLS. In fact b oth of them can b e seen as algorithms working in the 2 n -dimensional space Γ = R n × R n > 0 . Let us state b oth algorithms in a similar form. 8 Algorithm 1: IRLS Data : A ∈ R m × n , b ∈ R m w (0) = (1 , 1 , ..., 1) > ∈ R n ; for k = 0 , 1 , 2 , ... do q = argmin P n i =1 x 2 i w ( k ) i s . t . Ax = b ; y ( k +1) = q ; w ( k +1) = | q | ; end Algorithm 2: Physarum Data : A ∈ R m × n , b ∈ R m w (0) = (1 , 1 , ..., 1) > ∈ R n , h ∈ (0 , 1); for k = 0 , 1 , 2 , ... do q = argmin P n i =1 x 2 i w ( k ) i s . t . Ax = b ; y ( k +1) = (1 − h ) y ( k ) + hq ; w ( k +1) = (1 − h ) w ( k ) + h | q | ; end The ab o ve comparison yields a clear connection b etw een IRLS and Ph ysarum. Let us define a v ector field F : Γ → R n × R n b y the following form ula: F ( y , w ) def = ( q − y , | q | − w ) , q def = argmin x ∈ R n n X i =1 x 2 i w i s . t . Ax = b. (13) The IRLS algorithm given a p oint ( x, w ) ∈ R n × R n > 0 simply mov es along the vector F ( x, w ) to the new p oin t ( x, w ) + F ( x, w ), while Physarum mov es to a p oint on the interv al b etw een ( x, w ) and ( x, w ) + F ( x, w ). F or this reason Physarum can b e seen as a damp ed v arian t of IRLS. Let us now define tw o interesting subsets of Γ (see Figure 1 for a one-dimensional example) P def = { ( y , w ) ∈ Γ : Ay = b, | y | ≤ w } , ¯ P def = { ( y , w ) ∈ Γ : Ay = b, | y | = w } . IRLS can b e seen as a discrete dynamical system defined ov er ¯ P , while Physarum initialized at a y w ¯ P = { ( y , w ) : | y | = w } P = { ( y , w ) : | y | ≤ w } Figure 1. Illustration of Γ , P and ¯ P for n = 1. p oin t w (0) ∈ P stays in P , for an y c hoice of h ∈ (0 , 1). 5 In terestingly ¯ P is a non-conv ex set, which is the b oundary of P (in contrast P is con v ex). In the next section we prov e that Physarum never faces the issue of w ( k ) i b eing zero for some i , indeed w ( k ) > 0 for ev ery k , which follows from the fact that h < 1. In con trast, Physarum with h = 1 is equiv alen t to IRLS, where this happ ens frequently . 5 Ph ysarum initialized at a p oin t outside of P conv erges to P . 9 5. Convergence and Complexity of Physar um Dynamics In this section w e study con v ergence of Ph ysarum dynamics. The analysis is based on ideas dev elop ed in [BBD + 13, SV16a, SV16b]. Specifically , w e prov e the following theorem, whose informal v ersion app eared as Theorem 1.2. Let α def = max {| det( A 0 ) | : A 0 is a square submatrix of A } . Theorem 5.1. Supp ose w (0) was chosen to satisfy y (0) ≤ w (0) for some y (0) ∈ R n such that Ay (0) = b . F urthermor e assume w (0) i ≥ 1 for every i ∈ [ n ] and w (0) 1 ≤ M k x ? k 1 for some M ∈ R . L et ε ∈ (0 , 1 / 2 ) and h ≤ ε 40 n 2 α 2 . Then after k = O ln M +ln k x ? k 1 hε 2 steps w ( k ) 1 ≤ (1 + ε ) k x ? k 1 and one c an e asily r e c over a ve ctor y ( k ) such that Ay ( k ) = b and y ( k ) 1 ≤ w ( k ) 1 . F ew comments are in order. The assumptions ab out the starting p oint w (0) , we made in the statemen t, are not necessary for conv ergence. Ho w ev er, they greatly simplify the pro ofs and mak e it easy to reco v er a close to optimal feasible solution to (1). The choice of the step size h follows directly from our analysis and is not likely to b e optimal. Experiments suggest that the claimed iteration b ound should hold ev en for h b eing a small constant (not dep ending on the data). Assumptions, notation and simple facts. Motiv ated by the observ ation ab out hidden v ariables made in Section 4, w e assume that the starting point w (0) is chosen in suc h a w a y that w (0) > 0 and y (0) ≤ w (0) for some y (0) ∈ R n suc h that Ay (0) = 0. Recall that in that case, for every k w e are guaranteed existence of a feasible y ( k ) with y ( k ) ≤ w ( k ) . Moreo v er, these y ( k ) are easy to find. One particular choice of y (0) and w (0) could b e the least squares solution to Ax = b and w (0) i = y (0) i + 1 resp ectively . Let us now v erify that w ( k ) > 0 at all steps and hence that the Ph ysarum dynamics is well defined. Lemma 5.2. F or every k , w ( k ) ∈ R n > 0 . Pr o of. The pro of go es via simple induction. F or k = 0 the claim is v alid by assumption that w (0) > 0, next for k ≥ 0 we ha v e: w ( k +1) i = (1 − h ) w ( k ) i + h q ( k ) i > 0 b ecause h ∈ (0 , 1). The ab ov e lemma sho ws in particular that the weigh ted ` 2 -minimization problem solved in every step indeed has a unique optimal solution. F or the conv ergence pro of let us fix x ? ∈ R n to b e any optimal solution to our ` 1 -minimization problem (8). Without loss of generality we ma y assume that x ? ≥ 0 (if not, m ultiply b y ( − 1) all the columns of A whic h corresp ond to negative entries in x ? , it do es not change the problem neither the sequence pro duced by Ph ysarum). T o track the conv ergence pro cess of Physarum the t w o follo wing quantities are useful: (1) E ( k ) = P n i =1 q ( k ) i 2 w ( k ) i , (2) B ( k ) = P n i =1 x ? i ln w ( k ) i . A technical lemma. The following tec hnical lemma from [SV16b] is particularly useful in our setting. W e state the lemma together with a pro of to make the pap er self-contained. F or a v ersion with quantitativ e b ounds we refer the reader to [SV16b]. Lemma 5.3. Consider a weight ve ctor w ∈ R n > 0 , then the matrix L = AW A > is invertible and: ∀ i, j ∈ [ n ] | a > i L − 1 a j | ≤ α w i 10 wher e α ∈ R is a c onstant which dep ends solely on A . Pr o of. T ak e an y weigh t vector w ∈ R n > 0 and pic k i, j ∈ [ n ]. By symmetry it is enough to establish the ab ov e b ound with w i replaced by w j . W e first note that: w j a j a > j n X k =1 w k a k a > k = L, where is the Lo ewner ordering. By testing the ab o ve on the vector L − 1 a j , we obtain: ( L − 1 a j ) > w j a j a > j ( L − 1 a j ) ≤ ( L − 1 a j ) > L ( L − 1 a j ) . By a simple calculation the ab ov e yields: a > j L − 1 a j ≤ 1 w j . In the remaining part of the argumen t we sho w that | a > i L − 1 a j | ≤ αa > j L − 1 a j , where α will b e sp ecified later. If a > i L − 1 a j = 0 then there is nothing to prov e. Otherwise, let us call p = L − 1 a j , we may assume without loss of generality that a > k p ≥ 0 for ev ery k ∈ [ n ] (w e may reduce our problem to this case b y multiplying some a k ’s by ( − 1)). Because Lp = a j , we obtain: a j = Lp = n X i = k w k a k a > k ! p = n X i = k w k ( a > k p ) a k Hence we hav e just obtained a representation of a j as a conic com bination of a 1 , a 2 , . . . , a n . More- o v er, the set S i j = { s ∈ R n : s ≥ 0 , n X k =1 s k a k = a j , s i > 0 } is non-empty . Let us tak e an elemen t r ∈ R n of S i j whic h maximizes r i . Note that S i j dep ends solely on A . Hence there is some low er b ound for r i , let us call it 1 α for some α > 0. W e obtain: a > j L − 1 a j = p > a j = p > n X k =1 r k a k ! = n X k =1 r k ( p > a k ) ≥ r i p > a i ≥ 1 α a i L − 1 a j . Remark 5.4. F r om now on we state al l b ounds with r esp e ct to α obtaine d in the ab ove lemma. [SV16b] shows that if A is a matrix with inte ger entries then α c an b e chosen to b e: max | det( A 0 ) | : A 0 is a square submatrix of A . In gener al, one c an b ound α in terms of the maximum absolute value of al l entries of ( A 0 ) − 1 over al l invertible squar e submatric es A 0 of A . The following corollary is used m ultiple times in the conv ergence pro of. Recall that we work under the assumption that w (0) ≥ y (0) for some y (0) ∈ R n with Ay (0) = b . Corollary 5.5. Supp ose that w ( k ) k ∈ N is the se quenc e pr o duc e d by Physarum and q ( k ) k ∈ N is the c orr esp onding se quenc e of weighte d ` 2 -minimizers. Then, for every k : ∀ i ∈ [ n ] q ( k ) i w ( k ) i ≤ nα, for α b eing the same c onstant as in L emma 5.3. 11 Pr o of. Let L = AW ( k ) A > (note that b oth L and L − 1 are symmetric matrices), then: q ( k ) = W ( k ) A > L − 1 b. Hence: | q ( k ) i | w ( k ) i = a > i L − 1 b . Recall that b = Ay ( k ) where y ( k ) ≤ w ( k ) , hence: | q ( k ) i | w ( k ) i = a > i L − 1 Ay ( k ) ≤ n X j =1 y ( k ) j · a > i L − 1 a j = n X j =1 y ( k ) j · a > j L − 1 a i Lemma 5.3 ≤ n X j =1 y ( k ) j · α w ( k ) i ≤ nα. Analysis of p otentials. Lemma 5.6. F or every k ∈ N we have w ( k +1) 1 ≤ w ( k ) 1 . F urthermor e, if for some ε ∈ 0 , 1 2 we have w ( k ) > 1 + ε 3 E ( k ) then w ( k +1) ≤ 1 − hε 8 w ( k ) . Pr o of. W e hav e: w ( k ) 1 − w ( k +1) 1 = h n X i =1 w ( k ) i − q ( k ) i = h w ( k ) 1 − q ( k ) 1 . F urthermore: q ( k ) 1 = n X i =1 q ( k ) i = n X i =1 q w ( k ) i | q ( k ) i | q w ( k ) i , b y applying the Cauch y-Sch w arz inequality , we obtain: n X i =1 q w ( k ) i | q ( k ) i | q w ( k ) i ≤ w ( k ) 1 / 2 1 · E ( k ) 1 / 2 . Th us we finally get: h w ( k ) 1 − q ( k ) 1 ≥ h w ( k ) 1 / 2 1 w ( k ) 1 / 2 1 − E ( k ) 1 / 2 . Since q ( k ) minimizes the weigh ted ` 2 norm ov er the subspace Ax = b , w e obtain: E ( k ) = n X i =1 q ( k ) i 2 w ( k ) i ≤ n X i =1 y ( k ) i 2 w ( k ) i ≤ n X i =1 w ( k ) i 2 w ( k ) i = w ( k ) 1 . 12 Hence the first part of the lemma is prov ed. Assume now w ( k ) > 1 + ε 3 E ( k ). W e get: w ( k ) 1 − w ( k +1) 1 ≥ h w ( k ) 1 / 2 1 w ( k ) 1 / 2 1 − E ( k ) 1 / 2 ≥ h 1 − 1 + ε 3 − 1 / 2 w ( k ) 1 . It remains to note that 1 − 1 + ε 3 − 1 / 2 ≥ ε 8 . T o analyze the b ehavior of B ( k ) we use the following elemen tary inequality: x − x 2 ≤ ln(1 + x ) ≤ x (14) whic h is v alid for all − 1 2 ≤ x ≤ 1 2 . Let us also state the following useful fact. F act 5.7. L et w ∈ R n > 0 and q ∈ R n b e the solution to min x ∈ R n ( n X i =1 x 2 i w i : Ax = b ) . Then: b > L − 1 b = n X i =1 q 2 i w i , wher e L = AW A > . Pr o of. W e use the explicit formula q = W A > L − 1 b . Note that P n i =1 q 2 i w i = q > W − 1 q and hence: q > W − 1 q = b > L − 1 AW W − 1 W A > L − 1 b = b > L − 1 ( AW A > ) L − 1 b = b > L − 1 b. W e con tinue with a lemma describing the b ehavior of B ( k ). Lemma 5.8. Supp ose that h ≤ ε 40 · ( nα ) 2 , then for every k it holds that B ( k + 1) ≥ B ( k ) + h E ( k ) − 1 + ε 10 k x ? k 1 . Pr o of. W e hav e: B ( k + 1) − B ( k ) = n X i =1 x ? i ln w ( k +1) i w ( k ) i = n X i =1 x ? i ln 1 + h q ( k ) i w ( k ) i − 1 W e apply the left-hand side of (14) to every summand. This is p ossible b y our assumption x ? ≥ 0. F or simplicit y let z i def = q ( k ) i w ( k ) i − 1 ! . W e obtain: B ( k + 1) − B ( k ) ≥ n X i =1 x ? i ( hz i − h 2 z 2 i ) = h n X i =1 x ? i z i − h 2 n X i =1 x ? i z 2 i (15) 13 W e analyze the linear term and quadratic term separately . W e ha v e: n X i =1 x ? i z i = n X i =1 x ? i q ( k ) i w ( k ) i − 1 = n X i =1 x ? i q ( k ) i w ( k ) i − k x ? k 1 . W e lo wer-bound the first order term: n X i =1 x ? i q ( k ) i w ( k ) i ≥ n X i =1 x ? i q ( k ) i w ( k ) i = ( x ? ) > W ( k ) − 1 q ( k ) = ( x ? ) > W ( k ) − 1 W ( k ) A > L − 1 b = ( x ? ) > A > L − 1 b = b > L − 1 b where L = AW ( k ) A > . The ab o v e, together with F act 5.7 give: n X i =1 x ? i q ( k ) i w ( k ) i ≥ b > L − 1 b = E ( k ) . Th us we hav e obtained: (16) n X i =1 x ? i z i ≥ E ( k ) − k x ? k 1 . T o b ound the quadratic term in (15) we just apply Corollary 5.5: n X i =1 x ? i z 2 i ≤ n X i =1 x ? i ( nα − 1) 2 ≤ (2 nα ) 2 k x ? k 1 . W e com bine (15) with our b ounds on first and second order terms to obtain: B ( k + 1) − B ( k ) ≥ h ( E ( k ) − k x ? k 1 ) − h 2 (2 nα ) 2 k x ? k 1 ≥ h ( E ( k ) − k x ? k 1 ) − h · ε 10 · k x ? k 1 . Con v ergence pro of. W e are ready to prov e the main result. Pro of of Theorem 5.1: W e would lik e to count the num ber of steps till the first moment when w ( k ) 1 ≤ (1 + ε ) k x ? k 1 . F rom Lemma 5.6 the ` 1 − norm of w ( k ) is non-increasing with k and whenev er w ( k ) 1 > 1 + ε 3 E ( k ), w ( k ) 1 decreases by a multiplicativ e factor of (1 − hε 8 ). This means that there can b e at most log (1 − hε ) − 1 M 1 + ε = O ln M hε suc h steps. What ab out steps for which w ( k ) 1 ≤ 1 + ε 3 E ( k )? W e obtain: (1 + ε ) k x ? k 1 ≤ w ( k ) 1 ≤ 1 + ε 3 E ( k ) . 14 This in particular implies that: E ( k ) ≥ 1 + ε 2 k x ? k 1 . W e apply Lemma 5.8 to conclude that in suc h a case: B ( k + 1) ≥ B ( k ) + hε 3 k x ? k 1 . Let us no w analyze ho w B ( k ) can c hange throughout steps. W e start with B (0) ≥ 0 (since w (0) i ≥ 1 for ev ery i ∈ [ n ]) and B ( k ) is upp er b ounded by k x ? k 1 · (ln M + ln k x ? k 1 ) (this holds b ecause w ( k ) 1 ≤ w (0) 1 ≤ M k x ? k 1 ). A t every step when w ( k ) 1 > 1 + ε 3 E ( k ) the largest p ossible drop of B ( k ) is (by Lemma 5.8) upp er-b ounded by: h 1 + ε 10 k x ? k 1 ≤ 2 h k x ? k 1 . Note that b y the reasoning abov e there are at most O ln M hε suc h steps. On the other hand, if w ( k ) 1 ≤ 1 + ε 3 E ( k ) then B ( k ) increases by at least: hε 3 k x ? k 1 . This means that the total drop of B ( k ) ov er the whole computation is at most: O ln M ε k x ? k 1 . Hence the num ber of steps in which w ( k ) 1 ≤ 1 + ε 3 E ( k ) is at most: O ln M ε k x ? k 1 + k x ? k 1 · (ln M + ln k x ? k 1 ) hε 3 k x ? k 1 ! = O ln M + ln k x ? k 1 hε 2 . Appendix A. Example f or Non-convergence of IRLS W e presen t an example instance for which IRLS fails to conv erge to the optimal solution. More precisely we prov e the following. Theorem A.1. Ther e exists an instanc e ( A, b ) of the b asis pursuit pr oblem (1) and a fe asible, strictly p ositive p oint y ∈ R n > 0 such that if IRLS is initialize d at y (0) = y (and { y ( k ) } k ∈ N is the se quenc e pr o duc e d by IRLS) then y ( k ) 1 do es not c onver ge to the optimal value. The proof is based on the simple observ ation that if IRLS reaches a p oint y ( k ) with y ( k ) i = 0 for some k ∈ N , i ∈ [ n ] then y ( l ) i = 0 for all l > k . Let us consider an undirected graph G = ( V , E ) with V = { u 0 , u 1 , ..., u 6 , u 7 } and let s = u 0 , t = u 7 . G is depicted in Figure 2. s t u 1 u 2 u 3 1 4 1 4 3 4 3 4 3 4 3 4 3 4 3 4 1 2 u 4 u 5 u 6 Figure 2. The graph G together with a feasible solution y ∈ R V . 15 W e define A ∈ R V × E to b e the signed incidence matrix of G with edges directed according to increasing indices, let b def = e t − e s = ( − 1 , 0 , 0 , 0 , 0 , 0 , 0 , 1) > . Then the following problem: min k x k 1 s . t . Ax = b is equiv alen t to the shortest s − t path problem in G . The unique optimal solution is the path s − u 4 − u 3 − t . In particular, the edge ( u 3 , u 4 ) is in the supp ort of the optimal vector. Claim A.2. L et y ∈ R E b e a fe asible p oint given in the Figur e 2, i.e. y u 0 u 1 = y u 1 u 2 = y u 2 u 3 = y u 4 u 5 = y u 5 u 6 = y u 6 u 7 = 3 4 , y u 0 u 4 = y u 3 u 7 = 1 4 and y u 3 u 4 = 1 2 . IRLS initialize d at y pr o duc es in one step a p oint y 0 with y 0 u 3 y 4 = 0 . The abov e claim implies that IRLS initialized at y (which has full support) do es not con v erge to the optimal solution, which has 1 in the co ordinate corresp onding to u 3 u 4 . Th us to pro ve Theorem A.1 it suffices to sho w Claim A.2. Pro of of Claim A.2: IRLS chooses the next p oint y 0 ∈ R E according to the rule: y 0 = argmin y ∈ R E X e ∈ E x 2 e y e s . t . Ax = b whic h is the same as the unit electrical s − t flow in G corresp onding to edge resistances 1 y e . 6 One can easily see that in suc h electrical flo w the p oten tials of u 4 and u 3 are equal (the paths s − u 4 and s − u 1 − u 2 − u 3 ha v e equal resistances), hence the flow through ( u 3 , u 4 ) is zero. References [BBD + 13] Luca Becchetti, Vincenzo Bonifaci, Michael Dirn b erger, Andreas Karrenbauer, and Kurt Mehlhorn. Ph ysarum can compute shortest paths: Conv ergence pro ofs and complexity b ounds. In A utomata, L an- guages, and Pr o gr amming - 40th International Col lo quium, ICALP 2013, Riga, Latvia, July 8-12, 2013, Pr o c e e dings, Part II , pages 472–483, 2013. [Bec15] Amir Beck. On the conv ergence of alternating minimization for conv ex programming with applications to iterativ ely rew eigh ted least squares and decomp osition sc hemes. SIAM Journal on Optimization , 25(1):185–209, 2015. [BMV12] Vincenzo Bonifaci, Kurt Mehlhorn, and Girish V arma. Physarum can compute shortest paths. In Pr o c ee d- ings of the Twenty-Thir d Annual ACM-SIAM Symp osium on Discr ete Algorithms, SODA 2012, Kyoto, Jap an, January 17-19, 2012 , pages 233–240, 2012. [CR T06] E.J. Candes, J. Romberg, and T. T ao. Robust uncertaint y principles: exact signal reconstruction from highly incomplete frequency information. Information The ory, IEEE T r ansactions on , 52(2):489–509, 2006. [CT05] E.J. Candes and T. T ao. Deco ding by linear programming. Information The ory, IEEE T r ansactions on , 51(12):4203–4215, 2005. [CY08] R. Chartrand and W otao Yin. Iteratively reweigh ted algorithms for compressive sensing. In A c oustics, Sp e e ch and Signal Pr o c essing, 2008. ICASSP 2008. IEEE International Confer enc e on , pages 3869–3872, 2008. [DDF G10] Ingrid Daub echies, Ronald DeV ore, Massimo F ornasier, and C. Sinan Gntrk. Iteratively reweigh ted least squares minimization for sparse recov ery . Communic ations on Pur e and Applie d Mathematics , 63(1):1–38, 2010. [DE03] Da vid L. Donoho and Michael Elad. Optimally sparse representation in general (non-orthogonal) dictio- naries via l1 minimization. In Pr o c. Natl. A c ad. SCI. USA 100 , page 21972202, 2003. [DH01] D.L. Donoho and X. Huo. Uncertaint y principles and ideal atomic decomp osition. Information Theory, IEEE T r ansactions on , 47(7):2845–2862, 2001. [GR97] I.F. Goro dnitsky and B.D. Rao. Sparse signal reconstruction from limited data using focuss: A re-w eigh ted minim um norm algorithm. T r ans. Sig. Pr o c. , 45(3):600–616, March 1997. [Gre84] P eter J Green. Iteratively reweigh ted least squares for maxim um likelihoo d estimation, and some robust and resistant alternatives. Journal of the R oyal Statistic al So ciety. Series B (Metho dolo gic al) , pages 149– 192, 1984. 6 This is due to the fact that electrical flows minimize energy . 16 [IJNT11] K. Ito, A. Johansson, T. Nak agaki, and A. T ero. Conv ergence Prop erties for the Physarum Solver. ArXiv e-prints , January 2011. [JZ12] Anders Johannson and James Zou. A slime mold solver for linear programming problems. In How the World Computes , volume 7318 of Le ctur e Notes in Computer Scienc e , pages 344–354. Springer Berlin Heidelb erg, 2012. [MO07] T omo yuki Miya ji and Isamu Ohnishi. Mathematical analysis to an adaptiv e netw ork of the plasmo dium system. Hokkaido Math. J. , 36(2):445–465, 2007. [NYT00] T oshiyuki Nak agaki, Hiroy asu Y amada, and Agota T oth. Maze-solving by an amo eb oid organism. Natur e , 407(6803):470, September 2000. [Osb85] M. R. Osb orne. Finite Algorithms in Optimization and Data Analysis . 1985. [RKD99] B.D. Rao and K. Kreutz-Delgado. An affine scaling methodology for best basis selection. Signal Pro c essing, IEEE T r ansactions on , 47(1):187–200, Jan 1999. [SV16a] Damian Straszak and Nisheeth K. Vishnoi. Natural algorithms for flow problems. In ACM-SIAM Symp o- sium on Discr ete Algorithms , 2016. [SV16b] Damian Straszak and Nisheeth K. Vishnoi. On a natural dynamics for linear programming. In ACM Innovations in The or etic al Computer Scienc e , 2016. [TKN07] Atsushi T ero, Ryo Kobay ashi, and T oshiyuki Nak agaki. A mathematical mo del for adaptive transp ort net work in path finding by true slime mold. Journal of The or etic al Biolo gy , 244(4):553, 2007. 17
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment