Distributed Algorithms for Consensus and Coordination in the Presence of Packet-Dropping Communication Links - Part II: Coefficients of Ergodicity Analysis Approach

COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-ENG-11-2208 (CRHC-11-06) 1 Distrib uted Algorithms for Conse nsus and Coordination in the Presence of Pack et-Dropping Communication Links P art II: Coef ﬁcients of Er godicity Analysis Approach Nitin H. V aidya, F ellow , IEEE Christoforos N. Hadjicostis , Senior Member , IEEE Alejandro D. Dom ´ ınguez-Garc ´ ıa, Member , IEEE September 28, 2011 Abstract In th is two-pa rt paper, we con sider multico mponen t systems in wh ich eac h com ponen t can iteratively exchange informa tion with other com ponen ts in its neighbo rhoo d in ord er to compute, in a distrib uted fashion, the av erage o f the compo nents’ initial values or some other quantity of interest (i.e., som e function of these initial values). I n particular, we study an iterati ve alg orithm for computing the average of the initial values of the nod es. I n this alg orithm, each compon ent maintain s two sets of variables that are upd ated v ia two identical linear iterations. The average of th e initial values of the nodes can be asymptotically computed by each node as the ratio of two of the v ariables it maintains. In the ﬁrst part of this paper, we show ho w the update rules for th e two sets of variables can be e nhanced so that the alg orithm becom es tolerant to communica tion links that may drop packets, indepen dently amon g them and indep endently b etween dif ferent tra nsmission times. In this second part, by re writing the collective dyn amics of both iterations, we show that the resulting system is mathematically equivalent to a ﬁnite inhomogeno us Markov cha in whose tran sition matrix takes one of ﬁnitely many values at each step. Then, by using e a coefﬁcients of ergodicity approach , a method comm only used for conv ergence analysis of Ma rkov chain s, we prove conv ergence of the r obustiﬁed con sensus scheme. The an alysis suggests that similar c onv ergence sh ould hold u nder more general con ditions as well. Note to readers: Section I discusses the relatio n between Part II ( this re port) and the co mpanio n P art I of the rep ort, and discusses some related work . The readers may skip Sectio n I withou t a loss of contin uity . Univ ersity of Illinois at Urbana-Champaign. Coordinated S ciences Laboratory technical report UILU-ENG-11-2208 (CRHC-11-06) N. H. V aidya and A. D. Dom´ ınguez-Garc ´ ıa are wit h the Department of Electri cal and Computer Engineering at the University of Illinois at Urbana-Champaign, Urbana, IL 61801 , US A. E-mail: { nhv , aledan } @ILLINOIS.EDU. C. N. Hadjicostis is wit h the Department of Electri cal and Computer Engineering at the Univ ersity of Cyprus, Nicosia, Cyprus, and also with the Department of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. E-mail: chadjic@UCY .AC.CY . The work of A. D. D om ´ ınguez-Garc´ ıa was supported in p art by NSF under Career A ward ECCS-CAR-0954420. The work of C. N. Hadjicostis was supported in part by the European Commission (EC) 7th Framew ork Programme (FP7/2007-2013) under grant agreements INFSO-ICT -223844 and PIRG02-GA-2007-2 24877. The work of N. H. V aidya was supported in part by Army Research Ofﬁce grant W -9 11-NF-0710287 and NS F A ward 105954 0. Any opinions, ﬁndings, and conclusions or recommendations exp ressed here are those of the authors and do not necessarily reﬂect the views of the funding agencies or the U.S. gov ernment. COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-ENG-11-2208 (CRHC-11-06) 2 I . I N T RO D U C T I O N The focus of this paper is to analyze the con v er gence of the robustiﬁed double-iteration 1 algorithm for a verage c onsensus introduced in Pa rt I, utilizing a dif ferent fra mew ork that allo ws us to mo ve aw ay from the probabilistic model describing t he a va ilability of communicati on links of Part I. Mo re speciﬁcally , i nstead of focusing on the dynamics of the ﬁrst and s econd m oments o f the two iterations to establi sh con ver gence as done in Part I, we consider a framew ork th at builds upon the theory of ﬁnite in homogenous Markov chains. In this rega rd, by augm enting th e communicatio n graph, we wi ll show that the collective dyn amics of each of the two iterations can be re written in such a way that th e resulting s ystem is mathematicall y equiv alent to a ﬁnite inhomogenous Markov chain wh ose t ransition matrix takes values from a ﬁnite set of possible matrices. Once th e problem is recasted in this fashion, tools, su ch as coefﬁcients of ergodicity , comm only used in the analysis of inhom ogenous Markov chains (see, e.g., [1]) are used to p rove the con v er gence of the algorithm. Recalling from Part I, w hen the commun ication network is perfectly reliable (i .e., in the absence of packet drops), the collective dynam ics o f the li near iterations can be described by a discrete-time transition system with no inpu ts in wh ich the transition matrix is column stochastic and primi tiv e. Then, each node runs two identical copies of a lin ear iteration, with each iteration ini tialized differently depending on the problem to be s olved. This doubl e-iteration algorith m i s a particular in stance of the algorithm i n [2] (which is a generalization of the algorit hm proposed in [3]), where the matrices describing each linear iteration are allowed to vary as time ev olves, whereas in our setup (for the ideal case when there are n o communication link failures) the transition matrix is ﬁxed over time. In general, the algorithm described above is not robust against packet-dropping communication l inks. It might be possib le to robustify it b y introducing message deliver y acknowledgment m echanisms and retransmission mechanisms, but th is has certain overhead and drawbacks as discussed in Section II-C. Also, in a pu re broadcast system, which is the communication model we assume in this w ork, it is easy to s ee that the doubl e-iteration algorith m above will n ot work properly . The mechanism we proposed in Part I to robustify the do uble iteration algorithm was for each node i t o keep track of three quanti ties of interest: i) i ts o wn internal state (as captured by the state v ariables maintained in the origi nal double iteration scheme of [2 ], [4]; ii) an aux iliary variable t hat accounts for the total mass broadcasted so far by no de i to (all of) its neighbors; and iii) anoth er auxili ary variable t hat accounts for the total recei ved mass from each n ode j that sends information t o node i . Th e det ails of the algorit hm are provided in Section III, but th e key in analy zing con ver gence of the algorit hm is to s how that th e collective system dynamics can be rewritten by in troducing addi tional nodes—virtual buf fers—that account for the diffe rence between these two auxi liary variables. The resulting enhanced system is equiv alent to an inhomogenous Markov chain who se t ransition matrix takes values from a ﬁnite s et. As di scussed in Part I, eve n i f relying on the ratio of two linear iterations , our work is different from the work in [2] in terms of both the communication model and also the natu re o f the protocol itself. 1 In this second part we will also refer to t his algorithm as “rati o consensus” algorithm and will use both denomina tions interchangeably . COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-ENG-11-2208 (CRHC-11-06) 3 In this regard, a key premise in [2] is that stochasticity of t he transiti on m atrix must be maintained over time, wh ich requi res sending nodes t o know the number of n odes that are listening, suggesting that i ) either the com munication lin ks are perfectly reliable, or ii ) there is so me acknowledgment and retransmission m echanism that ensures messages are d eliv ered to the list ening nodes at every round of inform ation exchange. In our work, we remove both assum ptions, and assum e a pure broadcast model withou t acknowledgements and retransmissions . It is very easy to see that in the presence of lossy communi cation lin ks, the algori thm in [2] does not solve the av erage consensus probl ems as stochasticity of the transi tion matrix is not preserved over time. Thus, as mentioned above , the key in the approach we follo w to analy ze con ver gence is to augment the communication graph b y introducing additional nodes, and to establis h the correctness of the algo rithms and establish th at the collective dynamics of the resulting system is equiv alent to a ﬁnite inh omogenous Markov chain with transition matrix th at values values from a ﬁnite s et. Once the system is rewritten i n this fashion, t he robust algorithm for ratio consensu s reduces to a similar setting t o t he one in [2], e xcept for the fact that some of the the resul ting transition matrices might not have pos itive diagonals, whi ch is required for the proof in [2]. Thus, in t his regard, ou r approach may be also viewed as a generalization of the main result i n [2]. The idea of augmenting the communication graph has been used in consensus problems to study the impact of bounded (ﬁxed and random) communication delays [5], [6], [7]. In ou r work, the augmented communication graph that results from rewriting the collective sys tem dynamics has som e simil arities to t he augmented communication graph in [7], where the l ink from node i to node j is replaced by sev eral paths from node i to node j , in order to mim ic the ef fect of communicatio n delays . In p articular , in [7], for a m aximum delay of B steps, B paths are added in parallel with the single-edge path that captures the non-delayed message transmissi on. The added path correspondin g t o d elay b ( 1 ≤ b ≤ B ) has b no des, for a total of B ( B + 1) / 2 additional nodes capturin g th e effect of message transmission delays from node i to node j . At every ti me step, a message from node i to node j is randomly routed through one of these paths; the aut hors assume for simplicity that each of the p aths is activ ated with probability 1 /B . F or lar ge com munication graphs, one of the d rawbacks of this model is the explosion in the number of nodes to be added to the communication graph to model the ef fect of d elays. In our work, for analysis purposes, we also use t he idea of augment ing the communication graph, but in our case, a sin gle parallel path is sufﬁcient to capture the effect of packet-dropping commu nication links. As bri eﬂy discussed later , i t i s easy t o s ee that our modelin g formalism can also be u sed to capture random delays, with the advantage over the formalism in [7] t hat in our model, it is only n ecessary to add a single parallel path with B nodes (instead of the B ( B + 1) / 2 nodes added above) per lin k in the original commu nication path, which reduces t he number of st ates added. Addi tionally , our modeling frame work can h andle any d elay distribution, as long as the equiv alent augmented network satisﬁes properties (M1)-(M5) d iscussed in Section IV -A. In order to make Part II s elf-contained, we revie w sev eral i deas already int roduced in Part I, includi ng COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 4 the do uble-iteration algori thm formul ation ov er perfectly reliable networks and its robustiﬁed version. In Par t II, we will embrace the comm on conv ent ion util ized in Markov chains of pre-multiplying the transition matrix of the Markov chain by the correspondin g p robability vec tor . The remainder of this paper is organized as follows. Section II introduces the comm unication m odel, brieﬂy describes th e non-robust version of the double-iteration algo rithm, and discusses some i ssues that arise when implementi ng the double-iteratio n algo rithm in networks wi th unreliable links. Section III describes the st rategy to rob ustify t he double-iteratio n algorithm against commu nication link failures. Section IV reformulates each of the two iterations in the robust algorithm as an inhomogeneous Markov chain. W e emp loy coefﬁ cients of ergodicity analy sis t o characterize the algori thm behavior in Section V . Con ver gence of the robustiﬁed double-it eration algorithm is established in Section VI. Conclud ing remarks and discus sions o n future work are presented i n Section VII . I I . P R E L I M I N A R I E S This section describes the communicati on mo del we adopt throughout t he work, i ntroduces nota- tion, revie ws th e double-iteration algorithm that can be used to solve consens us probl ems when the communication network is perfectly reliable, and di scusses issues that arise when im plementing the double-iteration algorithm in networks with packet-dropping links. A. Network Communi cation Mod el The sys tem under consideration consists of a network of m nodes, V = { 1 , 2 , . . . , m } , each of which has som e ini tial value v i , i = 1 , 2 , . . . , m , (e.g., a temperature reading). The no des need to reach consensus to the av erage of these initial values in an it erativ e fashion. In ot her words, the goal is for each n ode to obtain the va lue P m j =1 v j m in a distributed fashion. W e assume a synchronous 2 system i n which t ime is divided into time steps of ﬁxed duration . The nodes in the n etwork are connected by a certain directed network. More speciﬁcally , a directed l ink ( j, i ) is said to “exist” if transm issions from node j can b e receiv ed by node i inﬁnitely often over an inﬁnite interval. Let E denote t he set of all directed li nks that e xi st in the n etwork. For notational con venience, we take that ( i, i ) ∈ E , ∀ i , so that a self-loop exists at eac h node. Then, g raph G = ( V , E ) represents the n etwork connectivity . Let us deﬁne I i = { j | ( j, i ) ∈ E } and O i = { j | ( i, j ) ∈ E } . Thus , I i consists of all nodes from whom node i has incoming links, and O i consists of all nodes to whom node i has outgoing links. For a set S , we will denote the cardinalit y of set S b y | S | . Th e outdegree of node i , denot ed as D i , is th e size of set O i , thus, D i = |O i | . Du e to the assumption that all nodes hav e self-loo ps, i ∈ I i and i ∈ O i , ∀ i ∈ V . W e assume that graph G = ( V , E ) is s trongly connected. Thus, in G = ( V , E ) , there exists a directed path from any node i to any node j , ∀ i, j ∈ V (alt hough it is possi ble that the link s on such a path between a pair o f nodes may not all be s imultaneousl y reliable in a given time slot). 2 W e later discuss how t he techniques we de velop for reaching consen sus using t he double iteration algorithm in the presence of packe t-dropping links naturally lead to an asynchronous computation setup. COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 5 The iterati ve consensus algorithms consi dered in thi s paper assum e that, at each step of the i teration, each node transmits some information to all the nodes to whom it h as a reliable directed link during that i teration (or “time step”). The iterative consensus algorithm summarized in Section II-B assumes the sp ecial case wherein all the links are al ways r eliable (that i s, al l l inks are reliabl e in ev ery ti me step). In Section III, and be yond, we consider a network with potentiall y unreliable l inks. Our work on iterativ e consensu s over u nreliable li nks is motiv ated b y the presence of su ch links in wireless networks. Suppose that the nodes in our network communicate over wireless li nks, with the node l ocations being ﬁxed. In such a wireless network, each n ode should generally be able to communicate with the other nodes in its vicinit y . Howe ver , such transmis sions may not always be reliable, due to channel fading and interference from other sources. T o make our subsequ ent di scussion precise, we will assume that a lin k ( i, j ) exists (i.e., ( i, j ) ∈ E ) only if each transmiss ion from i is successfully received by node j wit h prob ability q ij ( 0 < q ij ≤ 1 ). W e assume that successes of transm issions on differe nt links are independent of each oth er; also, successes of different transmissions on any gi ven link are independent of each other . As we will see, these independence assumptio ns can be partially relaxed but we adop t them at t his point for simpli city . W e assume that all transmissi ons from any nod e i are br oadcasts , 3 in t he sense that, e very node j , such that ( i, j ) ∈ E , may recei ve i ’ s transm ission wit h probability q ij independently between nodes and transm ission steps. As seen later , this broadcast property can potentially be exploited to m ake communication m ore efﬁcient, particularly when a given node i wants to send identical information to all the nod es in O i . When node i broadcasts a message to its neighbors, the reliabilities of receptions at di f ferent nod es in O i are m utually independent . Each node i is assumed t o be aw are o f the value of D i (i.e., th e number of nodes in O i ), and t he identity of each node i n set I i . Thi s informatio n can be learned u sing neighbor discovery m echanisms used in w ireless ad hoc or mesh networks. Note th at node i does not necessarily know whether t ransmission s to nodes i n O i are successful. B. Ratio Consensus Algorit hm in P erfectly R eliable Communication Networks In this section, we summ arize a consensus algorithm for a special case of the abov e system, wherein all the l inks in the n etwork are always r eliable (that is, reliable in e very time step). The “ratio cons ensus” algorithm presented here performs two iterativ e computations in parallel, with the solutio n of t he consensus algorithm being asymp totically obtain ed as the ratio of the outcom e of t he two parallel iterations. W e will refer to this approach as ratio consensus . In prior literature, s imilar approaches have also been called weighted consensus [2], [3]. Each node i maintains at iteration k state v ariables y k [ i ] and z k [ i ] . At each tim e step k , each node i 3 As elaborated later , the results in this paper can also be applied i n networks wherein the transmissions are unicast (not broadcast). COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 6 updates its st ate v ariable as follows: y k [ i ] = X j ∈I i y k − 1 [ j ] / D j , k ≥ 1 , (1) z k [ i ] = X j ∈I i z k − 1 [ j ] / D j , k ≥ 1 , (2) where y 0 [ j ] = v j , ∀ j = 1 , . . . , m , and z 0 [ j ] = 1 , ∀ j = 1 , . . . , m . T o facilitate implem entation of the above iterations, at time step k , each node i broadcasts a m essage containing values y k − 1 [ i ] /D i and z k − 1 [ i ] /D i to each node in O i , and awaits reception of a simi lar message from each node in I i . When node i has recei ved, from each node j ∈ I i , a value (namely , y k − 1 [ j ] /D j and z k − 1 [ i ] /D j ) at s tep k , node i performs the abov e update of its state variables (by simply summing the corresponding values). Hereafter , we wi ll use t he phrase “m essage v ” to mean “message containing value v ”. The above two iterations are represented in a m atrix n otation in (3) and (4), where y k and z k are ro w vectors of size m , and M is an m × m prim itive mat rix 4 , such that M [ i, j ] = 1 / D i if j ∈ O i and 0 otherwise. Compactly , we s how y k = y k − 1 M , k ≥ 1 , (3) z k = z k − 1 M , k ≥ 1 . (4) It is assumed that z 0 [ j ] = 1 and y 0 [ j ] = v j are the initial values at each node j ∈ V . Each node i calculates, at each tim e st ep k , the ratio v k [ i ] = y k [ i ] z k [ i ] . For the transit ion matrix M , (a) M [ i, j ] ≥ 0 , and (b) for all i , P j M [ i, j ] = 1 . Any matrix that satisﬁes these t wo cond itions is said to be a row stochastic matrix. It has been shown in [4] that v k [ i ] asymptoticall y con ver ges to the average of the elements of y 0 , provided that M is prim itive and r ow stochastic . That is, if M i s a primi tive ro w stochastic matrix, then lim k →∞ v k [ i ] = P j y 0 [ j ] m , ∀ i ∈ V , (5) where m is the number of elem ents in vec tor y 0 . C. Impl ementation A spects of R atio Consensus A lgorithm in the Pr esence o f Unr eliable Links Let us consider how we m ight impl ement iterations (3) and (4) i n a wireless network. Since the treatment for the y k and z k iterations is simi lar , let us focus o n t he y k iteration for now . Implement ing (3) requires that, at iteration k (to compute y k ), nod e i should transm it mess age y k − 1 [ i ] M [ i, j ] to each 4 A ﬁnite square matrix A is said to be primitive i f for some positiv e integer p , A p > 0 , that is, A p [ i, j ] > 0 , ∀ i, j . COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 7 node j ∈ O i . Con veniently , for all j ∈ O i , the v alues M [ i, j ] are ident ical, and equal to 1 /D i . Thus, node i needs to send message y k − 1 [ i ] /D i to each n ode in O i . Let us deﬁne µ k [ i ] ≡ y k − 1 [ i ] /D i , k ≥ 1 . In a wireless network, the t wo approaches described next may be used by node i to transmit message µ k [ i ] to all the nodes in O i . Appr oach 1: In this approach, each node i ensures that it s message µ k [ i ] is delivere d reliably to all the nodes in O i . O ne way t o achieve this goal is as follows. No de i can broadcast the message µ k [ i ] on the wireless channel, and then wait for acknowle gdements (ack) from all the nodes in O i . If such acks are not receiv ed from all nodes in O i within some timeou t i nterval, then i can retransmit the message. This procedure wi ll be repeated u ntil acks are recei ved from all t he intended recipients of µ k [ i ] . This procedure ensures that the m essage is recei ved by each node in O i reliably in each step k of the it eration. Howe ver , as an undesirable side-ef fect, the time required to guarantee the reliable deli very to all the neighboring nodes is no t ﬁxed. In fact, this t ime can be arbitrarily large with a non-zero probability , if each t ransmission on a link ( i, j ) ∈ E is reliable with probabili ty q ij < 1 . Different nodes may require di ff erent amounts of time to reliably delive r their message to their intend ed recipi ents. Thus, if a ﬁxed ﬁnite interva l of time is allocated for each step k , then i t becomes difﬁcult to guarantee that the iteratio ns wi ll be always performed corr ectly (because some m essages may not be deliv ered within the ﬁxed time interv al). Appr oach 2: Al ternativ ely , each node i may jus t b roadcast its message µ k [ i ] once in tim e step k , and hop e that all t he nod es in O i recei ve it reliably . Thi s approach has the advantage that each step of the i teration can be performed in a sho rt (and predictable) time interval. Howev er , it also has the undesirable property that all the nodes in O i may not recei ve the message (due to link unreliabilit y), and such nodes will not be able to update their state correctly . It is important to note that, sin ce there are no acknowle gements b eing s ent, a node i cannot immediat ely kn ow whether a node j ∈ O i has recei ved i ’ s message or not. Considering the shortcom ings o f the above two app roaches, it appears that an alternative solutio n is required. Our solution to the problem (to be introdu ced in Section III) is to maintain add itional state at each node, and utilize this s tate to mi tigate the d etrimental i mpact of lin k unreliabi lity . T o put it diffe rently , the addi tional state can be used to desi gn an iterative cons ensus algorit hm r obust to link unreliability . In particular , the amount of state maint ained by each node i i s propo rtional to |I i | . In a lar ge scale wireless network (i.e, wi th large m ) with nodes spread over lar ge space, we would expect that for any node i , |I i | << m . In such cases, the sm all increase in the amount of state is a jus tiﬁable cost to achi e ve robustness in presence of link un reliability . Although M [ i, j ] is i dentical (and equal to 1 /D i ) for all j ∈ O i in our example abov e, this i s not necessary . So long as M is a primitive ro w stochasti c matrix, the above iteration will conv erge to the correct cons ensus value (provided that the transmissio ns are always reliabl e). Thus, it is possible that i n COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 8 a gi ven iteration, node i m ay want to send dif ferent messages to dif ferent nodes in O i . Thi s goal can be achie ved by performing unicast operation t o each node in O i . In this situati on as well , two approaches analogous to Approaches 1 and 2 may be used. The ﬁrst approach would be to reliably deliver th e unicast messages, usi ng as many retransmissions as necessarys. The second approach may be to transmit each message j ust once. In both cases, it is poss ible that the i terations may not be p erformed correctly . T o simplify the d iscussion in this paper , we assume that each node i needs t o transmit identical message to the no des in O i . Howe ver , it i s easy to extend the proposed scheme so t hat it is applicable to the more general scenario as well. I I I . R O B U S T I FI C A T I O N O F R A T I O C O N S E N S U S A L G O R I T H M In this section, we present the p roposed ratio consensus algorithm that is robust in presence of link unreliability . The correctness of the proposed algorit hm is established in Section VI. As before, each node m aintains s tate variables y k [ i ] and z k [ i ] . Additional st ate m aintained at each n ode wi ll be deﬁned soon. Iterativ e computatio n is performed to maintain y k and z k . For bre vity , we wi ll focus o n presenting the iterations for y k , but iterations for z k are analogous, wit h the difference being in the initi al st ate. The initial values of y and z are assumed 5 to satisfy the following condit ions: 1) y 0 [ i ] ≥ 0 , ∀ i , 2) z 0 [ i ] ≥ 0 , ∀ i , 3) P i z 0 [ i ] > 0 . Our goal for the robust iterati ve consensus algorit hm i s to allow each node i to compu te (asympt otically) the ratio P i y 0 [ i ] P i z 0 [ i ] . W ith a suitable choice of y 0 [ i ] and z 0 [ i ] , differe nt functions may be calculated [4]. In particular , if the initial input of node i i s denoted as v i , then by setting y 0 [ i ] = w i v i and z 0 [ i ] = w i , where w i ≥ 0 , ∀ i , the nodes can compute the weighted av erage P i w i v i P i w i ; with w i = 1 , ∀ i ∈ V , the nodes calculate ave rage consensus. A. Intuition Behind the Robust Algori thm T o aid our presentation, let us int roduce the notion of “mass. ” The in itial v alue y 0 [ i ] at node i is to be viewed as its i nitial mass. If node i sends a message v to another node j , that can be viewed as a “transfer” of an amount o f mass equal to v to nod e j . W ith th is viewpoint, it helps to think of each step k as being performed over a no n-zero interval of time. Then, y k [ i ] should be viewed as the mass at node i at the end of time step k (which is the same as the star t of st ep k + 1 ). Thus, during step k , each n ode i transfers (perhaps un successfully , due to u nreliable l inks) som e mass to nodes in O i , the 5 The assumption that y 0 [ i ] ≥ 0 , ∀ i , can be relaxed, allowing for arbitrary values for y 0 [ i ] . COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 9 amount being a fun ction of y k − 1 [ i ] . The m ass y k [ i ] is the accumu lation of the mass that i recei ves in messages from nodes in I i during step k . Now , P i y 0 [ i ] is the total mass in the system i nitially . If we im plement iteration (3 ) in the absence of packet drops, then for all iteratio ns k X i y k [ i ] = X i y 0 [ i ] . That is, the total mass i n the sys tem re mains constant. This in var iant is m aintained beca use M is a ro w stochastic matrix. Howev er , if a message v sent by n ode i i s not receiv ed by som e node j ∈ O i , then the mass in that message is “lost, ” resulti ng i n reduct ion of th e total m ass in the system. Our robust algorit hm is motiv ated by t he desi re to av oid the l oss of mass in the system , ev en in the presence of unreliable l inks. The proposed algorithm uses Approach 2 for transmis sion of messages. In particular , in our algorithm (and as in t he original ratio con sensus), at each st ep k , each node i wants to transfer µ k [ i ] = y k − 1 [ i ] /D i amount of mass to each n ode in O i . For this purpose, node i broadcasts 6 message µ k [ i ] . T o make t he algorithm robust, let us assu me that, for each link ( i, j ) ∈ E , a “vi rtual buf fer” is av ailable to st ore the mass th at is “undelivered” on the link. For each node j ∈ O i , there are two possi bilities: (P1) Link ( i, j ) is no t reliable in slot k : In this case, message µ k [ i ] is not re ceiv ed by node j . Node i believes that it has transferred t he mass to j (and thus, i does not include th at mass in it s own state y k [ i ] ), and at the s ame time, that mass is not recei ved at node j , and therefore , n ot included in y k [ j ] . Therefore, let us view this mis sing mass as being “b uffered on” l ink ( i, j ) in a virtual buf fer . The virtual buf fer for each directed lin k ( i, j ) will b e viewe d as a virtual node in the network. Thus, when link ( i, j ) is unreliable, the mass is transferred from node i to “no de” ( i, j ) , instead of being transferred to node j . Not e that when link ( i, j ) is unreliable, node j neither receiv es mass directly from node i , nor from th e virtual buff er ( i, j ) . (P2) Link ( i, j ) is reliable in sl ot k : In this case, m essage µ k [ i ] is receiv ed by nod e j . Thus, µ k [ i ] contributes to y k [ j ] . In addition, all the mass buf fered in t he virtu al buf fer ( i, j ) wil l also be recei ved by node j , and th is mass will also con tribute to y k [ j ] . W e wil l say that buf fer ( i, j ) “releases” i ts mass to node j . W e capture the above intuit ion by building an “augmented” network that cont ains all t he nodes i n V , and also contai ns additional virtual nodes, each virtual node corresponding t o the virtual buf fer for a link i n E . Let us denote the augmented networks by G a = ( V a , E a ) where V a = V ∪ E and E a = E ∪ { (( i, j ) , j ) | ( i, j ) ∈ E } ∪ { i, ( i, j ) | ( i, j ) ∈ E } . In case (P2) above, the m ass sent by node i , and t he m ass released from the virtual buf fer ( i, j ) , bot h 6 In the more general case, node i may wan t to transfer different amounts of mass to different nodes in O i . In t his case, node i may send (unreliable) unicast messages to these neighbors. The treatment in this case wi ll be quite similar to the restricted case assumed in our discussion, except that node i wi ll need to separately track mass transfers to each of its out-neighb ors. COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 10 contribute to the ne w state y k [ j ] at node j . In parti cular , it will sufﬁce for node j to only know the sum of the mass being sent by no de i at step k and the mass being released (if any) from buf fer ( i, j ) at step k . In reality , of cours e, there is n o vi rtual b u f fer to hold t he m ass t hat has not b een delivered yet. Howe ver , an equiva lent mechanism can be implemented by introducing additional s tate at eac h node in V , which exploits the above observation. This is wh at we explain in the next section. B. Robust Ratio Consensus Alg orithm W e will mitigate the shortcomi ngs of Approach 2 described in Section II-C by changi ng ou r i terations to be tol erant to m issing messages. The modiﬁed scheme h as the following features: • I nstead of transmit ting message µ k [ i ] = y k − 1 [ i ] /D i at step k , each node i broadcasts at s tep k a message with v alue P k j =1 µ k [ i ] , denoted as σ k [ i ] . Thus, σ k [ i ] is the total mass that node i w ants to transfer to each node in O i through the ﬁrst k steps. • Ea ch node i maintains, in addition to state variables y k [ i ] and z k [ i ] , also a s tate va riable ρ k [ j, i ] for each node j ∈ I i ; ρ k [ j, i ] is the total mass that node i has receive d either directly from n ode j , or via vi rtual b u f fer ( j, i ) , through step k . The comput ation performed at node i at step k ≥ 1 is as fol lows. Note that σ 0 [ i ] = 0 , ∀ i ∈ V and ρ 0 [ i, j ] = 0 , ∀ ( i, j ) ∈ E . σ k [ i ] = σ k − 1 [ i ] + y k − 1 [ i ] /D i , (6) ρ k [ j, i ] = ( σ k [ j ] , if ( j, i ) ∈ E and message σ k [ j ] is recei ved by i from j at step k , ρ k − 1 [ j, i ] , if ( j, i ) ∈ E and no message is recei ved by i from j at step k , (7) y k [ i ] = X j ∈I i ( ρ k [ j, i ] − ρ k − 1 [ j, i ]) . (8) When link ( j, i ) ∈ E is reliable, ρ k [ j, i ] becom es equal to σ k [ j ] : this is reasonable, because i recei ves any ne w mass sent by j at step k , as well as any mass released by buffe r ( j, i ) at s tep k . On the other hand, when link ( j, i ) is unreliable, then ρ k [ j, i ] remains unchanged from the previous iteration, since no m ass is receiv ed from j (either directly or via virtual buf fer ( j, i ) ). It foll ows that, the total new mass receiv ed by node i at st ep k , either from node j d irectly or via buf fer ( j, i ) , is given by ρ k [ j, i ] − ρ k − 1 [ j, i ] , which explains (8). 7 I V . R O B U S T A L G O R I T H M F O R M U L A T I O N A S A N I N H O M O G E N E O U S M A R K OV C H A I N In this s ection, we reformulate each iteration performed by th e rob ust algorithm as an inhom ogeneous Markov chain whose transi tion mat rix takes values from a ﬁnite set of matri ces. W e wil l also di scuss some properties of these matrices, and analyze the behavior of their product s, which helps i n establishin g the con ver g ence of the robustiﬁed ratio consensus algorithm. 7 As per the algorithm speciﬁed abo ve, observ e that the v alues of σ and ρ increase monotonically with time. This can be a conce rn for a larg e number of steps in practical implementations. Howe ver , this concern can be mitigated by “resetting ” these v alues, e.g., via the exchan ge of additional information between neighbors (for instance, by piggyback ing cum ulativ e ack no wledgements, which will be deliv ered whenev er the links operate reliably). COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 11 A. Matrix Repr esentation of Each Individual Iteration The matrix representation is obtained by observin g an equivalence between t he it eration in (6)– (8), and an iterative algorithm (to be introduced soon) deﬁned on the augmented network described in Section II I-A. The vector state of the augment ed netw o rk con sists of n = m + |E | elements, corre sponding to the m ass held by each of the m nodes, and the m ass held by each of t he |E | virtu al buf fers: these n entities are represented by as many nodes in the augmented network. W ith a slight abuse of notation, let u s denote b y y k the state o f the nodes in the augment ed network G a . The vector y k for G a is an augmented version of y k for G . In additio n to y k [ i ] for each i ∈ V , the augmented y k vector also includes elements y k [( i, j )] for each ( i, j ) ∈ E , with y 0 [( i, j )] = 0 . 8 Due to t he manner in which the y k [ i ] ’ s are updated, y k [ i ] , i ∈ V , are identical in the original network and the augm ented network; therefore, we d o no t distinguish between them. W e next translate the iterativ e algorithm in (6)–(8) int o t he m atrix form y k = y k − 1 M k , (9) for appropriat ely row-stochastic matrices M k (to b e deﬁned soon) that m ight va ry as the algorith m progresses (b ut nevertheless take values from a ﬁnite set of pos sible matri ces). Let us deﬁne an indicator v ariable X k [ j, i ] for each link ( j, i ) ∈ E at each tim e st ep k as follows: X k [ j, i ] = ( 1, if link ( j, i ) is reliable at tim e st ep k , 0, otherwise. (10) W e will no w reformulate the iteration (6)–(8) and sho w how , in fact, it can be described in matrix form as shown in (9), where the matrix transitio n matrix M k is a functio n of the indicator v ariables deﬁned in (10). First, by using the indicator variables at time step k , as deﬁned in (10), it follows from (6) that ρ k [ j, i ] = X k [ j, i ] σ k [ j ] + (1 − X k [ j, i ]) ρ k − 1 [ j, i ] . (11) Now , for k ≥ 0 , deﬁne ν k [ j, i ] = σ k [ j ] − ρ k [ j, i ] (thus ν 0 [ j, i ] = 0 ). Then, it follows from (6) and (11) that ν k [ j, i ] = (1 − X k [ j, i ])  y k − 1 [ j ] D j + ν k − 1 [ j, i ]  , k ≥ 1 . (12) Also, from (6) and (11), it fol lows that (8) can be rewritten as y k [ i ] = X j ∈I i X k [ j, i ]  y k − 1 [ j ] D j + ν k − 1 [ j, i ]  , k ≥ 1 . (13) At e very instant k that the link ( j, i ) is not reliable, it is easy to see that the variable ν k [ j, i ] in creases by an amount equ al to the amount t hat n ode j wished to send to node i , but i n e ver receive d due to the l ink failure. Similarly , at ev ery instant k that the link ( j, i ) is reliabl e, the v ariable ν k [ j, i ] becomes 8 Similarly , z 0 [( i, j )] = 0 . COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 12 zero and its value at k − 1 i s received by node i as can b e seen in (13). Thus, from (12) and (13), we can think of the variable ν k [ j, i ] as the state of a virtual no de that b uffers the m ass that node i does not recei ve from node j e very time the li nk ( j, i ) fails. It is important to note that the ν k [ j, i ] ’ s are virtual var iables (no node in V com putes ν k ) that just result from combi ning, as explained above, variables that the no des in V compu te. The reason for doing th is is t hat the result ing model is equiv alent to an inhomogeneous Markov chain. This can b e easily seen by stacking up (13) for all n odes indexed in V , i.e., the com puting n odes, and (12) for all vi rtual buf fers ( j, i ) , with ( j, i ) ∈ E , and re writing t he resulting expressions in matrix form , from where the expression in (9) results. B. Structur e and Pr operties of the Matrices M k Next, we discuss the sparsity s tructure of t he M k ’ s and obt ain their entries by i nspection of (12) and (13). Addition ally , we will explore some properties of the M k ’ s that will be helpful in the analysis conducted in Section V for characterizing t he beha v ior of each of the individual iterations. 1) Stru ctur e of M k : Let us ﬁrst deﬁne the entries in ro w i of matrix M k that corresponds to i ∈ V . For ( i, j ) ∈ E , there are two possibiliti es: X k [ i, j ] = 0 or X k [ i, j ] = 1 . If X k [ i, j ] = 0 , then the mass µ k [ i ] = y k [ i ] /D i that node i wants to send to no de j is add ed to th e virtual buffe r ( i, j ) . Otherwise, no new mass from no de i i s added to buf fer ( i, j ) . Therefore, M k [ i, ( i, j )] = (1 − X k [ i, j ]) /D i . (14) The above value i s zero if link ( i, j ) is reliable at step k , and 1 / D i otherwise. Similarly , it follows t hat M k [ i, j ] = X k [ i, j ] /D i , (15) which is zero whene ver lin k ( i, j ) is unreliabl e at step k , and 1 /D i otherwise. Observe that for each j ∈ O i , M k [ i, j ] + M k [ i, ( i, j )] = 1 /D i , (16) with, in fact, one of t he two quantities zero and the ot her equal to 1 /D i . For ( i, j ) / ∈ E , it naturally follows that M k [ i, j ] = 0 . Simi larly , M k [ i, ( s, r )] = 0 , whene ver i 6 = s and ( s, r ) ∈ E . (17) Since |O i | = D i , all the elements in row i of m atrix M k add up to 1. Now deﬁne row ( i, j ) of matrix M k , which describes how the mass of the virtual buf fer ( i, j ) , for ( i, j ) ∈ E , gets d istributed. When l ink ( i, j ) works reliably at time step k (i.e., X k [ i, j ] = 1 ), all t he mass buf fered on link ( i, j ) i s transferred to node j ; otherwise, no mass is trasferred from buf fer ( i, j ) to node j and th e buf fer retains all its pre vi ous mass and i ncreases it by a quantity equal t o th e mass COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 13 that node i fail to s end to no de j . These conditions are captured by deﬁning M k entries as follows: M k [( i, j ) , j ] = X k [ i, j ] , (18) M k [( i, j ) , ( i, j )] = 1 − X k [ i, j ] . (19) Also, for obvious reasons, M k [( i, j ) , p ] = 0 , ∀ p 6 = j, p ∈ V , (20) M k [( i, j ) , ( s, r )] = 0 , ∀ ( i, j ) 6 = ( s, r ) , ( s, r ) ∈ E . (21) Clearly , all the entries of the ro w labeled ( i, j ) add up to 1, which result s in M k being a ro w stochastic matrix for all k ≥ 1 . 2) Pr operties of M k : L et us deno te the set of all possi ble instances (depending on the values of the indicator v ariables X k [ i, j ] , ( i, j ) ∈ E , k ≥ 1 ) of m atrix M k as M . The matrices in the s et M have the following properties: (M1) The set M is ﬁni te. Each distinct matrix i n M corresponds to dif ferent instant iations of the indicator variables deﬁned in (10), result ing in exactly 2 |E | distinct matrices in M . (M2) Eac h ma trix in M is a ﬁnite-dimensional s quar e r ow stochastic matri x. The number of rows of each matrix M k ∈ M , as deﬁned above, is n = m + |E | , which is ﬁnite. Also, from (14)–(21), theses matrices are s quare ro w-s tochastic matri ces. (M3) Eac h p ositive element of any matrix in M is lower bounded by a posit ive cons tant. Let us denote this lower bound as c . Then, d ue to the manner in which matrices in M are constructed, we can deﬁne c to be the positive constant obtain ed as c = min i,j,M | M ∈M ,M [ i,j ] > 0 M [ i, j ] . (M4) The matrix M k , k ≥ 0 , may be chosen to be an y matrix M ∈ M with a non-zero probabilit y . T he choice of the transition matrix at each time step is independent and ident ically dist ributed (i.i.d.) due to th e assum ption that link failures are independent (between nodes and t ime steps). Explanation: The probability distribution on M is a function of the probabil ity distrib ution on the link reliability . In particular , if a certain M ∈ M is o btained when the links in E ′ ⊆ E are reliable, and th e remaining l inks are unreli able, t hen the probabi lity th at M k = M i s equal t o Π ( i,j ) ∈E ′ q ij Π ( i,j ) ∈E −E ′ (1 − q ij ) . (22) (M5) F or each i ∈ V , t her e exists a ﬁnite posit ive inte ger l i such that it is possi ble to ﬁnd l i matrices in M (po ssibly with r epetition) such that their pr o duct (i n a chosen or d er) is a r ow stochastic matrix with t he column t hat corr esponds to n ode i con taining strictly posit ive entr ies. This property states that, for each i ∈ V , there exists a matrix T ∗ i , obt ained as the product of l i COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 14 matrices in M that has the following properties: T ∗ i [ j, i ] > 0 , ∀ j ∈ V , (23) T ∗ i [( j 1 , j 2 ) , i ] > 0 , ∀ ( j 1 , j 2 ) ∈ E . (24) This follows from t he fact that the underlying graph G a is strongl y conn ected (in fact, it can b e easily shown that l i ≤ m ). T o simplify the presentation below , and due to the self-loops , we can take l i to be equal to a const ant l , for all i ∈ V . Howe ver , it shou ld be easy to s ee th at the ar guments below can be g eneralized to the case when the l i ’ s may be diffe rent. W e can also sh ow that under ou r assumption for link fa ilures, th ere e xists a single matrix, say T ∗ , which simult aneously sati sﬁes the conditi ons in (23)–(24) for all i ∈ V . When all the links in the network operate reliably , network G ( V , E ) is st rongly connected (by assump tion). Since G is strongly connected, there is a di rected path between ev ery pair of nodes i and j , i.e., i, j ∈ V . In the augmented network G a , for each ( i, j ) ∈ E , there is a l ink from node i to node ( i, j ) , and a link from node ( i, j ) t o node j . Thus, it shoul d b e clear that the augment ed network G a is s trongly connected as well. Consider a spanni ng tree rooted at node 1, such that all the nodes in V = V ∪ E hav e a directed path towards node 1, and als o a spannin g t ree in wh ich all t he nodes have directed paths fr om node 1. Choose that m atrix, say M ∗ ∈ M , w hich corresponds to all the l inks on these two spanning trees, as well as self-loops at all i ∈ V , b eing reli able. If the tot al number of lin ks that are t hus reliable is e , it should be obvious that ( M ∗ ) e will contain only non-zero ent ries in columns correspondin g to i ∈ V . Th us, l deﬁned above may be chosen as e . There are seve ral other ways of constructing T ∗ , some of which may result in a sm aller value of l . V . E R G O D I C I T Y A N A L Y S I S O F P RO D U C T S O F M A T R I C E S M k W e will n ext analyze the ergodic behavior of the for war d pr oduct T k = M 1 M 2 . . . M k = Π k j =1 M j , where M j ∈ M , ∀ j = 1 , 2 , . . . , k . Informall y d eﬁned, weak ergodicity of T k obtains if the rows of T k tend t o equal ize as k → ∞ . In this work, we focus on t he weak ergodicity no tion, and establish probabilisti c statement s pertaining the ergodic behavior of T k . The analysis builds upon a large body of literature on produ cts of nonnegati ve matrices (see, e.g., [1] for a comprehensive account). First, we introduce the b asic t oolkit adopted from [8], [9], [1], and t hen us e it to analyze the er go dicity of T k . A. Some Resul ts P ertaini ng Coefﬁcients of Er godicity Informally speaking , a coefﬁcient of er god icity of a matrix A characterizes how dif ferent two rows of A are. For a row st ochastic matrix A , prop er 9 coef ﬁcients of ergodicity δ ( A ) and λ ( A ) are deﬁned 9 Any scalar function τ ( · ) continuous on the set of n × n row stochastic matrices, which satisﬁes 0 ≤ τ ( A ) ≤ 1 , is said to be a proper coef ﬁcient of ergodicity if τ ( A ) = 0 if and only if A = e T v , where e is the all-ones row ve ctor , and v ≥ 0 is such that v e T = 1 [1] . COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 15 as: δ ( A ) := max j max i 1 ,i 2 | A [ i 1 , j ] − A [ i 2 , j ] | , (25) λ ( A ) := 1 − min i 1 ,i 2 X j min( A [ i 1 , j ] , A [ i 2 , j ]) . (26) It is easy to see that 0 ≤ δ ( A ) ≤ 1 and 0 ≤ λ ( A ) ≤ 1 , and that the ro ws are identical if and only i f δ ( A ) = 0 . Additionally , λ ( A ) = 0 if and only if δ ( A ) = 0 . The next result establishes a relatio n between the coef ﬁcient of ergodicity δ ( · ) of a product of row stochastic matrices, and the coefﬁcients of er godicity λ ( · ) of the individual matrices deﬁning the product. This result will be used in the proo f of Lem ma 2. It was establ ished in [8] and also follows from th e more general statement of Theorem 4.8 in [1]. Propositi on 1: For any p square ro w stochastic matrices A 1 , A 2 , . . . A p − 1 , A p , δ ( A 1 A 2 · · · A p − 1 A p ) ≤  Π p − 1 i =1 λ ( A i )  δ ( A p ) ≤ Π p i =1 λ ( A i ) . (27) The result in (27) is particul arly us eful to infer ergodicity of a product of mat rices from the ergodic properties of the individual matrices in the prod uct. F or example, if λ ( A i ) is less than 1 for all i , then δ ( A 1 A 2 · · · A p − 1 A p ) will tend to zero as p → ∞ . W e will next int roduce an imp ortant cl ass of matrices for which λ ( · ) < 1 . Deﬁnition 1: A matrix A i s said to be a scrambling matri x, i f λ ( A ) < 1 [1]. In a scrambl ing matrix A , since λ ( A ) < 1 , for each pair of rows i 1 and i 2 , there exists a column j (which may depend on i 1 and i 2 ) such that A [ i 1 , j ] > 0 and A [ i 2 , j ] > 0 , and vi ce-versa. As a special case, if any one column of a ro w sto chastic matrix A contains o nly non-zero entries, th en A m ust be scrambling. B. Er godicity Analysis of Iterations of the Robust Al gorithm W e next analyze the ergodic properties of th e products of matrices that resul t from each of the iterations compris ing our robust algorit hm. Let us focus on ju st one of th e iterations, say y k , as the treatment of the z k iteration is identical. As described in Section IV, th e progress of the y k iteration can be recast as an inhomogeneous Marko v chain y k = y k − 1 M k , k ≥ 1 , (28) where M k ∈ M , ∀ k . As already discus sed, the sequence o f M k ’ s that wil l g overn th e progress of y k is determined by comm unication l ink a vailability . (28). Deﬁning T k = Π k j =1 M j , we o btain: y k = y 0 M 1 M 2 · · · M k = y 0 Π k j =1 M j = y 0 T k , k ≥ 1 . (29) By con vention, Π 0 i = k M i = I for any k ≥ 1 ( I denotes the n × n identit y matrix). COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 16 Recalling th e constant l deﬁned in (M5), deﬁne W k as follows, W k = Π k l j =( k − 1) l +1 M j , k ≥ 1 , M j ∈ M , (30) from where i t follo ws that T lk = Π k j =1 W k , k ≥ 1 . (31) Observe t hat the set of time steps “cover ed” by W i and W j , i 6 = j , are n on-overlapping. It is also important t o no te for subsequent analysis that, since the M k ’ s are row sto chastic matrices and t he product of any number o f row s tochastic matrices is row stochastic, all the W k ’ s and T k ’ s are also row stochastic matrices. Lemma 2 will establish t hat as the number of iteration steps goes to inﬁnity , th e rows of the matrix T k tend t o equalize. For proving Lemma 2, we need the resul t in Lemma 1 stated below , which establishes that there e xists a nonzero pro bability of choosing mat rices in M such that the W k ’ s as deﬁned in (30) are scrambling. Lemma 1: There e xi st constant s w > 0 and d < 1 such that, with probability equal to w , λ ( W k ) ≤ d for k ≥ 1 , independently for dif ferent k . Pr oof: Each W k matrix i s a product of l matrices from the set M . The cho ice o f the M k ’ s that form W i and W j is i ndependent for i 6 = j , since W i and W j “cov er” non-overlapping intervals of time. Thu s, under the i.i.d. assu mption for selectio n of matrices from M  property (M 4)  , and property (M5), it follows that , wit h a non-zero probabil ity (independently for W k and W k ′ for k 6 = k ′ ), matri x W k for each k is scrambl ing. Let us denote by w t he probabi lity t hat W k is scrambling. Let us deﬁne W as the set of all p ossible instances o f W k that are scrambling. The set W is ﬁnite because the set M is ﬁnite, and W is also non-empty (thi s follows from the discuss ion of (M5)). Let us d eﬁne d as the tigh t upper bound on λ ( W ) , for W ∈ W , i.e., d ≡ max W ∈W λ ( W ) . (32) Recall that λ ( A ) for any s crambling m atrix A is st rictly less than 1. Since W is non-empty and ﬁnite, and contains only scrambli ng matrices, it follows that d < 1 . (33) Lemma 2: There exist const ants α and β ( 0 < α < 1 , 0 ≤ β < 1 ) such that, with probability greater than ( 1 − α k ) , δ ( T k ) ≤ β k for k ≥ 8 l /w . Pr oof: Let k ∗ =  k l  and ∆ = k − l k ∗ . Thus, 0 ≤ ∆ < l . From (29) through (31), observe that T k = T lk ∗ +∆ = T lk ∗ Π ∆ j =1 M lk ∗ + j , where T lk ∗ is the product of k ∗ of W j matrices, where 1 ≤ j ≤ k ∗ . As per Lemma 1, for each W j , COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 17 the prob ability t hat λ ( W j ) ≤ d < 1 is equal to w . Thus, th e expected num ber of scrambling m atrices among the k ∗ matrices is w k ∗ . Denote by S the actual num ber of scrambling W j matrices among the k ∗ matrices. Then t he Chernof f lower tail bound tells us t hat, for any φ > 0 , Pr { S < (1 − φ ) E ( S ) } < e − E ( S ) φ 2 / 2 (34) ⇒ Pr { S < (1 − φ )( w k ∗ ) } < e − ( w k ∗ ) φ 2 / 2 . (35) Let us choos e φ = 1 2 . Then, Pr { S < ( w k ∗ ) / 2 } < e − w k ∗ / 8 (36) ⇒ Pr { S ≥ w k ∗ / 2 } > 1 − e − w k ∗ / 8 . (37) Thus, at least ⌊ w k ∗ / 2 ⌋ of the W matrices from the k ∗ matrices forming T lk ∗ are scrambling (each wit h λ v alue ≤ d , by Lemma 1 ) with p robability greater t han 1 − e − w k ∗ / 8 . Proposition 1 then imp lies that δ ( T k ) = δ ( T lk ∗ +∆ ) = δ  Π k ∗ i =1 W i   Π ∆ i =1 M lk ∗ + i  ≤  Π k ∗ i =1 λ ( W i )   Π ∆ i =1 λ ( M lk ∗ + i )  Since at least ⌊ w k ∗ / 2 ⌋ of the W i ’ s have λ ( W i ) ≤ d with p robability greater than 1 − e − w k ∗ / 8 , and λ ( M j ) ≤ 1 , ∀ j , it follows that δ ( T k ) ≤ d ⌊ wk ∗ / 2 ⌋ (38) with probability exceeding 1 − e − w k ∗ / 8 . (39) Let us deﬁne α = e − w 16 l and β = d w 8 l . No w , if k ≥ 8 l/ w , then i f follows that k ≥ 2 l , and k ∗ =  k l  ≥ k 2 l (40) ⇒  w k ∗ 2  ≥  w k 4 l  (41) ⇒  w k ∗ 2  ≥ w k 8 l (42) ⇒ d ⌊ wk ∗ 2 ⌋ ≤ d wk 8 l ( because 0 ≤ d < 1) (43) ⇒ d ⌊ wk ∗ 2 ⌋ ≤ β k . (44) COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 18 Similarly , if k ≥ 8 l /w , it follows that k ∗ =  k l  ≥ k 2 l (45) ⇒ e − w k ∗ / 8 ≤ e − wk 16 l = α k (46) ⇒ 1 − e − w k ∗ / 8 ≥ 1 − α k . (47) By substituting (44) and (47) into (38 ) and (39 ), respecti vely , the result follows. Note that α and β in Lemma 2 are independent of tim e. T he threshold on k for which L emma 2 holds, namely k ≥ 8 l /w , can be improved by us ing better bounds in (40) and (45). Knowing a smaller threshold on k for wh ich Lemma 2 holds can be beneﬁcial in a p ractical implement ation. In the abov e deriv ation for Lemm a 2, we chose a somewhat loose threshold in order to maintain a simpler form for the probabili ty expression (namely , 1 − α k ) and also a simpler expression for the bound on δ ( T k ) (namely , β k ). Lemma 3: δ ( T k ) con ver g es almost surely to 0. Pr oof: For k ≥ 8 l /w , from Lemma 2, we hav e that Pr { δ ( T k ) > β k } ≤ α k , 0 < α < 1 , 0 ≤ β < 1 . Then, it is easy to see that P k Pr { δ ( T k ) > β k } ≤ 8 l /w + P k α k < ∞ . Then, by the ﬁrst Borel- Cantelli lemm a, Pr { the ev ent that δ ( T k ) > β k occurs inﬁnitely often } = 0 . Th erefore, δ ( T k ) con verges to 0 alm ost surely . V I . C O N V E R G E N C E A N A L Y S I S O F R O B U S T I FI E D R A T I O C O N S E N S U S A L G O R I T H M The analy sis b elow s hows that the ratio algorithm achieves asymptotic consensus correctly in t he presence of the virtual nodes, e ven if diagonals of th e transition matrices ( M k ’ s) are not alwa ys strictly positive. A key consequence is that the v alue of z k [ i ] is not necessarily greater from zero (at least not for all k ), wh ich creates some difﬁculty when calculating the ratio y k [ i ] /z k [ i ] . As noted earlier , aside from these diffe rences, our algorithm is similar to that analyzed in [2]. Our proof has some similarities to the proo f in [2], with the diffe rences accountin g for our relaxed assumptions. By deﬁning z k in an analogous way as we deﬁned state y k in Section IV, the robustiﬁed version of the ratio cons ensus algorithm in (3)–(4) can be described in matrix form as y k = y k − 1 M k , k ≥ 1 , (48) z k = z k − 1 M k , k ≥ 1 , (49 ) where M k ∈ M , k ≥ 1 , y 0 [ i ] ≥ 0 , ∀ i , z 0 [ i ] ≥ 0 , ∀ i , and P j z 0 [ j ] > 0 , and y 0 [( i, j )] = z 0 [( i, j )] = 0 , ∀ ( i, j ) ∈ E . The same m atrix M k is used at step k of the iterations in (48) and (49), howe ver , M k may vary over k . Recall that y k and z k in (48) and (49) ha ve n elements, but only the ﬁrst m elements corre spond to computing nodes in the augm ented network G a ; the remaini ng ent ries in y k and z k correspond t o virtual b u ff ers. COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 19 The go al of the algorithm is for each compu ting nod e to obtai n a consensus value deﬁned as π ∗ = P j y 0 [ j ] P j z 0 [ j ] . (50) T o achieve this goal, each node i ∈ V calculates π k [ i ] = y k [ i ] z k [ i ] , (51 ) whenev er the denominator is large enough, i.e., whenev er z k [ i ] ≥ µ, (52) for som e constant µ > 0 to be deﬁn ed la ter . W e will show that, for each i = 1 , 2 , . . . , m , t he sequence π k [ i ] thus calculated asymptotically con ver g es to the desired consensus v alue π ∗ . T o sho w th is, we ﬁrst establish t hat (52) occurs inﬁnitely often, thus com puting nodes can calculate the ratio i n (51) inﬁnitely often. Th en, we will s how t hat as k goes t o inﬁnity , the sequence o f ratio computations in (51) will con ver ge to th e v alue in (50). The con ver gence when P j y 0 [ j ] = 0 can be shown trivially . So let us now consi der the case when P j y 0 [ j ] > 0 , and deﬁne new st ate variables ˜ y k and ˜ z k for k ≥ 0 as follows: ˜ y k [ i ] = y k [ i ] P j y 0 [ j ] , ∀ i, (53) ˜ z k [ i ] = z k [ i ] P j z 0 [ j ] , ∀ i. (54 ) Thus, ˜ y 0 and ˜ z 0 are deﬁned by n ormalizing y k and z k . It foll ows that ˜ y 0 and ˜ z 0 are sto chastic row v ectors. Also, s ince our transition m atrices are ro w sto chastic, it follows that ˜ y k and ˜ z k are also stochastic vectors for all k ≥ 0 . W e assume that each node knows a lo wer bound on P j z 0 [ j ] , denoted by µ z . In typical scenarios, for all i ∈ V , z 0 [ i ] will be positive, and, node i ∈ V can use z 0 [ i ] as a non-zero lower bound on P j z 0 [ j ] (thus, in general, the lower bound us ed by different nodes may not be id entical). W e also assume an upper bound, say µ y , on P j y 0 [ j ] . Let us deﬁne µ = µ z c l n . (55) As time progresses, each no de i ∈ V will calculate a new estimate of the consensus v alue whenever z k [ i ] ≥ µ . The next lemma establish es that nodes will can carry out this calculation inﬁnit ely often. Lemma 4: Let T i = { τ 1 i , τ 2 i , · · · } denote the sequence o f t ime instances when node i upd ates its estimate of the consensus usin g (51), and ob eying (52), where τ j i < τ j +1 i , j ≥ 1 . The sequence T i contains inﬁnitely many elements wi th probability 1. COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 20 Pr oof: T o prove the lemma, it will sufﬁce to prove that for inﬁnitely many values of k , z k [ i ] > µ , with probability 1. Assumpti ons (M1)-(M5) imply that each matrix W j , j ≥ 1 (deﬁned in (30)) contains a strictly posit iv e colum n corresponding to index i ∈ V with a non-zero probability , s ay γ i > 0 . Also, the choice of W k 1 and W k 2 is ind ependent of each ot her for k 1 6 = k 2 . T herefore, the second Borel- Cantelli lemma implies that, with prob ability 1, for inﬁnitely many values of j , W j will have t he i -th column strictly positiv e. Since the non -zero elements of each matrix in M are all greater or equal to c , c > 0 (by property M 3), and si nce W j is a produ ct of l m atrices in M , it follo ws that all the non-zero elements of each W j must be lower b ounded by c l . Consider only those j ≥ 1 for which W j contains pos itive i -th column. As noted abov e, there are inﬁnitely many such j values. No w , ˜ z j l = ˜ z ( j − 1) l W j . As noted above, ˜ z k is a stochastic vector . Thus , for any k ≥ 0 , X i ˜ z k [ i ] = 1 (56) and, at least one of the elements of ˜ z ( j − 1) l must be greater or equal to 1 /n . Also , all the elements in columns o f W j indexed b y i ∈ V are lower bounded by c l (recall that we are now only considering those j for which the i -th col umn of W j is positiv e). This implies that, ˜ z j l [ i ] ≥ c l /n (57) ⇒ z j l [ i ] ≥ X j z 0 [ j ] ! c l /n (58) ⇒ z j l [ i ] ≥ µ z c l /n (59) ⇒ z j l [ i ] ≥ µ, ∀ i ∈ V (by (55)) (60) Since inﬁnitely many W j ’ s will contain a positive i -th column (with probabil ity 1), (60 ) holds for inﬁnitely m any j with probabil ity 1. Therefore, wi th probabil ity 1, the set T i = { τ 1 i , τ 2 i , · · · } contains inﬁnitely many elements, for all i ∈ V . Finally , the next theorem shows that the ratio consensus algorithm will con verge to the consensus value deﬁned in (50). Theorem 1: Let π i [ t ] denote node i ’ s estimate of the consens us value calculated at time τ t i . F o r each node i ∈ V , with prob ability 1, the estimate π i [ t ] con verge s to π ∗ = P j y j [0] P j z j [0] . Pr oof: No te that the transition m atrices M k , k ≥ 1 , are randomly drawn from a certain dist ribution. By an “execution” of the algorithm, we will mean a particular i nstance o f the M k sequence. Thu s, the COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 21 distribution on M k ’ s results in a distribution on the executions. Lemma 3 implies that, Pr n lim k →∞ δ ( T k ) = 0 o = 1 . T o gether , Lemmas 3 and 4 imply that, with probabili ty 1, for a chosen execution, (i) for any ψ > 0 , there exists a ﬁnite k ψ such that, for all k ≥ k ψ , δ ( T k ) < ψ , and (i i) there exist inﬁnitely many v alu es of k ≥ k ψ such that z k [ i ] ≥ µ (i.e., k ∈ T i for the chosen execution). Consider any k ≥ k ψ such t hat z k [ i ] > µ . Since δ ( T k ) ≤ ψ , the rows of matrix T k are “wi thin ψ ” of each other . Observe that ˜ y k is obtained as the product of stochastic row vector ˜ y 0 and T k . Thus, ˜ y k is in t he con vex hull of t he rows of T k . Simi larly ˜ z k is i n the con vex hull of t he rows of T k . Therefore, the j -th elements of ˜ y k and ˜ z k are wi thin ψ of each oth er , for all j . Therefore, | ˜ y k [ i ] − ˜ z k [ i ] | ≤ ψ (61) ⇒     ˜ y k [ i ] ˜ z k [ i ] − 1     ≤ ψ ˜ z k [ i ] (62) ⇒      y k [ i ] z k [ i ] − P j y 0 [ j ] P j z 0 [ j ]      ≤ ψ P j y 0 [ j ] z k [ i ] (by (53) and (54)) (63) ⇒      y k [ i ] z k [ i ] − P j y 0 [ j ] P j z 0 [ j ]      ≤ ψ µ y z k [ i ] (because P j y 0 [ j ] ≤ µ y ) (64) ⇒      y k [ i ] z k [ i ] − P j y 0 [ j ] P j z 0 [ j ]      ≤ ψ µ y µ . (65) Now , giv en any ǫ > 0 , let us choo se ψ = ǫµ/µ y . Then (65) impl ies that      y k [ i ] z k [ i ] − P j y 0 [ j ] P j z 0 [ j ]      ≤ ǫ whenev er k ≥ k ψ and k ∈ T i . Therefore, in the limit, y k [ i ] z k [ i ] for k ∈ T i con ver ges to P j y 0 [ j ] P j z 0 [ j ] . This resul t holds with p robability 1, since conditio ns (i) and (ii) s tated above hold with probability 1. The result above can be strengt hened by proving con vergence of the algorithm ev en i f each node i ∈ V updates its estimate whenever z k [ i ] > 0 (no t necessarily ≥ µ ). T o prov e the con ver gence in this case, the argument is similar to that in Theorem 1 , wit h two modi ﬁcations: • Lemma 4 needs to be strength ened by observing that there exist inﬁnitely many time instants at which z k [ i ] > µ simul taneously for all i ∈ V . This is true due to the existence of a matrix T ∗ (as seen in the discus sion of (M5)) that contains posi tiv e columns corresponding to all i ∈ V . • Using the above obs erv ation, and t he argument in the proo f of Theorem 1, i t t hen fol lows that, with probability 1, for any ψ , there exists a ﬁnite k ψ such that δ ( T k ) < ψ whenever k ≥ k ψ . As before, deﬁning ψ = ǫµ/ µ y , it can be shown that for any ǫ , t her e e xis ts a k ǫ ≥ k ψ such t hat th e COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 22 following inequality holds for all i ∈ V simult aneously .      y k ǫ [ i ] z k ǫ [ i ] − P j y 0 [ j ] P j z 0 [ j ]      ≤ ǫ ∀ i ∈ V (66) ⇒ P j y 0 [ j ] P j z 0 [ j ] − ǫ ≤ y k ǫ [ i ] z k ǫ [ i ] ≤ P j y 0 [ j ] P j z 0 [ j ] + ǫ ∀ i ∈ V (67) Naturally , z k ǫ [ i ] 6 = 0 , ∀ i ∈ V . It i s now easy to argue that t he above inequality will continue to hold for all k > k ǫ and each i ∈ V w henev er z i [ k ] > 0 . T o see this, ob serve that, for k > k ǫ , y k = y k ǫ Π k k ǫ +1 M k Deﬁne P = Π k k ǫ +1 M k and ψ = ǫµ/µ y . Then, we have that, whene ver z k [ i ] > 0 for k > k ǫ , y k [ i ] z k [ i ] = P n j =1 y k ǫ [ j ] P [ j, i ] P n j =1 z k ǫ [ j ] P [ j, i ] (68) = P n j =1 ,P [ j ,i ] 6 =0 y k ǫ [ j ] P [ j, i ] P n j =1 ,P [ j ,i ] 6 =0 z k ǫ [ j ] P [ j, i ] (summation over non-zero P [ j, i ] terms) ⇒ min j,P [ j,i ] > 0 y k ǫ [ j ] z k ǫ [ j ] ≤ y k [ i ] z k [ i ] ≤ max j,P [ j,i ] > 0 y k ǫ [ j ] z k ǫ [ j ] (69) ⇒ P j y 0 [ j ] P j z 0 [ j ] − ǫ ≤ y k [ i ] z k [ i ] ≤ P j y 0 [ j ] P j z 0 [ j ] + ǫ from (67) (70) ⇒      y k [ i ] z k [ i ] − P j y 0 [ j ] P j z 0 [ j ]      ≤ ǫ for all i ∈ V and k ≥ k ǫ (71) This proves the con vergence of the algorithm in the lim it. Recall that for thi s con ver gence it sufﬁces if each node updates its estimate of t he consensus whenever it s z v alue is positive. (69) follows from t he observation that P j a [ j ] u [ j ] P j b [ j ] u [ j ] = P j h a [ j ] b [ j ] b [ j ] u [ j ] P k b [ k ] u [ k ] i is a weighted av erage of a [ j ] b [ j ] , and therefore, lower bounded by min j a [ j ] b [ j ] and upper boun ded by max j a [ j ] b [ j ] . V I I . C O N C L U D I N G R E M A R K S A N D F U T U R E W O R K Although our analysis abov e is motiv ated by wireless en vironments wh erein transm issions may not succeed, the analysis is mo re general. In particular , it app lies to other situation s i n which properties (M1)–(M5) are true. Ind eed, property (M4) by its elf is not as im portant as its cons equence that a given W k matrix has non-zero columns indexed by i ∈ V . A p articular application of the above analysis is in the case when messages may be delayed. As discussed previously , mass is t ransfered by any node to it s neighbors by means of messages. Since these messages m ay be d elayed, a m essage sent on link ( i, j ) i n s lot k m ay be receiv ed by node j in a later sl ot. Let us denote by V k [ i ] the s et of m essages recei ved by node i at step k . It is possi ble for V k [ i ] to contain mult iple messages from the same node. Note that V k [ i ] may contain a message sent by COORDIN A TED SCIENCES LABORA TOR Y TECHNICAL REPOR T UILU-EN G-11-2208 (CRHC-11-0 6) 23 node i to itself as well. Let us deﬁne the iteration for y k as follows: y k [ i ] = X v ∈ V k [ i ] v . (72) The iteration for z k can be deﬁned analogously . Our robust consensus algorithm essentially i mplements the above iteration, allowing for delays in delivery of mass on any li nk ( i, j ) (caused by link failures). Howe ver , in effect, the robust algorithm also ensures FIFO (ﬁrst-in-ﬁrst-out ) delivery , as follows. In slot k , if n ode i receiv es mass sent by node j ∈ I i in slot s, s ≤ k , then m ass s ent by no de j in slots strictly smaller t han s is either recei ved previously , o r will be recei ved i n s lot k . The virtu al buf fer mechanis m essent ially m odels asynchrono us com munication, w herein the messages between any pair of nodes in the network may require arbitrary delay , governed by som e distribution. It is not difﬁcult t o see that the i terativ e alg orithm (72) shou ld be able to achi e ve consensus correctly ev en under ot her distributions on m essage d elays, with possible correlation between the delays. In fa ct, it is also pos sible to t olerate non-FIFO (or out-of-order) mess age delivery pro vided that t he delay distribution satisﬁes some re asonable constraints. Delay of up to B slots on a certain link ( i, j ) ∈ E can be modeled using a s ingle chain o f B virtual nodes, with li nks from n ode i t o e very virtual nodes, and link from the last of the B nodes to node j —in this setting, depending on the delay incurred by a pack et, appropriate link from node i to one o f the virtual nod e on the delay chain (or t o j , if delay i s 0) is used. Note that while we made certain assumptio ns regarding link failures, the analysis relies primarily on two implications of these assumptions, namely (i) the rows of t he transition matrix T k become close to identical as k in creases, and (ii) z k [ i ] is bounded away from 0 for each i inﬁnitely often. When t hese implication s are true, similar con vergence results may hold in ot her en vironments. R E F E R E N C E S [1] E . Seneta, Non-ne gative Matrices and Marko v Chains , revised printing ed. Ne w Y ork, NY : Springer , 2006. [2] F . Benezit, V . Blondel, P . T hiran, J. Tsitsiklis, and M. V etterli, “W eighted gossip: Distri buted av eraging using non-doubly stochastic matrices, ” in Pr oc. of IEEE International Symposium on Information Theory , June 2010, pp. 1753 –1757 . [3] D . Kempe, A . Dobra, and J. Gehrke , “Gossip-based computation of aggregate i nformation, ” in Proc . I EEE Symposium on F oundations of Computer Science , Oct. 2003, pp. 482 – 491. [4] A . D. Dom´ ınguez-Garc ´ ıa and C. N. Hadjicostis, “Coordination and control of distri buted energy resources for provision of ancillary services, ” in Pr oc. IE EE SmartGridComm , 2010, pp. 537 – 542. [5] M. Cao, A. S. Morse, and B. D. O. A nderson, “Reaching a consensus in a dynamically changing en vironment: Con ver gence rates, measurement delays, and asynchronous eve nts, ” SIAM Journal on Contr ol and Optimization , vo l. 47, no. 2, pp. 601–623, 2008. [6] A . Nedi ´ c and A. Ozdaglar , “Conv ergence rate for consensu s with delays, ” Journ al of Global Optimization , vol. 47, pp. 437–456, July 2010. [7] K . Tsianos and M. Rabbat, “Distributed consensus and optimization under communication delays, ” in Pr oceedings of Annual Allerton Confer ence on Communication, Contr ol, and Computing , September 2011. [8] J. Hajnal, “W eak ergodicity in non-homoge neous Marko v chains, ” Proceed ings of the Cambridge Philosophical Society , vo l. 54, pp. pp. 233–24 6, 1958. [9] J. W olfowitz, “Products of i ndecompo sable, aperiodic, stochastic matrices, ” Proce edings of the American Mathematical Society , vol. 14, no. 5, pp. pp. 733–737, 1963.

Distributed Algorithms for Consensus and Coordination in the Presence of Packet-Dropping Communication Links - Part II: Coefficients of Ergodicity Analysis Approach

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment