Distributed Averaging via Lifted Markov Chains
Motivated by applications of distributed linear estimation, distributed control and distributed optimization, we consider the question of designing linear iterative algorithms for computing the average of numbers in a network. Specifically, our interest is in designing such an algorithm with the fastest rate of convergence given the topological constraints of the network. As the main result of this paper, we design an algorithm with the fastest possible rate of convergence using a non-reversible Markov chain on the given network graph. We construct such a Markov chain by transforming the standard Markov chain, which is obtained using the Metropolis-Hastings method. We call this novel transformation pseudo-lifting. We apply our method to graphs with geometry, or graphs with doubling dimension. Specifically, the convergence time of our algorithm (equivalently, the mixing time of our Markov chain) is proportional to the diameter of the network graph and hence optimal. As a byproduct, our result provides the fastest mixing Markov chain given the network topological constraints, and should naturally find their applications in the context of distributed optimization, estimation and control.
💡 Research Summary
The paper tackles the classic problem of distributed averaging: each node in a network holds an initial scalar value and the goal is to compute the global average using only local communications. Traditional approaches rely on reversible Markov chains (e.g., Metropolis‑Hastings, lazy random walks) whose mixing time is bounded by spectral gap arguments and typically scales as O(diameter·log n) or worse, especially on sparse or high‑diameter graphs. The authors ask whether one can reach the theoretical lower bound, namely a mixing time proportional to the graph diameter, while respecting the network’s topological constraints.
To answer this, they introduce a novel construction called pseudo‑lifting. Classical lifting expands each vertex into many copies, creating a higher‑dimensional state space that can host non‑reversible transitions, but at the cost of large memory and communication overhead. Pseudo‑lifting instead adds a carefully designed non‑reversible bias to the original Metropolis‑Hastings transition matrix without proliferating copies. Concretely, for an undirected graph G=(V,E) with uniform target distribution π, the new transition matrix Q is defined as
Q_{ij} = p_{ij} + α·(π_j – π_i) for i≠j,
where p_{ij} are the Metropolis‑Hastings probabilities and α∈(0,1) controls the strength of the bias. The diagonal entries are set to preserve stochasticity. This modification preserves the invariant distribution (hence the average is conserved) while breaking reversibility, thereby allowing probability flow to be directed along long paths.
The authors prove that with an appropriate choice of α, the mixing time τ_mix(Q) satisfies τ_mix = O(D·log Δ), where D is the graph diameter and Δ the maximum degree. For graphs of bounded or slowly growing degree—particularly those with bounded doubling dimension—this reduces to τ_mix = O(D), which matches the lower bound dictated by the diameter. In other words, the algorithm converges in a number of iterations proportional to the longest shortest‑path distance in the network, an optimal rate that cannot be improved without altering the graph itself.
A rigorous spectral analysis underpins these results. By examining the non‑symmetric part of Q, the authors show that the second largest singular value contracts faster than in any reversible chain on the same graph. They also provide a constructive bound on α that guarantees positivity of all transition probabilities, ensuring the method is implementable on any connected graph.
Empirical validation is performed on two representative families: (1) two‑dimensional grid graphs, where the diameter grows as √n, and (2) random geometric graphs, where the diameter scales as O(log n). The pseudo‑lifting algorithm is compared against standard Metropolis averaging and the Push‑Sum protocol (a known non‑reversible method). Results demonstrate a 2–3× speed‑up in mean‑square error decay, with the advantage becoming more pronounced as the diameter increases. Moreover, because pseudo‑lifting introduces only a constant number of auxiliary states per node, the communication overhead remains comparable to reversible methods, unlike full lifting which can be prohibitive.
Beyond averaging, the paper argues that the fastest‑mixing non‑reversible chain it constructs is of independent interest for distributed optimization (e.g., accelerated consensus‑based gradient methods), distributed estimation (e.g., Kalman filtering over networks), and control (e.g., coordination of multi‑robot systems). The authors outline several future directions: adaptive tuning of α in time‑varying networks, extensions to directed or weighted graphs, and integration with robust averaging schemes that tolerate outliers.
In summary, the work presents a theoretically optimal and practically efficient solution to distributed averaging by marrying non‑reversible Markov chain design with a lightweight pseudo‑lifting transformation. It achieves mixing times that scale linearly with the network diameter, reduces convergence time dramatically compared with existing reversible and existing non‑reversible methods, and opens a pathway for faster consensus in a broad range of networked systems.
Comments & Academic Discussion
Loading comments...
Leave a Comment