Gossip Algorithms for Convex Consensus Optimization over Networks

Reading time: 6 minute
...

📝 Original Info

  • Title: Gossip Algorithms for Convex Consensus Optimization over Networks
  • ArXiv ID: 1002.2283
  • Date: 2010-02-11
  • Authors: Jie Lu, Choon Yik Tang, Paul R. Regier, Travis D. Bow

📝 Abstract

In many applications, nodes in a network desire not only a consensus, but an optimal one. To date, a family of subgradient algorithms have been proposed to solve this problem under general convexity assumptions. This paper shows that, for the scalar case and by assuming a bit more, novel non-gradient-based algorithms with appealing features can be constructed. Specifically, we develop Pairwise Equalizing (PE) and Pairwise Bisectioning (PB), two gossip algorithms that solve unconstrained, separable, convex consensus optimization problems over undirected networks with time-varying topologies, where each local function is strictly convex, continuously differentiable, and has a minimizer. We show that PE and PB are easy to implement, bypass limitations of the subgradient algorithms, and produce switched, nonlinear, networked dynamical systems that admit a common Lyapunov function and asymptotically converge. Moreover, PE generalizes the well-known Pairwise Averaging and Randomized Gossip Algorithm, while PB relaxes a requirement of PE, allowing nodes to never share their local functions.

💡 Deep Analysis

Deep Dive into Gossip Algorithms for Convex Consensus Optimization over Networks.

In many applications, nodes in a network desire not only a consensus, but an optimal one. To date, a family of subgradient algorithms have been proposed to solve this problem under general convexity assumptions. This paper shows that, for the scalar case and by assuming a bit more, novel non-gradient-based algorithms with appealing features can be constructed. Specifically, we develop Pairwise Equalizing (PE) and Pairwise Bisectioning (PB), two gossip algorithms that solve unconstrained, separable, convex consensus optimization problems over undirected networks with time-varying topologies, where each local function is strictly convex, continuously differentiable, and has a minimizer. We show that PE and PB are easy to implement, bypass limitations of the subgradient algorithms, and produce switched, nonlinear, networked dynamical systems that admit a common Lyapunov function and asymptotically converge. Moreover, PE generalizes the well-known Pairwise Averaging and Randomized Gossip A

📄 Full Content

Consider an N -node multi-hop network, where each node i observes a convex function f i , and all the N nodes wish to determine an optimal consensus x * , which minimizes the sum of the f i 's:

Since each node i knows only its own f i , the nodes cannot individually compute the optimal consensus x * and, thus, must collaborate to do so. This problem of achieving unconstrained, separable, convex consensus optimization has many applications in multi-agent systems and wired/ wireless/social networks, some examples of which can be found in [1,2].

The current literature offers a large body of work on distributed consensus (see [3] for a survey), including a line of research that focuses on solving problem (1) for an optimal consensus x * [1,2,[4][5][6][7][8][9][10][11][12][13][14][15][16][17]. This line of work has resulted in a family of discrete-time subgradient algorithms, including the incremental subgradient algorithms [1,2,[4][5][6][7][8]10,15], whereby an estimate of x * is passed around the network, and the non-incremental ones [9, 11-14, 16, 17], whereby each node maintains an estimate of x * and updates it iteratively by exchanging information with neighbors.

Although the aforementioned subgradient algorithms are capable of solving problem (1) under fairly weak assumptions, they suffer from one or more of the following limitations: L1. Stepsizes: The algorithms require selection of stepsizes, which may be constant, diminishing, or dynamic. In general, constant stepsizes ensure only convergence to neighborhoods of x * , rather than to x * itself. Moreover, they present an inevitable trade-off: larger stepsizes tend to yield larger convergence neighborhoods, while smaller ones tend to yield slower convergence.

In contrast, diminishing stepsizes typically ensure asymptotic convergence. However, the convergence may be very slow, since the stepsizes may diminish too quickly. Finally, dynamic stepsizes allow shaping of the convergence behavior [4,6]. Unfortunately, their dynamics depend on global information that is often costly to obtain. Hence, selecting appropriate stepsizes is not a trivial task, and inappropriate choices can cause poor performance.

L2. Hamiltonian cycle: Many incremental subgradient algorithms [1, 2, 4-7, 10, 15] require the nodes to construct and maintain a Hamiltonian cycle (i.e., a closed path that visits every node exactly once) or a pseudo one (i.e., that allows multiple visits), which may be very difficult to carry out, especially in a decentralized, leaderless fashion.

L3. Multi-hop transmissions: Some incremental subgradient algorithms [4][5][6] require the node that has the latest estimate of x * to pass it on to a randomly and equiprobably chosen node in the network. This implies that every node must be aware of all the nodes in the network, and the algorithms must run alongside a routing protocol that enables such passing, which may not always be the case. The fact that the chosen node is typically multiple hops away also implies that these algorithms are communication inefficient, requiring plenty of transmissions (up to the network diameter) just to complete a single iteration.

L4. Lack of asymptotic convergence: A variety of convergence properties have been established for the subgradient algorithms in [1,2,[4][5][6][7][8][9][10][11][12][13][14][15][16][17], including error bounds, convergence in expectations, convergence in limit inferiors, convergence rates, etc. In contrast, relatively few asymptotic convergence results have been reported, except for the subgradient algorithms with diminishing or dynamic stepsizes in [4-6, 10, 15-17].

Limitations L1-L4 facing the subgradient algorithms raise the question of whether it is possible to devise algorithms, which require neither the notion of a stepsize, the construction of a (pseudo-)Hamiltonian cycle, nor the use of a routing protocol for multi-hop transmissions, and yet guarantee asymptotic convergence, bypassing L1-L4. In this paper, we show that, for the one-dimensional case and with a few mild assumptions, such algorithms can be constructed. Specifically, instead of letting the network be directed, we assume that it is undirected, with possibly a time-varying topology unknown to any of the nodes. In addition, instead of letting each f i in (1) be convex but not necessarily differentiable, we assume that it is strictly convex, continuously differentiable, and has a minimizer. Based on these assumptions, we develop two gossip-style, distributed asynchronous iterative algorithms, referred to as Pairwise Equalizing (PE) and Pairwise Bisectioning (PB), which not only solve problem (1) and circumvent limitations L1-L4, but also are rather easy to implement-although computationally they are more demanding than the subgradient algorithms.

As will be shown in the paper, PE and PB exhibit a number of notable features. First, they produce switched, nonlinear, networked dynamical systems whose state evolves along an invari

…(Full text truncated)…

📸 Image Gallery

cover.png page_2.webp page_3.webp

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut