The paper studies decentralized optimization over networks, where agents minimize a sum of {\it locally} smooth (strongly) convex losses and plus a nonsmooth convex extended value term. We propose decentralized methods wherein agents {\it adaptively} adjust their stepsize via local backtracking procedures coupled with lightweight min-consensus protocols. Our design stems from a three-operator splitting factorization applied to an equivalent reformulation of the problem. The reformulation is endowed with a new BCV preconditioning metric (Bertsekas-O'Connor-Vandenberghe), which enables efficient decentralized implementation and local stepsize adjustments. We establish robust convergence guarantees. Under mere convexity, the proposed methods converge with a sublinear rate. Under strong convexity of the sum-function, and assuming the nonsmooth component is partly smooth, we further prove linear convergence. Numerical experiments corroborate the theory and highlight the effectiveness of the proposed adaptive stepsize strategy.
1. Introduction. We study decentralized optimization problems in the form (P) min
, where f i : R d → R is the loss of agent i ∈ [m] := {1, . . . , m}, assumed to be (strongly) convex and locally smooth, and r i : R d → R ∪ {-∞, +∞} is a convex, nonsmooth proper extended value function. Both f i and r i are private functions assumed to be known only to agent i. Agents are embedded in a communication network, modeled as a fixed, undirected and connected graph G, with no central servers. Problem (P) arises in several applications of interest, including signal processing, machine learning, multi-agent systems, and communications. The literature abounds with decentralized methods for (P) under the standing assumption that the f i ’s are globally smooth and identical r i = r for all i; we refer to the tutorials (and reference therein) [27,37] and monograph [33] for comprehensive reviews. Global smoothness aside, these methods impose conservative stepsize bounds that depend on parameters such as the global Lipschitz constants of the agents’ gradients, the spectral gap of the graph gossip matrix, and other topological network properties. Such information is generally unavailable locally to agents in real-world deployments. As a result, stepsizes are frequently selected via manual tuning, yielding performance that is unpredictable, problem-dependent, and difficult to reproduce. Moreover, these approaches may fail when the agents’ losses are only locally smooth; see Sec. 1.1 for further discussion. This paper addresses these limitations by proposing adaptive-stepsize decentralized algorithms that solve (P) under local smoothness, private, nononsmooth functions r i , and without requiring global problem/network information.
1.1. Related works. 1. Adaptive centralized methods: There has been a growing interest in developing adaptive stepsize methods in centralized optimization. Rep-and a backtracking test on local agents’ losses that, via a descent inequality of a properly chosen Lyapunov function, certifies global convergence. To coordinate stepsizes, we propose two implementations based on global and local min-consensus protocols, respectively; they complement applications in different networking settings. Remarkably, neither implementation requires knowledge of global optimization constants or global network parameters.
Convergence guarantees: We provide a comprehensive convergence analysis for both proposed algorithms. We prove that all agents’ iterates consensually converge to a solution of (P). Under mere convexity of the f i ’s, we establish a sublinear rate of order O(1/k) for a suitable optimality gap.
Linear convergence rate with strongly convex f i and partly smooth r i : When each f i is locally strongly convex and each r i is partly smooth relative to a C 2 embedded manifold (Def. 5.1) around the limit point of the algorithms, we prove linear convergence. Our results are twofold: (i) if the aforementioned active manifold is affine, we establish finite-time manifold identification followed by a global linear rate, together with an explicit iteration-complexity bound; (ii) if the active manifold is a general C 2 manifold, we obtain asymptotic linear convergence.
An adaptive three-operator splitting: As a by-product of our design, we also obtain an adaptive three-operator splitting algorithm (a backtracking variant of Davis-Yin splitting) for composite optimization in the form (P), with locally smooth f , implementable in centralized or federated (master/client) architectures. This new method inherits the same convergence guarantees as its decentralized counterpart discussed above (global convergence under convexity and linear convergence under strong convexity plus partial smoothness), while remaining parameter-free (no knowledge of any optimization parameter is required). We believe this scheme is of independent interest as an adaptive splitting primitive for large-scale composite optimization.
In addition to the above guarantees, numerical experiments show that the proposed adaptive methods significantly outperform existing decentralized algorithms applicable to (P), which rely on non-adaptive (conservative) stepsize choices.
A preliminary version of this work appeared in [7]. The present paper substantially extends [7] by providing: (i) more principled and less restrictive adaptive stepsize rules based on a new merit function and an inexact descent analysis, leading to faster algorithms; (ii) a more comprehensive convergence theory, including iterate convergence and linear rates under local strong convexity of the f i ’s and partial smoothness of the r i ’s; and (iii) complete proofs and expanded experiments.
A related preprint appeared during the preparation of this manuscript [38], and after [7]. It develops an adaptive decentralized algorithm for a variant of (P) including conic constraints; an accelerated sublinear convergence is shown for convex losses and convex constraints. Adapti
This content is AI-processed based on open access ArXiv data.