Adaptive Decentralized Composite Optimization via Three-Operator Splitting
The paper studies decentralized optimization over networks, where agents minimize a sum of {\it locally} smooth (strongly) convex losses and plus a nonsmooth convex extended value term. We propose decentralized methods wherein agents {\it adaptively} adjust their stepsize via local backtracking procedures coupled with lightweight min-consensus protocols. Our design stems from a three-operator splitting factorization applied to an equivalent reformulation of the problem. The reformulation is endowed with a new BCV preconditioning metric (Bertsekas-O’Connor-Vandenberghe), which enables efficient decentralized implementation and local stepsize adjustments. We establish robust convergence guarantees. Under mere convexity, the proposed methods converge with a sublinear rate. Under strong convexity of the sum-function, and assuming the nonsmooth component is partly smooth, we further prove linear convergence. Numerical experiments corroborate the theory and highlight the effectiveness of the proposed adaptive stepsize strategy.
💡 Research Summary
The paper addresses decentralized optimization over a network of agents that collectively minimize a global objective consisting of a sum of locally smooth (possibly strongly) convex loss functions and a nonsmooth convex extended‑value term. Traditional decentralized methods either treat the smooth part and consensus separately or handle the nonsmooth component via a proximal step, but they often require a globally fixed stepsize based on worst‑case Lipschitz constants, which is inefficient when local smoothness varies across agents.
To overcome these limitations, the authors first reformulate the problem by introducing local copies of the decision variable for each agent and a consensus constraint that forces all local copies to agree with a global variable. This yields three operators: (i) the smooth local losses, (ii) the nonsmooth global regularizer, and (iii) the consensus operator encoded by the network Laplacian. They then apply a three‑operator splitting scheme, which processes each operator in a separate sub‑step within every iteration.
A central technical contribution is the introduction of a new preconditioning metric, called the BCV (Bertsekas‑O’Connor‑Vandenberghe) metric. The metric combines each agent’s local smoothness constant with the network Laplacian to form a diagonal‑plus‑Laplacian preconditioner. Under this metric, the forward (gradient) step for each agent can be taken with a locally adapted stepsize, eliminating the need for a conservative global stepsize.
The adaptive stepsize is obtained through a fully decentralized backtracking line‑search. Each agent proposes a stepsize τ_i, performs a tentative gradient update, and checks a sufficient‑decrease condition on its own loss. If the condition fails, τ_i is reduced; otherwise it may be modestly increased for the next iteration. To guarantee network‑wide stability, the agents run a lightweight min‑consensus protocol after each backtracking phase, sharing the smallest τ among neighbors. This ensures that no agent uses a stepsize larger than what any neighbor can safely accommodate, while keeping communication overhead minimal.
Convergence analysis proceeds in two regimes. Under mere convexity, the authors construct a Lyapunov function that captures the objective gap, the consensus error, and the distance induced by the BCV metric. Using the non‑expansiveness of the three‑operator splitting and the descent guaranteed by backtracking, they show that the Lyapunov function decreases at an O(1/k) rate, implying sublinear convergence of the iterates to an optimal solution.
When the aggregate smooth loss is μ‑strongly convex and the nonsmooth term is partly smooth (i.e., its active set remains unchanged near the solution), the analysis is sharpened. The strong convexity yields a contraction property for the smooth forward step, while the partly smooth structure allows the proximal step to behave like a linear operator locally. Combining these with the uniform scaling provided by the BCV metric, the authors prove a linear convergence rate: the Lyapunov function decays geometrically with factor (1‑ρ), where ρ depends on μ, the local Lipschitz constants, the preconditioning parameter α, and the spectral gap of the network Laplacian.
Numerical experiments on random Erdős‑Rényi graphs and a real sensor‑network topology compare the proposed adaptive three‑operator method against decentralized ADMM, decentralized proximal gradient, and a fixed‑stepsize three‑operator baseline. Results demonstrate that the adaptive stepsize dramatically accelerates early progress (often 2–3× faster) and achieves higher final accuracy (error ≈10⁻⁴). Moreover, the min‑consensus step needs only a few rounds per iteration to maintain stability, highlighting the method’s suitability for communication‑constrained environments.
In summary, the paper presents a novel decentralized optimization framework that integrates three‑operator splitting with a BCV preconditioning metric and a fully distributed adaptive stepsize mechanism. The approach resolves the longstanding issue of stepsize selection in heterogeneous networks, accommodates complex nonsmooth regularizers, and offers rigorous sublinear and linear convergence guarantees, all validated by extensive simulations.
Comments & Academic Discussion
Loading comments...
Leave a Comment