Decentralized Optimization with Mixed Affine Constraints
This paper considers decentralized optimization of convex functions with mixed affine equality constraints involving both local and global variables. Constraints on global variables may vary across different nodes in the network, while local variables are subject to coupled and node-specific constraints. Such problem formulations arise in machine learning applications, including federated learning and multi-task learning, as well as in resource allocation and distributed control. We analyze this problem under smooth and non-smooth assumptions, considering both strongly convex and general convex objective functions. Our main contribution is an optimal algorithm for the smooth, strongly convex regime, whose convergence rate matches established lower complexity bounds. We further provide near-optimal methods for the remaining cases.
💡 Research Summary
This paper studies decentralized convex optimization problems in which each agent holds a local variable x_i and a shared global variable \tilde x. The objective is the sum of local convex functions f_i(x_i,\tilde x). Three families of affine equality constraints are considered: (1) coupled constraints ∑_i(A_i x_i−b_i)=0 that link the local variables across agents, (2) local constraints C_i x_i=c_i that each agent can enforce independently, and (3) shared‑variable constraints \tilde C_i \tilde x=\tilde c_i where the matrix \tilde C_i differs per agent. This “mixed‑constraint” formulation captures a wide range of machine‑learning scenarios such as horizontal and vertical federated learning, distributed multi‑task learning, and resource‑allocation problems in control systems.
The authors first review existing decentralized methods for pure consensus (only shared variable) and pure coupled‑constraint settings, noting that reductions of consensus to coupled constraints incur at least a √n communication penalty. They then propose new algorithms that directly handle the mixed‑constraint structure.
For the smooth, strongly convex case (μ>0, L‑smooth), they adapt the Accelerated Proximal Alternating Predictor‑Corrector (APAPC) method. By stacking all constraint matrices into block‑diagonal forms and defining mixed condition numbers κ_A, κ_{C^⊤}, κ_{AC}, they derive a unified condition number that also incorporates the spectral gap of the communication graph (λ_min^+ of the Laplacian). With appropriately tuned step‑sizes, the algorithm achieves a convergence rate of O(√κ_f log 1/ε) iterations, matching the known lower bound for decentralized optimization with affine constraints. This rate is optimal even when the constraints are a mixture of coupled, local, and shared‑variable types.
When the objective is merely convex (μ=0) or nonsmooth, the paper employs a penalty reformulation H_r(u)=G(u)+(r/2)‖Bu−b‖^2 and applies a Gradient Sliding scheme. The sliding technique separates the smooth quadratic penalty from the possibly nonsmooth original function, yielding a complexity of O(κ_B L R^2/ε)·log 1/ε for convex nonsmooth problems and a comparable bound for the non‑strongly‑convex smooth case. These results are near‑optimal up to logarithmic factors.
A significant contribution is the quantitative analysis of how the structure of the affine constraints interacts with the network topology. The authors show that the mixed condition numbers multiply the inverse spectral gap of the communication graph, explaining why naïve reductions can dramatically increase communication cost. They also provide the first lower‑bound proof for mixed‑constraint problems, establishing that no first‑order method can beat the presented rates in the smooth strongly‑convex regime.
Experimental evaluations on horizontal federated learning, vertical federated learning with a replicated top model, and a distributed power‑allocation benchmark confirm the theoretical predictions. The proposed methods consistently require fewer communication rounds and achieve faster objective reduction than algorithms that treat the problem as pure consensus or pure coupled‑constraint optimization.
In summary, the paper delivers a comprehensive theoretical framework and practical algorithms for decentralized optimization with mixed affine constraints, delivering optimal or near‑optimal convergence guarantees across smooth, nonsmooth, strongly convex, and merely convex settings, and clarifying the fundamental role of constraint‑graph interaction in determining algorithmic efficiency.
Comments & Academic Discussion
Loading comments...
Leave a Comment