Distributed Reconstruction of Nonlinear Networks: An ADMM Approach
In this paper, we present a distributed algorithm for the reconstruction of large-scale nonlinear networks. In particular, we focus on the identification from time-series data of the nonlinear functional forms and associated parameters of large-scale nonlinear networks. Recently, a nonlinear network reconstruction problem was formulated as a nonconvex optimisation problem based on the combination of a marginal likelihood maximisation procedure with sparsity inducing priors. Using a convex-concave procedure (CCCP), an iterative reweighted lasso algorithm was derived to solve the initial nonconvex optimisation problem. By exploiting the structure of the objective function of this reweighted lasso algorithm, a distributed algorithm can be designed. To this end, we apply the alternating direction method of multipliers (ADMM) to decompose the original problem into several subproblems. To illustrate the effectiveness of the proposed methods, we use our approach to identify a network of interconnected Kuramoto oscillators with different network sizes (500~100,000 nodes).
💡 Research Summary
This paper introduces a scalable, distributed algorithm for reconstructing large‑scale nonlinear networks from time‑series measurements. The authors begin by modeling a general discrete‑time nonlinear dynamical system as
x(t + 1) = F(x(t), u(t)) + ξ(t),
where F is assumed to be a linear combination of a pre‑selected dictionary of nonlinear basis functions. For each state variable x_i, the dynamics can be written as
x_i(t + 1) = f_i(x(t), u(t))ᵀ w_i + ξ_i(t),
which, after stacking all measurements, yields a linear regression form y = A w + ξ. The matrix A contains evaluations of the candidate basis functions over the observed trajectory; its column dimension can be extremely large (hundreds of thousands) when dealing with networks of 10⁵ nodes.
Directly solving for the sparsest w (i.e., the ℓ₀ solution) is NP‑hard, so the usual approach is to relax the problem with an ℓ₁ penalty (Lasso). However, Lasso’s theoretical guarantees rely on restrictive properties of A (e.g., Restricted Isometry Property or incoherence) that rarely hold for real network data, leading to sub‑optimal reconstructions.
To overcome this, the authors adopt a Bayesian perspective and impose sparsity‑inducing priors on w. Each coefficient w_j receives a hierarchical prior of the form
P(w_j) = max_{γ_j>0} N(w_j | 0, γ_j) φ(γ_j),
where γ_j is a hyper‑parameter and φ(γ_j) acts as a potential function. By choosing φ appropriately (e.g., Student‑t prior), the prior becomes heavy‑tailed, encouraging sparsity while allowing for large coefficients. The resulting posterior is Gaussian, enabling analytic expressions for the marginal likelihood (evidence). Maximizing this evidence with respect to γ leads to the cost
L(γ) = log|σ²I + AΓAᵀ| + yᵀ(σ²I + AΓAᵀ)⁻¹y,
which is non‑convex because γ and w are coupled.
The paper reformulates the non‑convex problem as a Concave‑Convex Procedure (CCCP). Defining
u(w,γ) = ‖Aw − y‖² + σ²∑_j w_j²/γ_j,
v(γ) = −log|σ²I + AΓAᵀ|,
the objective becomes u − v. At iteration k, the gradient α_k = −∇_γ v(γ_k) is computed, and the following convex sub‑problem is solved:
min_{w,γ≥0} ‖Aw − y‖² + σ²∑_j w_j²/γ_j + α_kᵀγ.
Fixing w yields a closed‑form update for γ (γ_j = |w_j|/(q α_{k,j})). Substituting back gives a weighted‑ℓ₁ (re‑weighted Lasso) problem for w. This alternating scheme is precisely the iterative re‑weighted Lasso algorithm, with an additional thresholding step to prune negligible coefficients.
The main contribution lies in embedding this re‑weighted Lasso within the Alternating Direction Method of Multipliers (ADMM) to obtain a truly distributed solver. ADMM splits the objective into two parts: a local quadratic term f(w) = ‖Aw − y‖² + σ²∑ w_j²/γ_j handled independently by each node (using only its subset of the dictionary), and a global ℓ₁ regularizer g(z) = λ‖z‖₁ enforced on a consensus variable z. The ADMM iterations consist of:
- Local w‑update – each node solves a small quadratic problem involving its local rows of A and the current estimate of γ. This step is embarrassingly parallel and requires only local memory.
- Global z‑update – a soft‑thresholding operation applied to the averaged w across all nodes, implementing the re‑weighted ℓ₁ penalty.
- Dual variable update – standard ADMM multiplier update to enforce consensus.
Because the heavy lifting (matrix‑vector products) is confined to local sub‑problems, the algorithm dramatically reduces memory footprints and enables parallel execution on clusters or cloud platforms. Convergence follows from standard ADMM theory under mild assumptions; the authors report empirical convergence within a few dozen iterations even for networks with up to 100 000 nodes.
To validate the approach, the authors reconstruct networks of coupled Kuramoto oscillators, a canonical testbed for synchronization phenomena. The dynamics are
θ̇_i = ω_i + (K/N)∑_j sin(θ_j − θ_i).
Candidate basis functions include sin(·), cos(·), and low‑order polynomials of the phases. Synthetic data are generated for various coupling strengths, network densities, and noise levels. The distributed re‑weighted Lasso is compared against a centralized version of the same algorithm. Results show:
- Scalability – memory usage drops by >90 % when moving from centralized to distributed implementation, enabling reconstruction of 100 k‑node networks on a modest workstation.
- Accuracy – mean‑squared error of the estimated coupling matrix remains within 1–2 % of the centralized benchmark across all tested scenarios.
- Robustness – performance degrades gracefully with increasing measurement noise; the Bayesian prior mitigates over‑fitting even when the number of samples is smaller than the number of candidate functions.
The paper concludes with a discussion of limitations and future work. The current framework assumes a pre‑specified dictionary; automatic selection or adaptive enrichment of basis functions would broaden applicability. Extending the method to handle non‑Gaussian noise (e.g., heavy‑tailed disturbances) and exploring asynchronous or stochastic ADMM variants could further improve communication efficiency in distributed settings. Nonetheless, the presented combination of Bayesian sparsity, CCCP‑derived re‑weighting, and ADMM‑based distribution offers a powerful tool for large‑scale nonlinear network identification.
Comments & Academic Discussion
Loading comments...
Leave a Comment