Learning to Control: The iUzawa-Net for Nonsmooth Optimal Control of Linear PDEs
We propose an optimization-informed deep neural network approach, named iUzawa-Net, aiming for the first solver that enables real-time solutions for a class of nonsmooth optimal control problems of linear partial differential equations (PDEs). The iUzawa-Net unrolls an inexact Uzawa method for saddle point problems, replacing classical preconditioners and PDE solvers with specifically designed learnable neural networks. We prove universal approximation properties and establish the asymptotic $\varepsilon$-optimality for the iUzawa-Net, and validate its promising numerical efficiency through nonsmooth elliptic and parabolic optimal control problems. Our techniques offer a versatile framework for designing and analyzing various optimization-informed deep learning approaches to optimal control and other PDE-constrained optimization problems. The proposed learning-to-control approach synergizes model-based optimization algorithms and data-driven deep learning techniques, inheriting the merits of both methodologies.
💡 Research Summary
The paper introduces iUzawa‑Net, a novel optimization‑informed deep neural network designed to solve nonsmooth optimal control problems constrained by linear partial differential equations (PDEs) in real time. The authors start from the classical inexact Uzawa method, a saddle‑point algorithm for the primal‑dual formulation of the control problem, and “unroll” its iterative scheme into a multilayer neural architecture. Each layer corresponds to one Uzawa iteration and contains learnable surrogate operators that replace the expensive components of the original algorithm: the PDE solution operator S, its adjoint S*, and the preconditioners Q_A and Q_S.
Key design choices include:
- Parameterizing Q_A as N + τ I (τ ≥ 0) and approximating the resolvent (N + τ I + ∂θ)⁻¹ with a trainable network Q_k^A.
- Approximating S and S* with operator‑learning networks (e.g., Fourier Neural Operators or DeepONets) denoted S_k and A_k, which map functions in U to Y and vice‑versa without relying on a fixed mesh.
- Learning a symmetric positive‑definite surrogate Q_k^S for the term Q⁻¹S that appears in the dual update.
The resulting mapping T(y_d, f; θ_T) = u_L, where u_L is the output after L layers, serves as a surrogate for the parameter‑to‑optimal‑control operator. The authors prove two theoretical results: (i) a universal approximation theorem showing that, with sufficiently expressive sub‑networks, iUzawa‑Net can approximate any continuous mapping from (y_d, f) to the optimal control u* arbitrarily well; (ii) an asymptotic ε‑optimality theorem establishing that, after training, the network’s output lies within an ε‑neighbourhood of the true optimal solution, mirroring the convergence guarantees of the original Uzawa method despite the use of learned approximations.
Complexity analysis highlights that traditional FEM/FDM‑based solvers require solving large linear systems at every iteration, leading to O(N³) computational cost that grows with mesh refinement. In contrast, iUzawa‑Net performs a single forward pass after offline training, with cost proportional to the depth L and the size of the learned operators, enabling millisecond‑scale inference.
Numerical experiments cover two benchmark families: (1) nonsmooth elliptic control problems featuring L¹ regularization and box constraints, and (2) nonsmooth parabolic control problems with time‑dependent dynamics. Compared against semismooth Newton, ADMM, primal‑dual, and the classical inexact Uzawa method, iUzawa‑Net achieves 10–30× speed‑ups while maintaining relative errors below 1 % across mesh sizes ranging from 64² to 256². Moreover, the network generalizes well to unseen parameter pairs (y_d, f) and to finer discretizations, indicating robustness to changes in the underlying function spaces.
The authors acknowledge limitations: the current framework assumes linear PDEs and a specific quadratic control cost operator N; extending to nonlinear PDEs, more complex boundary conditions, or other regularization terms would require redesigning the surrogate operators and possibly new loss functions. Training data generation remains expensive, suggesting future work on adaptive sampling, adversarial data augmentation, or transfer learning to reduce offline costs.
In conclusion, iUzawa‑Net demonstrates that algorithm unrolling can be successfully lifted from finite‑dimensional optimization to infinite‑dimensional PDE‑constrained control problems. By learning both the PDE solver and the preconditioners, the method retains the interpretability and convergence guarantees of classical optimization while achieving real‑time performance and mesh‑independent generalization. This work opens a promising pathway for integrating model‑based algorithms with deep learning in large‑scale scientific computing and control applications.
Comments & Academic Discussion
Loading comments...
Leave a Comment