The Riemannian Landing Method: From projected gradient flows to SQP
Landing methods have recently emerged in Riemannian matrix optimization as efficient schemes for handling nonlinear equality constraints without resorting to costly retractions. These methods decompose the search direction into tangent and normal components, enabling asymptotic feasibility while maintaining inexpensive updates. In this work, we provide a unifying geometric framework which reveals that, under suitable choices of Riemannian metric, the landing algorithm encompasses several classical optimization methods such as projected and null-space gradient flows, Sequential Quadratic Programming (SQP), and a certain form of the augmented Lagrangian method. In particular, we show that a quadratically convergent landing method essentially reproduces the quadratically convergent SQP method. These connections also allow us to propose a globally convergent landing method using adaptive step sizes. The backtracking line search satisfies an Armijo condition on a merit function, and does not require prior knowledge of Lipschitz constants. Our second key contribution is to analyze landing methods through a geometric parameterization of the metric in terms of fields of oblique projectors and associated metric restrictions. This viewpoint disentangles the roles of orthogonality, tangent and normal metrics, and elucidates how to design the metric to obtain explicit tangent and normal updates. For matrix optimization, this framework not only recovers recent constructions in the literature for problems with orthogonality constraints, but also provides systematic guidelines for designing new metrics that admit closed-form search directions.
💡 Research Summary
The paper presents a comprehensive geometric framework that unifies a family of “landing” algorithms for equality‑constrained optimization under a single Riemannian formulation. The authors start from the observation that a landing step can be written as a sum of a tangent component d_T and a normal component d_N. The tangent part is the Riemannian constrained gradient of the objective f on the level set M_x ={y | c(y)=c(x)}; the normal part is the unconstrained Riemannian gradient of the infeasibility measure ψ(x)=½‖c(x)‖². Crucially, both components are generated by a single user‑defined Riemannian metric g on the ambient space, whereas prior work typically used a metric only for d_T and a Euclidean gradient for d_N.
The key technical device is the parameterization of g through a field of oblique projectors P_x onto the tangent space and the associated restrictions of g to the tangent and normal subspaces, denoted g_T and g_N. Theorem 4.3 shows that, given P_x, g_T, g_N, the updates can be written explicitly as
d_T = −P_x ∇_g f, d_N = −(I−P_x) ∇_g ψ.
This representation makes the connections to several classical schemes transparent:
-
Projected gradient flows – With the Euclidean metric and an orthogonal projector, d_T reduces to the standard projected gradient (1.4) and d_N coincides with the pseudoinverse correction used by Yamasita (1980).
-
Null‑space flows – Choosing P_x as the null‑space projector yields a tangent step that solves a least‑squares linearized constraint and a normal step that directly reduces constraint violation, reproducing the Null‑Space Optimizer of Feppon et al.
-
Sequential Quadratic Programming (SQP) – By setting g_T to the Hessian of the Lagrangian with respect to the primal variables and g_N to the Hessian with respect to the multipliers, the tangent step becomes a Riemannian Newton direction while the normal step exactly matches the multiplier update of SQP. Theorem 5.6 proves that, with unit step size, the landing iteration is mathematically identical to a classic SQP iteration and therefore inherits its quadratic convergence.
-
Augmented Lagrangian methods – A suitable choice of the metric on the normal subspace makes d_N equivalent to the standard multiplier update of an augmented Lagrangian scheme.
Beyond these equivalences, the authors develop a globalization strategy that does not require any Lipschitz constant. They introduce a merit function Φ(x)=f(x)+ρψ(x) and an Armijo‑type backtracking line search for the step size α_k. The resulting Algorithm 1 guarantees that any generated sequence converges to a point satisfying the first‑order KKT conditions, mirroring the well‑established global convergence theory for SQP (Nocedal & Wright, 2006). This is a significant improvement over earlier landing analyses, which relied on fixed step sizes smaller than an unknown bound.
The final section focuses on matrix optimization with orthogonality constraints (Stiefel manifolds). In the Euclidean metric, computing d_T requires solving a Sylvester equation involving D c D cᵀ, which is costly for large matrices. By explicitly constructing the projector P_x (e.g., the orthogonal projector I−XXᵀ or a pseudoinverse‑based projector) and choosing compatible g_T, g_N, the authors obtain closed‑form expressions for both d_T and d_N that involve only matrix multiplications and cheap linear solves. This yields an O(n p²) per‑iteration cost, making the method practical for large‑scale applications such as enforcing orthogonal weight matrices in deep neural networks or low‑rank adaptation in language model fine‑tuning.
In summary, the paper achieves four major contributions: (i) a unifying Riemannian landing framework that subsumes projected gradient, null‑space, SQP, and augmented Lagrangian methods; (ii) a systematic metric‑design methodology based on oblique projectors that clarifies the role of tangent and normal metrics; (iii) a globally convergent adaptive step‑size scheme without Lipschitz knowledge; and (iv) concrete metric constructions for orthogonality‑constrained matrix problems that lead to inexpensive, closed‑form updates. By showing that a quadratically convergent landing method essentially reproduces SQP, the work bridges the gap between modern retraction‑free algorithms and classical constrained optimization theory, opening new avenues for scalable, geometry‑aware optimization in machine learning and scientific computing.
Comments & Academic Discussion
Loading comments...
Leave a Comment