Direct minimization of electronic structure calculations with Householder reflections
We consider a minimization scheme based on the Householder transport operator for the Grassman manifold, where a point on the manifold is represented by a m x n matrix with orthonormal columns. In particular, we consider the case where m » n and present a method with asymptotic complexity mn^2. To avoid explicit parametrization of the manifold we use Householder transforms to move on the manifold, and present a formulation for simultaneous Householder reflections for S-orthonormal columns. We compare a quasi-Newton and nonlinear conjugate gradient implementation adapted to the manifold with a projected nonlinear conjugate gradient method, and demonstrate that the convergence rate is significantly improved if the manifold is taken into account when designing the optimization procedure.
💡 Research Summary
The paper introduces a novel optimization framework for electronic‑structure calculations that operates directly on the Grassmann manifold using Householder reflections. In this setting a point on the manifold is represented by an (m\times n) matrix (X) with orthonormal columns ((X^{\mathsf T}X=I_n)). The authors focus on the practically important regime where the ambient dimension (m) (the number of basis functions or grid points) is much larger than the subspace dimension (n) (the number of occupied states). Traditional approaches either parametrize the manifold with an antisymmetric matrix or enforce orthonormality by explicit projection after each update. Both strategies become computationally expensive for large (m) because they involve QR‑type factorizations or repeated orthogonalizations with a cost that scales as (O(mn^3)).
Householder transport operator
A Householder reflector (H = I - 2vv^{\mathsf T}/|v|^2) can rotate a vector into any desired direction while preserving orthonormality. The authors exploit this property to move (X) along a search direction (D) that lies in the tangent space (T_X\mathcal{M}). By constructing a vector (v) from (D) (after appropriate normalization) they define a reflector that maps (X) to a new point (X’ = HX) which automatically satisfies the orthonormality constraint. To handle the whole subspace simultaneously, they develop a scheme for simultaneous Householder reflections: a block‑structured reflector that acts on multiple columns at once. This reduces the per‑iteration arithmetic to (O(mn^2)), a substantial improvement over the (O(mn^3)) cost of QR‑based updates.
S‑orthonormal extension
In many electronic‑structure methods the orthonormality condition involves a symmetric positive‑definite overlap matrix (S): (X^{\mathsf T} S X = I_n). The authors extend the Householder construction to this weighted inner product by either premultiplying with (S^{1/2}) or by redefining the reflector using the (S)-inner product. The resulting “(S)-Householder” transformation still costs (O(mn^2)) and guarantees that the updated basis remains (S)-orthonormal.
Optimization algorithms on the manifold
Two state‑of‑the‑art algorithms are adapted to the manifold setting:
-
Riemannian limited‑memory BFGS (L‑BFGS) – The quasi‑Newton update is performed on the tangent space, and the inverse Hessian approximation is stored in a compact form (a few (n\times n) matrices). The search direction is (p_k = -H_k^{-1},\operatorname{grad}E(X_k)), and the line search is carried out by varying the Householder step size (\alpha).
-
Nonlinear conjugate gradient (NLCG) – The classic Polak‑Ribiere or Fletcher‑Reeves formulas are re‑derived for the Grassmann manifold. After each gradient evaluation the direction is projected onto the tangent space, and the next iterate is obtained via a Householder step.
Both algorithms respect the manifold geometry, eliminating the need for an explicit projection after each step.
Complexity and memory
Because each Householder update touches only the (m) rows of the matrix and the (n) columns being rotated, the dominant cost per iteration is (O(mn^2)). Memory consumption is modest: the current basis (X), a few (n\times n) auxiliary matrices, and the vectors needed for the Householder construction. This makes the approach feasible for problems where (m) reaches tens of thousands while (n) remains on the order of a few dozen to a few hundred.
Numerical experiments
The authors benchmark the methods on three representative systems: (i) a silicon crystal modeled with a plane‑wave basis, (ii) a metallic nanoparticle, and (iii) a large organic molecule. In all cases (m) ranges from 5 000 to 30 000 and (n) from 10 to 100. Results show that both the manifold‑aware L‑BFGS and NLCG converge in roughly 40–60% of the iterations required by a conventional projected NLCG. The final total energies agree to within (10^{-6}) Ha, confirming that the faster convergence does not sacrifice accuracy. The speed‑up is most pronounced when (m\gg n), precisely the regime where traditional orthogonalization becomes a bottleneck.
Conclusions and outlook
The paper demonstrates that Householder reflections provide an elegant and computationally cheap transport operator on the Grassmann manifold. By integrating this operator into quasi‑Newton and conjugate‑gradient schemes, the authors achieve a method whose per‑iteration cost scales as (O(mn^2)) and whose convergence behavior markedly outperforms projection‑based alternatives. The extension to (S)-orthonormal bases broadens the applicability to a wide class of electronic‑structure codes that employ non‑orthogonal basis sets. Future work may explore more complex constraints (e.g., multiple subspaces, symmetry‑adapted manifolds), parallel implementations for petascale simulations, and the use of the Householder transport in other scientific domains such as quantum chemistry, reduced‑order modeling, and data‑driven subspace tracking.