Lambda admissible subspaces of self adjoint matrices
Given a self-adjoint matrix $A$ and an index $h$ such that $λ_h(A)$ lies in a cluster of eigenvalues of $A$, we introduce the novel class of $Λ$-admissible subspaces of $A$ of dimension $h$. First, we show that the low-rank approximation of the form $P_{\mathcal{T}} A P_{\mathcal{T}}$, for a subspace $\mathcal{T}$ that is close to any $Λ$-admissible subspace of $A$, has nice properties. Then, we prove that some well-known iterative algorithms (such as the Subspace Iteration Method, or the Krylov subspace method) produce subspaces that become arbitrarily close to $Λ$-admissible subspaces. We obtain upper bounds for the distance between subspaces obtained by the Rayleigh-Ritz method applied to $A$ and the class of $Λ$-admissible subspaces. We also find upper bounds for the condition number of the (set-valued) map computing the class of $Λ$-admissible subspaces of $A$. Finally, we include numerical examples that show the advantage of considering this new class of subspaces in the clustered eigenvalue setting.
💡 Research Summary
The paper addresses a fundamental difficulty in low‑rank approximation of Hermitian (self‑adjoint) matrices when the target eigenvalue λ_h lies inside a cluster of multiple eigenvalues. In the classical setting, a clear eigen‑gap λ_h − λ_{h+1}>0 guarantees a uniquely defined dominant eigenspace X_h of dimension h, and the compression P_{X_h} A P_{X_h} provides an optimal rank‑h approximation. When λ_h=λ_{h+1}, the dominant eigenspace is no longer unique, and the usual convergence analysis of subspace‑iteration or Krylov methods breaks down because it relies on the inverse gap (λ_h − λ_{h+1})^{-1}.
To overcome this, the authors introduce the notion of Λ‑admissible subspaces. They select three indices 0 ≤ j < h < k ≤ rank(A) such that the eigen‑gaps λ_j > λ_{j+1} and λ_k > λ_{k+1} are non‑zero. The corresponding dominant eigenspaces X_j and X_k are therefore uniquely defined. An h‑dimensional subspace S is called Λ‑admissible if it satisfies the inclusion X_j ⊂ S ⊂ X_k. In other words, S contains all eigenvectors associated with the “upper” part of the cluster (up to index j) and none from the “lower” part (beyond index k), while the interior of the cluster (indices j+1,…,k) may be arbitrarily mixed inside S. This definition replaces the single dominant eigenspace by a class of admissible subspaces, which is well‑defined even when the eigenvalue at position h is repeated.
The first major result shows that any Λ‑admissible subspace yields a nice low‑rank approximation. For any unitarily invariant norm ‖·‖, the error ‖A − P_S A P_S‖ is within a constant factor of the optimal rank‑h approximation error, and the constant depends only on the cluster spread δ = λ_{j+1} − λ_k. Moreover, the first h Ritz values of the compressed matrix P_S A P_S differ from the true leading eigenvalues λ_1,…,λ_h by at most O(δ). Hence, the tighter the cluster, the more accurate the approximation.
Next, the authors analyze how standard iterative algorithms approach the Λ‑admissible class. For a polynomial φ and an initial subspace W with dim W = r (h ≤ r < k), they consider the image φ(A)W. Under generic assumptions (φ of sufficiently high degree and W containing enough spectral information), φ(A)W contains an h‑dimensional subspace that can be made arbitrarily close to some Λ‑admissible subspace. This covers both the Subspace Iteration Method (where φ(x)=x^m) and Krylov subspace methods (where φ is a truncated Taylor series). Consequently, these algorithms, although originally designed to converge to a fixed dominant eigenspace, actually converge to the set Λ‑adm_h(A) when the eigenvalue at index h is clustered.
The paper then turns to the Rayleigh‑Ritz framework. Given a trial subspace Q (dim Q = r, h ≤ r < k), the Ritz vectors span a subspace R. The authors derive an upper bound for the distance
d(Λ‑adm_h(A), R) = inf_{S∈Λ‑adm_h(A)} sin θ_max(S,R).
The bound involves the inverses of both eigen‑gaps (λ_j − λ_{j+1})^{-1} and (λ_k − λ_{k+1})^{-1}, the norm of A, and the residual ‖(I − P_Q)A‖. This result improves upon earlier work that could only exploit the upper gap λ_j − λ_{j+1}. It shows that the quality of the Ritz subspace depends simultaneously on how well Q captures the spectrum above the cluster and how sharply the cluster is separated from the lower spectrum.
A further contribution is a condition‑number analysis for the set‑valued map A ↦ Λ‑adm_h(A). The authors define a suitable metric on the Grassmannian and prove that the condition number is bounded by the product (λ_j − λ_{j+1})^{-1}(λ_k − λ_{k+1})^{-1}. This indicates that the problem is well‑conditioned precisely when both gaps are large, reinforcing the earlier requirement that the cluster be “sandwiched” between two significant gaps.
The theoretical findings are supported by numerical experiments. Test matrices are constructed with a clear eigenvalue cluster containing λ_h. The authors apply SIM, Krylov, and Rayleigh‑Ritz methods, measuring (i) the spectral error of the compressed matrix, (ii) the distance to the Λ‑admissible class, and (iii) the overall Frobenius error ‖A − P_T A P_T‖ for the generated subspace T. In all cases, subspaces that are close to the Λ‑admissible class produce markedly smaller errors than those measured against a single dominant eigenspace (which is ambiguous in the clustered setting). The advantage is most pronounced when the cluster spread δ is tiny and the surrounding gaps are sizable.
In summary, the paper makes three intertwined contributions: (1) it introduces Λ‑admissible subspaces as a robust replacement for dominant eigenspaces when eigenvalues are clustered; (2) it establishes that standard low‑rank approximation and iterative subspace methods naturally converge to this class, providing explicit error bounds that involve both surrounding eigen‑gaps; and (3) it supplies a condition‑number analysis for the set‑valued mapping, confirming the stability of the approach. These results broaden the theoretical foundation of low‑rank approximation and eigenvalue computation, especially for problems where eigenvalue multiplicities or tight clusters are unavoidable. Future work may extend the Λ‑admissible framework to non‑Hermitian matrices, block‑Krylov methods, or large‑scale distributed implementations.
Comments & Academic Discussion
Loading comments...
Leave a Comment