Pseudospectral Shattering, the Sign Function, and Diagonalization in Nearly Matrix Multiplication Time
We exhibit a randomized algorithm which given a matrix $A\in \mathbb{C}^{n\times n}$ with $|A|\le 1$ and $\delta>0$, computes with high probability an invertible $V$ and diagonal $D$ such that $|A-VDV^{-1}|\le \delta$ using $O(T_{MM}(n)\log^2(n/\delta))$ arithmetic operations, in finite arithmetic with $O(\log^4(n/\delta)\log n)$ bits of precision. Here $T_{MM}(n)$ is the number of arithmetic operations required to multiply two $n\times n$ complex matrices numerically stably, known to satisfy $T_{MM}(n)=O(n^{\omega+\eta})$ for every $\eta>0$ where $\omega$ is the exponent of matrix multiplication (Demmel et al., Numer. Math., 2007). Our result significantly improves the previously best known provable running times of $O(n^{10}/\delta^2)$ arithmetic operations for diagonalization of general matrices (Armentano et al., J. Eur. Math. Soc., 2018), and (with regards to the dependence on $n$) $O(n^3)$ arithmetic operations for Hermitian matrices (Dekker and Traub, Lin. Alg. Appl., 1971). It is the first algorithm to achieve nearly matrix multiplication time for diagonalization in any model of computation (real arithmetic, rational arithmetic, or finite arithmetic), thereby matching the complexity of other dense linear algebra operations such as inversion and $QR$ factorization up to polylogarithmic factors. The proof rests on two new ingredients. (1) We show that adding a small complex Gaussian perturbation to any matrix splits its pseudospectrum into $n$ small well-separated components. In particular, this implies that the eigenvalues of the perturbed matrix have a large minimum gap, a property of independent interest in random matrix theory. (2) We give a rigorous analysis of Roberts’ Newton iteration method (Roberts, Int. J. Control, 1980) for computing the sign function of a matrix in finite arithmetic, itself an open problem in numerical analysis since at least 1986.
💡 Research Summary
The paper presents a randomized algorithm that, given any complex n × n matrix A with spectral norm at most 1 and a target accuracy δ > 0, produces with high probability an invertible matrix V and a diagonal matrix D such that ‖A – V D V⁻¹‖ ≤ δ. The total number of arithmetic operations is O(T_MM(n)·log²(n/δ)), where T_MM(n) denotes the cost of a numerically stable matrix multiplication and satisfies T_MM(n)=O(n^{ω+η}) for any η > 0 (ω≈2.37 is the matrix‑multiplication exponent). The algorithm works in finite‑precision arithmetic using only O(log⁴(n/δ)·log n) bits of precision, and the computed similarity transformation satisfies ‖V‖·‖V⁻¹‖ = O(n^{2.5}/δ).
The contribution rests on two novel technical ingredients:
-
Pseudospectral Shattering.
For any matrix A, adding a tiny complex Gaussian perturbation γ G (with γ≈n^{–½}) yields a perturbed matrix X = A + γ G whose ε‑pseudospectrum splits into n disjoint, tiny components. Consequently the eigenvalues of X are well separated (minimum gap Ω(γ)) and the eigenvector condition number κ_V(X) is bounded by O(n·polylog(n/δ)). This “shattering” phenomenon extends recent results on eigenvalue gaps for i.i.d. random matrices (Davie’s conjecture) to arbitrary deterministic matrices, providing a universal regularization that makes the spectrum well‑conditioned. -
Robust Newton Iteration for the Matrix Sign Function.
The matrix sign function sign(X) is central to the spectral‑bisection method because it separates the spectrum into two halves. The authors give a rigorous finite‑precision analysis of Roberts’ Newton iteration X_{k+1}=½(X_k + X_k⁻¹). When applied to a shatter‑regularized matrix, the iteration converges in O(log (1/δ)) steps, and the round‑off error at each step can be bounded by a multiple of the machine precision u. By assuming a black‑box model where matrix multiplication, inversion, and QR factorization each incur relative error ≤ u, they show that the overall error propagation remains polylogarithmic, leading to the stated precision requirement.
With these tools the authors adapt the classical spectral‑bisection algorithm (Beavers & Denman, 1974). The algorithm proceeds recursively: after shattering, the sign function is used to split the spectrum, then each sub‑problem is solved independently. Because the eigenvalues are well separated, the recursion depth is O(log (n/δ)). Each recursive call requires a constant number of matrix multiplications, inverses, and QR factorizations, all of which can be reduced to matrix multiplication. Hence the total arithmetic cost is O(T_MM(n)·log²(n/δ)). The condition number of the final similarity matrix V is also controlled, guaranteeing numerical stability of the output.
The paper also discusses the computational model in detail. It works in the standard floating‑point setting where each stored number x is represented as fl(x) = (1+θ)x with |θ| ≤ u, and each elementary operation incurs a multiplicative error bounded by u. The required precision u is chosen as u = poly⁻¹(n/δ), leading to the O(log⁴(n/δ)·log n) bit precision bound. This is dramatically lower than the Ω(n) bits needed in rational‑arithmetic approaches for comparable tasks.
Compared with prior provable algorithms—O(n^{10}/δ²) arithmetic for general matrices (Armentano et al., 2018) and O(n³) for Hermitian matrices (Dekker & Traub, 1971)—the new method achieves a near‑optimal dependence on n, matching the best known complexities for matrix inversion and QR factorization up to polylogarithmic factors. It is the first algorithm that provably diagonalizes arbitrary matrices in nearly matrix‑multiplication time in any realistic computational model.
Beyond diagonalization, the authors suggest that pseudospectral shattering and the robust sign‑function iteration could be applied to other matrix‑function problems such as the singular‑value decomposition, matrix square roots, and more general holomorphic functional calculus. Open questions include tightening the polylogarithmic factors, extending the technique to structured matrices (e.g., sparse or low‑rank), and developing practical implementations that exploit modern fast‑matrix‑multiplication libraries.
In summary, the paper bridges random matrix theory and numerical analysis to overcome long‑standing barriers in eigenvalue computation, delivering a theoretically optimal algorithm for approximate diagonalization that operates essentially as fast as the best known dense linear‑algebra primitives.
Comments & Academic Discussion
Loading comments...
Leave a Comment