BOLT: Block-Orthonormal Lanczos for Trace estimation of matrix functions

BOLT: Block-Orthonormal Lanczos for Trace estimation of matrix functions
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Efficient matrix trace estimation is essential for scalable computation of log-determinants, matrix norms, and distributional divergences. In many large-scale applications, the matrices involved are too large to store or access in full, making even a single matrix-vector (mat-vec) product infeasible. Instead, one often has access only to small subblocks of the matrix or localized matrix-vector products on restricted index sets. Hutch++ achieves optimal convergence rate but relies on randomized SVD and assumes full mat-vec access, making it difficult to apply in these constrained settings. We propose the Block-Orthonormal Stochastic Lanczos Quadrature (BOLT), which matches Hutch++ accuracy with a simpler implementation based on orthonormal block probes and Lanczos iterations. BOLT builds on the Stochastic Lanczos Quadrature (SLQ) framework, which combines random probing with Krylov subspace methods to efficiently approximate traces of matrix functions, and performs better than Hutch++ in near flat-spectrum regimes. To address memory limitations and partial access constraints, we introduce Subblock SLQ, a variant of BOLT that operates only on small principal submatrices. As a result, this framework yields a proxy KL divergence estimator and an efficient method for computing the Wasserstein-2 distance between Gaussians - both compatible with low-memory and partial-access regimes. We provide theoretical guarantees and demonstrate strong empirical performance across a range of high-dimensional settings.


💡 Research Summary

The paper introduces BOLT (Block‑Orthonormal Lanczos Quadrature), a stochastic trace estimator that combines block‑wise orthonormal probing with Lanczos quadrature to approximate tr f(A) for large symmetric positive‑semidefinite matrices A. Traditional Hutchinson‑type estimators use scalar random vectors and require O(1/ε²) matrix‑vector products for additive error ε, while Hutch++ reduces the cost to O(1/ε) by sketching the dominant eigenspace via a randomized SVD. Both approaches assume full matrix‑vector access, which is often impossible when the matrix is too large to store or when only localized subblocks can be accessed.

BOLT replaces scalar probes with an n × b random matrix Z (Rademacher or Gaussian entries), orthogonalizes it via QR to obtain an orthonormal block V, and runs k steps of the Lanczos process starting from V. The resulting tridiagonal matrix T_i is diagonalized, yielding quadrature nodes μ_{ij} and weights w_{ij}. For each trial i the estimator computes η_i = Σ_j w_{ij} f(μ_{ij}) and aggregates over q independent trials as (n/(bq)) Σ_i η_i. The method is unbiased (E


Comments & Academic Discussion

Loading comments...

Leave a Comment