A weighted global GMRES algorithm with deflation for solving large Sylvester matrix equations
The solution of large scale Sylvester matrix equation plays an important role in control and large scientific computations. A popular approach is to use the global GMRES algorithm. In this work, we first consider the global GMRES algorithm with weighting strategy, and propose some new schemes based on residual to update the weighting matrix. Due to the growth of memory requirements and computational cost, it is necessary to restart the algorithm efficiently. The deflation strategy is popular for the solution of large linear systems and large eigenvalue problems, to the best of our knowledge, little work is done on applying deflation to the global GMRES algorithm for large Sylvester matrix equations. We then consider how to combine the weighting strategy with deflated restarting, and propose a weighted global GMRES algorithm with deflation for solving large Sylvester matrix equations. Theoretical analysis is given to show why the new algorithm works effectively. Further, unlike the weighted GMRES-DR presented in [{\sc M. Embree, R. B. Morgan and H. V. Nguyen}, {\em Weighted inner products for GMRES and GMRES-DR}, (2017), arXiv:1607.00255v2], we show that in our new algorithm, there is no need to change the inner product with respect to diagonal matrix to that with non-diagonal matrix, and our scheme is much cheaper. Numerical examples illustrate the numerical behavior of the proposed algorithms.
💡 Research Summary
The paper addresses the computational challenge of solving large‑scale Sylvester matrix equations of the form AX + XB = C, where A∈ℝ^{n×n}, B∈ℝ^{s×s} (with s≪n), and C∈ℝ^{n×s}. Directly vectorizing the equation leads to a linear system of size ns, which is prohibitive for modern applications in control, signal processing, and scientific computing. The authors therefore develop a Krylov subspace method based on the global GMRES algorithm, enhanced in two complementary ways: (1) a weighting strategy that modifies the inner product, and (2) a deflated restarting technique that removes the influence of small eigenvalues.
Weighted Global GMRES (W‑GLGMRES).
A positive diagonal matrix D is introduced to define a D‑inner product ⟨Y,Z⟩D = trace(ZᵀDY) and the associated D‑norm. The weighted Arnoldi process (Algorithm 1) builds a D‑orthonormal basis V₁,…,V_m for the block Krylov subspace K_m(A,V₁) and a (m+1)×m Hessenberg matrix Ĥ_m. The approximate solution is expressed as X_m = X₀ + V_m(y_w⊗I_s), where y_w solves the small least‑squares problem min_y‖βe₁ − Ĥ_m y‖₂. The residual R_m lies in the range of V{m+1} and is D‑orthogonal to the Krylov space, guaranteeing the usual GMRES optimality property under the weighted inner product.
Dynamic updating of the weighting matrix.
Instead of fixing D throughout the iteration, the authors propose three residual‑based schemes that recompute D at each restart: (i) using the column of the residual with the largest Euclidean norm, (ii) using the column with the smallest norm, and (iii) using the average of all columns. In each case D is set to diag(|R(:,t)|/‖R(:,t)‖₂) or its analogue. This dynamic weighting emphasizes directions where the error is largest, thereby accelerating convergence compared with a static weighting.
Deflated restarting (W‑GLGMRES‑DR).
A known drawback of global GMRES is the rapid growth of memory and computational cost when the Arnoldi process is allowed to run for many steps. To mitigate this, the authors incorporate a deflated restarting (also called thick‑restarting) strategy. After an initial cycle of W‑GLGMRES, they compute k weighted harmonic Ritz pairs (θ_i, Y_i) that satisfy a Petrov–Galerkin condition with respect to the D‑inner product. The vectors Y_i have the form g_i⊗I_s, where g_i is obtained from a small generalized eigenvalue problem θ_i (HᵀH)g_i = (HᵀĤ)g_i. The corresponding harmonic residuals e_Ri are shown to lie in the same subspace V_{m+1} as the GMRES residual and to be D‑orthogonal to AV_m. Theorem 3.1 proves that the set {e_Ri} and the GMRES residual are collinear up to a small s×s matrix, which guarantees that the deflation does not destroy the optimality of the underlying GMRES step.
During the next restart, the k harmonic Ritz vectors are retained as part of the new search space, effectively “deflating” the associated eigenvalues from the spectrum of the projected operator. This reduces the effective condition number of the reduced problem and leads to faster convergence without increasing the dimension of the Krylov basis beyond m.
Algorithmic framework.
Algorithm 2 (W‑GLGMRES) outlines the basic weighted GMRES cycle with dynamic D‑updates. Algorithm 3 (W‑GLGMRES‑DR) augments this with the harmonic Ritz computation and the deflated restart. Both algorithms require only the small least‑squares solve (size m) and a generalized eigenvalue problem of size m, so the extra cost per restart is modest (O(m³) at most) compared with the dominant matrix–vector products.
Numerical experiments.
The authors test their methods on several Sylvester problems with varying dimensions (e.g., n = 5000, s = 5) and spectral properties (including clustered and widely spread eigenvalues). They compare four solvers: (a) standard global GMRES, (b) weighted global GMRES with static D, (c) weighted global GMRES with the three dynamic D‑strategies, and (d) the proposed weighted‑deflated GMRES. Results show that the dynamic weighting alone already reduces iteration counts, and the addition of deflation yields the most dramatic savings in both iteration number and CPU time. Moreover, unlike the weighted GMRES‑DR of Embree et al., the present scheme never changes the inner product to a non‑diagonal weighting matrix, which simplifies implementation and avoids extra storage.
Conclusions and future work.
The paper demonstrates that combining a residual‑driven weighting strategy with a harmonic‑Ritz‑based deflated restart produces a robust, memory‑efficient algorithm for large Sylvester equations. The approach retains the simplicity of a diagonal weighting matrix while achieving convergence rates comparable to more expensive non‑diagonal inner‑product schemes. Open questions remain regarding the optimal choice of D (a problem still largely unresolved) and the extension of the technique to other block Krylov methods (e.g., block BiCG, block QMR) or to problems with multiple right‑hand sides beyond the Sylvester setting. The authors suggest that further theoretical analysis of the weighting update rules and adaptive selection of the number k of Ritz vectors could yield even greater performance gains.
Comments & Academic Discussion
Loading comments...
Leave a Comment