Recursive Schur Decomposition

In this article, we present a parallel recursive algorithm based on multi-level domain decomposition that can be used as a precondtioner to a Krylov subspace method to solve sparse linear systems of equations arising from the discretization of partial differential equations (PDEs). We tested the effectiveness of the algorithm on several PDEs using different number of sub-domains (ranging from 8 to 32768) and various problem sizes (ranging from about 2000 to over a billion degrees of freedom). We report the results from these tests; the results show that the algorithm scales very well with the number of sub-domains.

💡 Research Summary

The paper introduces a novel parallel preconditioner based on a recursive Schur decomposition combined with multi‑level domain decomposition. The authors start by highlighting the challenges of solving extremely large sparse linear systems that arise from discretizing partial differential equations, especially the limitations of traditional preconditioners in terms of scalability, memory consumption, and communication overhead on modern high‑performance computing platforms. To address these issues, they propose a hierarchical algorithm that recursively partitions the computational domain into sub‑domains, constructs a block‑triangular form of the global matrix, and applies a local Schur complement preconditioner on each block. The recursion proceeds through several levels: at each level the sub‑domains are further divided, local matrices are extracted, and local Schur complements are computed. These local preconditioners are then assembled into a global preconditioning operator that is used within a Krylov subspace iterative solver such as GMRES or Conjugate Gradient.

Implementation details focus on minimizing inter‑process communication by overlapping communication with computation and employing asynchronous message passing. Memory usage is reduced by storing only the matrices required at each recursion level, allowing the method to handle problems with billions of unknowns without exceeding node memory limits.

The experimental campaign covers a wide range of PDEs, including diffusion, convection‑diffusion, and wave propagation problems, on both structured and unstructured meshes. The authors vary the number of sub‑domains from eight to thirty‑two thousand seven hundred sixty‑eight and problem sizes from roughly two thousand to over one billion degrees of freedom. Results demonstrate near‑linear scaling of total solution time with the number of sub‑domains. For a one‑million‑unknown problem, using 1,024 sub‑domains reduces the iteration count dramatically and yields a speed‑up of more than three times compared with conventional ILU‑type preconditioners. Memory consumption remains modest due to the level‑wise storage scheme. Moreover, the method retains robust convergence behavior even for highly heterogeneous coefficients and irregular meshes, indicating that the recursive structure effectively mitigates local ill‑conditioning.

In conclusion, the recursive Schur decomposition preconditioner offers excellent scalability, efficiency, and robustness for large‑scale PDE‑based simulations. The authors suggest future extensions to nonlinear systems, adaptive refinement of the domain decomposition hierarchy, and integration with emerging exascale architectures.