HPRMAT: A high-performance R-matrix solver with GPU acceleration for coupled-channel problems in nuclear physics

HPRMAT: A high-performance R-matrix solver with GPU acceleration for coupled-channel problems in nuclear physics
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

I present HPRMAT, a high-performance solver library for the linear systems arising in R-matrix coupled-channel scattering calculations in nuclear physics. Designed as a drop-in replacement for the linear algebra routines in existing R-matrix codes, HPRMAT employs direct linear equation solving with optimized libraries instead of traditional matrix inversion, achieving significant performance improvements. The package provides four solver backends: (1) double-precision LU factorization, (2) mixed-precision arithmetic with iterative refinement, (3) a Woodbury formula approach exploiting the kinetic-coupling matrix structure, and (4) GPU acceleration. Benchmark calculations demonstrate that the GPU solver achieves up to 9$\times$ speedup over optimized CPU direct solvers, and 18$\times$ over legacy inversion-based codes, for large matrices ($N=25600$). The mixed-precision strategy is particularly effective on consumer GPUs (e.g., NVIDIA RTX 3090/4090), where single-precision throughput exceeds double-precision by a factor of 64:1; by performing factorization in single precision with iterative refinement, HPRMAT overcomes the poor FP64 performance of consumer hardware while maintaining double-precision accuracy. This makes large-scale CDCC and coupled-channel calculations accessible to researchers using standard desktop workstations, without requiring expensive data-center GPUs. CPU-only solvers provide 5–7$\times$ speedup through optimized libraries and algorithmic improvements. All solvers maintain physics accuracy with relative errors below $10^{-5}$ in cross-section calculations, validated against Descouvemont’s reference code (Comput.\ Phys.\ Commun.\ 200, 199–219 (2016)). HPRMAT provides interfaces for Fortran, C, Python, and Julia.


💡 Research Summary

This paper introduces HPRMAT, a high-performance solver library designed to address the computational bottleneck in R-matrix coupled-channel scattering calculations within nuclear physics. The library targets the large, dense, complex linear systems that arise from discretizing the Schrödinger equation using Lagrange-Legendre basis functions, a common approach for problems like Continuum-Discretized Coupled-Channels (CDCC) calculations involving weakly-bound nuclei.

The core innovation of HPRMAT lies in its modern, multi-faceted approach to accelerating these calculations while serving as a drop-in replacement for the linear algebra routines in widely-used legacy codes, such as Descouvemont’s package. The author identifies that traditional implementations relying on explicit matrix inversion are numerically less stable and inefficient. Instead, HPRMAT employs direct linear equation solving via LU factorization.

The library provides four distinct solver backends to cater to different hardware and accuracy needs: 1) A double-precision LU solver using optimized OpenBLAS libraries for CPU. 2) A mixed-precision solver that performs LU factorization in single precision on the GPU and recovers double-precision accuracy through iterative refinement. This strategy cleverly circumvents the poor double-precision performance of consumer-grade GPUs (e.g., NVIDIA RTX 3090). 3) A solver based on the Woodbury formula that attempts to exploit the specific block structure of the Hamiltonian matrix. 4) A GPU-accelerated solver using NVIDIA’s cuSOLVER library.

A significant portion of the paper is dedicated to analyzing the structure of the problem matrix. While it appears block-sparse (diagonal blocks are full, off-diagonal blocks are diagonal for local potentials), the author demonstrates through rigorous testing that sparse or iterative solvers (like block methods or preconditioned GMRES) fail due to rapid fill-in during factorization or convergence issues. This justifies the focus on optimizing dense direct methods.

Implementation details emphasize practical usability. HPRMAT maintains compatibility with existing Fortran codes, requiring minimal changes for adoption. It also offers interfaces for C, Python, and Julia to integrate with modern scientific workflows. Validation against established reference codes confirms that all solvers maintain physical accuracy with relative errors below 10^-5 in calculated cross-sections.

Comprehensive benchmarks across multiple hardware platforms reveal substantial performance gains. The GPU solver achieves up to a 9x speedup over optimized CPU direct solvers and an 18x speedup over legacy inversion-based codes for large matrices (N=25,600). The mixed-precision strategy is particularly effective on consumer GPUs, delivering roughly an 8x speedup over native double-precision GPU solving. Even the CPU-only solvers, by leveraging optimized OpenBLAS, show 5-7x speedups over reference implementations.

In conclusion, HPRMAT successfully bridges the gap between traditional nuclear physics codes and contemporary high-performance computing. By combining algorithmic improvements (direct solves over inversion), hardware-aware optimizations (mixed-precision for consumer GPUs), and seamless integration capabilities, it makes large-scale coupled-channel calculations feasible on standard desktop workstations, democratizing access to advanced computational nuclear physics. The paper suggests future work may include multi-GPU support and extensions to other architectures.


Comments & Academic Discussion

Loading comments...

Leave a Comment