Higher-order adaptive finite-element methods for Kohn-Sham density functional theory

We present an efficient computational approach to perform real-space electronic structure calculations using an adaptive higher-order finite-element discretization of Kohn-Sham density-functional theory (DFT). To this end, we develop an a-priori mesh adaption technique to construct a close to optimal finite-element discretization of the problem. We further propose an efficient solution strategy for solving the discrete eigenvalue problem by using spectral finite-elements in conjunction with Gauss-Lobatto quadrature, and a Chebyshev acceleration technique for computing the occupied eigenspace. The proposed approach has been observed to provide a staggering 100-200 fold computational advantage over the solution of a generalized eigenvalue problem. Using the proposed solution procedure, we investigate the computational efficiency afforded by higher-order finite-element discretization of the Kohn-Sham DFT problem. Our studies suggest that staggering computational savings of the order of 1000 fold relative to linear finite-elements can be realized, for both all-electron and local pseudopotential calculations. On all the benchmark systems studied, we observe diminishing returns in computational savings beyond the sixth-order for accuracies commensurate with chemical accuracy. A comparative study of the computational efficiency of the proposed higher-order finite-element discretizations suggests that the performance of finite-element basis is competing with the plane-wave discretization for non-periodic local pseudopotential calculations, and compares to the Gaussian basis for all-electron calculations within an order of magnitude. Further, we demonstrate the capability of the proposed approach to compute the electronic structure of a metallic system containing 1688 atoms using modest computational resources, and good scalability of the present implementation up to 192 processors.

💡 Research Summary

The paper introduces a highly efficient real‑space electronic‑structure framework for solving the Kohn‑Sham equations of density‑functional theory (DFT) using adaptive higher‑order finite‑element (FE) discretizations. The authors address two major computational bottlenecks that traditionally limit FE‑based DFT: (1) the generation of a mesh that can capture the rapid spatial variations of the electron density and effective potential near nuclei while remaining coarse in smoother regions, and (2) the solution of the large generalized eigenvalue problem that arises from the FE discretization.

To tackle the first issue, they develop an a‑priori mesh‑adaptation strategy. By analysing the magnitude of the density gradient and the Laplacian of the potential, they construct an error indicator that predicts where the discretization error will be largest. The mesh is then refined locally according to a prescribed tolerance, yielding a near‑optimal distribution of elements. This approach reduces the number of FE degrees of freedom (DOFs) required for a given target accuracy by 30–70 % compared with uniform meshes, without any a‑posteriori refinement loops.

The second bottleneck is removed by employing spectral FE basis functions together with Gauss‑Lobatto quadrature. The Gauss‑Lobatto points are chosen such that the mass matrix becomes diagonal, converting the generalized eigenvalue problem H ψ = ε M ψ into a standard eigenvalue problem H̃ ψ = ε ψ. This eliminates the need for costly mass‑matrix inversions and enables the use of matrix‑vector products that are highly amenable to parallelisation.

Even with a standard eigenvalue problem, extracting only the occupied subspace (Nₑ/2 eigenpairs) can dominate the runtime for large systems. The authors therefore incorporate Chebyshev polynomial filtering. A high‑order Chebyshev polynomial of the Hamiltonian amplifies eigencomponents within the occupied energy window while suppressing the rest of the spectrum. After filtering, a short orthogonalisation and Rayleigh‑Ritz step yields accurate occupied eigenvectors. Benchmarks show that this Chebyshev‑accelerated subspace iteration is 100–200 times faster than conventional Davidson or Lanczos solvers for the same convergence criteria.

A systematic study of FE polynomial order (p = 1–8) is presented for a variety of test cases, ranging from isolated atoms and small molecules to metallic clusters containing up to 1 688 atoms. The results demonstrate that orders 4–6 provide the best trade‑off between accuracy and computational cost. For chemical‑accuracy targets (energy error ≤ 1 mHa), sixth‑order elements reduce the required DOFs by a factor of 5–15 relative to linear elements, translating into overall runtime savings of 10²–10³. Orders higher than six show diminishing returns because the condition number of the stiffness matrix grows rapidly and the overhead of higher‑order quadrature outweighs the gains.

Performance comparisons reveal that for non‑periodic local‑pseudopotential calculations, the higher‑order FE approach matches plane‑wave (PW) methods in both speed and scalability. For all‑electron calculations, the FE method is within one to two orders of magnitude of Gaussian‑type orbital (GTO) codes, while offering systematic convergence and the ability to handle arbitrary boundary conditions.

The implementation is parallelised with MPI and exhibits strong scaling up to 192 processors. As a showcase, the authors compute the electronic structure of a metallic system with 1 688 atoms on a modest cluster, achieving convergence in under two hours. This demonstrates that the proposed methodology can extend real‑space DFT to system sizes that were previously out of reach for FE‑based solvers.

In summary, the paper contributes (i) a physics‑driven a‑priori mesh‑adaptation scheme, (ii) a diagonal‑mass‑matrix spectral FE discretisation that eliminates the generalized eigenvalue problem, (iii) Chebyshev‑filter subspace iteration that accelerates occupied‑state extraction by two orders of magnitude, and (iv) a thorough quantitative analysis of the benefits of higher‑order FE elements. Together, these innovations provide a practical, scalable route to high‑accuracy, large‑scale real‑space DFT calculations, opening the door to future extensions such as time‑dependent DFT, hybrid functionals, and GPU‑accelerated implementations.