Algorithmic differentiation for plane-wave DFT: materials design, error control and learning model parameters

Algorithmic differentiation for plane-wave DFT: materials design, error control and learning model parameters
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present a differentiation framework for plane-wave density-functional theory (DFT) that combines the strengths of forward-mode algorithmic differentiation (AD) and density-functional perturbation theory (DFPT). In the resulting AD-DFPT framework derivatives of any DFT output quantity with respect to any input parameter (e.g. geometry, density functional or pseudopotential) can be computed accurately without deriving gradient expressions by hand. We implement AD-DFPT into the Density-Functional ToolKit (DFTK) and show its broad applicability. Amongst others we consider the inverse design of a semiconductor band gap, the learning of exchange-correlation functional parameters, or the propagation of DFT parameter uncertainties to relaxed structures. These examples demonstrate a number of promising research avenues opened by gradient-driven workflows in first-principles materials modeling.


💡 Research Summary

The paper introduces a novel framework, AD‑DFPT, that merges forward‑mode algorithmic differentiation (AD) with traditional density‑functional perturbation theory (DFPT) to enable automatic, accurate derivatives of any plane‑wave density‑functional theory (DFT) output with respect to arbitrary input parameters. By embedding AD directly into the Density‑Functional ToolKit (DFTK), a compact Julia‑based plane‑wave DFT code, the authors achieve end‑to‑end differentiability across the three stages of a DFT workflow: setup (construction of the Kohn‑Sham Hamiltonian and energy functional from parameters θ), solve (self‑consistent field (SCF) iteration to obtain the ground‑state density matrix P(θ)), and post‑processing (evaluation of physical observables such as energies, forces, band structures).

The key technical insight is that the SCF stage, which is inherently nonlinear, can be differentiated by recognizing that the density matrix satisfies a fixed‑point relation P = f(H(θ,P)), where f is the smearing function. Differentiating this relation yields a Dyson‑type linear‑response equation ∂P/∂θ = (1‑χ₀K)⁻¹χ₀ ∂H/∂θ, where χ₀ = ∂f/∂H is the independent‑particle susceptibility and K = ∂H/∂P captures the Hartree and exchange‑correlation feedback. This equation is precisely the DFPT response problem; solving it provides the implicit contribution of the SCF to any downstream derivative. Because plane‑wave DFT operates with matrix‑free algorithms and FFTs, the authors implement the response solver in a matrix‑free, iterative fashion, preserving the efficiency of standard DFPT while allowing AD to handle the explicit dependence of H on θ.

Implementation leverages Julia’s modern AD ecosystem (e.g., Zygote, ForwardDiff) to automatically propagate derivatives through primitive operations such as floating‑point arithmetic, BLAS calls, FFTs, and eigenvalue solvers. Custom primitives are defined for the SCF solve, linking the AD system to the DFPT linear‑response solver. Consequently, any new parameter—geometric strain, pseudopotential coefficients, exchange‑correlation functional parameters—can be added without hand‑derived gradient formulas.

The authors demonstrate the framework with six illustrative applications:

  1. Elastic constants – By differentiating the stress tensor (first derivative of total energy with respect to strain) and then differentiating again, elastic stiffness tensors are obtained automatically. Compared with finite‑difference (FD) approaches, AD‑DFPT shows superior robustness to SCF convergence tolerances and eliminates the need for manual second‑derivative code.

  2. Inverse band‑gap design – A loss function penalizing deviation from a target band gap is defined. AD‑DFPT supplies gradients with respect to both lattice strain and exchange‑correlation parameters, enabling simultaneous optimization of geometry and functional to achieve the desired gap in silicon and GaAs.

  3. Learning exchange‑correlation parameters – Experimental observables are fitted by adjusting XC functional coefficients. The framework provides exact gradients for Bayesian or gradient‑based optimization, and also propagates parameter uncertainties to quantify model error.

  4. Pseudopotential optimization – By treating pseudopotential model coefficients as tunable parameters, a cost function based on forces and total energies is minimized. AD‑DFPT yields the necessary parameter gradients, producing pseudopotentials that outperform standard libraries in accuracy.

  5. Error propagation to relaxed structures – Uncertainties in plane‑wave cutoff, k‑point density, and smearing are treated as input perturbations. Their effect on relaxed atomic positions is obtained via the chain rule, delivering confidence intervals for optimized geometries.

  6. Force error due to cutoff – The sensitivity of forces to the plane‑wave energy cutoff is quantified automatically, providing guidance for convergence testing without repeated costly calculations.

Across all examples, AD‑DFPT matches or exceeds the precision of traditional DFPT while dramatically reducing human effort. The authors note that the current implementation uses forward‑mode AD, which scales with the number of independent input perturbations. They anticipate that extending the framework to reverse‑mode AD will enable efficient gradient computation for high‑dimensional parameter spaces typical in machine‑learning‑augmented DFT functionals.

In summary, this work delivers a general, automated differentiation infrastructure for plane‑wave DFT, turning derivative information into a first‑class asset for materials modeling. It opens new avenues for inverse materials design, systematic uncertainty quantification, and data‑driven functional development, all while preserving the computational efficiency of established DFPT techniques.


Comments & Academic Discussion

Loading comments...

Leave a Comment