End-to-End Differentiable Learning of a Single Functional for DFT and Linear-Response TDDFT
Density functional theory (DFT) and linear-response time-dependent density functional theory (LR-TDDFT) rely on an exchange-correlation (xc) approximation that provides not only energy but also its functional derivatives that enter the self-consistent potential and the response kernel. Here we present an end-to-end differentiable workflow to optimize a single deep-learned energy functional using targets from both Kohn-Sham DFT and adiabatic LR-TDDFT within the Tamm-Dancoff approximation. Implemented in a JAX-based two-component quantum chemistry code (IQC), the learned functional yields a consistent potential and LR kernel via automatic differentiation, enabling gradient-based training through the SCF fixed point and the Casida equation. As a proof of concept in a fixed finite basis (cc-pVDZ), we learn an exchange-correlation functional on the helium spectrum while incorporating one-electron self-interaction cancelation and the Lieb-Oxford inequality as penalty terms, and we assess its possible transfer to molecular test cases.
💡 Research Summary
This paper introduces a fully differentiable workflow that jointly optimizes a single deep‑learning exchange‑correlation (xc) energy functional for both ground‑state Kohn‑Sham density functional theory (DFT) and linear‑response time‑dependent DFT (LR‑TDDFT) within the Tamm‑Dancoff approximation. Implemented in a JAX‑based two‑component quantum chemistry package called IQC, the approach leverages automatic differentiation to obtain the functional’s first derivative (the xc potential) for the self‑consistent field (SCF) iteration and its second derivative (the adiabatic response kernel) for the Casida eigenvalue problem.
A key technical contribution is the treatment of the SCF convergence as a fixed‑point equation. By defining a mapping Sθ that returns a new Fock matrix from a given one and applying implicit differentiation to the fixed‑point condition g(F,θ)=Sθ(F)−F=0, the authors compute gradients with respect to the functional parameters without unrolling the SCF iterations, keeping memory usage constant regardless of the number of SCF steps. For the LR‑TDDFT part, they differentiate through the Casida equation, regularizing near‑degenerate eigenvalue denominators with Δ/(Δ²+ε) to avoid numerical instability.
The functional is represented by a neural network that takes the eigenvalue spectrum of the density matrix (computed in a fixed cc‑pVDZ basis) as input. A shared per‑eigenvalue embedding MLP processes each eigenvalue, the resulting embeddings are mean‑pooled, and a small prediction head outputs a scalar energy correction. This design guarantees rotational invariance but does not enforce strict size‑extensivity, a limitation acknowledged by the authors.
Training is performed on a single atom, helium, using a loss that combines three terms: (1) the mean‑squared error between the calculated and target excitation energies (first singlet S₁ and first triplet T₁) obtained from high‑level EOM‑CCSD, (2) a self‑interaction error (SIE) penalty measured on the helium cation (He⁺), and (3) a Lieb‑Oxford bound penalty that discourages the functional from violating the universal lower bound on the exchange‑correlation energy. Hyperparameters (c₁=10, c₂=1, c₃=10⁻⁴) and the Adam optimizer with a fixed learning rate of 1e‑4 enable convergence within ten training iterations, with a smooth monotonic decrease of the total loss.
Results show that the learned IXC functional reproduces the target excitation energies of He within 0.01 a.u., comparable to or better than traditional functionals (SVWN, PBE, B3LYP) when used in LR‑TDDFT. The SIE and Δ_LO metrics also improve, with Δ_LO remaining positive throughout training, confirming compliance with the Lieb‑Oxford inequality. To assess transferability, the authors apply the trained functional (without further retraining) to three additional systems—H₂, Li⁺, and H₂O—using the same cc‑pVDZ basis. Across these molecules, IXC yields lower mean absolute errors for excitation energies and especially for SIE (below 0.01 a.u.) compared to HF, SVWN, PBE, and B3LYP, demonstrating that a functional trained on a single atom can generalize to multi‑electron, multi‑atom cases.
In summary, the paper delivers a novel framework that makes the entire DFT + LR‑TDDFT pipeline differentiable, allowing gradient‑based joint optimization of an xc energy functional, its potential, and its response kernel. By embedding exact physical constraints (self‑interaction cancellation and Lieb‑Oxford bound) into the loss, the learned functional respects fundamental theoretical limits while achieving high accuracy on both ground‑state and excited‑state properties. The work opens avenues for future extensions such as incorporating full density‑matrix inputs to achieve size‑extensivity, scaling to larger basis sets and more complex molecules, handling non‑collinear spin, and eventually integrating real‑time TDDFT or many‑body perturbation theory within a similarly differentiable architecture.
Comments & Academic Discussion
Loading comments...
Leave a Comment