Better without U: Impact of Selective Hubbard U Correction on Foundational MLIPs

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The training of foundational machine learning interatomic potentials (fMLIPs) relies on diverse databases with energies and forces calculated using ab initio methods. We show that fMLIPs trained on large datasets such as MPtrj, Alexandria, and OMat24 encode inconsistencies from the Materials Project’s selective use of the Hubbard U correction, which is applied to certain transition metals only if O or F atoms are present in the simulation cell. This inconsistent use of +U creates two incompatible potential-energy surfaces (PES): a lower-energy GGA surface and a higher-energy GGA+U one. When trained on both, MLIPs interpolate between them, leading to systematic underbinding, or even spurious repulsion, between U-corrected metals and oxygen- or fluorine-containing species. Models such as MACE-OMAT and -MPA exhibit repulsion between U-corrected metals and their oxides, limiting their value for studying catalysis and oxidation. We link the severity of this pathology to the oxygen number density in U-corrected training configurations. This explains why OMAT-trained models are most affected and suggests the issue might worsen as expanding future datasets increasingly include configurations with low oxygen content, such as those generated through combinatorial exploration of multi-element or defect-containing systems. Our simple per-U-corrected-atom shift aligns PBE+U and PBE energies for identical structures, yielding a smoother PES compared to existing correction schemes, which target phase diagram accuracy. As a result, models trained on datasets with our shift applied exhibit smaller mean absolute errors for the adsorption energies of oxygen on U-corrected elemental slabs. Since datasets omitting +U entirely (e.g. MatPES, MP-ALOE) avoid these pathologies, we recommend excluding +U in future fMLIP datasets. For existing datasets, our post-hoc correction provides a low-cost improvement.

💡 Research Summary

The paper investigates a hidden but critical inconsistency in the training data used for foundational machine‑learning interatomic potentials (fMLIPs). Large public DFT databases such as the Materials Project (MP), Alexandria, and OMat24 adopt the PBE exchange‑correlation functional but apply a Hubbard U correction only to selected transition‑metal d‑orbitals when oxygen or fluorine atoms are present in the simulation cell. This “selective +U” policy creates two distinct potential‑energy surfaces (PES): a lower‑energy GGA (PBE) surface for systems without O/F and a higher‑energy GGA+U surface for systems that contain these anions.

When fMLIPs are trained on mixed raw energies from both surfaces, the models must interpolate between them. Because the models have a finite cutoff radius and learn a single continuous PES, they inevitably produce spurious positive energy contributions when an O or F atom approaches a metal that was previously sampled without the +U correction. The result is systematic under‑binding or even outright repulsion between U‑corrected metals (V, Cr, Fe, Co, Ni, Mn, Mo, W) and oxygen/fluorine‑containing species.

The authors demonstrate the pathology through three sets of experiments. First, they compute oxygen adsorption energies on 54 elemental slabs using DFT and compare them to predictions from a suite of state‑of‑the‑art MLIPs (MACE, MA CE, CHGNet, eSEN). All models trained on raw MP‑type data (including MACE‑OMAT, MACE‑MPA, MA CE‑OMAT) severely under‑bind oxygen on every U‑corrected metal; MA CE‑OMAT even predicts no binding at all, while some models show a non‑physical energy barrier. In contrast, models trained on the MatPES dataset, which omits +U entirely, reproduce the DFT reference accurately.

Second, they evaluate metal‑oxide adhesion energies for several metal/oxide interfaces. Models trained on mixed data predict unstable interfaces (positive adhesion energies) or large repulsive forces, causing the metal and oxide slabs to separate during relaxation. Again, the MatPES‑trained model yields physically sensible adhesion energies.

Third, they examine fluorine adsorption on Fe/FeO slabs, confirming that the same under‑binding appears when fluorine approaches a pure Fe surface but not when it interacts with FeO, which is already represented on the GGA+U surface.

The paper also reviews existing correction schemes designed for phase‑diagram accuracy, such as the Wang et al. and Jain et al. approaches. These schemes add constant per‑element or per‑anion shifts to PBE+U energies, fitted to experimental formation enthalpies. While they reduce formation‑energy errors, they do not produce a smooth PES because the shifts are discontinuous and do not eliminate the fundamental energy gap between PBE and PBE+U configurations. Consequently, MLIPs trained on data corrected with these schemes still exhibit noticeable under‑binding, albeit less severe.

To address the problem directly, the authors propose a simple “per‑U‑atom shift.” For every structure that appears both in a PBE‑only dataset (MatPES‑PBE) and in a PBE+U dataset (MP‑PBE+U), they compute the energy difference and fit a constant offset for each U‑corrected element that minimizes the mean‑squared difference across all matched structures. This shift aligns the two PES on a per‑atom basis, effectively removing the discontinuity.

Training the same MACE architecture on the MP dataset after applying the per‑U‑atom shift yields markedly improved performance: the mean absolute error (MAE) for oxygen adsorption on U‑corrected slabs drops by ~30 % compared with models using the Wang et al. correction, and the predicted adhesion energies become physically reasonable. The authors also show that the severity of the pathology correlates with the oxygen number density in the U‑corrected configurations; datasets with higher O density (e.g., OMat24) suffer more because the mixed PES is sampled more densely in the high‑energy region.

Finally, the authors issue two practical recommendations. (1) Future fMLIP datasets should avoid the Hubbard U correction altogether, as demonstrated by the clean performance of MatPES‑based models. (2) For existing large datasets that already contain mixed PBE/PBE+U data, the per‑U‑atom shift provides a low‑cost post‑hoc fix that restores a smooth PES and improves downstream predictions for catalysis, corrosion, and other applications involving metal‑oxygen or metal‑fluorine interactions.

Overall, the work highlights an under‑appreciated source of error in modern MLIP training pipelines, provides a rigorous quantitative analysis of its impact, and offers a straightforward, physically motivated remedy that can be readily adopted by the community.

Better without U: Impact of Selective Hubbard U Correction on Foundational MLIPs

💡 Research Summary

Comments & Academic Discussion

Leave a Comment