Active Δ-learning with universal potentials for global structure optimization

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Universal machine learning interatomic potentials (uMLIPs) have recently been formulated and shown to generalize well. When applied out-of-sample, further data collection for improvement of the uMLIPs may, however, be required. In this work we demonstrate that, whenever the envisaged use of the MLIPs is global optimization, the data acquisition can follow an active learning scheme in which a gradually updated uMLIP directs the finding of new structures, which are subsequently evaluated at the density functional theory (DFT) level. In the scheme, we augment foundation models using a Δ-model based on this new data using local SOAP-descriptors, Gaussian kernels, and a sparse Gaussian Process Regression model. We compare the efficacy of the approach with different global optimization algorithms, Random Structure Search, Basin Hopping, a Bayesian approach with competitive candidates (GOFEE), and a replica exchange formulation (REX). We further compare several foundation models, CHGNet, MACE-MP0, and MACE-MPA. The test systems are silver-sulfur clusters and sulfur-induced surface reconstructions on Ag(111) and Ag(100). Judged by the fidelity of identifying global minima, active learning with GPR-based Δ-models appears to be a robust approach. Judged by the total CPU time spent, the REX approach stands out as being the most efficient.

💡 Research Summary

This paper, “Active Δ-learning with universal potentials for global structure optimization,” presents and rigorously evaluates an integrated active learning framework designed to enhance universal machine learning interatomic potentials (uMLIPs) for specific chemical systems while simultaneously using the improved potential to drive global structure optimization searches.

The core challenge addressed is that while uMLIPs (like CHGNet and MACE models) offer broad generalization across materials, their accuracy for precise tasks like identifying the global minimum (GM) on a specific system’s potential energy surface (PES) may be insufficient. Traditional fine-tuning requires substantial data and computational effort and risks catastrophic forgetting.

The proposed solution synergistically combines two key components:

Δ-model Correction: The total energy prediction is corrected by adding a Δ-term to the uMLIP output (E_corrected = E_uMLIP + E_Δ). The E_Δ is a sparse Gaussian Process Regression (GPR) model trained on a small, system-specific dataset of DFT-calculated energies. It uses Smooth Overlap of Atomic Positions (SOAP) descriptors and a Gaussian kernel, making it agnostic to the uMLIP’s internal architecture. This approach allows for fast, lightweight retraining and mitigates interference with the uMLIP’s broad knowledge.
Active Learning Loop: The Δ-corrected uMLIP is employed within a global optimization algorithm to explore new candidate structures. These candidates are then evaluated with DFT, and the resulting (structure, DFT energy) pairs are added to the training pool for the Δ-model. This creates a feedback loop where improved model accuracy leads to more efficient exploration, which in turn generates better data for further model refinement.

The methodology section details the uMLIPs tested (CHGNet, MACE-MP0, MACE-MPA) and the four global optimization algorithms implemented within the AGOX framework: Random Structure Search (RSS), Basin Hopping (BH), a Bayesian optimizer (GOFEE), and a Replica Exchange method (REX). The test systems are

Active Δ-learning with universal potentials for global structure optimization

💡 Research Summary

Comments & Academic Discussion

Leave a Comment