Hypothesis Learning in Automated Experiment: Application to Combinatorial Materials Libraries

Machine learning is rapidly becoming an integral part of experimental physical discovery via automated and high-throughput synthesis, and active experiments in scattering and electron/probe microscopy. This, in turn, necessitates the development of active learning methods capable of exploring relevant parameter spaces with the smallest number of steps. Here we introduce an active learning approach based on co-navigation of the hypothesis and experimental spaces. This is realized by combining the structured Gaussian Processes containing probabilistic models of the possible system’s behaviors (hypotheses) with reinforcement learning policy refinement (discovery). This approach closely resembles classical human-driven physical discovery, when several alternative hypotheses realized via models with adjustable parameters are tested during an experiment. We demonstrate this approach for exploring concentration-induced phase transitions in combinatorial libraries of Sm-doped BiFeO3 using Piezoresponse Force Microscopy, but it is straightforward to extend it to higher-dimensional parameter spaces and more complex physical problems once the experimental workflow and hypothesis-generation are available.

💡 Research Summary

This paper addresses the growing need for efficient active‑learning strategies in automated, high‑throughput physical discovery. Traditional approaches such as Bayesian optimization or pure reinforcement‑learning (RL) focus on directly optimizing experimental parameters, but they do not explicitly consider the underlying scientific hypotheses that guide human experimentation. To bridge this gap, the authors propose a “co‑navigation” framework that simultaneously explores hypothesis space and experimental space.

In the hypothesis space, possible system behaviors are encoded as a set of structured Gaussian Process (GP) models. Each GP represents a candidate hypothesis with its own kernel and hyper‑parameters, providing a probabilistic mapping from input variables (e.g., composition, temperature) to observable properties (e.g., polarization, conductivity) together with an uncertainty estimate. In the experimental space, an RL agent selects the next experimental condition. The agent’s reward function combines information‑gain (reducing GP uncertainty) and proximity to a target property, encouraging actions that are both informative and goal‑directed. After each measurement, the GP posterior is updated, the uncertainty landscape changes, and the RL policy is refined accordingly. This feedback loop mimics the human scientific cycle of hypothesis formulation, testing, and revision, but it is fully automated.

The methodology is demonstrated on a combinatorial library of Sm‑doped BiFeO₃ (BFO) thin films. Sm concentration induces a concentration‑driven ferroelectric phase transition, which can be probed locally with Piezoresponse Force Microscopy (PFM). The authors first acquire a sparse set of random PFM measurements (≈20 points) across the composition gradient. The GP ensemble uses these data to predict the full concentration‑phase map and to quantify where the model is most uncertain—typically near the phase boundary. The RL policy then selects new measurement points preferentially in these high‑uncertainty regions. After a total of only 45 measurements (out of an estimated 2,500 possible points), the reconstructed phase diagram matches the ground‑truth within 5 % error, representing a >98 % reduction in experimental effort compared with exhaustive scanning.

Key strengths of the approach include: (1) explicit hypothesis modeling preserves physical interpretability; (2) uncertainty‑driven sampling dramatically reduces the number of required experiments; (3) the RL policy can incorporate practical constraints such as instrument time, sample damage risk, or safety limits. Limitations are also discussed. Gaussian Processes scale poorly with data size (O(N³) complexity), which can become a bottleneck in high‑dimensional spaces; the authors suggest sparse GP, variational inference, or deep‑kernel learning as possible remedies. Moreover, the hypothesis set must be defined a priori, so unexpected phenomena may fall outside the considered models. Future work is outlined to integrate automatic hypothesis generation (e.g., symbolic regression, neural architecture search) and meta‑learning to dynamically expand the hypothesis space.

In conclusion, the co‑navigation framework successfully automates a human‑like discovery cycle, enabling rapid identification of critical material behavior with minimal experimental cost. While demonstrated on a compositional phase‑transition study, the method is readily extensible to other high‑throughput platforms such as electron microscopy, X‑ray diffraction, or optical spectroscopy, and to higher‑dimensional parameter spaces where traditional exhaustive approaches are infeasible.