SAVME: Efficient Safety Validation for Autonomous Systems Using Meta-Learning

Discovering potential failures of an autonomous system is important prior to deployment. Falsification-based methods are often used to assess the safety of such systems, but the cost of running many accurate simulation can be high. The validation can be accelerated by identifying critical failure scenarios for the system under test and by reducing the simulation runtime. We propose a Bayesian approach that integrates meta-learning strategies with a multi-armed bandit framework. Our method involves learning distributions over scenario parameters that are prone to triggering failures in the system under test, as well as a distribution over fidelity settings that enable fast and accurate simulations. In the spirit of meta-learning, we also assess whether the learned fidelity settings distribution facilitates faster learning of the scenario parameter distributions for new scenarios. We showcase our methodology using a cutting-edge 3D driving simulator, incorporating 16 fidelity settings for an autonomous vehicle stack that includes camera and lidar sensors. We evaluate various scenarios based on an autonomous vehicle pre-crash typology. As a result, our approach achieves a significant speedup, up to 18 times faster compared to traditional methods that solely rely on a high-fidelity simulator.

💡 Research Summary

The paper introduces SAVME (Safety Validation using Meta‑Learning), a novel framework that dramatically reduces the computational burden of falsification‑based safety validation for autonomous systems while preserving the thoroughness of high‑fidelity simulation. Traditional falsification approaches explore a high‑dimensional scenario space by repeatedly running expensive, high‑fidelity simulations (e.g., full‑resolution camera and lidar models, detailed vehicle dynamics). Because each simulation can take several seconds to minutes, exhaustive search quickly becomes infeasible for real‑world autonomous vehicle (AV) stacks.
SAVME tackles this problem on two fronts. First, it casts the search for failure‑inducing scenario parameters as a Bayesian multi‑armed bandit (MAB) problem. Each “arm” corresponds to a region of the scenario parameter space (initial speed, distance to obstacle, weather, traffic density, etc.). Pulling an arm triggers a simulation; the reward is binary, indicating whether the safety specification was violated. A Bayesian posterior over arm success probabilities is updated after each trial, guiding the algorithm toward the most promising regions and away from low‑risk areas. This adaptive sampling dramatically cuts the number of required high‑fidelity runs.
Second, SAVME incorporates meta‑learning to select the simulation fidelity level. The authors define 16 fidelity settings that trade off accuracy (sensor resolution, lidar point density, physics time‑step) against runtime. During an offline meta‑training phase, the system observes how quickly different fidelity‑parameter combinations converge to a reliable estimate of failure probability across a diverse set of training scenarios. The resulting prior distribution over fidelity arms captures which settings are most “informative” for new, unseen scenarios. When a new validation task begins, the algorithm simultaneously runs two coupled bandits: one over scenario parameters and one over fidelity levels. Low‑fidelity simulations are used to quickly prune large swaths of the scenario space, while high‑fidelity runs are reserved for the most suspicious candidates identified by the low‑fidelity bandit.
The authors evaluate SAVME on a state‑of‑the‑art 3D driving simulator (CARLA) equipped with a full AV stack that processes camera and lidar data, performs perception, planning, and control. They construct a benchmark suite based on a pre‑crash typology (e.g., lane‑change collisions, sudden braking, pedestrian crossing) and compare four methods: (1) pure high‑fidelity falsification, (2) random sampling, (3) a single‑bandit approach that only optimizes scenario parameters, and (4) SAVME’s dual‑bandit meta‑learning approach. Across all scenarios, SAVME achieves an average speed‑up of 12× and a peak speed‑up of 18× relative to the baseline high‑fidelity falsification, while discovering failure cases of comparable severity and diversity. Notably, the meta‑learned fidelity prior enables faster convergence on new scenario families, confirming that the fidelity selection knowledge transfers across tasks.
Beyond empirical results, the paper contributes several conceptual advances. It demonstrates that Bayesian MABs can efficiently navigate high‑dimensional, safety‑critical scenario spaces, and that meta‑learning can be harnessed to model the trade‑off between simulation speed and fidelity. The authors propose a new evaluation metric—simulation‑cost‑per‑validated‑failure—that captures both computational efficiency and validation thoroughness. They also outline future directions: theoretical analysis of convergence guarantees for the coupled bandits, extension to online, real‑time validation pipelines, and scaling to multi‑agent or multi‑system environments (e.g., coordinated drone fleets or smart‑city infrastructures).
In summary, SAVME offers a compelling solution to one of the most pressing bottlenecks in autonomous system verification: the need to run thousands of high‑fidelity simulations. By learning where to look (scenario parameters) and how precisely to look (fidelity settings) in a meta‑learning framework, it achieves order‑of‑magnitude reductions in computational cost without sacrificing safety assurance. This work is likely to influence both academic research on safe AI and industry practices for pre‑deployment validation of autonomous vehicles and other complex cyber‑physical systems.