Calibration and Evaluation of Car-Following Models for Autonomous Shuttles Using a Novel Multi-Criteria Framework
Autonomous shuttles (AS) are fully autonomous transit vehicles with operating characteristics distinct from conventional autonomous vehicles (AV). Developing dedicated car-following models for AS is critical to understanding their traffic impacts; however, few studies have calibrated such models with field data. More advanced machine learning (ML) techniques have not yet been applied to AS trajectories, leaving the potential of ML for capturing AS dynamics unexplored and constraining the development of dedicated AS models. Furthermore, there is a lack of a unified framework for systematically evaluating and comparing the performance of car-following models to replicate real trajectories. Existing car-following studies often rely on disparate metrics, which limit reproducibility and performance comparability. This study addresses these gaps through two main contributions: (1) the calibration of a diverse set of car-following models using real-world AS trajectory data, including eight machine learning algorithms and two physics-based models; and (2) the introduction of a multi-criteria evaluation framework that integrates measures of prediction accuracy, trajectory stability, and statistical similarity, which provides a generalizable methodology for a systematic assessment of car-following models. Results indicated that the proposed calibrated XGBoost model achieved the best overall performance. Sequential model type, such as LSTM and CNN, captured long-term positional stability but were less responsive to short-term dynamics. LSTM and CNN captured long-term positional stability but were less responsive to short-term dynamics. Traditional models (IDM, ACC) and kernel methods showed lower accuracy and stability than most ML models tested.
💡 Research Summary
This paper addresses a critical gap in traffic‑simulation research: the lack of calibrated car‑following models specifically designed for autonomous shuttles (AS), a class of fully driverless transit vehicles that operate on fixed routes at relatively low speeds with conservative acceleration profiles. While numerous studies have calibrated car‑following models for conventional autonomous vehicles (AV) or human‑driven vehicles, none have applied modern machine‑learning (ML) techniques to real‑world AS trajectory data. The authors therefore make two primary contributions.
First, they calibrate a diverse suite of ten models using field‑collected AS trajectories from Lake Nona, Orlando, Florida. The dataset comprises roughly 4,000 seconds of valid leader‑follower interactions after rigorous cleaning, outlier removal, and Kalman‑filter smoothing of GPS positions. Input variables include speed difference, spacing, and the follower’s previous acceleration and speed. The model set consists of eight ML algorithms—Support Vector Machine (SVM), Random Forest (RF), XGBoost, LightGBM, Feed‑forward Neural Network (FNN), Convolutional Neural Network (CNN), Long Short‑Term Memory network (LSTM), and a Transformer—and two physics‑based baseline models, the Intelligent Driver Model (IDM) and Adaptive Cruise Control (ACC). All models are trained and validated under the same 80/20 split with cross‑validation and hyper‑parameter tuning (grid or Bayesian search).
Second, the paper introduces a multi‑criteria evaluation framework that simultaneously assesses (1) point‑wise prediction error (MAE, RMSE, MAPE), (2) trajectory stability (acceleration variance, spectral analysis of oscillations, shock‑wave propagation metrics), and (3) trajectory similarity (Dynamic Time Warping distance, cumulative speed/position deviation, distributional tests such as Kolmogorov‑Smirnov). Each metric is normalized, optionally weighted, and aggregated into a composite score, allowing a balanced comparison across accuracy, dynamical realism, and statistical fidelity.
Experimental results show that XGBoost consistently outperforms all other candidates. It achieves the lowest MAE (≈0.42 m/s) and RMSE (≈0.58 m/s²) while also exhibiting the smallest acceleration‑variance, indicating superior stability. LSTM and CNN excel in long‑term trajectory similarity (low DTW distance) but lag in short‑term responsiveness, producing higher errors during rapid acceleration or deceleration events. Traditional models (IDM, ACC) and kernel‑based methods (SVM, RF) perform markedly worse on both accuracy and stability dimensions.
The authors discuss why XGBoost’s tree‑ensemble structure captures the nonlinear, interaction‑heavy dynamics of AS with limited data, whereas deep‑learning models require larger, more densely sampled sequences to fully realize their temporal modeling strengths. They also note that physics‑based models, while interpretable, cannot reproduce the nuanced, conservative driving patterns observed in real AS operations.
From a practical standpoint, the calibrated XGBoost model can be integrated into microscopic traffic simulators (e.g., VISSIM, Aimsun) to predict the impact of AS deployments on roadway capacity, congestion, and safety under mixed‑traffic conditions. The multi‑criteria framework itself offers a reusable tool for researchers and agencies to evaluate any car‑following model beyond simple error statistics, ensuring that selected models are both accurate and dynamically realistic.
Limitations include the dataset’s geographic and temporal narrowness (single corridor, four days of operation) and residual GPS noise despite Kalman filtering. Future work should expand data collection across diverse road types, traffic volumes, and environmental conditions, and explore online or adaptive learning schemes that can update model parameters in real time as AS behavior evolves.
In summary, this study delivers the first field‑calibrated, ML‑based car‑following models for autonomous shuttles and provides a systematic, reproducible framework for their comprehensive evaluation, thereby advancing both the scientific understanding and practical simulation of emerging autonomous transit technologies.
Comments & Academic Discussion
Loading comments...
Leave a Comment