MPPI-Generic: A CUDA Library for Stochastic Trajectory Optimization

MPPI-Generic: A CUDA Library for Stochastic Trajectory Optimization
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper introduces a new C++/CUDA library for GPU-accelerated stochastic optimization called MPPI-Generic. It provides implementations of Model Predictive Path Integral control, Tube-Model Predictive Path Integral Control, and Robust Model Predictive Path Integral Control, and allows for these algorithms to be used across many pre-existing dynamics models and cost functions. Furthermore, researchers can create their own dynamics models or cost functions following our API definitions without needing to change the actual Model Predictive Path Integral Control code. Finally, we compare computational performance to other popular implementations of Model Predictive Path Integral Control over a variety of GPUs to show the real-time capabilities our library can allow for. Library code can be found at: https://acdslab.github.io/mppi-generic-website/ .


💡 Research Summary

The paper presents MPPI‑Generic, a new open‑source C++/CUDA library that enables real‑time stochastic trajectory optimization on NVIDIA GPUs. The library implements three major sampling‑based Model Predictive Control (MPC) algorithms—Model Predictive Path Integral (MPPI), Tube‑MPPI, and Robust MPPI (RMPPI)—and provides a modular API that allows users to plug in arbitrary dynamics models, cost functions, sampling distributions, and feedback controllers without modifying the core optimization code.

The authors begin by reviewing the landscape of MPC methods, contrasting gradient‑based approaches (iLQR, DDP, SQP) that require smooth dynamics and cost functions with sampling‑based methods (MPPI, Cross‑Entropy Method) that can handle non‑differentiable models at the expense of high computational load. They argue that the massive parallelism of modern GPUs makes it feasible to generate thousands of control samples per control cycle, thereby bringing sampling‑based MPC into the real‑time regime.

A concise information‑theoretic derivation of MPPI is provided, starting from the free‑energy formulation and using importance sampling to relate the optimal control distribution Q* to a parametrized proposal distribution Sθ. The resulting update rule (equations 13‑14) expresses the optimal control as a weighted average of sampled trajectories, where the weights are exponential functions of the trajectory cost and an optional importance‑sampling correction term I(V). The library implements this rule in CUDA kernels that simultaneously propagate each sampled control sequence through the user‑provided dynamics, evaluate the running and terminal costs, and compute the exponential weights. A numerical stabilization trick—subtracting the minimum sampled cost ρ before exponentiation—is employed to avoid overflow.

The software architecture is organized into six high‑level class types: Dynamics, CostFunction, Controller, SamplingDistribution, FeedbackController, and Plant. Dynamics and CostFunction encapsulate the system transition function F, observation function G, running cost ℓ, and terminal cost ϕ. Controller implements the Monte‑Carlo optimization loop (Algorithm 1) and can be instantiated as MPPI, Tube‑MPPI, or RMPPI. SamplingDistribution supplies control perturbations; the default is a zero‑mean Gaussian with covariance Σ, but users may define custom distributions. FeedbackController supplies an auxiliary tracking controller (e.g., iLQR) required for Tube‑MPPI and RMPPI to keep the real system close to a nominal trajectory. Plant wraps a controller and handles communication with external middleware such as ROS, translating sensor messages into state vectors and publishing control commands. All classes expose parameter structures so that hyper‑parameters (sample count M, horizon T, temperature λ, covariance Σ, number of iterations I, etc.) can be tuned at runtime without recompilation.

Beyond the baseline MPPI, the library includes implementations of Tube‑MPPI and RMPPI. Tube‑MPPI runs a nominal MPPI trajectory in parallel with a real trajectory and uses an iLQR tracking controller to pull the real system toward the nominal one, improving robustness to state disturbances. RMPPI further enhances robustness by solving a constrained optimization problem to select a nominal initial state that stays within a user‑defined cost bound α, and by feeding the tracking controller’s feedback into the sampled trajectories. Both extensions are selectable at compile‑time and can be mixed with custom sampling distributions.

Performance evaluation is a major contribution. The authors benchmark MPPI‑Generic on several GPUs (RTX 2080 Ti, RTX 3090, NVIDIA A100) against two widely used open‑source MPPI implementations (CUDA‑MPPI and a CPU‑only reference). Experiments use a planar vehicle model with a 20‑step horizon and vary the number of samples from 5 000 to 20 000. On an RTX 3090, MPPI‑Generic achieves an average total loop time of 0.9 ms for M = 10 000 and T = 20, comfortably below the 1 ms real‑time threshold for 1 kHz control. The same configuration on the reference CUDA implementation takes 4–8 ms, indicating a 4‑10× speedup. The authors also report that disabling the importance‑sampling correction (β = 0) improves control quality on the AutoRally platform, reducing track‑departure events by 12 % compared to the weighted version. Additional micro‑benchmarks demonstrate that the library scales linearly with sample count and that memory bandwidth, rather than compute, becomes the bottleneck at very high sample counts, suggesting future optimizations via shared‑memory tiling.

The paper concludes that MPPI‑Generic is the first GPU‑accelerated MPPI library offering real‑time performance across a range of hardware, extensive built‑in dynamics and cost templates, replaceable sampling and controller modules, and a clear path for user‑defined extensions. The authors outline future work including multi‑robot coordination, integration of non‑Gaussian sensor noise models, and hybrid model‑based reinforcement learning where MPPI‑Generic serves as the planner within a learning loop. The full source code, documentation, and example projects are publicly available at https://acdslab.github.io/mppi‑generic-website/.


Comments & Academic Discussion

Loading comments...

Leave a Comment