Feasibility-aware Learning of Robust Temporal Logic Controllers using BarrierNet

Feasibility-aware Learning of Robust Temporal Logic Controllers using BarrierNet
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Control Barrier Functions (CBFs) have been used to enforce safety and task specifications expressed in Signal Temporal Logic (STL). However, existing CBF-STL approaches typically rely on fixed hyperparameters and per-step optimization, which can lead to overly conservative behavior, infeasibility near tight input limits, and difficulty satisfying long-horizon STL tasks. To address these limitations, we propose a feasibility-aware learning framework that constructs trainable, time-varying High Order Control Barrier Function (HOCBF) constraints and hyperparameters that guarantee satisfaction of a given STL specification. We introduce a unified robustness measure that jointly captures STL satisfaction, constraint feasibility, and control-bound compliance, and propose a neural network architecture to generate control inputs that maximize this robustness. The resulting controller guarantees STL satisfaction with strictly feasible HOCBF constraints and requires no manual tuning. Simulation results demonstrate that the proposed framework maintains high STL robustness under tight input bounds and significantly outperforms fixed-parameter and non-adaptive baselines in complex environments.


💡 Research Summary

**
The paper tackles the problem of enforcing Signal Temporal Logic (STL) specifications on nonlinear control‑affine systems while respecting tight input bounds. Existing CBF‑STL methods rely on fixed hyper‑parameters for class‑κ functions in High‑Order Control Barrier Functions (HOCBFs). Such static choices lead to excessive conservatism and, more critically, to infeasible quadratic programs (QPs) when the system approaches the barrier under stringent actuation limits. Moreover, prior learning‑based approaches either ignore constraints during policy optimization or require a supervised reference trajectory that already satisfies all constraints—an unrealistic demand for complex STL tasks.

To overcome these limitations, the authors propose a feasibility‑aware learning framework that integrates three neural networks with a differentiable QP layer (BarrierNet).

  1. InitNet predicts the initial set of HOCBF hyper‑parameters and QP cost weights from the sampled initial state.
  2. RefNet updates cost‑related hyper‑parameters online during a rollout, allowing the controller to adapt its objective as the trajectory evolves.
  3. BarrierNet (time‑varying extension) embeds HOCBF constraints into a QP whose class‑κ scaling factors (p_i(t)) are generated by the network at each time step, making the barrier constraints adaptive rather than static.

A novel unified robustness metric is introduced, combining three components: (i) the standard STL robustness (\rho(\phi, x, 0)), (ii) a feasibility penalty measuring violation of the HOCBF constraints, and (iii) a bound‑violation penalty for exceeding the input limits. The overall loss is a weighted sum of these terms, and the weights themselves are learnable. By minimizing this loss, the system simultaneously maximizes STL satisfaction, maintains QP feasibility, and respects actuator bounds.

Training proceeds in a model‑based reinforcement‑learning loop. For each episode, an initial condition is drawn, InitNet supplies the initial hyper‑parameters, RefNet updates them along the trajectory, and BarrierNet solves the differentiable QP to produce the control input. The resulting trajectory is evaluated with the unified robustness metric, gradients are back‑propagated through the QP layer, and all network parameters are updated end‑to‑end. Because the QP is differentiable, the entire pipeline is trainable without any external supervision or pre‑computed reference controls.

Experimental validation uses a 2‑D double‑integrator robot tasked with a composite STL specification: stay inside a safe region for at least 5 s, reach a goal within 10 s, and avoid obstacles. Input limits are set to a narrow interval (


Comments & Academic Discussion

Loading comments...

Leave a Comment