Multi-Agent Temporal Logic Planning via Penalty Functions and Block-Coordinate Optimization

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Multi-agent planning under Signal Temporal Logic (STL) is often hindered by collaborative tasks that lead to computational challenges due to the inherent high-dimensionality of the problem, preventing scalable synthesis with satisfaction guarantees. To address this, we formulate STL planning as an optimization program under arbitrary multi-agent constraints and introduce a penalty-based unconstrained relaxation that can be efficiently solved via a Block-Coordinate Gradient Descent (BCGD) method, where each block corresponds to a single agent’s decision variables, thereby mitigating complexity. By utilizing a quadratic penalty function defined via smooth STL semantics, we show that BCGD iterations converge to a stationary point of the penalized problem under standard regularity assumptions. To enforce feasibility, the BCGD solver is embedded within a two-layer optimization scheme: inner BCGD updates are performed for a fixed penalty parameter, which is then increased in an outer loop to progressively improve multi-agent STL robustness. The proposed framework enables scalable computations and is validated through various complex multi-robot planning scenarios.

💡 Research Summary

The paper tackles the challenging problem of synthesizing control sequences for multi‑agent systems (MAS) that must satisfy complex Signal Temporal Logic (STL) specifications. Traditional approaches either resort to mixed‑integer formulations, which do not scale, or restrict themselves to limited STL fragments. The authors propose a unified, scalable optimization framework that combines smooth STL robustness approximations, a quadratic penalty formulation, and a block‑coordinate gradient descent (BCGD) algorithm, embedded within a two‑level penalty method.

Problem formulation
Each agent i follows discrete‑time dynamics (x_i(t+1)=f_i(x_i(t),u_i(t))). The global STL specification (\phi) is a conjunction of sub‑formulas (\phi_\nu) defined over cliques (\nu) of agents; cliques may overlap, reflecting collaborative tasks. The objective is to minimize a separable cost (L(u)=\sum_i L_i(u_i)) while ensuring the STL robustness (\rho_\phi(u)>0).

Smooth STL approximation
The non‑smooth min/max operators in STL robustness are replaced by log‑softmax approximations parameterized by (\Gamma>0). This yields a smooth robustness (\varrho_\phi^\Gamma(u)) that satisfies (\varrho_\phi^\Gamma(u)\le\rho_\phi(u)). Crucially, the authors prove that for conjunctive specifications with at least two cliques, (\varrho_\phi^\Gamma(u)\ge0) implies (\rho_\phi(u)>0). Hence, the original hard constraint can be conservatively relaxed to (\varrho_\phi^\Gamma(u)\ge0).

Penalty formulation
The relaxed feasible set (\mathcal{M}={u\mid\varrho_\phi^\Gamma(u)\ge0}) is turned into an unconstrained problem by introducing a quadratic penalty \

Multi-Agent Temporal Logic Planning via Penalty Functions and Block-Coordinate Optimization

💡 Research Summary

Comments & Academic Discussion

Leave a Comment