Multi-agent planning under Signal Temporal Logic (STL) is often hindered by collaborative tasks that lead to computational challenges due to the inherent high-dimensionality of the problem, preventing scalable synthesis with satisfaction guarantees. To address this, we formulate STL planning as an optimization program under arbitrary multi-agent constraints and introduce a penalty-based unconstrained relaxation that can be efficiently solved via a Block-Coordinate Gradient Descent (BCGD) method, where each block corresponds to a single agent's decision variables, thereby mitigating complexity. By utilizing a quadratic penalty function defined via smooth STL semantics, we show that BCGD iterations converge to a stationary point of the penalized problem under standard regularity assumptions. To enforce feasibility, the BCGD solver is embedded within a two-layer optimization scheme: inner BCGD updates are performed for a fixed penalty parameter, which is then increased in an outer loop to progressively improve multi-agent STL robustness. The proposed framework enables scalable computations and is validated through various complex multi-robot planning scenarios.
M ULTI-agent systems (MAS) research deals with the task of coordinating collections of autonomous systems, e.g., in logistics, exploration, and smart infrastructure. These applications require agents to satisfy complex spatiotemporal and logical constraints governing both individual and interactive behaviors. Signal Temporal Logic (STL) [1] has emerged as a powerful framework suitable for these requirements, offering an expressive language to encode timebounded properties over continuous-time signals.
Unlike automata-based LTL synthesis [2], STL’s quantitative semantics [3], [4] enable the direct optimization of satisfaction margins over system trajectories. While exact solutions can be obtained via mixed-integer MPC formulations [5], [6], their poor scalability has motivated the development of smooth robustness relaxations [7]- [10] for efficient gradientbased optimization.
However, in MAS settings, collaborative tasks further amplify computational complexity, rendering scalable planning formidable. Important existing multi-agent approaches, including distributed MPC [11], [12], decentralized feedback control [13], and sequential planning [14], address these challenges but are often limited to restricted STL fragments, rely on the assumption of feasible solutions, or employ heuristic coordination schemes that lack formal guarantees. Developing scalable planning methods that can handle complex collaborative specifications while ensuring rigorous satisfaction thus remains an important open problem.
To address this challenge, this paper introduces a scalable, optimization-based framework for multi-agent STL planning under arbitrary collaborative tasks, leveraging smooth STL semantics [9] and the computational efficiency of the Block-Coordinate Gradient Descent (BCGD) method [15]. Specifically, we demonstrate that the original planning problem, featuring an objective function that is separable across agents, yet subject to generally coupled multi-agent STL constraints, can be relaxed into an unconstrained problem via a quadratic penalty defined over smooth robustness metrics. This relaxation is solved efficiently via BCGD, where computations are performed at the block level, with each block corresponding to the decision variables of a single agent. Under standard regularity assumptions, we show that the BCGD iterations converge to a stationary point of the penalized problem, providing a computational architecture that remains invariant to the complexity of the multi-agent specification. To enforce feasible solutions for the original planning problem, the BCGD solver is embedded in a two-layer optimization scheme, forming a penalty method (PM) [16,Chap. 17], where the inner loop optimizes for a fixed penalty parameter, which is then updated in the outer loop to progressively improve multi-agent STL robustness.
This framework is the first to systematically integrate smooth STL semantics, block-coordinate optimization, and penalty functions for efficient multi-agent STL synthesis. We validate BCGD-PM across complex multi-robot scenarios, benchmarking it against an LBFGS-based implementation [16, Chap. 9] within the same modular penalty framework. This comparison highlights BCGD’s advantages in handling highdimensional STL planning. For readability, the main technical proofs are provided in the Appendix.
Notation: The sets of real numbers and nonnegative integers are IR and IN, respectively. Let N ∈ IN so that IN [0,N ] = {0, 1, . . . , N }. Let x 1 , . . . , x n be vectors so that
Given a sequence {x k }, x is called an accumulation point if there exists a subsequence {x kj } such that x kj →x.
We consider STL formulas in positive normal form (PNF) with syntax
where π := (µ(x) ≥ 0) is a predicate, with predicate function µ : IR nx → IR, ϕ 1 and ϕ 2 are STL formulas built recursively using the grammar in (1), ¬, ∧, and ∨ are the logical operators denoting negation, conjunction, and disjunction, respectively, and □ I and U I are the always and until temporal operators, respectively, defined over the discrete interval I⊂IN. We omit the eventually operator (♢ I ) from (1) since ♢ I ϕ 1 =⊤U I ϕ 1 .
The above definition has negation appearing only beside atomic predicates. This form of STL specifications in PNF is equivalent to the full class of STL specifications [1], and any STL formula can be transformed into a PNF using usual logical identities [17,Prop. 2].
We denote by x(t) |= ϕ, t ∈ IN, the satisfaction of ϕ, verified over x(t) = (x(t), x(t + 1), . . .). The validity of ϕ can be determined recursively using the Boolean semantics of STL; for details, we refer to [1] due to space limitations.
STL is endowed with quantitative semantics [4]: A scalarvalued function ρ ϕ : × H ϕ t=0 IR n → IR of a signal x(t), termed robustness function, where H ϕ is the horizon of ϕ [1], indicates how robustly a signal x(t) satisfies a formula ϕ, and is defined recursively as
where ⊕ denotes the Minkowski sum. The satisfa
This content is AI-processed based on open access ArXiv data.