Formalisms for Robotic Mission Specification and Execution: A Comparative Analysis

1 F or malisms f or Robotic Mission Speciﬁcation and Ex ecution: A Comparativ e Analysis Gianluca Filippone, Sara P ettinari, Patrizio P elliccione ✦ Abstract —Robots are increasingly deploy ed across diverse domains and designed for m ulti-pur pose operation. As robotic systems gro w in complexity and operate in dynamic en vironments, the need f or struc- tured, expressiv e, and scalable mission-speciﬁcation approaches be- comes critical, with mission speciﬁcations often deﬁned in the ﬁeld by domain e xper ts rather than robotics specialists. Ho we ver , there is no standard or widely accepted f ormalism for specifying missions in single- or multi-robot systems. A v ariety of f ormalisms, such as Beha vior T rees, State Machines, Hierarchical T ask Networks, and Business Process Model and Notation, have been adopted in robotics to varying degrees, each providing different levels of abstraction, expressiv eness, and sup- por t for integ ration with human workﬂows and e xter nal de vices. This paper presents a systematic analysis of these four formalisms with respect to their suitability for robot mission speciﬁcation. Our study focuses on mission-lev el descr iptions rather than robot software de- velopment. We analyze their underlying control structures and mission concepts, e valuate their expressiveness and limitations in modeling real- world missions, and assess the extent of availab le tool suppor t. By comparing the f or malisms and validating our ﬁndings with experts, we provide insights into their applicability , strengths , and shortcomings in robotic system modeling. The results aim to suppor t pr actitioners and researchers in selecting appropriate modeling approaches for designing robust and adaptab le robot and multi-robot missions. Index T erms —Robotic systems, Mission speciﬁcation, Behavior T rees, State Machines, Hierarchical T ask Networks, BPMN 1 I N T R O D U C T I O N Robots are becoming pervasive acr oss a wide range of domains, including industrial automation, logistics, health- care, hospitality , and agriculture [1], [2]. At the same time, robots ar e increasingly multi-purpose, capable of perform- ing diverse tasks rather than being tailored to a single function [3]. As a result, missions must often be speciﬁed and adapted directly in the ﬁeld by domain experts, who ar e responsible for deﬁning robot behavior despite not necessar- ily having expertise in r obotics, programming languages, or computer science [4]–[7]. As robotic systems are increasingly deployed in dy- namic, real-world settings, several efforts have sought to specify missions using structur ed, expressive, and scalable formalisms [4], [5], [8]. Effective mission speciﬁcation for single- and multi-robot systems demands approaches that The ﬁrst two authors contributed equally to this paper . G. Filippone, S. Pettinari, and P . Pelliccione are with Gran Sasso Science Institute (GSSI), L ’Aquila, Italy - e-mail: { gianluca.ﬁlippone, sara.pettinari, patrizio.pelliccione } @gssi.it balance clarity , modularity , and ease of use with the ability to adapt during execution. Despite sustained research and industrial inter est, however , no standar d or widely accepted formalism has emerged that adequately addresses these requir ements across diverse application domains. Instead, the state of the art is fragmented across multiple formalisms, including Behavior T rees (BT) [8]–[11], State Machines (SM) 1 [8], [12], [13], Hierarchical T ask Networks (HTN) [14], and Business Process Model and Notation (BPMN) [15]–[18]. While BT and SM are widely adopted in robotics due to their relative simplicity and execution efﬁ- ciency , they provide limited support for integrating human- driven tasks and external workﬂows. Conversely , HTN and BPMN offer richer abstractions for coordination and inte- gration but intr oduce additional modeling complexity and have yet to achieve broad adoption in robotics [19]. This fragmentation is mirrored in practice. High-proﬁle projects, such as NASA ’s Europa Lander mission 2 , have experimented with both HTN-based planning and BPMN- based workﬂow modeling to coordinate r obotic activi- ties [16], [20], [21], underscoring the absence of a dominant solution. In industrial contexts, vendors largely rely on proprietary graphical or block-based languages (e.g., Dobot, KUKA, Universal Robots), which further limits portability and reuse. Although some companies, such as P AL Robotics and Bosch, have begun exploring standard formalisms like SM [12], BT, and HTN 3 , the lack of consensus and systematic comparison continues to hinder informed selection and adoption. Each of these formalisms offers differ ent levels of ab- straction, expr essiveness, and contr ol, which in turn shape how robotic systems are designed, veriﬁed, and executed. The choice of formalism is therefor e not neutral, but de- pends on factors such as mission complexity , requir ed adaptability , and execution constraints. Prior work has be- gun to examine these trade-offs. For instance, [8] analyzes key language concepts in Behavior T rees and contrasts them with State Machines, which remain the de facto standard for behavior modeling in robotics. Complementarily , [4] reports a controlled experiment evaluating the effectiveness 1. For presentation purposes, we use the term State Machine (SM) to encompass both Finite State Machines (FSMs) and Hierarchical Finite State Machines (HFSMs). 2. https://ai.jpl.nasa.gov/public/projects/europa- lander/ 3. https://docs.pal- robotics.com/ari/sdk/23.12/development/ intro- development.html © “This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.” 2 and efﬁciency of BT and SM when used by end users to specify robot missions. While these studies provide valu- able insights into individual formalisms, they do not offer a comprehensive comparison across the broader space of mission-speciﬁcation approaches or address their suitability for complex, real-world robotic missions. In this work, we analyze Behavior T rees (BT) and State Machines (SM) from a practical perspective, focus- ing on their control structures and mission abstractions, as well as their distinctive characteristics, limitations, and tool support. In addition to BT and SM, we also consider Hierarchical T ask Networks (HTN) and Business Process Model and Notation (BPMN). For clarity , we collectively refer to BT, SM, HTN, and BPMN as “the formalisms”. By systematically comparing these approaches, we aim to provide practitioners and resear chers with concrete insights into their applicability for robotic system modeling. Our analysis assumes that robots and their underlying software components are already implemented. W e therefore focus exclusively on mission description, understood as a natural- language or domain-speciﬁc speciﬁcation of the activities a robot must perform [6]. T o ensure conceptual clarity , we adopt the terminology introduced in RobMoSys [22] and used in subsequent studies [3], [8]. Speciﬁcally , we refer to a skill as a programmed action executable by a robot and typically implemented as a softwar e component; a task refers to a simple, coordinated behavior composed of multiple skills; and a mission represents a coordinated sequence of tasks that enables the robot to achieve its overall objective. T o systematically compare the formalisms and address the lack of consolidated guidance for mission speciﬁcation, we structure our analysis around the following research questions: • RQ1: How can control structures and mission concepts be modeled with the formalisms? Rationale: Since each formalism is built on differ ent execution and control abstractions, understanding how they represent core mission constr ucts (e.g., sequencing, branching, concurrency , and coordination) is essential to assess their expressive power and suitability for mission-level modeling. • RQ2: What are the peculiarities and limitations of modeling missions with the formalisms? Rationale: While a formalism may be expressive in prin- ciple, its practical applicability depends on how natu- rally and effectively it supports realistic mission sce- narios. This question investigates modeling trade-of fs, abstraction levels, and limitations that emer ge when specifying complex single- and multi-robot missions. • RQ3: Which publicly available tools support the formalisms, and to what extent? Rationale: T ool support is a key factor for real-world adoption. By examining available modeling, execution, and veriﬁcation tools, this question evaluates the ma- turity and practicality of each formalism beyond its theoretical foundations. T ogether , these research questions examine how the formalisms model mission control structures ( RQ1 ), reveal their practical strengths and limitations when specifying realistic r obotic missions ( RQ2 ), and evaluate the maturity and effectiveness of available tool support for mission spec- iﬁcation and execution ( RQ3 ). W e validated our ﬁndings through an expert question- naire survey conducted according to established guide- lines [23], using purposive sampling [24] to recruit authors of the refer ence works underlying our analysis. Participants self-assessed their expertise and evaluated only the for- malisms they knew , rating our results on completeness , cor- rectness , and alignment via Likert-type items complemented by mandatory justiﬁcations for neutral-or-lower ratings; we then analyzed responses a posteriori and followed up with selected experts to clarify and deepen critical feedback. Paper outline : Section 2 provides an overview of the four formalisms analyzed in this paper . Section 3 describes the resear ch method we deﬁned to perform the study together with the analysis corpus used to compare the considered formalisms. Section 4 answers RQ1 by analysing the for- malisms in terms of control structures and mission concepts. Section 5 answers RQ2 by describing peculiarities and limi- tations of modeling missions with the formalisms. Section 6 answers RQ3 by discussing available tools supporting the formalisms. W e validated the ﬁndings with experts of the formalisms. The validation of each RQ is reported in the respective section. Section 7 discusses the ﬁndings of the study . Section 8 discusses the related works. The paper concludes with ﬁnal remarks and future works in Section 9. 2 T H E F O R M A L I S M S This section provides an overview of the four formalisms analyzed in this paper by providing a lightweight descrip- tion of their components and semantics and brieﬂy dis- cussing their origins. 2.1 Behavior T rees BT s were originally developed to serve the videogame in- dustry as an approach to design the artiﬁcial intelligence of non-player characters (NPCs), as an alternate way to SMs [11], [25]. In the last years, BT s gained popularity in the robotics industry and resear ch as a modular and ﬂexible ap- proach to describe r obots’ behaviors by structuring decision- making logic hierar chically , with states repr esented as leaves in a tree [8]. The work in [26] provides a comprehensive overview of BT functional and non-functional pr operties that are relevant for the robotic community , how they relate to each other and the metrics to measure BT s. A BT is a directed rooted tree whose internal nodes are called control ﬂow nodes and leaf nodes are called execution nodes . The execution of the tree is performed through ticks . They are periodic signals that are sent from the root and propagate thr ough its children. When a node r eceives a tick, it executes its behavior (which can be a ﬂow control task or the execution of a robotic skill) and immediately returns to its parent node one a status: Success , if the execution com- pleted successfully , Running , if the execution is currently in progr ess, Failure , otherwise. In their most classical formula- tion [11], the core elements, i.e., nodes, of a BT consist of two types of execution nodes ( Action and Condition nodes) and four types of control ﬂow nodes ( Sequence , Fallback , Parallel , and Decorators ), as shown in Figure 1. 3 → ? → → Sequence Fallback Parallel sequence δ Decorator Action Condition Action Condition Execution nodes Control flow nodes Fig. 1. Core BT elements. Action nodes execute speciﬁc commands. In robotics, they typically map to skills, which are reusable, parameterized behaviors such as navigating to a location, grasping an object, or manipulating a tool. When ticked, action nodes perform their associated skill and r eturn Success , Failur e , or Running , as explained above. Condition nodes evaluate boolean expressions related to the system state, e.g., check- ing whether an object is detected or if a robot has r eached its destination. They return Success if the condition holds and Failure otherwise. They never return Running as they do not repr esent actions that are executed. Control ﬂow nodes manage the tick propagation through the tree according to their speciﬁc semantics. Sequence nodes tick the children in order , returning Success , if and only if, all its children return Success . If a child returns Failure or Running , the next children are not ticked, and the node re- turns Failure or Running , accordingly . Fallback nodes tick the children in or der as in the previous case, but r eturn Failure , if and only if, all its children return Failure . Similarly , if a child returns Success or Running , the next children are not ticked, and the node returns Success or Running , accordingly . The Parallel node ticks all its children (possibly) simultaneously and returns Success or Failure , if at least a certain number of children returned Success or Failure , respectively . The Decorator node is a custom control-ﬂow node that features only one child and whose behavior is user-deﬁned, via a so- called policy . T ypical examples of decorator nodes are the Inverter node, which alters the child’s return status, and the Repeater , which forces repeated executions. Figure 2 shows an exemplar mission expressed as a BT, adapted from [11]. Its execution is commanded by ticks that are sent to the root sequence node with a certain frequency . This node propagates the tick to the Find Ball action node. ? → Find Ball → Place Ball Approach Ball Grasp Ball Pick Ball Ball Close Fig. 2. Example of mission expressed as a BT. State entry / activity_1 doActivity / activity_2 exit / activity_3 State Initial Pseudostate Final State T ransition event Choice Pseudostate Fork / Join Pseudostate Composite State Composite State State1 State2 Fig. 3. Core SM elements. When it returns Success , it is pr opagated to the Pick Ball subtree (see the dashed-border ed box) and, then, to the Fallback node. From there, the Ball Close condition is checked. If the ball is not close to the r obot, the condition node returns Failure , and the tick is propagated to the Approach Ball . If the latter ends successfully , the fallback node returns Success . If the condition checking returns Success , the fallback node returns success, and the tick is then propagated to the Grasp Ball action node. If the latter returns Success , the sequence node in the subtree returns Success as well, and the tick is ﬁnally propagated to the Place Ball action node. Note that, should any of the nodes associated with the sequence tasks return Running , the root sequence node returns Running as well. For this reason, the tree is ticked repeatedly to allow its complete execution. 2.2 State Machines The main concept of the state machine model is to describe a complex system’s behavior through states and events. In robotics, state machines have become a common choice for modeling task-level control and reactive behaviors, as they offer a way to specify how a robot should respond to internal or external events [27]. Given the typical complexity of robotic systems, with numerous states and events, an SM must be structured in a modular and hierarchical way to avoid unstructured or chaotic models [28], [29]. Although many semantics are available for state ma- chines, the most commonly referenced is the one deﬁned in the UML standard [30]. Therefore, we present the core concepts of state machines as outlined in the standard. A State is a situation in the SM where a speciﬁc constraint is maintained. While in this state, activities linked to its status can be carried out. Speciﬁcally , a state may include an entry behavior executed upon entering the state and an exit behavior executed upon leaving the state. Additionally , it can include a doActivity behavior that begins after the completion of the entry behavior . If a state contains sub- states, it is referr ed to as a Composite State , allowing the deﬁnition of a hierarchical structure among states. States are linked by T ransitions , labeled with events that trigger the transition between states. Moreover , the standard adopts pseudostates to abstract different types of elements that deﬁne the transition ﬂow . Among them, a Choice represents 4 start success Find Ball success Place Ball success Pick Ball success Approach Ball Grasp Ball ball not close ball close success Fig. 4. Example of mission expressed as a SM. a conditional decision, where the behavior is constrained by evaluating the transition guards associated with the pseudostate. Differently , a Fork and Join pseudostates serve to split or join multiple transitions. Finally , the activation and completion of a behavior is regulated by an Initial pseudostate representing the starting point and by a Final state repr esenting the ending of the behavior . The visual repr esentation of the core elements is depicted in Fig. 3. The execution is event-driven , the SM is traversed from the initial to the ﬁnal state based on the triggered transition. For instance, in the mission example shown in Figure 4, the state machine begins by triggering the Find Ball state. Upon successful completion of this state, the Pick Ball composite state is activated. W ithin this composite state, the Approach Ball state is executed only if the ball is not already near the robot; otherwise, the Grasp Ball state is executed directly . If the composite state completes successfully , the Place Ball state is activated. Once this state achieves a successful outcome, the ﬁnal state is reached. 2.3 Hierarc hical T ask Networks Hierarchical T ask Networks (HTNs) are an automated plan- ning formalism in which high-level tasks are decomposed into progr essively simpler subtasks until executable prim- itive actions are reached [31]. An HTN planning problem starts with an initial task network consisting of tasks and constraints. Primitive tasks correspond to actions that can be directly executed, while non-primitive (compound) tasks must be reﬁned using methods, predeﬁned decompositions into subtasks that preserve or dering and constraint rela- tionships. Planning proceeds by repeatedly selecting non- primitive tasks and replacing them with their corr esponding task networks from applicable methods until only primitive tasks remain; a successful plan is then a fully expanded, executable sequence of primitive actions that satisﬁes all task constraints. HTNs formalize planning problems for robot missions by repr esenting a mission as a task network that must be reﬁned into an executable course of action. An HTN model distinguishes compound (non-primitive) tasks from primitive tasks (actions). Compound tasks are r eﬁned via methods , i.e., domain-deﬁned decomposition rules that replace a task with a (partially order ed) network of subtasks, optionally subject to ordering and state constraints. Planning proceeds by repeatedly selecting a compound task and applying an applicable method until a primitive task network is obtained; a plan is valid if the resulting primitive actions can be linearized to satisfy all ordering/causal constraints and are executable in the world state [31]. In robotics, HTNs are Primitive task (leaf) sequential method method precondition T ask 1 T ask 2 Compound task (non leaf) unordered method method precondition T ask 1 T ask 2 method 1 method n Fig. 5. Core HTN elements. typically used as a deliberative mechanism: they encode domain knowledge to generate structur ed mission plans be- fore execution, and they support controlled reﬁnement and repair during execution when conditions change. This role aligns with distributed deliberative architectures for multi- robot missions, wher e an of ﬂine-computed hierarchical plan is executed by local supervisors that manage their allocated plan parts and perform hierar chical r epair to handle failures while reducing communication demands [14]. There is no universally accepted graphical standard for HTNs. In this work, we adopt a graphical notation consis- tent with prior robotics literature, where tasks are repr e- sented as nodes and hierarchical decomposition is shown by connecting compound tasks to their subtasks through method links, often annotated with ordering constraints or guard conditions [14], [32], [33]. This notation distinguishes between compound tasks (which decompose into subtasks) and primitive tasks (which correspond to executable steps), supporting intuitive interpretation of task hierarchies and dependencies. Notably , in [14], [32], [33] compound tasks are referenced to as abstract tasks, while primitive tasks are refer enced to as elementary or concrete . In this paper , we use terms compound and primitive , consistently with HTN literature [34]–[36]. Fig. 5 reports the graphical representation of HTN ele- ments we use in this work. T asks are repr esented through ellipses. Primitive tasks are the leaves in the tr ee, while compound tasks are the internal nodes. Methods are de- picted according to their execution order: diamonds rep- resent sequential ordering, while parallelograms repr esent unorder ed relations of their subtasks. Sequential ordering prescribes tasks that have to be executed in sequence (in the graphical repr esentation, from the left-most to the right- most task). Unordered methods can be performed in parallel (depending on the execution platform, e.g., if assigned to differ ent robots) or in any sequential order . Method can be optionally guarded by preconditions, r eported into a rectan- gular box. They are required to select the proper method to realize the compound task decomposition according to the current conditions. Referring to the example Pick and Place mission, as shown in Fig. 6, it is represented in HTN as a compound task that is decomposed through a method ( m pick place ) into subtasks such as Find Ball, Pick Ball, and Place Ball. Subtasks may themselves be compound and further reﬁned using alternative methods, as shown for Pick Ball, which branches depending on contextual conditions (e.g., whether the ball is close). These conditions guide the selection of ap- propriate methods (e.g., m appr oach or m grasp ), ultimately yielding primitive tasks such as Approach Ball and Grasp Ball that can be directly executed by the r obot. This hi- 5 Pick and Place m_pick_place Find Ball Pick Ball Place Ball Ball not close Approach Ball Grasp Ball m_approach Ball close m_grasp Grasp Ball Fig. 6. Example of mission expressed as a HTN. erarchical decomposition explicitly captures task structure, decision points, and execution dependencies, making HTNs well-suited for modeling complex, goal-directed robotic missions. 2.4 Business Process Modeling Notation Business Process Management (BPM) is a discipline widely adopted by organizations to ensure consistent outcomes and identify improvement opportunities [37]. In particular , BPM manages the chain of events, activities, and decisions connected to an organization. These chains are represented as business process models, and BPM includes concepts, methods, and techniques to support their design, enactment, monitoring, and analysis [38]. Over the last years, with the widespread of autonomous and interconnected devices, novel solutions have been focused on applying BPM tech- niques to specify and drive also robotic missions [18], [39]. Business process models are mostly expressed via the BPMN standar d [40], which is adopted by industry and academia to enable clear , versatile representations for busi- ness users and technical developers. T o foster the usage and inter changeability of BPMN between dif ferent tools, the models can be shared in a standard manner . Indeed, the standard deﬁnes a unique XML-based notation in which a business process is described in a tree-structur ed way , bringing all the information required for reproducing the elements composing the diagram. Indeed, each BPMN el- ement can be mapped to an XML fragment containing se- mantic and visual information. BPMN allows the design of differ ent kinds of diagrams: process, collaboration, choreog- raphy , and conversation diagrams. Speciﬁcally , collaboration diagrams can be used to depict processes in a distributed system. W ithin these diagrams, various BPMN elements are utilized to model the intended behavior of the refer enced system. Notably , the BPMN standard deﬁnes more than 200 distinct elements [41], providing a highly expressive and structured notation for designing collaborative processes. In the following, we describe the core concepts of BPMN elements. Given the richness of the notation, we provide an Pool Activity Gateway Event Sequence Flow Fig. 7. Core BPMN elements. abstract overview of these elements rather than an exhaus- tive description. Pools are used to repr esent participants or organizations involved in the collaboration and include details on internal process speciﬁcations and related elements. Activities are used to repr esent a speciﬁc work to be performed within a process. Events are used to represent something that can happen. Gateways are used to manage the ﬂow of a process. Notably , activities, events, and gateways can be marked in differ ent ways to indicate the corresponding execution behavior (e.g., a cross symbol in a gateway marks an ex- clusive choice). Finally , Sequence Flows are used to specify the internal ﬂow of the process, thus the execution order of elements in the same pool. The visual representation of the core elements is shown in Fig. 7. The execution semantics of BPMN is token-based [40, Sec. 7.1.1]. A token traverses, fr om a start event, the se- quence edges of the process and passes through its elements, enabling their execution, and ﬁnally , an end event consumes it when it terminates. Process elements acquire one or more tokens from incoming sequence ﬂows for execution. Once ﬁnished, they may produce one or more tokens on outgoing sequence ﬂows, depending on their behavior . Considering the example in Fig. 8, the diagram contains one pool, named Robot . The execution starts with one token in the start event which traverses sequentially the process model. This activates the ﬁrst activity (i.e., Find Ball ) followed by the Pick Ball subprocess. The subprocess executes the Approach Ball task only if the ball close condition is evaluated as false, after that it executes the Grasp Ball task. The execution continues by ﬁring the Place Ball task and completes when the token reaches the end event . Following the execution semantics, BPMN process mod- els can be directly executed by BPMN engines. These en- gines can consume and execute processes provided in the correct format. Notably , standardization of the format and semantics by BPMN ensures that the execution behavior remains consistent across different engines [42]. Robot Place Ball Find Ball Pick Ball Grasp Ball ball close? yes no Approach Ball Fig. 8. Example of mission expressed in BPMN. 6 Literature and documentation Scenarios from different domains Repository search Source of information RQ1 Formalisms control structures and mission concepts mapping RQ2 Analysis of peculiarities and limitations RQ3 Tool analysis Survey and Follow-up with Experts Comparative Analysis Validation Fig. 9. Research method. 3 R E S E A R C H M E T H O D A N D A N A L Y S I S C O R P U S This section describes the research method adopted in our study and the analysis corpus used to compare the con- sidered formalisms. Speciﬁcally , we outline the sources and materials that informed our analysis, including (i) primary documentation and representative applications of the for- malisms in robotics, (ii) a set of robotic scenarios spanning multiple domains, and (iii) a collection of publicly available tools that support, to varying extents, the four formalisms. T ogether , these elements provide the empirical basis for a systematic and practice-oriented comparison. Figure 9 summarizes our r esearch method for addressing the three RQs, organized into three phases: data collection, comparative analysis, and validation of the results. 3.1 Sources of Information Our study started by gathering background knowledge on the four considered formalisms. T o this purpose, we col- lected sources through a targeted literature search on Scopus focusing on each of the four formalisms. A search string was composed for each of the formalism, using keyword combinations including the formalism name and common variants (e.g., “hierarchical task network”, “HTN”, etc.) to- gether with robotics- and mission-related terms. Following this strategy , we obtained four search strings following this pattern: “( < formalism > ) AND (robot OR robotic) AND (mis- sion OR mission speciﬁcation OR mission execution)” , where < formalism > was substituted with the string composed by the name of the formalism and its variants. The obtained results were ﬁltered according to the following inclusion criteria: (i) the paper focuses on the use of the formalism for robotic mission speciﬁcation or execution; and (ii) it presents, applies, or discusses the formalism in a robotic context. As exclusion criteria, we discarded works in which the formalism was used for purposes differ ent from mission modeling/execution (e.g., modeling physical space or state machines in control-theoretic contexts). After the paper ﬁl- tering, we applied snowballing to get additional potential sources. The resulting set of sources consisted of papers speciﬁcally discussing the properties and applications of BT s [9], [10], [10], [11], [25], [26], [43] and SMs [8] in robotics, comparison between BT s and SMs [4], [44], applications of HTN in robotics [14], [32], [33], [45], and applications of BPMN in robotics [16]–[18], [39]. Moreover , we looked for sources that document the formalism, regardless of the ap- plication domain, including scientiﬁc papers [31], informal documentation, and standard deﬁnitions [30], [40]. The obtained sources were further exploited to identify (i) a set of r obotic scenarios spanning multiple domains (e.g., logistics, healthcare, households, and agriculture), and (ii) a set of publicly available tools that support, to differ ent extents, the four formalisms. Concerning the identiﬁcation of the scenarios, besides the aforementioned sources, we also considered the Robo- MAX exemplars collection [19]. The scenarios were selected by applying the following inclusion criteria: (i) the mis- sions in the scenarios should involve multi-purpose robot capabilities rather than ﬁxed, single-purpose behaviors, and (ii) missions take place in dynamic environments, thereby excluding, for example, single-purpose industrial robots. As exclusion criteria, we did not include simple missions concerning ﬁxed sequences of tasks (e.g., pick and place). In total, we identiﬁed 11 scenarios, repr esenting r obot behavior at different abstraction levels over different domains. In particular , we considered the Pick Ball [11, Example 2.1]) and Humanoid Robot [11, Figure 2.4] scenarios, the V ital Sings Monitoring , Keeping Clean , Food Logistics , Lab Samples Logistics , Welcome People to Hospital , and Deliver Goods exem- plars from RoboMAX [19], the SUA VE use case from [46], the Smart Agriculture use case from [18], and the Warehouse Automation scenario fr om [15]. The detailed description of each scenario, as well as the missions modeled using the formalisms, is available in the dedicated section of the replication package [47]. Finally , the selection of the tools supporting the for- malisms was performed by leveraging the ones mentioned, analyzed, or used in the selected information sources. Addi- tionally , we scour ed GitHub repositories and r eviewed tools mentioned in the literature, grouping them by purpose. T o this aim, we searched for differ ent strings containing the formalism name (both in full and in acronym) and “robot” or “robotics” (e.g., “bt robot”, “bt robotics”, “behavior tree robot”, etc.). As inclusion criteria, we considered (i) tools designed speciﬁcally for robotic missions, and (ii) general- purpose tools for the consider ed formalisms that can be adapted to robotics. As exclusion criteria, we considered (i) lack of publicly available documentation, (ii) educational or pr ototype implementations used as toy examples, (iii) not maintained tools, i.e., last commit older than three years 4 . 3.2 Comparative Analysis Starting from the literature and the documentation of the formalisms, we analyzed the formalisms based on how they support the modeling of (i) the control structures for describing the ﬂow of actions to be performed in the robotic mission, and (ii) the main concepts related to robotic mis- sions, identiﬁed from the literature and within the scenarios. The result of the analysis conducted for RQ1 consists in a mapping of the base elements offered by the formalisms to the aforementioned control structur es and robotic mission concepts. Results are presented in Section 4. By leveraging the results of the analysis performed for RQ1 and the existing literature, we analyze the peculiarities 4. The search was done in July 2025. 7 and limitations of formalisms in modeling robotic mis- sions ( RQ2 ). T o this end, we modeled the mission of each of the 11 identiﬁed scenarios using the four formalisms, employing existing straightforward tools when available, and evaluated the resulting models. Each scenario was modeled by keeping the same abstraction level as in the scenario description. W e considered as base actions (i.e., skills) the ones that are reported in the scenario description. W e leveraged the control structures and concept modeling identiﬁed in the scope of RQ1 and the modeling tools that were previously selected to stress them in the modeling of complex behaviors. The modeled scenarios allowed us to analyze the formalisms expressiveness, by scoring their suitability in expressing particular aspects of the mission. The analysis process was realized according to the fol- lowing methodology: 1) two co-authors modeled independently and separately differ ent scenarios using the differ ent formalisms; 2) the models obtained by one of the two co-authors were reviewed by the other to check for model correctness, and vice versa; 3) a third co-author reviewed the models and facilitated the discussion for the identiﬁcation of the model char- acteristics. Steps (1) and (2) were essential, as, similarly to software programming, there is no single, uniquely correct way to model a scenario. Multiple valid representations may exist, and this process allowed us to cross-check the soundness of the models while mitigating individual modeling biases. The output of the analysis conducted for RQ2 allowed highlighting the str engths and weaknesses of each of the for- malisms in modeling differ ent aspects of robotic missions. The results of the analysis are reported in Section 5. Finally , we analyzed the tools associated with the for- malisms by considering their scope and their usability in robotic missions, focusing on those that are actively main- tained. W e also examined the baseline tools that support each formalism and have served as the foundation for the development of current ROS-compatible packages. The tool analysis allowed us to (i) support the results conducted for RQ2 , in particular concerning the analysis of the strengths and weaknesses of the formalisms, as some of the tools pro- vide implementation-level solutions for expressing robotic- related concerns, and (ii) draw an overview of the major currently available tools supporting the formalisms within the robotic domain. 3.3 V alidation T o validate the ﬁndings derived from our analysis, we con- ducted questionnaire surveys with domain experts. For each resear ch question, the experts evaluated our results in terms of completeness , correctness , and alignment with established formalisms and best practices for their use in robotics. W e followed the questionnaire surveys empirical standard and its essential attributes [23]. Speciﬁcally , this standard prescribes the systematic collection of data from a deﬁned sample of participants thr ough a structur ed set of questions, typically managed via computerized forms. Participants were selected through purposive sampling [24], focusing on authors of the scientiﬁc works that we used as references for this study , as they have direct expertise in the correspond- ing formalisms and their application to robotic systems. Each expert was contacted via email and received timely reminders to encourage participation. The questionnaire was custom-designed to facilitate tar- geted and reliable evaluation. T o reduce respondent burden and ensure relevance, each participant was asked to declare their expertise on each of the formalisms by rating the expertise using a 5-point self-assessment scale (1 = not familiar , 2 = heard of it, 3 = some experience, 4 = used many times, 5 = expert user). Participants were asked to reply only to the questions related to the formalism(s) in which they had acknowledged expertise (i.e., expertise higher than or equal to 3). The survey primarily consisted of close-ended Likert-type questions (1 = strongly disagree to 5 = strongly agree), assessing agreement with statements about the completeness, correctness, and alignment with our ﬁndings. T o strengthen r esults interpretability , every closed-ended question was complemented with an open text ﬁeld, mandatory for responses rated less than or equal to 3 (neutral or lower), requiring participants to justify their as- sessment. This design choice ensured that lower evaluations were always supported with qualitative explanations. Responses were collected in a structur ed spreadsheet for a posteriori analysis. Following the analysis of the questionnaire responses and an internal discussion among the authors, we complemented the survey with a round of in-depth follow-up interactions. Participants to the follow- up round were selected based on: (i) their declared level of expertise, ensuring at least one self-reported 5/5 expert for each formalism; (ii) their expertise span multiple for- malisms; (iii) the presence of particularly critical or insight- ful questionnaire responses; (iv) have background knowl- edge of the formalism, also beyond r obotic applications; and (v) have acknowledged the willingness to be contacted for follow-up questions. Questionnaire data collection was carried out over a period of four weeks in January 2026, resulting in a total of 29 complete responses out of 83 invitations, corresponding to a r esponse rate of 34.94%. T able 1 overviews the question- naire participants proﬁles. The r espondent gr oup comprised experts across different proﬁles and career stages, including 4 PhD students, 6 postdoctoral researchers, 3 resear chers, 11 professors, and 5 r oboticists and industry people. Regarding expertise on the formalisms, participants self-reported expe- rience as follows: 27 experts in SMs, 25 in BT s, 14 in BPMN, and 10 in HTN, with an average experience of ∼ 8 years in robotic software engineering. Follow-up interactions were carried out over two weeks in February 2026. Seven participants were invited for follow- up discussions, of whom four conﬁrmed their availability . Speciﬁcally , we collected detailed feedback from four par- ticipants, whose proﬁles are highlighted in T able 1. Feed- back was collected both synchronously and asynchr onously , according to participant availability . Thr ee participants ( [par:5] , [par:10] , and [par:17] ) were interviewed in live ses- sions, while one ( [par:20] ) provided written responses to a set of follow-up questions. The purpose of this extended feedback collection was to deepen the discussion of the comparative analysis, focusing on critical points and clar- iﬁcation requests raised in the questionnaire, as well as on 8 T ABLE 1 Questionnaire par ticipants ov er view (g rey-highlighted rows indicate interviewed par ticipants). ID Proﬁle Y ears Declared expertise in role BT SM HTN BPMN par:1 Postdoc 1 2/5 4/5 1/5 5/5 par:2 Postdoc 5 5/5 5/5 3/5 3/5 par:3 Industry 4 4/5 3/5 5/5 3/5 par:4 PhD student 5 5/5 5/5 5/5 3/5 par:5 Roboticist 7 5/5 4/5 3/5 2/5 par:6 Professor 13 4/5 4/5 2/5 1/5 par:7 Professor 18 3/5 4/5 5/5 2/5 par:8 Postdoc 7 3/5 3/5 3/5 5/5 par:9 Professor 20 4/5 4/5 2/5 1/5 par:10 Pr ofessor 3 3/5 5/5 1/5 4/5 par:11 Pr ofessor 10 2/5 3/5 5/5 2/5 par:12 Postdoc 8 5/5 4/5 2/5 1/5 par:13 Pr ofessor 5 3/5 5/5 4/5 5/5 par:14 PhD student 6 5/5 3/5 2/5 2/5 par:15 Resear cher 7 4/5 5/5 1/5 1/5 par:16 Resear cher 3 3/5 3/5 1/5 5/5 par:17 Pr ofessor 3 2/5 2/5 1/5 5/5 par:18 Pr ofessor 26 3/5 5/5 1/5 1/5 par:19 Postdoc 5 2/5 3/5 2/5 2/5 par:20 Resear cher 3 2/5 5/5 2/5 2/5 par:21 Postdoc 3 3/5 4/5 4/5 5/5 par:22 Pr ofessor 4 1/5 2/5 1/5 5/5 par:23 PhD student 4 5/5 3/5 1/5 2/5 par:24 Pr ofessor 21 5/5 5/5 2/5 2/5 par:25 Industry 15 3/5 5/5 2/5 1/5 par:26 Pr ofessor 10 3/5 5/5 2/5 3/5 par:27 PhD student 6 3/5 4/5 1/5 3/5 par:28 Roboticist 5 4/5 4/5 3/5 1/5 par:29 Roboticist 11 2/5 4/5 2/5 5/5 additional issues that emer ged from participants’ comments and required further rationale or reﬁnement. The questionnaire is available online in the replication package [47]. Supplementary material also includes the anonymized answers and the interview summary . 4 C O N T R O L S T R U C T U R E S A N D M I S S I O N C O N - C E P T S ( R Q 1 ) This section addresses RQ1 by analyzing how the con- sidered formalisms model contr ol structures and mission concepts. Following the ﬁrst phase of our resear ch method (Section 3), we draw on (i) insights from primary doc- umentation and repr esentative robotics applications, and (ii) a set of robotic scenarios spanning multiple domains. Grounding the analysis in both language constructs and concrete mission scenarios enables a systematic assessment of how each formalism represents mission control ﬂow and core mission abstractions. 4.1 Control Structures Our analysis of control ﬂow considers sequential , conditional , and loop constructs, which are not speciﬁc to robotics and originate from structur ed programming and ﬂowchart prin- ciples [48]. They build the fundamental blocks for control ﬂow both in most general-purpose programming languages and many differ ent modeling frameworks, e.g., UML ac- tivity diagrams, business process models, process algebra, including the ones considered in this paper . Additionally , we consider the parallel control structure, since both single multi-purpose robots and multi-robot missions commonly requir e actions to be performed concurrently , and paral- lelism is a widely adopted construct across the aforemen- tioned modeling frameworks. T able 2 presents how the formalisms support control structures that drive the ﬂow of a mission. Additionally , we provide a graphical rep- resentation of these control structur es to exemplify their functionality . Sequential : T o realize a sequential task execution, BTs employ the sequence and fallback control ﬂow nodes. These nodes demand the execution (i.e., ticking ) of their children from left to right interrupting the sequence when a child returns Failure (sequence nodes) or Success (fallback nodes). In contrast, SMs do not have an explicit control structure to model a sequence of actions; rather , an SM transitions from the initial to the ﬁnal state, reacting dynamically to triggered events as the system evolves, i.e., the sequence is driven by the events that trigger state changes. In HTN task sequences are realized through sequential methods, where all the method’s children are executed from left to right. Finally , a BPMN is traversed from the start event to the event node, based on the sequence ﬂow . Parallel : Concurrent execution ﬂow is necessary to model parallel behaviors. BT offers the parallel node to compose child nodes that must be executed concurrently . In a SM, a fork pseudostate can be used to split the in- coming transition into multiple transitions, without guards, activating the corresponding states. HTN offers unordered methods , where all the children can be executed in any order , even in parallel if possible. It is worth noting that this type of method does not explicitly demand or constrain parallel execution of tasks; rather , the control architectur e that executes the task is responsible for managing their parallel execution. In BPMN, the AND gateway receives an incoming token and splits it into multiple tokens for each outgoing ﬂow , thus enabling concurrent ﬂow execution.  Observation 1.1 – Execution parallelism: Concurrency cannot always be fully realized in practical implementations, as it depends on the robotic platform and controller implementation. For instance, parallelism across tasks that share the same resour ces can often be approximated only by interleaving tasks. Conditional : Modeling a conditional ﬂow is necessary to regulate execution based on speciﬁc conditions. In a BT, this can be achieved by a fallback node with sequence and condition nodes [11]. Speciﬁcally , the condition node’s eval- uation determines whether to execute the action following the condition node that is evaluated to true.  Observation 1.2 – BT nodes failure semantics: A Failure state can be either returned by a condition node due to the condition evaluated as false , or by an action node due to a failure (e.g., due to errors). T o avoid the “spurious” execution of actions, BT s require checking both the condition and its negation. This allows failure states arising from errors and condition evaluation to be disambiguated. 9 T ABLE 2 Control structures. Sequential Parallel Conditional Loop BT From left to right from the Se- quence or Fallback node Parallel Node Combination of Fallback, Se- quence, and Condition Nodes Decorator (possible implementation) → A1 A2 A1 A2 → → A1 C ? → A2 !C → δ Repeat n times A1 A 1 is executed before A 2 A 1 and A 2 are executed in parallel If C is true, then A 1 is per- formed; if C is false, then A 2 is performed A 1 is executed n -times SM From initial to ﬁnal state fol- lowing triggered events Fork pseudostate Choice pseudostate T ransition cycles e1 S e2 S1 S2 e1 e2 e g2 g1 e S1 S2 e1 e2 S e e1 e2 When e 1 is triggered, state S remains active until e 2 is trig- gered The fork pseudostate splits the incoming transition into two transitions, activating S 1 and S 2 If g 1 is true, S 1 is executed; otherwise if g 2 is true, S 2 is executed e 1 reactivates S ’s action, while e 2 terminates it HTN Method with sequential rela- tionship Method with unorder ed rela- tion Methods combination with differ ent preconditions Achievable using recursive methods and conditions T1 T2 m m T1 T2 T2 T1 m1 m2 T3 p1 p2 T2 m1 ε m2 T1 p1 !p1 T1 T 1 is executed before T 2 T 1 and T 2 are executed in any order , in parallel if possible If m 1 ’s preconditions hold ( p 1 ), T 2 is executed; if m 2 ’s preconditions hold ( p 2 ), T 3 is executed As long as loop preconditions ( p 1 ) hold, T 2 is executed, then T 1 recursively runs the loop; when p 1 does not hold any- more, m 2 realizes the loop exit BPMN From the start to the end event following the sequence ﬂow AND Gateway XOR Gateway ( a ) Combination of XOR gate- ways; ( b ) Loop Activity; ( c ) Multi-instance Activity A1 A2 A1 A2 c1 default A1 A2 A3 c2 A A c1 c2 (a) (b) (c) A The ﬂow transitions to activity A 1 , then to A 2 , after which the process terminates. A 1 and A 2 are executed in parallel If c 1 is true, A 1 is executed; if c 2 is true, A 2 is executed; if neither c 1 nor c 2 is true, the default ﬂow is taken and A 3 is executed ( a ) A is executed repeatedly until condition c 1 is no longer true ( b ) A is executed until a speciﬁed condition is met ( c ) A is sequentially executed for a given number of times An SM utilizes the choice pseudostate to evaluate the guards of outgoing transitions (i.e., g 1 and g 2 in T able 2), determining the subsequent ﬂow of execution. HTN does not explicitly model choices and does not have dedicated constructs to evaluate conditions. However , preconditions can be associated with methods that r eﬁne compound tasks: differ ent methods can be associated with a compound task, hence using preconditions to specify the conditional behav- ior to follow . Similar to a SM, BPMN uses the XOR gateway , which evaluates conditions on sequence ﬂows to determine the direction of execution. Moreover , BPMN allows the ex- plicit speciﬁcation of a default ﬂow: if none of the conditions 10 are satisﬁed, the process follows the default branch; if no default ﬂow is deﬁned, an error is raised.  Observation 1.3 – Mutually-exclusive guards: Conditional semantics differ across formalisms. BT s re- solve simultaneous conditions through implicit prioriti- zation (tick order) [43], whereas SMs, and HTNs require the modeler to ensure guard mutual exclusivity and exhaustiveness. In BPMN, XOR gateways select the ﬁrst satisﬁed condition; conditions are evaluated in order , so that the ﬁrst true condition is consider ed [40, p.435], while optionally supporting an explicit default ﬂow . Loop : Finally , iterations allow an execution to be re- peated multiple times. In BT, a repeat decorator node can be deﬁned to tick the child node n -times or unless the child returns success . It is worth remarking that, although being provided by default by the main BT implementations, such a decorator is not deﬁned within the formalism. However , the behavior of decorators is by deﬁnition customizable [11], allowing differ ent loop policies to be deﬁned. In an SM, a transition can loop over a state, keeping it active until the guard in the cycle is triggered. In HTN, iterative behavior is not repr esented through explicit loop constructs but is realized implicitly through recursive task decomposition. In particular , a compound task can be reﬁned by a method whose subtasks include the same compound task, provided that the method’s preconditions remain satisﬁed. The rep- etition continues as long as these preconditions hold, and terminates when no recursive method is applicable, thereby encoding loop-like behavior through conditional recursion. In BPMN, three structures support repetitions. Using XOR gateways, a structur ed loop repeats the ﬂow inside the gate- ways as long as the condition remains true. Alternatively , a single activity can be marked as a loop and conﬁgured to be executed until a given condition is evaluated as true, and can be subject to an optional maximum number of repetitions. Moreover , BPMN also supports multi-instance markers to run a given number of activity instances. These instances may run sequentially or , when appropriate for the scenario (e.g., dispatching tasks to multiple robots), in parallel. 4.2 Mission Concepts Regarding mission concepts, we adopt the terminology introduced in [3], [8] and rely on the layered organization proposed in RobMoSys [49] to structure the repr esentation of robotic capabilities. T able 3 reports such layers, as sepa- rate concepts that represent different abstraction levels, each providing a lower-level speciﬁcation of the concept on top of it. In line with the RobMoSys abstraction layers, and given our focus on comparing formalisms for high-level mission speciﬁcation, we do not consider concepts below the service layer . These layers address low-level, hardwar e-dependent execution aspects that are outside the scope of mission-level modeling considered in this study . Instead, we consider the skill , task (task plot), and mission layers, as follows: • A skill is a programmed action that represents a basic capability of the robot. T ypically , it is implemented by software experts by leveraging the lower-level compo- nents, abstracting the implementation details. It pro- vides access to the functionalities realized within the robot’s components and makes them accessible to the task level. • A task is a symbolic r epresentation of a robotic behavior realized as a combination of skills. T asks specify what must be done and only partially how , abstracting away from the concrete implementations provided by the skills composed to realize them. • A mission represents the global high-level objective that the robotic system has to accomplish, deﬁned as a set of coordinated sequences of tasks to be performed that include precedence constraints and that either can be organized in sequences or executed in parallel. W ithin the scope of this paper , mission constitutes the funda- mental element speciﬁed using the selected formalisms. Additionally , we consider further concepts that are in- volved in the mission speciﬁcation. In particular , we con- sider the capability of a formalism to express the concepts of data , communication , events , errors , and pre/post-conditions . These concepts were derived both from the literature and from the analysis of the scenarios mentioned above, where speciﬁc needs naturally emerged. • Data speciﬁcation encompasses the conﬁguration con- cern [49] of the system and the management of knowl- edge propagation throughout the mission. It is r equired to provide the needed information for skills, tasks, and control structures. Data can be static, provided as an input that conﬁgures or parametrizes skills/tasks (e.g., the target location for a navigation task), or dynamic, being produced, managed, and propagated across the skills/tasks performed by robots within a mission (e.g., the status of environmental conditions affecting the mission). • Communication in the mission speciﬁcation is required to address the communication and coordination con- cerns [49] of the system, particularly when the mission is deﬁned in a multi-robot context [18] or when robots must interact with humans or external systems [15]. The explicit speciﬁcation of communication deﬁnes how robots share state information (e.g., task execution status) and propagate mission-relevant data (e.g., envi- ronmental conditions or context variables). • Events enable the deﬁnition of how the system responds to internal or external events that arise during mis- sion execution and requir e explicit management from T ABLE 3 Abstraction le vels in robotic systems (adapted from [49]). Abstraction Level Example Mission Serve customers; serve as a butler T ask plot Deliver coffee Skill Grasp object with constraints Service Move manipulator Function Inverse kinematics (IK) solver Execution Control Activity OS / Middleware pthread; socket; FIFO scheduler Hardwar e Manipulator; laser scanner; CPU architec- ture; mobile platform 11 T ABLE 4 Mission concepts. Skill T ask Data Communication Events Errors Pre/post-conditions BT Action Node [9], [11] Sub- tree [9], [11] Data inputs to action nodes through ports and blackboard storage [ behav- iortree.dev  ] [ py- trees.r eadthedocs.io  ] Rely on action implementa- tion [10] Not explicitly modeled. Achievable thanks to conditions that reactively verify if they are true or not through the reactive nature Not explicitly modeled. Achievable through the reactive nature and Failure propagation Not explicitly modeled. Achievable through the Postcondition- Precondition-Action (PP A) pattern [11] SM Simple State [30, p.308] Composite State [30, p.308] Data handled inside states [30] and global variables conﬁguration [ ﬂexbe.readthedocs.io  ] Rely on be- havior imple- mentation Each transition is associated with an event. But each transition can only be activated if the related state is ac- tive A transition can be related to an error . But each transition can only be activated if the related state is active T ransition guards [30, p.315]. A specialization named Protocol T ransition supports pre- and post-conditions [ uml- diagrams.org  ] HTN Primitive T ask Compound T ask In task header (e.g., travel(d)) (Or as a parameter in the task deﬁnition) Rely on prim- itive task im- plementation None None In the task and method deﬁnition BPMN T ask [40, p.154], [18], [51] Call Activity or Sub- process [40, p.430] Data Objects [40, p.224] and process variables [ docs.camunda.org  ] Message or Signal Events [40, p.269- 272], [18] Multiple event types capable of repr esent differ ent situations [40, p.232], [18], [39] Error Events [40, p.264] [18], [39] Not natively supported. Intermediate events can be used as a workar ound. Some works provide ex- tensions to support these conditions [52], [53] a mission-level perspective (e.g., executing additional skills/tasks, or reconﬁguring them). Handling such events allows for modeling the robot’s reactive behav- ior [11]. • Err ors specify the management of a particular class of events that arise from faults, failur es, or any unexpected conditions preventing the mission from being executed without proper handling. Explicit error handling allows for the deﬁnition of fault-tolerant and resilient behav- ior [11], [50]. • Pr e and post-conditions formalize the states of the system before and after the execution of a task or skill. In particular , pr e-conditions deﬁne the requirements that must hold before executing one or more actions (e.g., the robot has to be in the designed location for picking an object). Instead, post-conditions deﬁne the expected system or environment state after the successful exe- cution of an action (e.g., the robot holding an object). These explicit deﬁnitions allow ensuring the consis- tency among the sequences of tasks and skills in the mission speciﬁcation, their dependencies, and enable the support for planning and veriﬁcation. T able 4 overviews and compares how the formalisms can be used to model the elements a user may need to repr esent in the robotic mission. The table also reports the main references for the reported solutions and uses the notation [ ⟨ website ⟩  ] to refer to technical or non-academic documentation. Notably , within the comparison, the mission concept is not reported, as for all the formalism we consider the mission as the whole model. Skill : Skills are modeled as atomic elements for all the formalisms we are considering. In particular , in BTs, skills can be modeled as leaves in the tree through action nodes repr esenting either actuation or sensing operations. In SMs, skills can be modeled through simple states , which are regu- lated by the internal state behaviors (i.e., entry , doAcivity , exit behaviors). In HTNs, skills can be modeled through primitive tasks . In BPMN they can be modeled through tasks , i.e., atomic activities in the notation’s standard. Depending on the skill’s objective, a task can be classiﬁed into differ ent types, such as a service task, which directly calls a robot service [51], or a script task, which embeds robot-speciﬁc code within the BPMN task [18]. T ask : In BT s, tasks are represented by sub-trees . In SMs, a task can be modeled with composite states , which enhance the modularity of the model by nesting simple states enabling task achievement. In HTNs, they can be represented as compound tasks , which are reﬁned into primitive tasks by methods. Finally , in BPMN a task can be modeled through a call activity or sub-process . The main difference is that a call activity references an external process, while a sub- process is embedded within the original process deﬁnition. The primary use case for a call activity is to enable a r eusable process deﬁnition that can be invoked from multiple other process deﬁnitions. For instance, in the example missions in Section 2, the task Pick Ball is modeled as a subtree in the BT in Figure 2, as a composite state in the SM in Figure 4, as a compound task in the HTN in Figure 6, and as a sub-pr ocess in the BPMN model in Figure 8. Data : SMs and HTNs natively support the provision of data inputs to states and tasks. W ithin SMs, data can be added inside states (both simple and composite), while HTNs support the provision of data within the task header or within the set of variables associated with tasks. Con- cerning BTs, there is no standard way to deﬁne parameters for tasks, but some implementations allow adding inputs to 12 ? C → → C1 A1 C2 A2 Fig. 10. PP A patter n in BTs. action nodes through ports . Instead, BPMNs support Data Objects , which represent an object or a collection of objects that can be written and read by the activities in the pro- cess. Alternatively , some BPMN implementations allow the deﬁnition of the inputs and outputs that are associated with both a single activity and the whole process. Regar ding data storage, the available support is mainly implementation- speciﬁc for all the formalisms. Many BT implementations rely on blackboards, a centralized key-value storage, as a mechanism for sharing data between execution nodes. Similarly , SMs can leverage global variables, dynamically updated within states. In BPMN, data can be conﬁgured to create process variables that are accessible within the pro- cess scope. In contrast, HTNs do not provide mechanisms for data storage. Communication : Among the four formalisms, BPMN is the only one providing support to the explicit modeling of communication in the mission, and speciﬁcally through message or signal events. Message events can repr esent a one-to-one communication, while signal events can express a broadcast communication [18]. In contrast, BTs, SMs, and HTNs only rely on the implementation of actions and primitive tasks to realize the communication. Pre/post-conditions : HTNs natively support their spec- iﬁcation both in the tasks and methods deﬁnition. A spe- cialization of SMs, namely Protocol T ransition , enables their support. BPMN does not support their speciﬁcation na- tively , but some extensions enable it, e.g., in [52], [53]. Alternatively , BPMN intermediate events can be employed as a workaround to constrain task execution based on the satisfaction of certain conditions before or after an activity , although this does not formally capture the semantics of pre- and post-conditions. Finally , within BTs, condition nodes can be used to specify both pre- and post-conditions thr ough the Postcondition-Precondition-Action (PP A) pattern [11]. Figure 10 shows a general case of PP A: the post-condition C is speciﬁed as condition node placed as the ﬁrst child of a fallback node, whereas possible actions to reach C are speciﬁed within sibling nodes; pre-conditions are speciﬁed as condition nodes placed as the left sibling of an action node with a sequence node as a parent (either actions A 1 or A 2 can be executed to reach C , with pre-conditions C 1 or C 2 , respectively). Events : Even if not explicitly modeled in BTs, their intrinsic reactive nature allows the event handling without any dedicated constructs. In fact, since the tree is con- tinuously ticked, the condition nodes that check for the occurrence of a given event are continuously ticked as well: if a condition node succeeds because of the occurred event, the behavior the robot should exhibit in response can be performed as a consequence. This behavior can be modeled by leveraging the conditional structur e shown in T able 2 [11]. Similarly , in SMs the representation of events is supported by default, since events are the triggers for state changes. Each transition must be associated with an event: if the current active state has an outgoing transition related to the occurred event, the system moves to the target state for handling. This implies that if a recurr ent event needs to be handled by different states of the SM, each state must have an outgoing transition associated with this event. In con- trast, BPMN provides elements to explicitly model events that occur within the system. These events can vary in type, such as time-driven, condition-driven, or communication- driven, and can be placed on the boundary of activities, used as starting points for processes or subprocesses, or in- tegrated into the execution ﬂow . Events can also be modeled as interrupting or non-interrupting, meaning that when the event occurs, the main ﬂow is either interrupted or allowed to continue running, respectively . Finally , HTNs do not of fer mechanisms for explicitly modeling events, which need to be realized within the robot’s mission execution platform. Errors : As a speciﬁc case of events, BT s and SMs models errors by leveraging their reactive and event-based nature. BT handle errors arising fr om action nodes through the pos- sibly returned failure result. Similarly to events, SMs require a transition related to an error from each of the states to properly handle the event during mission execution. BPMN offer error events elements for their explicit representation, while HTNs do not offer support for error speciﬁcation at modeling time.  Observation 1.4 – HTN event and error handling: While HTNs do not provide explicit ﬁrst-class con- structs for events or errors, these aspects can be handled through method preconditions and r eplanning mecha- nisms. Events and errors can be addressed at the plan- ning and execution monitoring level, rather than being explicitly modeled in the HTN. 4.3 V alidation T o validate the ﬁndings related to RQ1, we asked experts to evaluate the correctness and completeness of our com- parison for each formalism thr ough two main validation questions (VQ): VQ1.1 Do you agree with the usage of control structures? VQ1.2 Do you agree with the modeling of concepts? For each VQ, respondents rated their agreement on a 5-point Likert scale (1 = strongly disagr ee, 5 = strongly agree). For responses rated ≤ 3 , participants were required to provide qualitative feedback suggesting clariﬁcations or corrections. Speciﬁcally , for VQ1.1, participants wer e invited to indicate whether they would suggest modiﬁcations or further clariﬁcations. For VQ1.2, they were asked to specify whether any concepts were repr esented incorrectly or mis- leadingly , and whether any relevant concepts were missing from the comparison. Results and discussion : The overall agreement scores for BT s, SMs, HTNs, and BPMN are illustrated in Figure 11, Figure 12, Figur e 13, and Figure 14, respectively . Participants mostly agreed with the proposed mapping, with very few 13 disagreements (never more than 3, with at most only one “strong disagree” per each formalism-related question). In the following, we ﬁrst report broader (minor) con- cerns that affect all four formalisms, then we discuss the updates for each formalism individually . Regarding control structures, participants mostly suggested minor r eﬁnements to improve the clarity of the descriptions in T able 2. For instance, some participants pointed out that the deﬁnition of parallel task execution does not coincide with parallel execution in a program. This led us to include Observa- tion 1.1 in the text. Moreover , the description of the condi- tional ﬂow and the mutual exclusivity of guards emerged as a cross-cutting concern for BT s, SMs, and BPMN. W e introduced Observation 1.3 after interviews with [par:20] , [par:17] , [par:10] to clarify this concern. Finally , the explana- tions of loop modeling in SMs and BPMN were revised to reduce ambiguities and better reﬂect their realization. These clariﬁcations address some of the lower agreement scores observed for SM and BPMN control structures (see Figur e 12 and Figure 14). For BT s, participants mostly agreed with the proposed mapping, while disagreements (3 for control and 2 for concept modeling, over 22 respondents) where reported for the representation of the conditional control structure (that, initially , did not include the checking the C condition and its negation), and for the description of both the parallel node and the loop structures. Several participants highlighted that the initial BT excerpt we provided did not clearly express an if-else semantics comparable to that of the other formalisms, and that the conditional constructs across the formalisms are not strictly equivalent due to the semantics of failure state. Following the interview with [par:5] , we discussed the practical implications of the such semantics and reﬁned the modeling and description of the BT condi- tional control structure to align with the if-else mechanism, leading to Observation 1.2. Additionally , we discussed best practices for expressing conditions, which we incorporated into the text. For SM, only two disagreements were reported by par- ticipants (a strong disagree and a disagree for both control structures and concepts, respectively , over 27 respondents). Concerning control structures, they reported errors in the description of the mapping with loop and conditional struc- tures in T able 2, which we ﬁxed accor dingly . Moreover , some participants questioned the selection policy when multiple condition guards evaluated to true, and the handling of non- satisﬁed guards (i.e., when none evaluated to true). As also conﬁrmed by the interaction with [par:20] , we observed that the formalism does not enforce mutual exclusivity of guards and default behaviors, leading to Observation 1.3. Concern- ing the mapping of concepts, it emerged that the mapping of skills to states was debated among respondents. While most agreed on the mapping, others (e.g., [par:10] ) saw a better- ﬁtting mapping of skills to the state’s activity . The following interview with [par:10] allowed us to clarify this mapping, i.e., skills are repr esented through states and implemented through do activity and entry / exit actions within states. The interview with [par:5] conﬁrmed that the proposed mapping of tasks and skills is consistent with common practice in robotic SM–based controllers. For HTN, 3 disagreements were reported overall over 0 5 10 15 20 Control structures Modeling concepts Number of responses Strongly disagree Disagree Neutral Agree Strongly agree Fig. 11. Likert responses for BTs (22 respondents). the 10 respondents. Besides discussing the possible methods preconditions overlapping (included in Observation 1.3), respondents pointed out that event and error handling are handled in practice through preconditions, implementation- based failure handling, and replanning. W e included this in Observation 1.4. Finally , for BPMN, besides the discussion on the ex- clusivity of outgoing conditions in XOR gateways sum- marized in Observation 1.3, the only disagreement (over 14 r espondents) case concerned an imprecise description of the loop structure, which has been ﬁxed. Following the interview with [par:17] , we further discussed the possibility of modeling loops through multi-instance tasks. W e inte- grated this alternative repr esentation into T able 2 and added the corresponding description in the text. Concerning the concept mapping, a participant raised concerns regarding the expr ession of pre- and post-conditions in BPMN. This led us to clarify that, although not natively supported, inter - mediate events can be used as a workaround to approximate the intended semantics of pre- and post-conditions. Summary of RQ1 . All four formalisms are capable of ex- pressing the fundamental control structures (sequential, conditional, loop, and parallel) through different mod- eling mechanisms. They differ in how mission concepts are modeled. BT and SM provide explicit constructs for skills and tasks, but offer limited native support for com- munication or pre- and post-conditions. HTN provides elements for expressing tasks and skills and encodes pre- and post-conditions directly , though data ﬂow and communication remain implicit. BPMN offers the most explicit and comprehensive support for mission concepts, with dedicated elements for data, communication, events, errors, and task types. However , it does not natively support pre- and post-conditions. Overall, as highlighted by the experts, all the concepts can be represented either using appropriate modeling patterns or at the implemen- tation level. 5 P E C U L I A R I T I E S A N D L I M I TA T I O N S O F M O D E L I N G M I S S I O N S W I T H T H E F O R M A L I S M S ( R Q 2 ) This section addresses the second research question (RQ2) by analyzing the expressiveness of the formalisms con- cerning how they support speciﬁc modeling concerns re- lated to robotic missions, obtained leveraging the scenarios presented in Section 2. W e ﬁrst discuss how the concerns 14 0 5 10 15 20 25 Control structures Modeling concepts Number of responses Strongly disagree Disagree Neutral Agree Strongly agree Fig. 12. Likert responses for SMs (27 respondents). 0 5 10 Control structures Modeling concepts Number of responses Strongly disagree Disagree Neutral Agree Strongly agree Fig. 13. Likert responses for HTNs (10 respondents). are supported by the formalisms, then, we discuss their strengths and weaknesses in modeling robotic missions. 5.1 Modeling of mission concerns From the selected scenarios, we identiﬁed the set of concerns that affect the modeling of robotic missions, stressing the model expressiveness. Speciﬁcally: • r eactive behavior : behaviors that allow the robot to re- spond to events or errors requiring the performance of additional actions; • decision making : choices made at runtime based on the current system state or overall context; • time-dependent behavior : behaviors that have to be exe- cuted periodically after a speciﬁed interval, triggered after a certain delay , or constrained by timeouts; • task status : tracking the execution status of an action, i.e., if it is completed, ongoing, or if an error occurred; • r obot-robot interaction : direct interactions between multi- ple robots involved in the mission, such as inter-robot communication and synchronization; • human-r obot interaction : direct interaction between hu- mans and robots, involving explicit communication from the robot to the human and vice-versa, e.g., prompting commands/instructions and getting human feedback, or tasks to be executed together with or only by humans; • r obot-external systems interaction : explicit communication between the r obot and external systems, e.g., user inter - faces, web services, and databases, for sending/receiv- ing data, commands, etc; • state saving and task resuming : pausing and resuming the current execution, for allowing a temporary interrup- tion of the current task for executing extraordinary ac- tions (e.g., if an event or an error occurs), and resuming 0 5 10 Control structures Modeling concepts Number of responses Strongly disagree Disagree Neutral Agree Strongly agree Fig. 14. Likert responses for BPMN (14 respondents). the mission afterwards (also mentioned as event handler in [5]); • explicit waiting : holding the robot in a busy form of waiting for speciﬁc events or conditions before starting or proceeding with the execution of the mission; T able 5 reports the concerns identiﬁed within each of the scenarios. Each concern is evaluated according to the extent it is supported by each of the formalisms, based on the insights arising from the models obtained from the identiﬁed sce- nario described in Section 3. W e score the support provided by each formalism for a given concern on three different levels, as follows: • Full support , if the formalism provides either native con- structs (i.e., elements, operators, or control structures) associated with the concern, or modeling patterns to express it, allowing its modeling to be unambiguous and consistent across the missions without requiring workarounds; • Partial support , if the formalism does not provide an ex- plicit or dedicated construct to express the concern, but it can still be modeled indirectly through workarounds or ad hoc solutions that leverage other constructs; • No support , if the formalism can not express the con- cern, neither directly nor through workarounds, hence requiring the realization of such concern using external mechanisms or by realizing it at a different abstraction level. This does not mean that the concern is not ad- dressable when using a given formalism, but that it requir es implementation-level effort. T able 6 summarizes the support provided by each of the formalisms in expressing the mission concerns. Reactive behavior : Regarding the expression of the re- active behavior , BT s, SMs, and BPMN fully support its modeling, as a dir ect effect of their reactive nature (BTs and SMs) or by leveraging event elements (BPMN). In particular , BT s support this through the continuous tree ticking, which allows the evaluation of all the tree and its associated condition nodes at each tick, hence enabling the execution of the actions guarded by condition nodes that check the occurrence of a certain event, by following the conditional control structur e in T able 2. Figure 15 shows an excerpt of the BT modeling the SUA VE mission, where the (mission-wide) reactive behavior is controlled through a fallback operator placed as a root of the subtree model- ing the tasks that have to be preempted when an event occurs, and a sequence node as child (placed left-most to the tasks to be preempted): when the condition node’s 15 T ABLE 5 Robotic scenarios and associated concer ns. Acronyms used in the tab le: PB (Pick Ball), HR (Humanoid Robot), VS (Vital Signs Monitoring), KC (K eeping Clean), FL (F ood Logistics), LSL (Lab Samples Logistics), WPH (Welcome P eople to Hospital), DG (Deliver Goods), SU A VE, SA (Smar t Agriculture), W A (Warehouse A utomation). Concern PB HR VS KC FL LSL WPH DG SUA VE SA W A Reactive behavior ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Decision making ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ T ime-dependent behavior ✓ ✓ T ask status ✓ Human-robot interaction ✓ ✓ ✓ ✓ ✓ ✓ Robot-robot interaction ✓ ✓ ✓ Robot-external system interaction ✓ ✓ ✓ ✓ ✓ ✓ State saving & task resuming ✓ ✓ ✓ Explicit waiting ✓ ✓ ✓ T ABLE 6 Summary of formalism expressivity f or mission concer ns. ( : full suppor t; G # : par tial suppor t; # : no suppor t) Mission concern BT SM HTN BPMN Rationale Reactive behav- ior # BT and SM leverage their reactive nature and event-handling structures. BPMN feature boundary events and event sub-processes. HTN does not support it. Decision making SM and BPMN feature conditional control structures for runtime decision-making. In BT s it is realized by combining fallbacks and sequence nodes. HTN relies on planning according to pre- and post-conditions associated with methods. T ime-dependent behavior G # G # # In BT and SM it has to be realized by manually implementing the control of timing through condition nodes and state implementations, respectively . BPMN features timer events. HTN does not support it. T ask status # BT, SM, and BPMN support it through the value r eturned by the tick, the events outgoing from a state, and the activity lifecycle, respectively . HTN does not support it. Robot-robot in- teraction G # G # G # BT, SM, and HTN rely on the action nodes, states, and tasks implementation, respectively . BPMN has dedicated structures, i.e., message and signal events for communication. Human-robot in- teraction G # G # G # BT, SM, and HTN rely on the action nodes, states, and tasks implementation, respectively . BPMN supports user and manual tasks, and the explicit modeling of communication with message/signal events. Robot-external system interaction G # G # G # BT, SM, and HTN rely on the action nodes, states, and tasks implementation, respectively . BPMN has dedicated task types (service task, send task, receive task) and message/signal events. State saving and task resuming G # # # BT offers control ﬂow nodes with memory to keep track of the overall status of the mission, or can rely on a shared knowledge (blackboards) to keep track of already executed tasks. SM requir es ad hoc states to serve as history states and events to ﬁrst pause and then resume the execution. HTN and BPMN do not support it. Explicit waiting G # G # G # BT, SM, and HTN rely on the ad hoc implementation of action nodes, states, and tasks, respectively , since they do not provide dedicated elements. BPMN supports intermediate events of differ ent types. → ? Thrust Failure Enter Recovery Fig. 15. Example of reactive beha vior realized through BT. tick returns Success , the associated response behavior is executed (i.e., when Thrust Failure holds, then Enter Recovery is executed). Otherwise, the tick is propagated to the next child of the fallback node. On their side, SMs support this through the intrinsic event-based nature of the model, where events drive the transitions between states. In this case, transitions (labelled with the event to react to) connect the system states to be preempted to the state(s) modeling the actions to perform as a response. In BPMN, the reactive behavior is expressed by exploiting boundary events or event sub-processes. Boundary events are attached to activities that may require a r eaction to events and enable the execution ﬂow to directly transition to other activities, modeling the corresponding response. Event sub-processes 16 m_clean Go T o Room m_occupied Send Message Abort Mission Check Occupied Enter m_free Mark Occupied Room Free Enter Room Room Occupied Fig. 16. Example of decision-making realized through HTN. can be employed to handle events or errors that may occur at any point during the mission execution. Both strategies allow actions in response to events to be executed either interrupting the “normal” mission execution ﬂow or as a parallel process. Conversely , HTN does not provide support for reactive behavior during the runtime. Decision making : Concerning the modeling of the run- time decision-making of the system, all the formalisms, although at differ ent levels, support it by applying the condi- tional control structures reported in T able 2. In particular , BT, SM, and BPMN have explicit control structures to control the runtime behavior by switching between different tasks according to the runtime conditions. HTN, on the other side, relies on the pre-conditions of the deﬁned methods to deﬁne alternative behaviors. The association of methods to abstract tasks is done through planning [34], which needs to be performed at runtime in order to consider runtime conditions that are not accessible beforehand. Figure 16 shows an excerpt of the HTN modeling the Keeping Clean mission, where the robot’s actions reﬁning the abstract task Enter have to be decided according to the room’s status. In this case, two differ ent methods are deﬁned, associated with differ ent pre-conditions: m_occupied deﬁning the behavior when the r oom is occupied, m_free when the room is free. Runtime planning takes into account such pre-conditions to associate the proper method to reﬁne the abstract task Enter . T ime-dependent behavior : Concerning the modeling of time-dependent behavior like timeouts and time triggers, BT s and SMs have to rely on speciﬁc implementations of action and conditions nodes (BT s), or states and events (SMs) that check ad hoc realized timers and react to them consequently , as described for the reactive behavior . HTN does not support this feature. BPMN provides timer events, which can be deﬁned either for a speciﬁc date and time or for a duration (e.g., every two hours). These events can be used in different parts of the mission to constrain the start of the process to a given time, act as interrupting triggers during execution, or pause the ﬂow for a speciﬁed duration. Figure 17 shows the initial part of the V ital Signs Monitoring scenario, which prescribes that all patients’ vital signs be checked every two hours. In BPMN, this periodicity is captured using a timer start event, which triggers the mission execution at the required two-hour interval. T ask status : Concerning the ability of keeping track of the status of a mission task, BT s support this through the 2 hours GoT o Room ... ... Fig. 17. Example of time-dependent behavior realiz ed through BPMN. returned value of the tick on a node: Success , Running , or Failure , as described in Section 2. Interestingly , it is worth noting that the Failure value returned by condition nodes has a different semantics than the one returned by action nodes: the ﬁrst indicates a condition that is currently not holding; the second indicates possible failures during the action execution. SMs do not pr escribe predeﬁned execution statuses; however , task status can be repr esented either by dedicated states (e.g., Success , Failure ) or by the events or outcomes emitted by a state, which may trigger transitions to differ ent successor states depending on whether a task completes successfully or fails. Figure 18 shows an excerpt of the SM modeling the Pick Ball scenario in which task failure is modeled both via outgoing transitions labeled as fault and through dedicated states for the recovery ( Wait for Help ). In this case, if a fault occurs within the Find Ball or Approach Ball states, the robot waits for help. Afterwards, if the help was successful ( help ok transition) the robot goes in the Success state; otherwise, it goes in the Failure state. In BPMN, task status is encoded in the activity lifecycle, which includes states such as Ready , Active , Withdrawn , Com- pleted , and Failed . T ransitions between these states determine how tokens progr ess through the process and enable the speciﬁcation of different behaviors depending on the exe- cution outcome of an activity [40, p.428]. Differently , HTN does not handle this concern. Robot-robot interaction : The interaction with other robots is not explicitly supported by BT s, SMs, and HTNs, as they do not offer ad hoc constructs to model the communi- cation with other parties. T o realize this kind of interaction, such formalisms must rely on the implementation of the action nodes (BT), states (SM), and tasks (HTN) which have to be realized ad hoc. Conversely , BPMN offers differ ent constructs allowing interactions, such as explicit message events for one-to-one communication, or signal events, which can be exploited for explicitly modeling multicast communication among multiple robots. Figur e 19 shows an excerpt of the model built for the Smart Agriculture sce- nario, where the robot-to-robot communication is modeled through a signal event: the Drone shares the position of a ball found fault Find Ball Approach Ball Wait for Help fault Success help ok Failure help not ok Fig. 18. Example of task status handling with SM. 17 Drone weed_position ... ... T ractor weed_position Weed Position ... Fig. 19. Example of robot-robot interaction with BPMN. weed grass via the weed_position signal send event, and a currently active Tractor can catch it through the signal receive event. Human-robot interaction : Similarly to the previous con- cern, BT s, SMs, and HTNs, have to rely on the implemen- tation of action nodes, states, and tasks, respectively , to explicitly model this kind of concern. BPMN offers a richer set of modeling elements that enable explicit representation of human involvement. User and Manual tasks [40, p.160] allow the speciﬁcation of activities performed by humans, either with system involvement in the case of User T asks, or without system support in the case of Manual T asks. Moreover , human behavior can be integrated directly into the mission model by assigning it to a dedicated lane that ex- ecutes tasks interleaved within the r obot’s workﬂow [15](see Figure 20, showing an excerpt of the War ehouse scenario), or by deﬁning a separate interacting process that exchanges messages with the robot through message or signal events. Robot-external system interaction : Similarly , BT s, SMs, and HTNs have to rely on the implementation of the action nodes, states, and tasks to model this kind of concern. BPMN of fers dedicated mechanisms that allow these in- teractions to be modeled directly within the process. The approach is analogous to the one used for human involve- ment, with the main differ ence lying in the speciﬁc task types employed. Interactions with external systems can be modeled using Service T asks , which represent automated operations carried out by external software components, or through message ﬂows that capture communication be- tween the robot process and external participants or system components. Notably , Service T asks may also be used to invoke a corresponding robotic activity , such as navigation, via connectors when the BPMN pr ocess executes outside the robot itself [15], [16]. Warehouse Process Robot ... MoveT o Material Check Material Weight ... Worker Load Material Fig. 20. Example of robot-human interaction with BPMN. Patient Unavailable Wait 5 minutes → ? ... (a) Time-based waiting. → Drawer Open ? Open Drawer Drawer Close ? Close Drawer Sample Deposited ? Wait Deposit (b) Event-based waiting. Fig. 21. Examples of explicit w aiting realized through BT. State saving and task resuming : State saving and task resuming requir e the mission to be ﬁrst paused (state saving) and then resumed from the point where it was paused (task resuming). As explained, this concern allows the execution of exceptional behavior in response to an event, as discussed for the reactive behavior modeling, and the restoration of the normal mission behavior after- wards. Interestingly , none of the constructs fully support this concern explicitly . BT s can achieve the resuming of the task execution by either (i) using contr ol ﬂow nodes with memory [11] to keep track of the ongoing state of the mission by internally storing the r esults r eturned by their children, until the node returns Success or Failure to its parent, or (ii) designing the whole tree in such a way , before ticking an action node, a condition node checks the overall mission status using the backchaining paradigm [11]. In both cases, the solution avoids nodes from being ticked again if previously completed (i.e., avoids tasks from being re-executed when not needed), hence allowing the mission to be r esumed from where it was interrupted. However , both solutions have limitations: in the ﬁrst case, the use of control ﬂow nodes with memory limits the overall r eactivity of the tree [11], [54], while, in the latter case, task status may have to be manually stored into separate structures, such as blackboards , which are not part of the formalism, although commonly supported by BT implementations. Differ ently , SMs need ad hoc event handling, with events outgoing from states, for pausing the mission execution (as described for the reactive behavior), while state transitions labelled with the previously interrupted task have to be redirected to the corresponding state for their resuming. This requir es that the information about the previously-interrupted task has to be manually carried on [44]. In any case, the pause of the actions within the currently running node or the currently active state has to be manually implemented within the event response. However , being the support offer ed by BT s more advanced than SMs, since BT s provide built-in nodes with memory , we evaluated them as offering full support , 18 T ABLE 7 Overview of f ormalism expressivity strengths and weaknesses. Strengths W eaknesses BT • Easily express reactive behavior when the robot has to continuously react to changing conditions. • Clear task status semantics, making progress and failure handling a ﬁrst-class concern. • The semantics Failure is overloaded, making action and condition nodes returning Failure for different reasons (system-level failures or conditions not holding), requiring additional disambiguation through ad hoc nodes to catch possible errors. • No standard way for expressing waitings, temporal constraints, and interac- tions, which must be manually implemented in the action/condition nodes. • The use of nodes with memory for keeping track of mission execution state hinders the tree reactivity by limiting the overall tree re-evaluation. SM • Natively model event-driven behavior deﬁning how the system should react to events and transition between states • Easily handles task status as distinct states and via transitions outgoing from them. • Handling errors and events requires outgoing transitions from all the poten- tially affected states, hence requiring explicit transitions from every state for handling system-level events. • T ask resuming requires ad hoc history states and transitions towards each state. • Interactions, messaging, and coordination with other robots, systems, or humans have to be manually realized. • Time-dependent behaviors require the manual integration of timers within the state implementation. HTN • Provides support for decision making through runtime planning, achieved by decomposing tasks into sub-tasks using methods. • Natively allows explicit deﬁnition of pre- and post- conditions. • No expr essiveness for reactive behavior , which has to be realized at a dif ferent level outside of the mission model. • No explicit expression of static choices, which are always demanded to planning. • No task status tracking, which is totally demanded to the HTN executor implementation. • Interactions, waiting, and time-driven behaviors are not natively supported and must be manually implemented within task deﬁnitions. BPMN • Decision-making logic is easy to model thanks to the process-oriented structure. • Rich notation explicitly supporting diverse event and error handling, making it very expressive for time- dependent behavior , event/error handling, and waiting. • Explicitly allows modeling of human activities and ex- ternal systems, and offers a clear representation of robot- robot, human-robot, and robot-system interactions. • No native support for specifying tasks or mission resuming after they are interrupted. • Dealing with reactions to highly-fr equent events can not be optimally achieved. • Can become complex and overloaded for detailed and extended robotic missions. while SMs as offering partial support . HTN and BPMN, in contrast, do not offer solutions or workarounds to address this concern.  Observation 2.1 – State saving and task resuming in BPMN: State saving and task resuming can be realized at the execution engine level by storing the process-instance state (e.g., tokens position and the status of activity instances) upon interruption, and reinstating it to re- sume the process from the same point. Moreover , BPMN compensation mechanisms could support rollback-like behavior , though their application to robotic missions remains largely unexplored. Explicit waiting : BT, SM, and HTN do not offer con- structs for modeling an explicit busy-form of waiting during the mission execution. All of them have to rely on ad hoc implementations of action nodes, states, and tasks, respectively . In contrast, BPMN events can deal with this concern. Indeed, events can be placed within the process ﬂow to model waiting for differ ent situations, such as a speciﬁed duration, the satisfaction of a condition, or the receipt of a message or signal event. Figure 21 illustrates two BT s modeling an example of time-based waiting from the V ital Signs Monitoring scenario. Here, the timer event requir es that, after assessing that the patient is not available and leaving the room, the robot wait for 5 minutes before re-entering the room. In this case, the mission is paused until 5 minutes have elapsed. Here, the time-based waiting (Figure 21a) is realized by leveraging the conditional control structure reported in T able 2: if the patient is not available, then the 5-minute waiting is realized through an ad hoc action node whose implementation has to keep the robot in a waiting state by returning Running until 5 minutes are elapsed. Differ ently , the tree enabling event waiting (Figure 21b) can be realized 5 by pairing each action node with a condition node through a fallback ( explicit success condition pattern in [11]). In particular , after having opened the drawer , the robot checks if the sample is deposited and, 5. W e recall that there is not only a single way to realize a given behavior . W e report one out of the many possible modeling solutions. 19 if not, the Wait Deposit action realizes the waiting by idling over a Running response returned at each tick. By leveraging the reactive nature realized through the continu- ous tree ticking, after the sample is deposited, the Sample Deposited condition will return Success and the execution proceeds towards the next branch (controlling the drawer closing).  Observation 2.2 – Implementation-speciﬁc concerns: T ime-dependent behaviors, explicit waiting, and state saving and task resuming can be realized by leveraging formalisms elements in ad hoc solutions. E.g., ad hoc- deﬁned decorators for waiting in BT s, blackboards or nodes with memory and history state nodes for state saving and task resuming in BT s and SMs, respectively . Some implementations of BT, SM, and HTN provide extensions enabling the modeling of the aforementioned concerns. 5.2 Formalism expressivity strengths and weaknesses T able 7 reports the main strengths and weaknesses of the formalisms with respect to the expressiveness of the identi- ﬁed concerns. They have been drawn from the analysis of the expressiveness conducted in the previous section. BT : In general, BT’s major strength lies in the expression of reactive behavior , thanks to the tick-based execution strat- egy that allows for the continuous evaluation of the entire tree. This makes the evaluation of mission-level conditions and the monitoring of events (i.e., those conditions that must be checked throughout the whole mission, and the events that affect the overall mission) easy to specify within the model. Moreover , the Success , Failure , or Running result re- turned by all the nodes after each tick allows the continuous evaluation and control of the mission execution status. On the other side, as also mentioned in Section 4, the Failure state’s semantic results overloaded since it is returned by both action and condition nodes for two possibly differ - ent reasons: for action nodes, it is usually associated with failures in the action being executed, while for condition nodes it is associated to a condition that is currently not met (see, for instance, the conditional structure in T able 2 or the PP A pattern in Figure 10, leveraging the conditional nodes for controlling the tree ticking). This overloading requires disambiguation, so to avoid the execution of unintended behavior . For instance, if the Enter Recovery action in Figure 15, for some reason, fails, the resulting Failure can be interpreted by the fallback node at the tree root the same way as if there was no thrust failure, causing the mission to pro- ceed as if no failure was detected. Also, BT lacks dedicated constructs for temporal-r elated behavior and interactions with other robots, systems, and humans. They have to be manually implemented within the action nodes. SM : Conversely , from BT s, SMs employs transitions from states to explicitly distinguish between different events and differ ent outcomes of the actions executed by the r obots within each state. This allows the mission model to unam- biguously distinguish differ ent events and error causes, and to explicitly redir ect the ﬂow towards the states modeling their handling. As a drawback, handling events that can potentially affect every state requires outgoing transitions from each of them, hence making the model mor e com- plex. Organizing states hierarchically [28] (as in Figure 4) allows reducing such complexity . Similarly , for resuming a temporarily interrupted mission, transitions towards all the possible interrupted states have to be modeled. This also requires the speciﬁcation of ad hoc selector states [44], hence contributing to the model complexity . As for BT s, the handling of temporal-related behaviors has to be manually realized within the single state logic, as well as the interac- tion with other robots, systems, and humans. HTN : HTNs have the strength (unique among all the considered formalisms) of explicitly expressing pre- and post-conditions, which provide the support for binding abstract tasks to multiple methods that can realize decision- making through runtime planning. The task decomposition obtained by reﬁning abstract tasks into methods naturally realizes a modular structure. As a drawback, choices can not be embedded explicitly into the model, since planning is always required. Moreover , no support for reactiveness and no mechanisms to handle task status are pr ovided within the model deﬁnition, relegating this feature to both external documentation/modeling and behavior implementations. BPMN : BPMN provides the richest expressivity among the considered formalisms, thanks to its comprehensive standard notation, which enables many mission-relevant concepts to be modeled explicitly . Communication can be repr esented through message and signal events, while the variety of task types allows the speciﬁcation of interac- tions with humans, robots, and external systems. Decision- making and control logic are naturally expressed through gateways. However , BPMN offers limited support for re- suming tasks after interruption: once a token leaves an activ- ity , its execution state is lost, requiring the activity to restart. As discussed in Observation 2.1, a possible workaround can be realized by acting on the process execution engine, by extending it to keep track of the executed tasks. However , this should be implemented on top of BPMN execution engines. Reacting to fr equent events or modeling ﬁne- grained robotic missions often leads to large and complex diagrams. For instance, continuous condition monitoring (e.g., battery level) can be modeled with event subprocesses, but their execution typically overrides the main ﬂow unless the engine provides specialized handling. Alternatively , one may attach boundary events to each task or add further gateways, both of which increase model complexity . 5.3 V alidation T o validate the ﬁndings related to RQ2, we asked experts to evaluate their agreement on the support of formalism ex- pressivity and with the identiﬁed str engths and weaknesses. Hence, for each identiﬁed mission concern, we deﬁned the following VQ: VQ2.1 Do you agree with the formalism expressivity for the concern? VQ2.2 Do you agree with the identiﬁed strengths and weak- nesses? For each VQ, respondents rated their agreement on a 5- point Likert scale. For responses rated ≤ 3 , participants were r equired to pr ovide qualitative feedback by explaining the reason for the disagreement. Speciﬁcally , for VQ2.1, 20 0 5 10 15 20 25 Reactive behavior Decision making T ime-dep. behavior T ask status Robot-robot interaction Human-robot interaction Robot-ext. sys. interaction State saving and task resuming Explicit waiting Number of responses Strongly disagree Disagree Neutral Agree Strongly agree Fig. 22. Likert responses for VQ2.1. 0 5 10 15 20 25 BT SM HTN BPMN Number of responses Strongly disagree Disagree Neutral Agree Strongly agree Fig. 23. Likert responses for VQ2.2. participants wer e asked to rate their agreement with each of the identiﬁed mission concerns, while VQ2.2 was asked for each formalism they acknowledged expertise with. Results and discussion : Figure 22 and Figure 23 show the overall agreement scores obtained for VQ2.1 and VQ2.2, respectively . Most of the participants agreed with both the expressivity for the mission concerns and for the strengths and weaknesses. Concerning expressivity (VQ2.1), some disagreements were associated with the time-dependent behavior , the state saving, and the r epresentation of the explicit waiting (7, 6, 4 over 29 respondents, respectively). In particular , they argued that some implementations of BT, SM, and HTN pr o- vide extensions to enable the modeling of time-dependent behavior , state saving and task resuming, and explicit wait- ing, and that such concerns can be r ealized without dif- ﬁculty using the formalism elements. W e included this in Observation 2.2. Additionally , some respondents noted that, for BPMN, state saving and task resuming could potentially be implemented at the execution-engine level. The follow- up discussion with [par:17] conﬁrmed this possibility , and further hinted at the possibility of using compensation for having rollback-like behaviors, leading to Observation 2.1. Moreover , a small number of respondents (3 over 29) dis- agreed on the human-r obot and robot-external system inter- actions. As motivation, they suggested that interactions can be, in general, considered as normal actions and modeled as for other skills. Follow-up discussions with [par:5] also clariﬁed that interactions with external systems can also be interpreted as part of the interactions with the robot’s environment, wher e an implicit communication is medi- ated through sensing actions. Similarly , some respondents argued that explicit waiting can be modeled using standard nodes, delegating the waiting logic to their execution-level implementation. After follow-up discussions, interviewed participants converged toward agreement with the assigned levels of support for the discussed concerns, as, per the deﬁned support levels, the absence of dedicated explicit structures is marked as “partial support”. Concerning the identiﬁed strengths and weaknesses (VQ2.2), respondents were mostly concerned about the ones identiﬁed for BT s (1 strong disagr ee, 2 disagree), mostly arising from the following (now removed) weakness: “it is difﬁcult to trace the origin of failures returned by nodes deeper in the tree”. After follow-up interactions with [par:5] , we removed the statement from T able 7 because it applies to all the considered formalisms, particularly those supporting hierarchy , and more generally to programs using try-catch mechanisms. Moreover , the ﬁrst weakness listed for BT s in T able 7 already captures this aspect through the use of dedicated nodes for error handling. Regarding SMs, the only reported disagreement concerned the treatment of task resuming. After follow-up discussions with [par:10] and [par:20] , we clariﬁed that history states can be used to model task saving and resuming, and we reﬂected this both in the text and in T able 7. Summary of RQ2 . The four formalisms have different expressivity pr operties. Overall, BT and SM, even though not always explicitly , are able to cover all the considered mission concerns by leveraging their base elements. HTN resulted as the mor e “rigid”, being unable to express reactive behavior , and delegating to runtime planning the decision-making capabilities. BPMN offers the richest set of elements, although the models can be complex if the mission speciﬁcation is ﬁne-grained. The identiﬁed concerns, even if not explicitly supported, could be ad- dressed either by relying on workarounds leveraging the existing formalism elements (for partially-supported con- cerns), or at the implementation level through execution- layer mechanisms (for non-supported concerns). 21 T ABLE 8 Identiﬁed tools for BT (Gre y-highlighted rows indicate baseline tools supporting the core formalism and used as foundations f or ROS-based packages). Name Scope Description Documentation Reference BehaviorT ree.CPP Execution; Debug- ging C++ library for executing BT. Provides a rich set of control nodes, logging facilities, and interfaces for de- bugging purposes. [ behaviortree.dev  ] PyT rees Execution; Moni- toring; Debugging Python library for executing BT, designed to facilitate the rapid development of medium-sized decision-making engines. Offers minimal visualizations for monitoring and debugging purposes. [ py-trees.r eadthedocs.io  ] Groot Modeling; Monitoring; Debugging Graphical editor for BT. Allows tree design, log playback, and real-time introspection when connected to Behav- iorT ree.CPP . [ behaviortree.dev/gr oot  ] Forester Execution Orchestration engine implementing BT. Supports export- ing trees for use with the ROS navigation library . [ forester-bt.github.io  ] BehaviorT ree.ROS2 Execution ROS-compatible implementation of BehaviorT ree.CPP [ behaviortree.dev/docs/r os2 integration  ] PyT rees ROS Execution ROS-compatible implementation of PyT rees. [ py-trees-r os.readthedocs.io  ] ros2 ros bt py Execution; Moni- toring Python library supporting runtime execution and inte- gration with ROS2 components, with introspection capa- bilities for monitoring behavior . [ fzi-forschungszentrum- informatik.github.io/ros2 ros bt py/  ] 6 A V A I L A B L E T O O L S S U P P O R T I N G T H E F O R - M A L I S M S ( R Q 3 ) This section provides the analysis and evaluation of tools supporting the four formalisms. W e ﬁrst analyze the land- scape of tools supporting the formalisms, focusing on those that ar e actively maintained and on their scope and usability in robotic missions. 6.1 T ools analysis T ables 8, 9, 10 and 11 report the identiﬁed tools for each formalism, together with their scope, a brief description, and the corresponding documentation refer ences. The rows highlighted in grey denote the tools that, although not designed and realized for robotic systems, have served as the foundation for the development of current ROS- compatible packages. Regarding the scope, we distinguish ﬁve objectives that a tool may support: modeling, execution, monitoring, debugging, and planning. Modeling indicates that the tool provides graphical interfaces to create, edit, or visualize mission speciﬁcations using the constructs of the corresponding formalism. Execution refers to the tool’s ability to interpret, execute, or run the speciﬁed model, either as a standalone engine or as a component integrated within ROS. Monitoring refers to the runtime information during execution, offering real-time insights into mission status and ongoing tasks. Debugging supports developers in diagnosing undesired behaviors through features such as trace visualization, state status, and logging of internal execution events. In addition to these scopes, tools for HTNs may also support a dedicated planning capability . This refers to the automatic generation of a task decomposition or action sequence from an HTN domain description. BT : is supported by several tools (see T able 8). Among the most mature and used ones, BehaviorT ree.CPP , PyT rees , and Groot provide graphical editors, execution engines, log- ging tools, and visualization support that facilitate the de- sign and runtime analysis of BT-based behaviors. Moreover , ROS integration is natively supported through dedicated packages, i.e., BehaviorT ree.ROS2 and PyT rees ROS . Similarly , the For ester tool provides a BT engine that enables the deﬁni- tion of behavior trees through its own DSL. Although it does not include ROS-speciﬁc components, it supports exporting a Forester tree into a format compatible with the ROS navigation library . Additionally , ros2 ros bt py is a ROS2- based Python library for deﬁning and executing BT s. The library enables the speciﬁcation of BT s directly in code and provides tight integration with ROS2 components, support- ing execution and runtime monitoring. In addition to these tools, other solutions, such as CoST AR [55], have been de- veloped as prototypes released together with academic pub- lications. As a consequence, they do not provide long-term maintenance or general-purpose applicability . Finally , it is worth noting the [ github.com/narcispr/py trees meet groot  ] module that enables loading Groot-generated BT s into the PyT rees library , thus providing an automatic mapping from BehaviorT ree.CPP semantics to PyT rees one.  Observation 3.1 – BT -related tools: The most widely adopted BT tools ar e the Behav- iorT ree.CPP ecosystem (including the Groot editor) and the PyT rees suite. These libraries are actively maintained, well-documented, and supported by a large and active community . However , they adopt a different default ex- ecution semantics: BehaviorT ree.CPP implements mem- oryful control nodes, while PyT rees employs a stateless approach. This implementation-level characteristic inﬂu- ences modeling choices and practical usage of the tools. SM : unlike other formalisms, we did not identify estab- lished baseline tools outside the robotics domain that have served as foundations for ROS-compatible solutions (see T a- ble 9). Instead, the most widely adopted tools in robotics ar e those developed directly within the ROS ecosystem itself. In particular , SMACH and FlexBE are mature solutions that provide modeling, execution, and visualizations speciﬁcally tailored for robotic behaviors. Additionally , the Y ASMIN tool [56] has been proposed to address the initial lack of ROS2 compatibility in SMACH and FlexBE . It is actively maintained and provides execution support and modeling 22 T ABLE 9 Identiﬁed tools for SM. Name Scope Description Documentation Reference SMACH Execution; Moni- toring Python library for implementing hierar chical SM in ROS. Provides a runtime viewer that displays active states but offers no modeling interfaces. [ github.com/ros/executive smach  ] FlexBE Modeling; Execu- tion; Monitoring; Debugging Behavior engineering toolkit with a graphical SM editor , execution engine, onboard monitoring, and debugging tools. [ github.com/ﬂexbe  ] Y ASMIN Modeling; Execu- tion; Monitoring; Debugging ROS package for designing and executing SM using Python. Provides a shared blackboard for data exchange and a lightweight execution engine. [ github.com/uleroboticsgr oup/yasmin  ] SMACC2 Execution; Moni- toring; Debugging Event-driven, asynchronous hierarchical state machine library for ROS2 in C++. [ smacc2.robosoft.ai  ] RAFCON Modeling; Execu- tion; Monitoring; Debugging Graphical tool for hierarchical and concurrent SM. In- cludes a graphical interface with visualization, variable inspection, breakpoints, and step-by-step debugging. [ github.com/DLR-RM/RAFCON  ] facilities for SM-based robotic mission speciﬁcation within. Similarly , SMACC2 is a ROS2-oriented library designed to address real-world industrial scenarios with real-time re- quirements. It does not provide graphical modeling support, as state machines are deﬁned directly in code. Nevertheless, it pr ovides execution capabilities, along with monitoring and debugging support, through built-in runtime visualiza- tion and diagnostic tools. Finally , RAFCON [29] offers a hi- erarchical state machine framework that features concurrent state execution for repr esenting complex robot programs. It includes a graphical user interface for creating and editing state machines and provides IDE-like debugging mecha- nisms to support development and runtime monitoring.  Observation 3.2 – SM-related tools: SM tools have generally been developed independently , without building upon an established baseline tool. Among the most widely adopted are SMACH and FlexBE, while Y ASMIN and SMACC2 ar e gaining inter est as the community transitions toward the ROS2 frame- work. Feedback from the evaluation indicated that these tools can present usability challenges, and that SMACH in particular appears to be approaching the end of its life. HTN : is supported by a set of tools focused on task decomposition and planning (see T able 10). Planners like the SHOP family provide mature planning engines that allow the speciﬁcation of tasks and methods in a hierarchical manner , supporting automated planning and reasoning over complex behaviors. A few tools, like InductorHTN and Hierarchical T ask Network Planning AI were speciﬁcally developed for controlling agents within videogames. The latter , in particular , provides a complete toolset, with a graphical interface, to assist developers in realizing HTNs, computing plans (with runtime replanning support), simulating their execution, executing them over the Unreal Engine environment, and debugging. In robotics, frameworks like ROSPlan [57] and PlanSys2 [58] extend HTN planning capabilities to the ROS ecosystem, enabling the integration of task planning within robot behavior . W e note that ROSPlan natively supports PDDL rather than HTN representations. Nevertheless, HTN models can be translated into PDDL under speciﬁc restrictions [59]. Additionally , only a few prototype repositories can be found, they are tightly coupled to speciﬁc experimental setups, such as [ github.com/Robertorocco/Pick Place Blocksworld Envir onment  ], or are no longer actively maintained, like [ github.com/Leontes/ros htn  ], limiting their practical reuse. Additionally , a few works, such as [32], [33], [45], proposed HTN implementations for execution into ROS-based mission execution. However , they all employ ad hoc realized representations of HTNs, representing already-instantiated HTN trees resulting from planning. They are either manually-provided or obtained using one of the aforementioned tools, where all the abstract tasks are already one-to-one bound with methods reﬁning them.  Observation 3.3 – HTN-related tools: Although the SHOP family of planners is relatively dated and no longer actively maintained, many HTN-based tools still build upon it or derive from its principles. In the ROS ecosystem, tools have been primarily developed around PDDL, or aim to integrate planning results with execution mechanisms such as BT s to enable direct de- ployment in robotic systems. BPMN : is widely supported by an ecosystem of tools that cover the entire lifecycle of a process model, includ- ing process modeling, enactment, and monitoring (see T a- ble 11). Several industrial and open-source platforms, such as Camunda and bpmn.io , provide editors, execution engines, and dashboards that facilitate the design and execution of BPMN workﬂows. These tools can form the basis for BPMN solutions that specify and execute robotic missions, as they provide infrastructur es that can be adapted for robotic applications. However , as this formalism belongs to a business and organizational domain, only a few so- lutions exist to operationalize BPMN in the robotic domain. Among them, the FaMe framework [18] provides support for modeling robotic missions and conﬁguring them to be ROS- compliant. It enables mission execution through a BPMN engine implemented as a ROS node, developed by extend- ing the functionalities of the bpmn.io toolsuite. Similarly , the TRACE tool [16] is intended to support both the modeling and execution of planned and contingent activities in r obotic 23 T ABLE 10 Identiﬁed tools for HTN (Gre y-highlighted rows indicate baseline tools supporting the core formalism and used as foundations f or ROS-based packages). Name Scope Description Documentation Reference Pyhop Planning Lightweight Python HTN planner . Suitable for prototyp- ing due to its simple implementation. [ pyhop  ] SHOP family Planning Domain-independent automated-planning systems based on order ed task decomposition. [ cs.umd.edu/projects/shop  ] InductorHTN Planning Python and C++ lightweight HTN planning engine based on a Prolog compiler . It has been developed and used for providing planning support to agents in mobile games. [ github.com/EricZinda/InductorHtn  ] HTN Planning AI Modeling; Planning; Execution; Debugging Plugin for Unr eal Engine providing support for planning game characters’ AI. It provides support for modeling HTNs through a graphical interface, generating plans out of it, running, and debugging the plan. [ maksmaisak.github.io/htn  ] ROSPlan Planning; Execution ROS-integrated planning framework. Generates plans and dispatches them to ROS components. [ kcl-planning.github.io/ROSPlan  ] PlanSys2 Planning; Execution ROS package providing a planning system. After com- puting a plan, it automatically converts it into an exe- cutable BT, enabling integration with robot controllers. ROS planner package that, after obtaining the plan, au- tomatically converts it into an executable BT. [ plansys2.github.io  ] T ABLE 11 Identiﬁed tools for BPMN (Gre y-highlighted rows indicate baseline tools supporting the core formalism and used as foundations f or ROS-based packages). Name Scope Description Documentation Reference Camunda Modeling; Execu- tion; Monitoring; Debugging Process orchestration BPMN platform. Provides graphi- cal modeling tools, workﬂow engines, REST interfaces, and dashboards for runtime monitoring and debugging. [ camunda.com  ] bpmn.io Modeling W eb-based BPMN editor and viewer for designing and visualizing process models. Serves as the front-end basis for many BPMN-based applications. Provides many add- ons that can extend its scope. [ bpmn.io  ] FaMe Modeling; Execu- tion; Monitoring BPMN-driven framework for multi-robot system de- velopment. Provides BPMN modeling, automatic ROS- compliant mission conﬁguration, and an execution en- gine implemented as a ROS node. [ pros.unicam.it/fame  ] TRACE Modeling; Execu- tion A BPMN execution engine providing a connector with the ROS framework. [ github.com/nasa/trace-executive  ] [ github.com/nasa/trace-ros-connector  ] B2XKlaim Modeling; Execu- tion T ranslates BPMN diagrams into executable multi-robot coordination code in Klaim, enabling visual mission de- sign and automated code generation. [ github.com/khalidbourr/B2XKlaim  ] space missions. Unlike most other tools, it also includes model veriﬁcation capabilities to assess feasibility before execution. However , the publicly available implementation appears to provide only execution functionality and lacks comprehensive documentation. Finally , B2XKlaim [60] takes a differ ent approach by translating BPMN diagrams into executable multi-robot coordination code in the Klaim lan- guage. It enables users to visually design robot missions in BPMN, while the generated Klaim code supports their execution. W e acknowledge the existence of an additional BPMN- based solution presented [61]. However , we do not include it among the available tools as it is integrated into a broader and highly specialized workﬂow-management suite rather than being focused on r obotic missions speciﬁcation and execution, and has an outdated corresponding repository . For these reasons, we mention this work for completeness but do not list it as a usable tool in our analysis.  Observation 3.4 – BPMN-based tools: Existing BPMN-ROS solutions build on established tool suites that provide mature execution engines and model- ing environments whose functionalities can be extended (e.g., to support ROS-based functionality). However , we observed that these tools have mainly been developed within the scope of speciﬁc projects or research publi- cations. As a result, their broader adoption and long- term impact on the robotics community require further investigation. 6.2 V alidation T o ensure the accuracy and completeness of the tools anal- ysis, we asked the experts to conﬁrm whether they were familiar with or had used the identiﬁed tools, to assess the accuracy of their classiﬁcation in terms of scope and capabilities, and to indicate whether any relevant tools or 24 aspects had been overlooked. Speciﬁcally , each expert was asked the following VQs: VQ3.1 Are you familiar with or have you previously used any of the listed tools? VQ3.2 Based on your experience, do you agree with the assigned scope? VQ3.3 Are there further ROS-related tools that are missing? The feedback collected during this interview was used to reﬁne and strengthen the results presented in this section and to gather additional insights into the tools presented. For VQ3.1 and VQ3.3, participants responded to open- ended questions, whereas for VQ3.2, they rated their agree- ment on a 5-point Likert scale. Speciﬁcally , in VQ3.1, partici- pants were asked to indicate which tools fr om the pr esented table they were familiar with, or to state “none” otherwise. If familiarity with any tool was acknowledged, in VQ3.2, participants evaluated the correctness of the assigned tool scope. Finally , in VQ3.3, participants were invited to suggest additional formalism-related ROS tools that may have been missing from our analysis. Furthermore, participants wer e invited to pr ovide feedback on tools they were familiar with, allowing us to capture end-user experiences. Results and discussion : W ith respect to tool familiar- ity (VQ3.1), Figure 24 summarizes the distribution of tool knowledge acr oss respondents. T o provide a comprehensive view , the ﬁgure also includes tools suggested in response to VQ3.3, which are marked with an ∗ . For BT, most respondents are familiar with the Behav- iorT ree.CPP ecosystem, while PyT rees is slightly less com- monly known. For SM, familiarity is primarily associated with SMACH , although it is known only by approximately half of the respondents (14 out of 27). Regarding HTN, a small subset of respondents reported familiarity with the SHOP family , while other solutions appear to be less widely recognized. Finally , for BPMN, the baseline tool suites Ca- munda and bpmn.io are the most commonly known among participants. W e also analyzed the proportion of “none” re- sponses, i.e., participants who indicated no familiarity with the proposed tools and did not suggest alternatives. This proportion amounts to approximately 27.2% of BT experts (6/22), 33.3% of SM experts (9/27), 50% of HTN experts (5/10), and 28.6% of BPMN experts (4/14). This distribution suggests that, speciﬁcally for HTN, knowledge may be more widespread at a conceptual or theoretical level than at the level of concrete tool usage and adoption. Concerning the agreement with the assigned tool scope (VQ3.2), respondents expressed overall positive evaluations. Only one participant suggested a modiﬁcation, noting that Y ASMIN also provides debugging support. This observa- tion has been incorporated into T able 8, T able 9, T able 10, and T able 11 and in the corresponding tool description. Ad- ditionally , another participant raised concerns about ROS- Plan’s primary support for PDDL, noting that integrating HTNs would require translation into PDDL. W e revised the description of the ROSPlan framework to explicitly clarify this aspect, while keeping its inclusion as a poten- tial solution for integrating HTN-based approaches within the ROS ecosystem. Figur e 25 reports the distribution of agreement levels. The number of respondents considered for this analysis includes only those who indicated familiarity with at least one tool in VQ3.1. Overall, the agreement levels conﬁrm the appropriateness of the adopted scope classiﬁcation. Regarding missing tools (VQ3.3), r espondents suggested adding additional solutions. In particular , ros2 ros bt py was included among the BT tools, and SMACC2 was added to the SM ones. W e also discussed in the text [ github.com/narcispr/py trees meet groot  ], which provides an automatic translation between Groot and PyT rees, and repr esents a solution of interest for future investigation. Furthermore, for BPMN, TRACE and B2XKlaim were ex- plicitly mentioned and have now been incorporated into T able 11 and discussed in the text. TRACE was already known from the literature; however , as its implementation was not initially mapped to a publicly available repository , it was not listed in the original table. Finally , we report the qualitative feedback collected from respondents regar ding their practical experience with the analyzed tools. W ith respect to BT s, respondents generally provided positive feedback on the maturity and usability of the BehaviorT ree.CPP and PyT rees ecosystems. However , some participants noted differ ences in semantics between BehaviorT ree.CPP and PyT rees , suggesting that these differ - ences may inﬂuence expressivity and modeling choices. This aspect has been summarized in Observation 3.1. Regarding SMs tools, the feedback was more heterogeneous. Several respondents highlighted usability challenges, reporting that some SM tools can be difﬁcult to conﬁgure or use in practice. W e reported this in Observation 3.2. At the same time, a contrasting experience was reported (targeting SMACH and FlexBE ), which was described as being in academic envi- ronments and relatively easy for students to use. These di- verging perspectives suggest that usability may depend sig- niﬁcantly on context and usage objectives. For HTN-related tools, a participant emphasized that the SHOP family frame- works, although still considered reference implementations, are relatively old and not actively maintained, as reported in Observation 3.3. Finally , for BPMN, feedback was largely positive. In particular , bpmn.io was explicitly appreciated for its ﬂexibility and suitability for adapting BPMN models to mission-speciﬁc requirements. This highlights the modeling and execution support provided by BPMN tools, even if its adoption in robotics remains less widespread. Summary of RQ3 . Publicly available tool support is uneven across the four formalisms. BT s and SMs beneﬁt from several actively maintained, ROS-oriented frame- works that primarily target modeling, execution, mon- itoring, and debugging. HTN support is more limited and focuses on planners, with only a few tools providing integration with robotic execution. BPMN is supported by a matur e ecosystem of business-process tools with strong modeling, execution, monitoring, and debugging capabilities; however , its integration into robotic systems remains at an early stage. Feedback from the validation further highlighted differ ences in usability and maturity: BT tools and BPMN baseline tools were generally per- ceived as robust and well supported, SM tools wer e often noted as less user-friendly , and HTN tools were consid- ered comparatively dated and less actively maintained. 25 0 3 6 9 12 15 18 21 BehaviorT ree.CPP BehaviorT ree.ROS2 PyT rees PyT rees ROS Groot ros2 ros bt py ∗ Respondents familiarity BT s 0 2 4 6 8 10 SHOP family ROSPlan Pyhop PlanSys2 Respondents familiarity HTN 0 3 6 9 12 15 18 21 24 27 SMACH FlexBE Y ASMIN SMACC2 Forester Respondents familiarity SMs 0 2 4 6 8 10 12 14 Camunda bpmn.io FaMe B2XKlaim ∗ TRACE ∗ Respondents familiarity BPMN Fig. 24. T ool mentions grouped by f ormalism ( ∗ -mar ked tools represent the ones suggested b y respondents). 0 5 10 15 20 BT SM HTN BPMN Number of responses Strongly disagree Disagree Neutral Agree Strongly agree Fig. 25. Likert responses for VQ3.2. 7 D I S C U S S I O N This section discusses the implications of our results from two complementary perspectives. First, we reﬂect on how well the four formalisms capture mission concerns that arise in real-world service and multi-robot settings, including variability introduced by human involvement and runtime uncertainty . Second, we discuss adoption-oriented factors, such as readability , reuse, accuracy , and tooling integration, that often determine whether a formalism is viable beyond controlled examples. 7.1 Formalisms expressiveness for real-world needs Missions are a major driver of variability for service robots [3]. In particular , mission speciﬁcation formalisms should support key sour ces of variability [3]: (i) the ex- pertise of the human operator varies and is often domain- dependent, (ii) the means of human–robot interaction range from traditional interfaces to gesture- or voice-based in- teraction, and (iii) humans may share the environment with robots and participate at different levels, from passive to active to proactive involvement [62]. Human presence, therefor e, requires abilities beyond merely completing the mission safely: r obots increasingly operate with conﬁg- urable degrees of autonomy and may need to adapt their behavior to human intentions and, potentially , af fective cues (e.g., deciding when control should shift from the robot to the human to avoid ethically problematic situations) [62]. These aspects pose r equirements on mission formalisms: they should be expressive enough, and they should enable operators, given their expertise and available interaction mechanisms, to specify missions correctly , safely , and at an appropriate level of detail [3]. W ithin this context, our analysis indicates that the four formalisms exhibit expressiveness gaps that are often ad- dressed through external code or additional artifacts. Be- sides our results, prior work highlights the advantages of BT s, such as ﬂexibility [8], reactivity , and modularity [10]. However , several mission concerns (e.g., interaction logic, temporal constraints, and waiting) are typically delegated to action-node implementations, as the cor e syntax provides no dedicated constructs for these aspects. This tendency also makes BT s tightly coupled with “behavioral glue code” that links the model to the underlying software system [43]. Moreover , some behavioral aspects (notably concurrency) are not strictly deﬁned by the BT formalism and are often left to user-deﬁned execution policies and implementations. A similar conclusion applies to SMs, where many concerns are pushed into state implementations and supporting in- frastructure. HTNs are less suited to missions where key choices cannot be fully committed at design time because they 26 depend on runtime observations or evolving conditions (e.g., in the FL scenario, some decisions cannot be taken in advance [33]). While this limitation can be mitigated by pro- viding multiple alternative methods for the same abstract task, this shifts the burden to online method selection and (re-)planning, which typically requires dedicated mission- management components (state monitoring, re-planning triggers, and safe plan replacement) and non-trivial engi- neering effort. The challenge is ampliﬁed in multi-robot missions, where decisions depend on distributed state (e.g., teammate availability and communication delays), mak- ing consistent replanning and coordinated task allocation harder to implement and validate. Finally , our ﬁndings suggest that the formalisms are better suited to differ ent abstraction levels. BPMN is well aligned with mission-level modeling, as it supports explicit orchestration and can facilitate the integration of robots with other devices as well as with human workﬂows. Compared to the other formalisms, the main advantage of HTN is that decision making is speciﬁed declaratively (tasks, alternative methods, and preconditions), rather than being hard-coded through explicit control-ﬂow; in automated planning terms, this corresponds to deliberation , i.e., reasoning about which course of action to take based on goals and the current state [34]. The resulting HTN speciﬁcation can then be executed operationally like a process, similarly to what is done in distributed and architecture-oriented robotics planning frameworks [14], [32]. In contrast, BT s are often a good ﬁt for task-level speciﬁcations, while SMs are typically suitable for well-scoped skills or modes with clear event- driven transitions. Importantly , these formalisms are not mutually exclusive: as also suggested by some interviewees, a pragmatic approach is to combine them so that each is used where it provides the strongest modeling support (e.g., BPMN for orchestration and human/robot handoffs, HTN for deliberation and plan synthesis, and BT s/SMs for reactive execution and skill-level control). 7.2 Formalisms adoption in real-world settings Beyond expressivity (Section 4), additional factors inﬂuence whether a mission speciﬁcation formalism is adopted in practice, especially when intended users may include do- main experts rather than robotics specialists. In what fol- lows, we discuss four recurrent adoption drivers: (i) simplic- ity (including complexity and readability), (ii) scalability , (iii) extensibility , (iv) reusability , (v) accuracy , and (vi) integrability with available tooling and with heterogeneous devices (e.g., IoT sensors). Simplicity , complexity , and readability . W e interpret com- plexity as the number of modeling elements and the amount of connections, relationships, and interdependencies that must be managed to specify a mission, and readability as the extent to which a human can understand, maintain, and debug the model without extensive effort, both aspects being central for adoption [44]. Among the considered for- malisms, BT s are often perceived as less immediate for non- experts: their tick-based semantics make the control ﬂow implicit and not directly readable from the tree shape, so understanding the runtime behavior often requir es mentally simulating the ticking mechanism and the propagation of status values. In addition, some constructs can become fragile when striving for robustness: since both conditions and actions return the same status domain, failures can trigger the same fallback behavior , which may inadvertently steer execution to alternative branches unless additional guarding logic is introduced. As a result, seemingly simple concerns (e.g., implementing a wait or carefully isolating failure causes) may lead to non-trivial modeling patterns and increased structural complexity . HTNs, while also tree- shaped, often of fer a more linear “reading” of execution (e.g., depth-ﬁrst decomposition), which can make the in- tended ﬂow easier to grasp at a high level. In contrast, SMs and BPMN expose contr ol ﬂow more explicitly through their transition-/token-based semantics, which often im- proves traceability of execution paths. At the same time, BPMN provides a rich and standardized notation, which can support communication and documentation, but its breadth of constructs can raise the entry barrier and require deeper familiarity; this suggests that BPMN may be particu- larly suitable for documenting and communicating mission workﬂows at higher abstraction levels. Scalability . In terms of scalability , i.e., the ability to model increasingly large and detailed missions while preserving readability and manageability , hierarchical structuring and modularization are key . BT s typically scale well through subtree composition and reuse, although scalability de- grades when many mission concerns are implemented in- side action nodes, splitting the logic between the model and code. SMs can become dif ﬁcult to maintain as missions grow due to state/transition explosion, even if hierarchy mitigates it partially . HTNs scale effectively as domain libraries by adding methods/operators, but large method sets increase maintenance and debugging effort and often r equire ad- ditional infrastructure (e.g., monitoring and replanning) to remain reactive. BPMN can remain readable at scale when using subprocess decomposition and clear conventions, but highly detailed robot behaviors may still lead to large pro- cess models and extra ef fort to connect process-level logic to execution-level mechanisms. Extensibility . Extensibility is a desired characteristic of mission speciﬁcation formalisms [44]: as missions evolve, engineers need to intr oduce new behaviors and cross- cutting concerns without rewriting large portions of the model. BT s can often support localized extensions by com- posing or replacing subtr ees and by using decorators to wrap existing behaviors (e.g., retries, timeouts, guards, and recovery) with limited impact on otherwise independent parts. In contrast, some formalisms may “explode” under incremental change: SMs can suffer from state/transition explosion when new interrupts, priorities, and exception paths must be integrated across many states, while BPMN models can become unwieldy if low-level contingencies and exception handling are explicitly encoded in the process. This motivates combining complementary models so that extensions r emain localized to the abstraction layer they affect. Multi-robot settings further stress extensibility because they intr oduce task interruption, save-and-r esume behavior , and team-level coordination concerns. For example, when priorities change due to limited resources (e.g., battery level), robots may need to interrupt one task and later 27 resume it from the pr evious computational state; none of the analyzed formalisms support task save-and-resume as a na- tive capability , making resumption logic an additional engi- neering concern. Moreover , multi-robot missions inherently requir e task assignment and coordination : tasks must be allo- cated based on capabilities and availability , and execution must be synchronized through dependencies, rendezvous points, mutual exclusion over shared resources, and com- munication protocols. These concerns are rarely captured end-to-end by a single formalism, and extending missions in practice often entails evolving not only the behavior model but also the surrounding coordination mechanisms that ensure coherent team-level execution. Reusability . W e deﬁne reusability as the extent to which mission fragments can be r eused across missions with minimal adaptation. In principle, all formalisms support some form of modularization (e.g., hierarchical states, sub- processes, subtree composition, method libraries), but prac- tical reuse depends on how well models can be separated from system-speciﬁc code and tooling. For BT s, reuse is often advertised through subtree composition and libraries of nodes; however , empirical evidence indicates that reuse mechanisms in robotics projects are frequently simple and that models may be deeply intertwined with “behavioral glue code” connecting them to the underlying software system, which hinders reuse and makes model-level ma- nipulation (visualization, testing, reuse outside the original system) more difﬁcult [43]. Similar risks exist for other for- malisms whenever mission logic is split between the model and extensive external code (e.g., action implementations, event dispatchers, or custom runtime adapters), reducing portability of reusable fragments. As a standardized no- tation, BPMN provides explicit constructs for modulariza- tion, such as sub-processes and call activities, which can facilitate reuse when models are designed accordingly [18]. Nevertheless, the actual degree of reuse depends on mod- eling discipline: tight coupling with speciﬁc execution en- gines or custom extensions may limit portability , whereas well-structured and implementation-agnostic models can be more easily reused across missions and systems. Accuracy . W e interpret accuracy as the ability of the for- malism to captur e relevant mission concerns without r elying on undocumented assumptions or external artifacts that are essential for understanding the intended behavior . In real deployments, accuracy is challenged whenever crucial concerns are systematically delegated to low-level imple- mentations (e.g., synchronization protocols, timing/waiting policies, interaction contracts), because the model ceases to be a self-contained repr esentation of the mission. This is particularly problematic for validation and assurance: stakeholders may read the model as complete, while key behaviors are implicitly deﬁned elsewhere. From this point of view , all the considered formalisms are exempt from accuracy limitations, since all of them, in differ ent ways, delegate the expression of concerns to ad hoc behavior im- plementation or to the underlying execution infrastructure. For instance, BT s and SMs are often used in the implemen- tation of action nodes or state behaviors to encode coor dina- tion and communication mechanisms or timing constraints. HTN externalize reactive behavior to the planner . BPMN, while explicitly modeling many concerns, typically assumes engine-level mechanisms for task suspension and resump- tion. Integrability . Finally , integrability concerns the ease of embedding the speciﬁcation into an operational robotic system and its ecosystem, leveraging existing tools and in- terfaces without requiring extensive bespoke adapters. This includes runtime execution support, monitoring/debugging facilities, and interoperability with external devices and ser- vices. BT s often integrate high-level decision-making with low-level control through mature robotics-oriented libraries and runtime infrastructures. BPMN is well supported in business-process platforms and can naturally integrate with enterprise and IoT ecosystems; however , a known limitation is that business processes ar e typically speciﬁed a priori and can behave like rigid action plans at runtime [63]. Recent proposals, therefore, combine process execution with auto- mated planning to recover from exceptional situations and preserve progress during execution [64]–[66]. In particular , automated planning (including HTN-based approaches) has been argued to be well suited to synthesize at runtime the content of underspeciﬁed activities, i.e., generating sub- processes of appropriate granularity when it becomes clear what must be done at a given point in the process [66]. 8 R E L AT E D W O R K S Among the considered formalisms, BT s have attracted the most signiﬁcant attention in robotics as a modular and reactive formalism for structuring robot behaviors. Prior work has discussed BT s from both conceptual and practical perspectives, including their modeling principles, typical control-ﬂow constructs, and the engineering motivations behind their adoption in robotic systems. In particular , Gh- zouli et al. [43] analyze key BT characteristics and modeling concepts, relate them to UML state and activity diagrams through a language-level mapping, and complement this discussion with an empirical analysis of how BT s are used in practice by mining GitHub repositories (e.g., adopted libraries, language elements, and reuse patterns). Comple- mentary works provide broader background on BT s in robotics and their beneﬁts: Colledanchise and ¨ Ogren discuss advantages of BT s over alternative control architectur es [10] and further elaborate on BT design principles and expres- siveness considerations [9], while Iovino et al. [25] offer a survey of BT s in robotics that synthesizes common patterns of use, implementation practices, and recurring challenges. Beyond BT-focused studies, several contributions ex- plicitly compare BT s with SMs, highlighting differ ences in execution semantics and their practical implications. Berger et al. [8] compare BT s and SMs through the lens of widely used DSL-based implementations (e.g., BehaviorT ree.CPP and PyT rees for BT s, and SMACH and FlexBE for SMs), contrasting their modeling constructs and semantics and further analyzing their adoption in open-source projects mined from GitHub. In addition to conceptual and tooling- oriented comparisons, controlled empirical evidence has been reported on the effects of using BT s versus SMs in robot mission speciﬁcation tasks [4], offering a user-centric view on the trade-offs between the two formalisms. Finally , comparative discussions have also been extended to broader mission-speciﬁcation perspectives that position BT s and 28 SMs with respect to complementary modeling approaches and their support for mission concerns [44]. In summary , existing work establishes BT s as a practical and widely adopted formalism in robotics [9], [10], [25], [43] and clariﬁes key trade-offs between BT s and SMs [4], [8], [44]. Our work builds on these foundations by adopting a mission-speciﬁcation viewpoint and extends the compar- ative analysis beyond BT s and SMs to cover additional formalisms and the mission concerns they (explicitly or implicitly) support. 9 C O N C L U S I O N S A N D F U T U R E W O R K This paper compares four mission speciﬁcation formalisms for robotics, namely BT s, SMs, HTNs, and BPMN, to clarify their expressiveness for real-world missions and the impli- cations for adoption. W e addr ess thr ee r esearch questions by analyzing how each formalism r epresents cor e contr ol struc- tures and mission concepts across repr esentative scenarios, synthesizing their strengths and weaknesses with respect to recurring mission concerns, and validating the analysis through an expert questionnaire survey complemented by targeted follow-up interactions. Our results show that the formalisms are complemen- tary rather than interchangeable: each is strongest at a particular abstraction level, while other concerns are of- ﬂoaded to external artifacts or implementation code. BPMN best supports mission-level orchestration and integration with human workﬂows and heterogeneous devices; HTN supports declarative decision making (deliberation) and can synthesize executable structures from tasks, methods, and preconditions; BT s suit task-level reactive execution and modular composition; and SMs ﬁt well-scoped skills and mode-based control. However , key mission concerns (e.g., temporal constraints, waiting, interaction protocols, and aspects of concurrency) are frequently delegated to action/state implementations, reducing the model’s self- containment. In multi-robot missions, task interruption with save-and-resume, task assignment, and coordination over distributed state remain largely unsupported as ﬁrst- class constructs and typically requir e additional mission- management infrastructure. A key takeaway is therefor e that these formalisms should be viewed as complementary rather than competing . In practice, combining them can be a pragmatic strategy to keep speciﬁcations readable, maintainable, and evolvable: for example, using BPMN or HTN at the mission level for orchestration and deliberation, while delegating execution- level robustness to BT s and skill/mode logic to SMs. This layered use also helps mitigate scalability and extensibility issues that may arise when a single formalism is stretched across all mission concerns. Future work should focus on (i) principled guidelines for multi-formalism mission speciﬁcations and their interfaces (e.g., plan-to-execution dispatch, monitoring feedback, and recovery), (ii) reusable patterns and tool support for recur- ring concerns such as interruption/resumption, distributed coordination, and human-in-the-loop adaptation, and (iii) shared benchmarks and empirical studies on larger systems to quantify trade-offs in scalability , maintainability , and correctness across formalisms and combinations thereof. A C K N OW L E D G M E N T S This work has been partially funded by (a) the MUR (Italy) Department of Excellence 2023 - 2027, (b) the European HORIZON-KDT -JU resear ch pr oject MA TISSE “Model- based engineering of Digital T wins for early veriﬁcation and validation of Industrial Systems”, HORIZON-KDT -JU-2023- 2-RIA, Proposal number: 101140216-2, KDT232RIA 00017, (c) the PRIN pr oject P2022RSW5W - RoboChor: Robot Choreography , (d) the PRIN project 2022JKA4SL - HALO: etHical-aware AdjustabLe autOnomous systems. R E F E R E N C E S [1] G.-Z. Y ang, J. Bellingham, P . Dupont, P . Fischer , L. Floridi, R. Full, N. Jacobstein, V . Kumar , M. McNutt, R. Merriﬁeld, B. Nelson, B. Scassellati, M. T addeo, R. T aylor , M. V eloso, Z. L. W ang, and R. W ood, “The grand challenges of science robotics,” Science Robotics , vol. 3, no. eaar7650, Jan. 2018. [Online]. A vailable: https://robotics.sciencemag.or g/content/3/14/eaar7650 [2] P . Schillinger , S. Garc ´ ıa, A. Makris, K. Roditakis, M. Logothetis, K. Alevizos, W . Ren, P . T ajvar , P . Pelliccione, A. Argyros, K. J. Kyriakopoulos, and D. V . Dimarogonas, “Adaptive heterogeneous multi-robot collaboration from formal task speciﬁcations,” Robot. Auton. Syst. , vol. 145, no. C, Nov . 2021. [Online]. A vailable: https://doi.org/10.1016/j.r obot.2021.103866 [3] S. Garc ´ ıa, D. Str ¨ uber , D. Brugali, A. D. Fava, P . Pelliccione, and T . Berger , “Software variability in service robotics,” Empirical Soft- ware Engineering , vol. 28, no. 1, p. 24, 2023. [4] S. Dragule, E. Bainomugisha, P . Pelliccione, and T . Berger , “Ef- fects of specifying robotic missions in behavior trees and state machines,” Journal of Computer Languages , vol. 85, p. 101330, 2025. [5] S. Garc ´ ıa, P . Pelliccione, C. Menghi, T . Berger , and T . Bures, “High- level mission speciﬁcation for multiple robots,” in Proceedings of the 12th ACM SIGPLAN International Conference on Software Language Engineering , ser . SLE 2019. New Y ork, NY , USA: Association for Computing Machinery , 2019, p. 127–140. [Online]. A vailable: https://doi.org/10.1145/3357766.3359535 [6] C. Menghi, C. T sigkanos, P . Pelliccione, C. Ghezzi, and T . Berger , “Speciﬁcation patterns for robotic missions,” IEEE T ransactions on Software Engineering , vol. 47, no. 10, pp. 2208–2224, 2021. [7] C. Menghi, C. T sigkanos, M. Askarpour , P . Pelliccione, G. V ´ azquez, R. Calinescu, and S. Garc ´ ıa, “Mission speciﬁcation patterns for mobile robots: Providing support for quantitative pr operties,” IEEE T ransactions on Software Engineering , vol. 49, no. 4, pp. 2741– 2760, 2023. [8] R. Ghzouli, T . Berger , E. B. Johnsen, A. W asowski, and S. Dragule, “Behavior trees and state machines in robotics applications,” IEEE T ransactions on Software Engineering , vol. 49, no. 9, pp. 4243–4267, 2023. [9] M. Colledanchise and L. Natale, “On the implementation of behavior trees in robotics,” IEEE Robotics and Automation Letters , vol. 6, no. 3, pp. 5929–5936, 2021. [10] M. Colledanchise, A. Marzinotto, D. V . Dimarogonas, and P . Oe- gren, “The advantages of using behavior tr ees in mult-robot systems,” in Proceedings of ISR 2016: 47st International Symposium on Robotics , 2016, pp. 1–8. [11] M. Colledanchise and P . ¨ Ogren, Behavior trees in robotics and AI: An introduction . CRC Press, 2018. [12] J. M. Zutell, D. C. Conner , and P . Schillinger , “Ros 2-based ﬂexible behavior engine for ﬂexible navigation,” in SoutheastCon 2022 , 2022, pp. 674–681. [13] P . Schillinger , S. Kohlbrecher , and O. von Stryk, “Human-robot col- laborative high-level control with application to rescue robotics,” in 2016 IEEE International Conference on Robotics and Automation (ICRA) , 2016, pp. 2796–2802. [14] C. Lesire, G. Infantes, T . Gateau, and M. Barbier , “A distributed architecture for supervision of autonomous multi- robot missions - application to air-sea scenarios,” Auton. Robots , vol. 40, no. 7, pp. 1343–1362, 2016. [Online]. A vailable: https://doi.org/10.1007/s10514- 016- 9603- z [15] R. Rey , M. Corzetto, J. A. Cobano, L. Merino, and F . Caballero, “Human-robot co-working system for warehouse automation,” in International Conference on Emerging T echnologies and Factory Automation, ETF A . IEEE, 2019, pp. 578–585. [Online]. A vailable: https://doi.org/10.1109/ETF A.2019.8869178 29 [16] J.-P . de la Croix and G. Lim, “Event-driven modeling and execu- tion of robotic activities and contingencies in the europa lander mission concept using bpmn,” 2020. [17] J. Whitaker , J. Swedeen, and G. Droge, “Mission planning and execution architecture for robotic systems using bpmn,” in Inter- mountain Engineering, T echnology and Computing (IETC) . IEEE, 2024, pp. 34–39. [18] F . Corradini, S. Pettinari, B. Re, L. Rossi, and F . T iezzi, “A BPMN- driven framework for multi-robot system development,” Robotics and Autonomous Systems , vol. 160, p. 104322, 2023. [19] M. Askarpour , C. T sigkanos, C. Menghi, R. Calinescu, P . Pel- liccione, S. Garc ´ ıa, R. Caldas, T . J. von Oertzen, M. W immer , L. Berardinelli, M. Rossi, M. M. Bersani, and G. S. Rodrigues, “Robomax: Robotic mission adaptation exemplars,” in 2021 In- ternational Symposium on Software Engineering for Adaptive and Self- Managing Systems (SEAMS) , 2021, pp. 245–251. [20] C. Basich, J. A. Russino, S. A. Chien, and S. Zilberstein, “A sampling based approach to robust planning for a planetary lander ,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS . IEEE, 2022, pp. 4106–4111. [Online]. A vailable: https://doi.org/10.1109/IROS47612.2022.9981083 [21] D. W ang, J. A. Russino, C. Basich, and S. A. Chien, “Analyzing the efﬁcacy of ﬂexible execution, replanning, and plan optimization for a planetary lander ,” in International Conference on Automated Planning and Scheduling, ICAPS . AAAI Press, 2022, pp. 518–526. [Online]. A vailable: https: //ojs.aaai.org/index.php/ICAPS/article/view/19838 [22] C. Schlegel, A. Lotz, and D. Stampfer , “Robmosys composable models and software for robotics systems, d2.1 - deliverable d2.1: Modeling foundation guidelines and meta-meta-model structures,” RobMoSys Project, EU H2020 Project Deliverable D2.1, 2017, robMoSys Deliverable. [Online]. A vailable: https: //robmosys.eu/wp- content/uploads/2017/03/D2.1 Final.pdf [23] ACM SIGSOFT , “Empirical Standards for Software Engineer- ing,” https://github.com/acmsigsoft/EmpiricalStandards/blob/ master/docs/standards/Questionnair eSurveys.md, 2020. [24] I. Etikan, S. A. Musa, R. S. Alkassim et al. , “Comparison of convenience sampling and purposive sampling,” American journal of theoretical and applied statistics , vol. 5, no. 1, pp. 1–4, 2016. [25] M. Iovino, E. Scukins, J. Styrud, P . ¨ Ogren, and C. Smith, “A survey of behavior trees in robotics and ai,” Robotics and Autonomous Systems , vol. 154, p. 104096, 2022. [26] S. Gugliermo, D. C ´ aceres Dom ´ ınguez, M. Iannotta, T . Stoyanov , and E. Schaffernicht, “Evaluating behavior trees,” Robotics and Autonomous Systems , vol. 178, p. 104714, 2024. [Online]. A vailable: https://www .sciencedirect.com/science/article/pii/ S0921889024000976 [27] J. Bohren and S. Cousins, “The SMACH high-level executive [ROS news],” IEEE Robotics & Automation Magazine , vol. 17, no. 4, pp. 18–20, 2010. [28] D. Harel, “Statecharts: A visual formalism for complex systems,” Science of computer programming , vol. 8, no. 3, pp. 231–274, 1987. [29] S. G. Brunner , F . Steinmetz, R. Belder , and A. D ¨ omel, “RAFCON: A graphical tool for engineering complex, robotic tasks,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS . IEEE, 2016, pp. 3283–3290. [Online]. A vailable: https://doi.org/10.1109/IROS.2016.7759506 [30] OMG, “Uniﬁed Modeling Language,” 2017. [31] K. Erol, J. Hendler , and D. S. Nau, “HTN planning: complexity and expressivity ,” in Proceedings of the T welfth AAAI National Conference on Artiﬁcial Intelligence , ser . AAAI’94. AAAI Press, 1994, p. 1123–1128. [32] G. Rodrigues, R. Caldas, G. Araujo, V . de Moraes, G. Rodrigues, and P . Pelliccione, “An architecture for mission coordination of heterogeneous robots,” Journal of Systems and Software , vol. 191, p. 111363, 2022. [33] G. Filippone, J. A. Pi ˜ nera Garc ´ ıa, M. Autili, and P . Pelliccione, “Handling uncertainty in the speciﬁcation of autonomous multi- robot systems through mission adaptation,” in Proceedings of the 19th International Symposium on Software Engineering for Adaptive and Self-Managing Systems , ser . SEAMS ’24. New Y ork, NY , USA: Association for Computing Machinery , 2024, p. 25–36. [34] M. Ghallab, D. Nau, and P . T raverso, Automated planning and acting . Cambridge University Press, 2016. [35] I. Georgievski and M. Aiello, “An overview of hierarchical task network planning,” 2014. [Online]. A vailable: https: //arxiv .org/abs/1403.7426 [36] D. H ¨ oller , G. Behnke, P . Bercher , S. Biundo, H. Fiorino, D. Pellier , and R. Alford, “Hddl: An extension to pddl for expressing hier- archical planning problems,” Proceedings of the AAAI Conference on Artiﬁcial Intelligence , vol. 34, no. 06, pp. 9883–9891, Apr . 2020. [37] M. Dumas, M. L. Rosa, J. Mendling, and H. A. Reijers, Fundamen- tals of Business Process Management, Second Edition . Springer , 2018. [38] M. W eske, Business Process Management - Concepts, Languages, Ar- chitectures . Springer , 2019. [39] J.-P . de la Croix, G. Lim, J. V ander Hook, A. Rahmani, G. Droge, A. Xydes, and C. Scrapper Jr , “Mission modeling, planning, and execution module for teams of unmanned vehicles,” in Unmanned Systems T echnology XIX , vol. 10195. SPIE, 2017, pp. 160–172. [40] OMG, “Business Process Model and Notation (BPMN) v . 2.0,” 2011. [41] I. Compagnucci, F . Corradini, F . Fornari, and B. Re, “A study on the usage of the BPMN notation for designing process collabora- tion, choreography , and conversation models,” Bus. Inf. Syst. Eng. , vol. 66, no. 1, pp. 43–66, 2024. [42] M. Geiger , S. Harrer , J. Lenhard, M. Casar , A. V orndran, and G. W irtz, “BPMN conformance in open source engines,” in Sym- posium on Service-Oriented System Engineering . IEEE Computer Society , 2015, pp. 21–30. [43] R. Ghzouli, T . Berger , E. B. Johnsen, S. Dragule, and A. W asowski, “Behavior trees in action: a study of robotics applications,” in Proceedings of the 13th ACM SIGPLAN International Conference on Software Language Engineering , ser . SLE 2020. New Y ork, NY , USA: Association for Computing Machinery , 2020, p. 196–209. [44] M. Iovino, J. F ¨ orster , P . Falco, J. J. Chung, R. Siegwart, and C. Smith, “Comparison between behavior trees and ﬁnite state machines,” IEEE T rans Autom. Sci. Eng. , vol. 22, pp. 21 098–21 117, 2025. [Online]. A vailable: https://doi.org/10.1109/T ASE.2025. 3610090 [45] E. B. Gil, G. N. Rodrigues, P . Pelliccione, and R. Calinescu, “Mission speciﬁcation and decomposition for multi-robot systems,” Robotics and Autonomous Systems , vol. 163, p. 104386, 2023. [Online]. A vailable: https://www .sciencedirect. com/science/article/pii/S0921889023000258 [46] G. R. Silva, J. P ¨ aßler , J. Zwanepol, E. Alberts, S. L. T . T arifa, I. Gerostathopoulos, E. B. Johnsen, and C. H. Corbato, “Suave: An exemplar for self-adaptive underwater vehicles,” in 2023 IEEE/ACM 18th Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS) , 2023, pp. 181–187. [47] G. Filippone, S. Pettinari, and P . Pelliccione, “Formalisms for robotic mission speciﬁcation and execution: A comparative analysis - replication package,” 2026. [Online]. A vailable: https://gssi- robotics.github.io/robotic- mission- formalisms [48] I. Nassi and B. Shneiderman, “Flowchart techniques for str uctured programming,” SIGPLAN Not. , vol. 8, no. 8, p. 12–26, Aug. 1973. [49] “RobMoSys Wiki,” https://robmosys.eu/wiki- sn- 02/start. [50] D. Crestani, K. Godary-Dejean, and L. Lapierre, “Enhancing fault tolerance of autonomous mobile robots,” Robotics and Autonomous Systems , vol. 68, pp. 140–155, 2015. [Online]. A vailable: https:// www .sciencedirect.com/science/article/pii/S0921889014003157 [51] C. Schmidbauer , S. Zafari, B. Hader , and S. Schlund, “An empirical study on workers’ preferences in human–r obot task assignment in industrial assembly systems,” IEEE T ransactions on Human-Machine Systems , vol. 53, no. 2, pp. 293–302, 2023. [52] B. Intrigila, G. D. Penna, and A. D’Ambrogio, “A lightweight BPMN extension for business process-oriented requirements en- gineering,” Compututers , vol. 10, no. 12, p. 171, 2021. [53] C. T sai, H. Luo, and F . W ang, “Constructing a BPM environment with BPMN,” in International Workshop on Future T rends of Dis- tributed Computing Systems . IEEE Computer Society , 2007, pp. 164–172. [54] I. Millington, AI for Games . CRC Press, 2019. [55] C. Paxton, A. Hundt, F . Jonathan, K. Guerin, and G. D. Hager , “Costar: Instructing collaborative robots with behavior trees and vision,” in 2017 IEEE International Conference on Robotics and Au- tomation (ICRA) , 2017, pp. 564–571. [56] M. ´ A. G. Santamarta, F . J. Rodr ´ ıguez-Lera, V . M. Olivera, and C. F . Llamas, “Y ASMIN: yet another state machine,” in ROBOT Iberian Robotics Conference - Advances in Robotics , ser . Lecture Notes in Networks and Systems, vol. 590. Springer , 2022, pp. 528–539. [Online]. A vailable: https://doi.org/10.1007/978- 3- 031- 21062- 4 43 [57] M. Cashmore, M. Fox, D. Long, D. Magazzeni, B. Ridder , A. Carrera, N. Palomeras, N. Hurt ´ os, and M. Carreras, “Rosplan: 30 Planning in the robot operating system,” in International Conference on Automated Planning and Scheduling, ICAPS . AAAI Press, 2015, pp. 333–341. [Online]. A vailable: http://www .aaai. org/ocs/index.php/ICAPS/ICAPS15/paper/view/10619 [58] F . Mart ´ ın, J. G. Clavero, V . Matell ´ an, and F . J. Rodr ´ ıguez, “Plansys2: A planning system framework for ROS2,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS . IEEE, 2021, pp. 9742–9749. [Online]. A vailable: https://doi.org/10.1109/IROS51168.2021.9636544 [59] R. Alford, U. Kuter , and D. S. Nau, “T ranslating HTNs to PDDL: A small amount of domain knowledge can go a long way ,” in International Joint Conference on Artiﬁcial Intelligence (IJCAI) , 2009, pp. 1629–1634. [60] K. Bourr , F . Tiezzi, L. Bettini, and S. Seriani, “T ranslating bpmn models into x-klaim programs for developing multi-robot mis- sions,” International Journal on Software T ools for T echnology T ransfer , pp. 1–19, 2026. [61] W . Ochoa, J. Legaristi, F . Larrinaga, and A. Perez, “Dynamic context-aware workﬂow management architectur e for efﬁcient manufacturing: A ros-based case study ,” Future Generation Com- puter Systems , vol. 153, pp. 505–520, 2024. [62] M. Autili, M. De Sanctis, P . Inverardi, and P . Pelliccione, “Engineering digital systems for humanity: A research roadmap,” ACM T rans. Softw. Eng. Methodol. , vol. 34, no. 5, May 2025. [Online]. A vailable: https://doi.org/10.1145/3712006 [63] M. Reichert and B. W eber , Enabling Flexibility in Process-A ware Information Systems - Challenges, Methods, T echnologies . Springer , 2012. [Online]. A vailable: https://doi.org/10.1007/978- 3- 642- 30409- 5 [64] L. Malburg, M. Hoffmann, and R. Bergmann, “Applying MAPE-K control loops for adaptive workﬂow management in smart factories,” J. Intell. Inf. Syst. , vol. 61, no. 1, pp. 83–111, 2023. [Online]. A vailable: https://doi.org/10.1007/s10844- 022- 00766- w [65] A. Marr ella, M. Mecella, and S. Sardi ˜ na, “Intelligent process adaptation in the smartpm system,” ACM T rans. Intell. Syst. T echnol. , vol. 8, no. 2, pp. 25:1–25:43, 2017. [Online]. A vailable: https://doi.org/10.1145/2948071 [66] A. Marrella, “Automated planning for business process management,” J. Data Semant. , vol. 8, no. 2, pp. 79–98, 2019. [Online]. A vailable: https://doi.org/10.1007/s13740- 018- 0096- 0 B I O G R A P H Y S E C T I O N Gianluca Filippone is a P ostdoctoral Re- searcher in Computer Science at Gran Sasso Science Institute (GSSI, Italy). He received his Ph.D . from the University of L ’Aquila, Italy , in 2023. His research topic is software engineering, with focus on autonomous, self-adaptive, and robotic systems. His work spans from ser vice- oriented and distr ibuted architectures for self- adaptive systems to software engineering ap- proaches for the speciﬁcation and adaptation of robotic and multi-robot missions. Sara Pettinari is a Postdoctor al Researcher in Computer Science at the Gran Sasso Sci- ence Institute (GSSI, Italy). She ear ned her PhD in Computer Science from the University of Camerino. Her research focuses on business process management and process mining, par- ticularly for developing and analyzing robotic systems. Additionally , her work explores the in- tegration of ethical aspects in the design and dev elopment of autonomous systems. Patrizio P elliccione is a Prof essor in Com- puter Science at Gran Sasso Science Institute (GSSI, Italy) and Director of the Computer Sci- ence area. Patrizio is also adjunct prof essor at the University of Bergen, Norwa y . His re- search topics are mainly in software engineer- ing, software architecture modeling and veriﬁca- tion, autonomous systems, and f ormal methods. He received his PhD in computer science from the University of L ’Aquila (Italy). Thereafter, he worked as a senior researcher at the University of Luxembourg in Lux embourg, then assistant professor at the University of L ’Aquila in Italy , then Associate Professor at both Chalmers | Univer- sity of Gothenburg in Sweden and University of L ’Aquila. He has been on the organization and program committees f or se veral top conferences and he is a revie wer for top jour nals in the software engineering domain. He is ver y active in European and National projects. In his research activity , he has collaborated with sever al companies. More information is av ailable at http://patriziopelliccione.com.

Formalisms for Robotic Mission Specification and Execution: A Comparative Analysis

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment