The HIVE Tool for Informed Swarm State Space Exploration
Swarm verification and parallel randomised depth-first search are very effective parallel techniques to hunt bugs in large state spaces. In case bugs are absent, however, scalability of the parallelisation is completely lost. In recent work, we proposed a mechanism to inform the workers which parts of the state space to explore. This mechanism is compatible with any action-based formalism, where a state space can be represented by a labelled transition system. With this extension, each worker can be strictly bounded to explore only a small fraction of the state space at a time. In this paper, we present the HIVE tool together with two search algorithms which were added to the LTSmin tool suite to both perform a preprocessing step, and execute a bounded worker search. The new tool is used to coordinate informed swarm explorations, and the two new LTSmin algorithms are employed for preprocessing a model and performing the individual searches.
💡 Research Summary
The paper addresses a fundamental scalability issue in parallel model checking techniques such as Swarm Verification (SV) and parallel randomised depth‑first search. While these methods excel at quickly finding bugs by assigning each worker a distinct DFS ordering, they lose all parallel benefit when the system under verification is bug‑free, because every worker ends up exhaustively exploring the entire reachable state space. To overcome this, the authors propose Informed Swarm Verification (ISV), a framework that informs each worker about a specific, bounded portion of the state space to explore.
ISV relies on three preconditions: (1) the system specification can be represented as a labelled transition system (LTS) with deterministic labels; (2) the system consists of multiple parallel processes; and (3) at least one subsystem (denoted P₀) exhibits finite behaviour, i.e., it has a finite set of traces. The key idea is to pre‑compute all possible traces of P₀ and assign each trace to a worker as a “swarm trace”.
The preprocessing step (P1) builds the LTS of P₀ and computes a trace‑counting weight function tc(s) for each state s. The authors introduce a Trace‑Counting Depth‑First Search algorithm that recursively sums the tc values of successor states, assigning 1 to dead‑lock states. Consequently, each trace can be uniquely identified by a natural number, allowing the HIVE tool to select traces efficiently.
During the actual verification, the HIVE tool selects a trace σ (F1) and launches a worker that runs the Informed Swarm Search (ISS) algorithm, implemented as an extension of the LTSmin tool suite. ISS explores the full system P but restricts the behaviour of P₀ to follow the assigned trace σ. At each step i of σ, the algorithm expands two sets: (a) Next – successors generated by actions of the remaining processes (P \ P₀); and (b) Step – successors generated by the single action σ(i) of P₀. When the current Step set becomes empty, i is incremented and the next action of σ is used. This process continues until the entire trace σ has been consumed, after which the worker terminates.
While exploring, ISS records, for each position i, the set Fi of P₀ actions actually observed. Upon completion, HIVE receives the Fi sets and prunes the global list of remaining swarm traces (F3). Because each trace corresponds to a contiguous range of natural numbers, pruning can be performed efficiently by merging ranges. This feedback mechanism also mitigates over‑approximation caused by synchronisations between P₀ and the rest of the system, ensuring that workers do not waste effort on infeasible traces.
The authors sketch a correctness argument: if after assigning all traces of P₀ some reachable state remained unvisited, then that state would have to be reachable via a trace that includes an action not present in the current σ(i). However, such a trace would have a prefix already represented by some earlier σ′, contradicting the assumption that all σ′ have been used. Hence, exhaustive exploration of all σ guarantees complete coverage of the state space.
Implementation-wise, the new algorithms are added to LTSmin version 1.6‑19 in C, reusing existing data structures and duplicate‑detection mechanisms. Communication between HIVE and workers is lightweight: only the list of actions constituting the swarm trace is transmitted. A limitation noted by the authors is the lack of automatic extraction of P₀ from the full specification; users must manually identify the finite subsystem, which currently hampers broader experimentation.
The experimental evaluation uses a DRM protocol model where the iPod component serves as P₀. The trace‑counting DFS identifies 14 distinct traces. In bug‑free runs, the workload is evenly distributed among workers, leading to a substantial reduction in total verification time compared with naïve SV. In buggy runs, the diversity of traces ensures that a bug is discovered quickly, often after only a few workers have completed their assigned trace.
Future work includes automating the extraction of P₀, dynamic load balancing of traces, and scaling the approach to larger clusters. Overall, the HIVE tool and the ISV methodology introduce a novel “bounded exploration + feedback‑driven pruning” paradigm that alleviates the time‑explosion problem in explicit‑state model checking and restores the scalability of parallel verification even when the system under test is correct.
Comments & Academic Discussion
Loading comments...
Leave a Comment