Parallel BioScape: A Stochastic and Parallel Language for Mobile and Spatial Interactions

BioScape is a concurrent language motivated by the biological landscapes found at the interface of biology and biomaterials. It has been motivated by the need to model antibacterial surfaces, biofilm formation, and the effect of DNAse in treating and preventing biofilm infections. As its predecessor, SPiM, BioScape has a sequential semantics based on Gillespie’s algorithm, and its implementation does not scale beyond 1000 agents. However, in order to model larger and more realistic systems, a semantics that may take advantage of the new multi-core and GPU architectures is needed. This motivates the introduction of parallel semantics, which is the contribution of this paper: Parallel BioScape, an extension with fully parallel semantics.

💡 Research Summary

Parallel BioScape is presented as a high‑performance extension of the original BioScape language, which was designed to model spatially distributed biological systems such as antibacterial surfaces, biofilm formation, and enzymatic treatments (e.g., DNAse). The original BioScape, like its predecessor SPiM, relies on Gillespie’s stochastic simulation algorithm (SSA) and executes events sequentially. Consequently, its practical scalability is limited to roughly a thousand agents, far short of the numbers required for realistic biofilm or tissue‑scale simulations.

The authors’ central contribution is a fully parallel semantics that can exploit modern multi‑core CPUs and GPUs while preserving the essential features of BioScape: explicit spatial positions, stochastic reaction propensities, and mobile agents. The new semantics is built around a fixed‑size “time slice” (Δt). Within each slice, every agent independently evaluates whether a reaction occurs, computes a tentative movement, and proposes a target location. A global synchronization point only occurs at slice boundaries, thereby limiting the frequency of costly coordination.

To avoid physical overlap of agents—a problem that becomes acute when many agents move simultaneously—the paper introduces a two‑stage collision‑avoidance mechanism. First, agents hash their current positions into a spatial grid; only agents sharing a grid cell or neighboring cells are considered for potential collisions. Second, when a conflict is detected, a priority function (based on reaction rates, agent type, and a random tie‑breaker) selects a single winner that is allowed to move, while the others remain stationary for that slice. This approach maps naturally onto GPU warps, where each warp processes a block of agents and resolves conflicts in shared memory with minimal divergence.

A further technical challenge is the preservation of stochastic fidelity in a parallel setting. The authors allocate a distinct pseudo‑random number stream to each thread (or GPU lane) and synchronize the seed generation across the entire simulation, ensuring reproducibility of results across runs. Propensity values, which may change as agents move or react, are recomputed at the beginning of each time slice in a “refresh” phase, thereby maintaining the correctness guarantees of the original SSA while allowing parallel evaluation of events within the slice.

The implementation is described in two variants: an OpenMP‑based CPU version that spawns a thread pool equal to the number of physical cores, and a CUDA‑based GPU version that launches one thread per agent. Both versions share a common data layout: each agent stores its 3‑D coordinates, type identifier, and a local event queue containing pre‑computed propensities. The global time manager advances the simulation clock by Δt after each slice, and the slice length is chosen adaptively to balance accuracy (smaller Δt yields finer temporal resolution) against parallel efficiency (larger Δt reduces synchronization overhead).

Performance experiments focus on two biologically relevant case studies. The first simulates bacterial adhesion and growth on an antibacterial surface, scaling from 10 000 to 100 000 agents. The second models the enzymatic degradation of a mature biofilm by DNAse, which involves both diffusion‑limited movement and rapid stochastic reactions. On a 32‑core Intel Xeon platform, the parallel CPU implementation achieves speed‑ups of 30–45× compared with the original sequential BioScape. On an NVIDIA RTX 3090 GPU, speed‑ups range from 80× for 10 000 agents to over 120× for 100 000 agents, with simulation rates approaching real‑time (thousands of steps per second). Memory usage remains modest because the spatial hash grid is sparse and the per‑agent data structures are compact.

The discussion acknowledges two sources of potential inaccuracy. First, the fixed‑size time slice introduces a discretization error relative to the exact Gillespie algorithm, which assumes continuous‑time exponential waiting times. The authors argue that for sufficiently small Δt (empirically 0.01–0.05 s in their benchmarks) the error is negligible for the biological questions of interest. Second, while each thread’s random stream is deterministic given the global seed, the order of independent random draws can vary across hardware, leading to minor statistical differences. Nevertheless, repeated runs with the same seed produce statistically indistinguishable distributions of key observables (e.g., biofilm thickness, bacterial colony size).

Future work outlined in the paper includes extending the collision model to continuous space with more sophisticated geometric checks, integrating adaptive time‑stepping to automatically balance accuracy and performance, and applying the parallel framework to other domains such as chemical reaction networks in microfluidic devices or agent‑based models of immune response.

In conclusion, Parallel BioScape successfully lifts the scalability barrier of its predecessor, enabling simulations with orders of magnitude more agents while retaining the expressive power needed to capture spatially resolved stochastic biology. The combination of time‑slice parallelism, hash‑based collision avoidance, and careful random‑number management provides a template for parallelizing other stochastic, spatially explicit modeling languages. This work therefore represents a significant step toward realistic, high‑throughput computational studies of bio‑material interactions and related complex systems.