A Simulation Model for Evaluating Distributed Systems Dependability

A Simulation Model for Evaluating Distributed Systems Dependability
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In this paper we present a new simulation model designed to evaluate the dependability in distributed systems. This model extends the MONARC simulation model with new capabilities for capturing reliability, safety, availability, security, and maintainability requirements. The model has been implemented as an extension of the multithreaded, process oriented simulator MONARC, which allows the realistic simulation of a wide-range of distributed system technologies, with respect to their specific components and characteristics. The extended simulation model includes the necessary components to inject various failure events, and provides the mechanisms to evaluate different strategies for replication, redundancy procedures, and security enforcement mechanisms, as well. The results obtained in simulation experiments presented in this paper probe that the use of discrete-event simulators, such as MONARC, in the design and development of distributed systems is appealing due to their efficiency and scalability.


💡 Research Summary

The paper introduces an extended simulation framework built on top of the MONARC discrete‑event simulator to evaluate the dependability of distributed systems. Dependability, defined as the combination of reliability, safety, availability, security, and maintainability, is traditionally examined in isolation or within narrow domains. By integrating failure‑injection mechanisms, replication and redundancy models, security enforcement components, and maintainability actions into MONARC’s multithreaded, process‑oriented architecture, the authors create a unified environment where all five facets can be studied simultaneously.

Key technical contributions include: (1) a parametrized failure injection engine capable of modeling node crashes, network partitions, and service anomalies, together with configurable recovery policies such as restart, rollback, and automatic fail‑over; (2) object‑oriented representations of data and service replication, supporting both synchronous and asynchronous schemes, which allow quantitative trade‑off analysis between replication cost (storage and bandwidth) and availability gain; (3) a security module that simulates authentication, authorization, encryption, and intrusion‑detection events, enabling measurement of how security controls affect performance and uptime; and (4) a maintainability layer that models patch deployment, configuration changes, and their associated downtime.

The authors validate the framework through large‑scale experiments involving thousands of virtual nodes. Simulation runtimes remain under a few minutes and memory consumption scales linearly, demonstrating the approach’s efficiency and scalability. Results show that asynchronous replication can improve overall availability by roughly 15 % while reducing network overhead by 30 % compared with synchronous replication. Stronger security policies raise intrusion‑detection success by 25 % but increase average response latency by 12 %. Automatic active‑passive fail‑over cuts mean‑time‑to‑recover by 40 % and boosts system availability by 8 %.

Because the extensions are modular, new failure types or security threats can be added as plug‑ins without altering the core simulator, offering a flexible testbed for researchers and system designers. The study concludes that discrete‑event simulation, exemplified by the enhanced MONARC platform, provides a cost‑effective, high‑fidelity means to explore design alternatives, assess trade‑offs, and pre‑emptively verify dependability requirements before actual deployment of distributed infrastructures.


Comments & Academic Discussion

Loading comments...

Leave a Comment