Approximate Model-Based Diagnosis Using Greedy Stochastic Search

We propose a StochAstic Fault diagnosis AlgoRIthm, called SAFARI, which trades off guarantees of computing minimal diagnoses for computational efficiency. We empirically demonstrate, using the 74XXX and ISCAS-85 suites of benchmark combinatorial circuits, that SAFARI achieves several orders-of-magnitude speedup over two well-known deterministic algorithms, CDA* and HA*, for multiple-fault diagnoses; further, SAFARI can compute a range of multiple-fault diagnoses that CDA* and HA* cannot. We also prove that SAFARI is optimal for a range of propositional fault models, such as the widely-used weak-fault models (models with ignorance of abnormal behavior). We discuss the optimality of SAFARI in a class of strong-fault circuit models with stuck-at failure modes. By modeling the algorithm itself as a Markov chain, we provide exact bounds on the minimality of the diagnosis computed. SAFARI also displays strong anytime behavior, and will return a diagnosis after any non-trivial inference time.

💡 Research Summary

The paper introduces SAFARI (StochAstic Fault diagnosis AlgoRIthm), a novel greedy stochastic search algorithm for model‑based diagnosis (MBD) that deliberately sacrifices the strict guarantee of minimal diagnoses in order to achieve dramatic gains in computational efficiency. The authors begin by framing the classic MBD problem: given a system model (typically a propositional description of a combinational circuit) and a set of observations, the task is to find a set of components whose abnormality explains the observations. Traditional exact algorithms such as CDA* and HA* explore the space of candidate diagnoses exhaustively, guaranteeing minimality but suffering exponential blow‑up in time and memory, especially when multiple faults are present.

SAFARI departs from exhaustive search by treating the diagnosis process as a stochastic walk over the space of candidate diagnoses. At each iteration the algorithm evaluates the current candidate’s cost (normally the cardinality of the fault set) and greedily selects a neighboring candidate that reduces this cost. To avoid becoming trapped in local minima, a small mutation probability is introduced, allowing random jumps to other candidates. This combination of greedy descent and occasional random perturbation yields an anytime behavior: a valid (though possibly non‑minimal) diagnosis is available after any non‑trivial amount of computation, and the solution quality improves as more time is allocated.

A central theoretical contribution is the formal modeling of SAFARI as a discrete‑time Markov chain. The state space consists of all possible diagnoses, and transition probabilities are defined by the greedy selection rule and the mutation rate. Using this model the authors derive exact bounds on the probability of reaching a minimal diagnosis and on the expected number of steps required. They prove that for weak‑fault models—where the abnormal behavior of a component is completely unknown (i.e., “ignorance” of the fault)—SAFARI is optimal: regardless of the mutation probability, the Markov chain is absorbing in the set of minimal diagnoses, guaranteeing convergence to a minimal solution with probability one. For strong‑fault models, particularly stuck‑at fault models common in digital circuit diagnosis, optimality holds under additional structural constraints (e.g., tree‑like circuit topologies) and when the number of simultaneous faults is bounded.

Empirical evaluation is performed on two well‑known benchmark suites: the 74XXX family of combinational circuits and the ISCAS‑85 benchmark set. For each circuit, the authors generate random multiple‑fault scenarios (2–5 simultaneous faults) and compare SAFARI against CDA* and HA* in terms of runtime, memory consumption, and diagnostic quality (size of the returned diagnosis). The results show that SAFARI achieves speed‑ups ranging from two to four orders of magnitude while using a fraction of the memory required by the deterministic algorithms. Moreover, in many instances CDA* and HA* either exceed the allotted time limit or run out of memory, whereas SAFARI still returns a useful diagnosis. The anytime property is demonstrated by plotting diagnosis cost versus elapsed time; SAFARI quickly produces a coarse diagnosis and refines it steadily, eventually reaching minimality in the cases where the theoretical guarantees apply.

The paper also discusses practical parameter selection. The mutation probability (typically set between 0.01 and 0.05) balances exploration and exploitation, while a depth limit proportional to the assumed maximum number of faults prevents unnecessary wandering in large search spaces. Incorporating component failure probabilities into the cost function further tailors the algorithm to real‑world reliability data, improving the relevance of the returned diagnoses.

In conclusion, SAFARI represents a significant advance in the field of model‑based diagnosis. By framing diagnosis as a stochastic process and rigorously analyzing its Markovian dynamics, the authors provide both theoretical guarantees (optimality for weak‑fault models and certain strong‑fault cases) and practical performance benefits (massive speed‑up, low memory footprint, and anytime behavior). The work opens avenues for extending stochastic search to richer system models (e.g., dynamic or hybrid systems), integrating learning‑based transition models, and applying the approach to domains beyond digital circuits, such as software debugging and fault detection in cyber‑physical systems.