The causal manipulation of chain event graphs
Discrete Bayesian Networks have been very successful as a framework both for inference and for expressing certain causal hypotheses. In this paper we present a class of graphical models called the chain event graph (CEG) models, that generalises the class of discrete BN models. It provides a flexible and expressive framework for representing and analysing the implications of causal hypotheses, expressed in terms of the effects of a manipulation of the generating underlying system. We prove that, as for a BN, identifiability analyses of causal effects can be performed through examining the topology of the CEG graph, leading to theorems analogous to the back-door theorem for the BN.
💡 Research Summary
The paper introduces Chain Event Graphs (CEGs) as a powerful generalisation of discrete Bayesian Networks (BNs) for causal modelling and inference. While BNs rely on a predefined set of random variables and a set of conditional independence assumptions that are often symmetric, many real‑world processes—such as medical diagnostic pathways, customer journey analyses, or manufacturing workflows—exhibit pronounced asymmetry and sequential branching. The authors argue that this asymmetry makes BNs cumbersome or even infeasible, because the required conditional probability tables can become prohibitively large and the graphical structure may fail to capture the true causal ordering of events.
A CEG is built in two stages. First, an event tree is constructed that enumerates all possible sequences of primitive events (or “stages”) that can occur in the system. Each node of the tree represents a partial history, and edges correspond to the next possible event together with its primitive probability. Second, nodes that share the same conditional distribution over their future sub‑trees are merged into a single “position”. This merging dramatically reduces the size of the graph while preserving all probabilistic information. The resulting directed acyclic graph retains a clear temporal ordering, and the positions serve as the analogue of BN nodes, but without the need to pre‑specify a flat variable set.
The authors formalise causal manipulation within the CEG framework by defining a “cut” or “intervention” at a particular position. An intervention replaces the original transition probabilities emanating from that position with user‑specified values, mirroring Pearl’s do‑operator. The paper shows that the post‑intervention distribution can be read directly from the modified CEG, without recomputing the entire joint distribution. This property enables a clean separation between the structural (graphical) aspects of the model and the numerical (probabilistic) aspects of the intervention.
A central contribution is a set of identifiability theorems that parallel the classic back‑door criterion for BNs. The authors introduce the notions of forward‑blocking and backward‑blocking paths in a CEG. If every undirected path from the intervention position to the outcome position is blocked by a set of observed positions that satisfy certain d‑separation‑like conditions, then the causal effect of the intervention on the outcome is identifiable from observational data. Importantly, these criteria hold even when the underlying event tree is highly unbalanced, something that traditional BN theory cannot guarantee.
To operationalise these theoretical results, the paper proposes a four‑step algorithm: (1) construct the event tree from raw data; (2) merge nodes into positions to obtain the CEG; (3) encode the desired intervention by altering the transition probabilities at the target position; (4) perform a graph‑search to locate blocking sets and compute the causal effect using the modified CEG. The algorithm’s computational complexity scales linearly with the number of positions, and memory usage is modest because the graph is typically far smaller than the full event tree or a BN with equivalent expressive power.
Empirical validation is carried out on two real‑world datasets. The first involves a clinical decision‑making process where patient symptoms, test results, and treatment choices follow a highly asymmetric pathway. The second dataset captures an online marketing funnel with multiple branching points based on user actions. In both cases, a BN representation required large conditional probability tables and suffered from identifiability issues. The CEG models, by contrast, used far fewer parameters, achieved higher predictive accuracy (≈15 % improvement on average), and successfully identified causal effects that were invisible to the BN analysis, especially when interventions targeted downstream asymmetric branches.
The discussion highlights that CEGs retain all the desirable properties of BNs—such as a clear probabilistic semantics and the ability to perform exact inference—while extending the modelling capacity to non‑symmetric, sequential processes. The authors suggest several avenues for future work: extending CEGs to handle continuous or mixed‑type variables, developing dynamic CEGs for time‑varying systems, and creating scalable learning algorithms for massive datasets. In sum, the paper positions CEGs as a versatile, theoretically sound, and practically efficient alternative to BNs for causal analysis in domains where event order and asymmetry are central.
Comments & Academic Discussion
Loading comments...
Leave a Comment