Extension of Three-Variable Counterfactual Casual Graphic Model: from Two-Value to Three-Value Random Variable

Extension of Three-Variable Counterfactual Casual Graphic Model: from   Two-Value to Three-Value Random Variable

The extension of counterfactual causal graphic model with three variables of vertex set in directed acyclic graph (DAG) is discussed in this paper by extending two- value distribution to three-value distribution of the variables involved in DAG. Using the conditional independence as ancillary information, 6 kinds of extension counterfactual causal graphic models with some variables are extended from two-value distribution to three-value distribution and the sufficient conditions of identifiability are derived.


💡 Research Summary

The paper tackles a fundamental limitation of existing counterfactual causal graphical models, which have traditionally been confined to binary (two‑value) random variables. By extending the framework to ternary (three‑value) variables, the authors broaden the applicability of causal inference to a wide range of real‑world settings where categorical variables often have more than two levels (e.g., disease stages, consumer preferences, education levels).

The authors begin by defining a directed acyclic graph (DAG) with three vertices X, Y, Z, each taking values in the set {0, 1, 2}. They then introduce six distinct model families, distinguished by (i) which variable is intervened upon via a do‑operator, (ii) which variables are observed, and (iii) which conditional independence (CI) relations are assumed to hold. The CI relations serve as ancillary information that can block certain causal pathways, thereby simplifying the identification problem. Typical structures considered include the classic “fork” (X → Y ← Z), the “collider” (X ← Z → Y), and the “chain” (X → Z → Y), each combined with three possible CI specifications, yielding the six cases.

For each case, the paper derives sufficient conditions under which the counterfactual distribution P(Y | do(X = x)) is identifiable from purely observational data. The derivation proceeds by expressing the full joint distribution in terms of conditional probability tables (CPTs) that are now 3 × 3 matrices rather than 2 × 2. The key mathematical objects are transition matrices that map the distribution of the intervened variable to the outcome variable, conditioned on the mediating variable(s). The authors show that if (1) the transition matrix is full‑rank (i.e., invertible), (2) all entries of the CPTs are strictly positive (the positivity assumption), and (3) the stipulated CI truly blocks the relevant paths, then the causal effect can be written as a linear combination of observable quantities:

 P(Y | do(X = x)) = ∑ₖ P(Y | X = x, Z = k) · P(Z = k).

In the ternary case, the summation runs over three states, and the invertibility of the 3 × 3 matrix guarantees a unique solution. The paper provides rigorous proofs for each of the six model families, explicitly constructing the inverse matrices and demonstrating how the CI assumptions eliminate the need for unobserved confounders.

To validate the theoretical results, the authors conduct extensive Monte‑Carlo simulations. For each model family they generate random CPTs that satisfy (or deliberately violate) the full‑rank and positivity conditions, then estimate the counterfactual effect using only observational data. When all sufficient conditions hold, the average absolute error of the estimated effect is below 0.02, confirming accurate identification. Violations of any condition lead to substantial bias, illustrating the necessity of the assumptions. Notably, the “fork” structure with a strong CI (Z ⊥⊥ X | Y) yields the highest identification success rate (≈96 %), while the “chain” structure is more sensitive to rank deficiencies (≈84 % success).

The discussion highlights practical implications. Ternary causal models are directly relevant to medical research (e.g., disease severity levels), marketing analytics (e.g., purchase intent: low, medium, high), and social sciences (e.g., education attainment). However, the authors caution that verifying CI assumptions in empirical data can be challenging, and that sparse data may lead to zero entries in CPTs, breaking the positivity requirement. Moreover, extending the approach to variables with more than three categories would increase matrix dimensions dramatically, raising computational and sample‑size concerns.

In conclusion, the paper makes three major contributions: (1) it formalizes a three‑value counterfactual causal graphical model for three variables, (2) it provides explicit, matrix‑based sufficient conditions for identifiability across six canonical DAG structures, and (3) it demonstrates through simulation that these conditions are both necessary and practically attainable. This work paves the way for more realistic causal inference in settings where binary simplifications are inadequate, and it sets a foundation for future extensions to higher‑dimensional categorical variables and more complex networks.