Causal Inference: A Tale of Three Frameworks
Causal inference is a central goal across many scientific disciplines. Over the past several decades, three major frameworks have emerged to formalize causal questions and guide their analysis: the potential outcomes framework, structural equation models, and directed acyclic graphs. Although these frameworks differ in language, assumptions, and philosophical orientation, they often lead to compatible or complementary insights. This paper provides a comparative introduction to the three frameworks, clarifying their connections, highlighting their distinct strengths and limitations, and illustrating how they can be used together in practice. The discussion is aimed at researchers and graduate students with some background in statistics or causal inference who are seeking a conceptual foundation for applying causal methods across a range of substantive domains.
💡 Research Summary
The paper “Causal Inference: A Tale of Three Frameworks” offers a systematic comparative review of the three dominant paradigms used to formalize causal questions: the Potential Outcomes (PO) framework, Non‑Parametric Structural Equation Models (NPSEM), and Directed Acyclic Graphs (DAG). The authors begin by motivating the importance of causal reasoning in modern data‑driven fields such as machine learning and artificial intelligence, noting that reliable causal conclusions are essential for building trustworthy, transparent, and robust systems. They observe that while each paradigm has been extensively studied in isolation, there is a paucity of concise side‑by‑side treatments that translate assumptions and results across the three approaches.
The first substantive section introduces each framework in turn. The PO framework, rooted in Neyman (1923) and Rubin (1974), treats causal effects as comparisons between potential outcomes Y(0) and Y(1) for each unit. Causal identification relies on two core assumptions: consistency (the observed outcome equals the potential outcome under the actually received treatment) and ignorability (treatment assignment is independent of potential outcomes conditional on observed covariates L). The authors discuss common estimands such as the average causal effect (ACE), conditional average treatment effect (CATE), effect of treatment on the treated (ETT), and quantile treatment effects, emphasizing that only population‑level quantities are identifiable without further assumptions.
NPSEM is presented as a functional representation of each variable X as X = f_X(Pa_X, ε_X), where Pa_X denotes the set of parent variables and ε_X is an error term. The model’s autonomy property allows an intervention on a variable A to be modeled by replacing its structural equation while leaving all others unchanged. When the additional independent‑error assumption (NPSEM‑IE) is imposed, the model guarantees that interventions on any subset of variables are simultaneously well‑defined, thereby enabling identification of all interventional distributions in the system.
DAGs encode causal assumptions graphically. The authors explain modularity (intervention corresponds to deleting incoming edges to the intervened node) and causal sufficiency (no hidden common causes). Under these assumptions, graphical criteria such as the back‑door and front‑door criteria provide testable conditions for identifiability that are mathematically equivalent to the ignorability condition in the PO framework.
The second major contribution of the paper is a detailed exposition of the formal relationships among the three paradigms. The authors show that (1) potential outcomes arise naturally from NPSEM by interpreting each structural equation’s counterfactual version as a potential outcome; (2) conversely, a collection of potential outcomes can be assembled into a canonical NPSEM representation; (3) any DAG can be turned into an NPSEM by attaching independent error terms to each node and specifying deterministic functions, a result known as the functional representation lemma; and (4) Single‑World Intervention Graphs (SWIGs) provide a bridge between DAGs and PO by splitting each intervened node into a fixed treatment node and a set of potential‑outcome nodes, preserving the graphical structure while making counterfactual variables explicit. This synthesis demonstrates that the three frameworks are not competing but rather complementary lenses on the same underlying causal structure.
In the comparative analysis, the authors evaluate the frameworks along several dimensions: philosophical orientation (realist counterfactuals vs. structuralist graph theory), expressive capacity (single‑stage vs. multi‑stage interventions), and identification power. They argue that NPSEM‑IE can yield stronger identification results because the independent‑error assumption rules out certain forms of hidden confounding that DAG‑based analyses would treat as non‑identifiable. However, they caution that the independence assumption may be overly restrictive in practice, and that DAGs offer a more flexible, testable way to encode uncertainty about unmeasured confounders.
The paper concludes with practical guidance for applied researchers. When the research question involves a single, well‑defined intervention, the PO framework with a carefully chosen set of covariates often suffices. For more complex causal structures, researchers should first construct a DAG to visualize assumptions, use graphical criteria to select adjustment sets, and then translate those sets into conditioning variables for a PO analysis. If the analyst is comfortable assuming independent errors and wishes to identify a broader class of interventional distributions, an NPSEM‑IE may be adopted, but sensitivity analyses are recommended to assess robustness to violations of the independence assumption. The authors also highlight the utility of SWIGs for communicating causal assumptions across interdisciplinary teams.
Finally, the authors point to emerging research that seeks to unify these perspectives, including automated causal graph learning, Bayesian structural equation modeling, and the integration of causal inference with reinforcement learning. They suggest that future methodological developments will increasingly blend the graphical intuition of DAGs, the counterfactual rigor of PO, and the functional flexibility of NPSEM to provide a coherent, transparent workflow for causal inference across scientific domains.
Comments & Academic Discussion
Loading comments...
Leave a Comment