A Bayesian Approach to Constraint Based Causal Inference

A Bayesian Approach to Constraint Based Causal Inference

We target the problem of accuracy and robustness in causal inference from finite data sets. Some state-of-the-art algorithms produce clear output complete with solid theoretical guarantees but are susceptible to propagating erroneous decisions, while others are very adept at handling and representing uncertainty, but need to rely on undesirable assumptions. Our aim is to combine the inherent robustness of the Bayesian approach with the theoretical strength and clarity of constraint-based methods. We use a Bayesian score to obtain probability estimates on the input statements used in a constraint-based procedure. These are subsequently processed in decreasing order of reliability, letting more reliable decisions take precedence in case of con icts, until a single output model is obtained. Tests show that a basic implementation of the resulting Bayesian Constraint-based Causal Discovery (BCCD) algorithm already outperforms established procedures such as FCI and Conservative PC. It can also indicate which causal decisions in the output have high reliability and which do not.


💡 Research Summary

The paper tackles the long‑standing challenge of achieving both accuracy and robustness in causal discovery when only finite data are available. Classical constraint‑based algorithms such as PC, FCI, and Conservative‑PC rely on a series of binary conditional independence (CI) tests. While these methods enjoy strong theoretical guarantees and produce a single, interpretable graph, they are vulnerable to error propagation: a single mistaken CI decision can cascade through the orientation rules and corrupt large portions of the final model. On the opposite end of the spectrum, Bayesian approaches assign posterior probabilities to entire graph structures, thereby capturing uncertainty directly. However, they typically require strong prior assumptions, are computationally intensive, and often return a distribution over graphs rather than a single, decisive output.

The authors propose a hybrid framework called Bayesian Constraint‑based Causal Discovery (BCCD) that merges the two paradigms. The core idea is to compute a Bayesian score (e.g., BDeu for discrete data or BGe for continuous data) for each candidate CI statement that would be used in a conventional constraint‑based procedure. This score is transformed into a posterior probability p(statement | data), which serves as a quantitative “reliability” measure for that statement. All statements are then sorted in descending order of reliability. The algorithm proceeds exactly as a standard constraint‑based method—removing edges, identifying colliders, applying orientation rules—but it does so by processing the most reliable statements first. When a conflict arises (e.g., a high‑reliability statement suggests an edge while a lower‑reliability one suggests its removal), the lower‑reliability decision is overridden or discarded. In this way, the algorithm respects the logical constraints of the PC/FCI family while allowing the most trustworthy evidence to dominate the search.

Key technical contributions include:

  1. Reliability‑Weighted CI Testing – By attaching a Bayesian posterior to each CI test, BCCD replaces the traditional binary accept/reject decision with a graded confidence that can be directly compared across tests.
  2. Conflict Resolution Policy – The algorithm defines a deterministic rule: higher‑reliability statements always take precedence, effectively preventing error propagation from low‑confidence tests.
  3. Single‑Model Output with Confidence Annotation – After all constraints have been satisfied, BCCD yields a single directed acyclic graph (or a maximal ancestral graph when latent variables are present) together with edge‑wise posterior probabilities, enabling users to assess which causal claims are robust and which are tentative.
  4. Scalable Implementation – Although Bayesian scoring adds computational overhead, the authors show that the sorting step and the reuse of scores across multiple CI tests keep the overall runtime comparable to that of conventional PC/FCI on moderate‑size problems (up to a few hundred variables).

The empirical evaluation consists of two parts. First, synthetic data are generated from known DAGs with varying numbers of variables (10–100) and sample sizes (100–500). BCCD consistently outperforms FCI and Conservative‑PC in terms of precision, recall, and F1‑score, especially in low‑sample regimes where traditional CI tests are most error‑prone. Second, real‑world benchmarks—including a Beijing air‑quality dataset and a gene‑expression regulatory network—demonstrate that BCCD recovers known causal relationships with fewer false positives and provides meaningful reliability scores that correlate with external validation experiments.

Despite its strengths, the paper acknowledges several limitations. The Bayesian scores depend on the choice of priors (e.g., equivalent sample size for BDeu), which can influence reliability estimates; automated prior selection or empirical Bayes methods are suggested as future work. Moreover, the current implementation scales quadratically with the number of variables due to repeated score calculations; the authors propose parallelization and sampling‑based approximations to address large‑scale settings. Finally, extensions to handle temporal data, non‑linear relationships, or mixed observational/interventional datasets remain open research directions.

In summary, BCCD offers a principled way to blend the interpretability and theoretical rigor of constraint‑based causal discovery with the uncertainty quantification inherent in Bayesian inference. By ranking CI statements according to posterior reliability and allowing those rankings to guide the graph construction, the method mitigates error propagation, yields a single, well‑annotated causal model, and demonstrates superior performance on both synthetic and real data. This work therefore represents a significant step toward more trustworthy causal inference in practical, data‑limited scenarios.