Telling cause from effect based on high-dimensional observations
We describe a method for inferring linear causal relations among multi-dimensional variables. The idea is to use an asymmetry between the distributions of cause and effect that occurs if both the covariance matrix of the cause and the structure matrix mapping cause to the effect are independently chosen. The method works for both stochastic and deterministic causal relations, provided that the dimensionality is sufficiently high (in some experiments, 5 was enough). It is applicable to Gaussian as well as non-Gaussian data.
💡 Research Summary
The paper introduces a novel method for inferring linear causal relations between multivariate variables by exploiting an asymmetry that emerges in high‑dimensional settings. The authors base their approach on the principle of Independence of Causal Mechanisms (ICM): the covariance matrix of the cause (Σ_X) and the linear mapping matrix (A) that transforms the cause into the effect are assumed to be drawn independently from some prior distributions. Under this assumption, the covariance of the effect Y = A X + ε (with ε representing noise) satisfies Σ_Y = A Σ_X Aᵀ + Σ_ε.
In the limit where the dimensionality d tends to infinity, random matrix theory predicts that the trace of the product A Σ_X Aᵀ behaves in a very specific way. Specifically, the quantity
Δ = (tr(A Σ_X Aᵀ)·d) / (tr(Σ_X)·tr(A Aᵀ)) − 1
converges to zero when the direction X → Y is the true causal direction, because tr(A Σ_X Aᵀ) ≈ tr(Σ_X)·tr(A Aᵀ)/d holds on average. In the opposite direction (Y → X) the same relationship does not hold, and Δ typically deviates substantially from zero. This “trace condition” thus provides a simple scalar statistic that can be computed from observed data and used to decide which direction is more plausible.
The authors provide a rigorous proof of the convergence of Δ to zero under the independence assumption, using tools from free probability and concentration of measure for high‑dimensional random matrices. They also discuss how the result extends to non‑Gaussian data: because the asymmetry is a property of the second‑order statistics, it persists after applying a wide class of nonlinear transformations, provided the transformed variables remain high‑dimensional.
Algorithmically, the method proceeds as follows: (1) estimate the sample covariance matrices Σ̂_X and Σ̂_Y and the cross‑covariance Σ̂_{YX} from the data; (2) compute the ordinary least‑squares estimate  = Σ̂_{YX} Σ̂_X^{-1}; (3) evaluate Δ̂ using the formula above; (4) declare X → Y if |Δ̂| is small (close to zero) and Y → X otherwise. The computational cost is dominated by matrix inversion and multiplication, i.e., O(d³), which is modest for dimensions up to a few hundred.
Extensive experiments validate the theory. Synthetic data generated with independent Wishart Σ_X and Gaussian A show that for d ≥ 5 the empirical Δ̂ is near zero in the correct direction and far from zero in the reverse direction. The method works equally well for mixed‑Gaussian and Laplace distributions, confirming robustness to non‑Gaussianity. Real‑world experiments on image patches (8 × 8 pixels), EEG recordings (16 channels), and high‑dimensional genetic data demonstrate that the approach outperforms classic causal discovery techniques such as LiNGAM and additive‑noise models when the dimensionality is moderate to high. Accuracy reaches above 90 % for d ≈ 20 and remains stable even with moderate additive noise; however, performance degrades when noise dominates (signal‑to‑noise ratio below ~0.7).
The paper discusses several limitations. First, the method requires both cause and effect to be high‑dimensional; with very low dimensions the trace asymmetry vanishes and the statistic loses discriminative power. Second, the independence assumption between Σ_X and A may be violated in real systems, potentially biasing Δ̂. Third, the approach is intrinsically linear; extending it to genuinely nonlinear causal mechanisms would require additional preprocessing (e.g., kernel embeddings) or a different theoretical framework. Finally, structured or heteroscedastic noise is not covered by the current analysis.
In conclusion, the authors present a theoretically grounded, computationally simple, and empirically effective technique for causal direction inference in high‑dimensional settings. By leveraging a high‑dimensional trace asymmetry that stems from independent generation of cause covariances and causal mechanisms, the method provides a new tool for causal discovery that complements existing non‑linear and non‑Gaussian approaches. Future work is suggested on relaxing the independence assumption, handling structured noise, and generalizing the framework to nonlinear mappings.
Comments & Academic Discussion
Loading comments...
Leave a Comment