Robust Causal Discovery in Real-World Time Series with Power-Laws
Exploring causal relationships in stochastic time series is a challenging yet crucial task with a vast range of applications, including finance, economics, neuroscience, and climate science. Many algorithms for Causal Discovery (CD) have been proposed; however, they often exhibit a high sensitivity to noise, resulting in spurious causal inferences in real data. In this paper, we observe that the frequency spectra of many real-world time series follow a power-law distribution, notably due to an inherent self-organizing behavior. Leveraging this insight, we build a robust CD method based on the extraction of power-law spectral features that amplify genuine causal signals. Our method consistently outperforms state-of-the-art alternatives on both synthetic benchmarks and real-world datasets with known causal structures, demonstrating its robustness and practical relevance.
💡 Research Summary
The paper introduces PLaCy (Power‑Law Causal discovery), a novel causal discovery framework designed specifically for real‑world time‑series that exhibit power‑law spectral characteristics. Traditional causal discovery methods, especially Granger‑type approaches, rely on assumptions of stationarity, Gaussian noise, and a single characteristic time scale. These assumptions are routinely violated in domains such as finance, climate science, and neuroscience, where data often display long‑range dependencies and scale‑free 1/f^α spectra. Consequently, classic methods are prone to spurious edges and reduced robustness under noisy, non‑stationary conditions.
PLaCy tackles this problem by moving the analysis from the raw time domain to the frequency domain. Each multivariate series is segmented into overlapping windows (stride can be as small as one sample). For every window, a discrete Fourier transform (DFT) is computed, and the resulting power spectrum is fitted in log‑log space to a straight line: log A(f) = a − λ log f. The slope λ and intercept a become two new time‑varying features that summarize the underlying stochastic process. By tracking (a(t), λ(t)) over time, the method captures subtle, temporally localized changes in the spectral structure that are often where causal influence manifests.
The core insight is that genuine causal interactions tend to modulate the spectral exponent λ (and to a lesser extent the amplitude a) more consistently than they affect raw amplitudes. Therefore, after extracting these feature series for all variables, PLaCy applies multivariate Granger causality tests not on the original signals but on the (a, λ) trajectories. Specifically, for a candidate pair (i, j), the test asks whether the past of (a_i, λ_i) improves prediction of λ_j beyond what λ_j’s own past provides. This “spectral‑Granger” test inherits the statistical power of classic Wald‑type Granger testing while being far less sensitive to high‑frequency noise and non‑stationary drifts, because the spectral fitting step acts as a denoising filter.
The authors provide a theoretical guarantee (Theorem 1) stating that, under the assumptions of (1) an underlying linear VAR causal process and (2) a common power‑law spectral shape across frequencies, the transformation from raw series to (a, λ) preserves the causal graph. The proof proceeds by showing that the feature series still satisfy the identifiability conditions required for VAR‑based causal inference, and that any causal mechanism that influences the original series will necessarily induce a detectable change in the spectral exponent. Consequently, applying Granger causality to the transformed series recovers exactly the same directed edges as a time‑domain VAR analysis would, but with markedly improved robustness.
Empirical evaluation is extensive. Synthetic benchmarks are generated with controlled levels of non‑stationarity, nonlinear noise, and varying power‑law exponents. Across a grid of signal‑to‑noise ratios, PLaCy consistently achieves lower structural Hamming distance, higher precision, and higher recall than state‑of‑the‑art baselines such as PCMCI, PCMCIΩ, DYNOTEARS, Rhino, and frequency‑domain methods like Geweke‑NP. Real‑world experiments involve (i) a financial dataset comprising interest rates, stock indices, and exchange rates, where known macro‑economic causal links (e.g., rates → equities) are correctly identified, and (ii) a climate dataset containing ENSO indices, temperature, and precipitation records, where established teleconnections (ENSO → precipitation) are recovered. In both cases, competing methods either miss true edges or produce many false positives, especially in high‑volatility periods, whereas PLaCy remains stable.
The paper also discusses practical considerations. Window length and stride are hyper‑parameters that trade off temporal resolution against reliable exponent estimation; the authors propose an adaptive selection based on the p‑value of the spectral fit. They acknowledge that when the underlying process lacks a clear power‑law spectrum (e.g., pure white noise) the method’s advantage diminishes, and that the current formulation assumes linear causal effects, leaving nonlinear extensions as future work.
In summary, PLaCy leverages the ubiquitous power‑law spectral structure of many real‑world time series to construct a denoised, scale‑invariant representation of dynamics. By applying Granger causality to the evolution of spectral exponents, it delivers a causal discovery tool that is markedly more robust to noise, non‑stationarity, and scale‑free dynamics than existing time‑domain or frequency‑domain approaches, opening the door to more reliable causal inference across a broad spectrum of scientific and engineering domains.
Comments & Academic Discussion
Loading comments...
Leave a Comment