Unraveling Spurious Properties of Interaction Networks with Tailored Random Networks

We investigate interaction networks that we derive from multivariate time series with methods frequently employed in diverse scientific fields such as biology, quantitative finance, physics, earth and climate sciences, and the neurosciences. Mimicking experimental situations, we generate time series with finite length and varying frequency content but from independent stochastic processes. Using the correlation coefficient and the maximum cross-correlation, we estimate interdependencies between these time series. With clustering coefficient and average shortest path length, we observe unweighted interaction networks, derived via thresholding the values of interdependence, to possess non-trivial topologies as compared to Erd\H{o}s-R'{e}nyi networks, which would indicate small-world characteristics. These topologies reflect the mostly unavoidable finiteness of the data, which limits the reliability of typically used estimators of signal interdependence. We propose random networks that are tailored to the way interaction networks are derived from empirical data. Through an exemplary investigation of multichannel electroencephalographic recordings of epileptic seizures - known for their complex spatial and temporal dynamics - we show that such random networks help to distinguish network properties of interdependence structures related to seizure dynamics from those spuriously induced by the applied methods of analysis.

💡 Research Summary

The paper investigates how interaction networks derived from multivariate time‑series can exhibit spurious topological features when the data are of limited length and possess non‑uniform frequency content. The authors generate synthetic data from independent stochastic processes, varying both the number of samples and the spectral composition (e.g., different ratios of low‑ to high‑frequency power). Pairwise interdependence is quantified using two widely employed estimators: the Pearson correlation coefficient (capturing instantaneous linear coupling) and the maximum cross‑correlation (allowing for time‑lagged linear coupling). After computing these measures for every pair of channels, a threshold is applied; only pairs whose statistic exceeds the threshold are linked, producing an unweighted interaction network.

For each constructed network the clustering coefficient (C) and the average shortest‑path length (L) are calculated. These metrics are compared against those of Erdős–Rényi (ER) random graphs having the same number of nodes and average degree. The authors find that, even though the underlying processes are independent, finite sample size and spectral imbalance cause C to be markedly inflated and L to be reduced relative to the ER baseline. Consequently, the networks appear to possess “small‑world” characteristics (C≫C_ER while L≈L_ER) purely as an artifact of the estimation procedure.

To address this methodological bias, the authors propose “tailored random networks” as a more appropriate null model. The key idea is to preserve all aspects of the data‑generation pipeline—sample length, variance, power spectrum, and the specific dependence estimator—while destroying any genuine inter‑channel coupling. This is achieved by either phase‑randomizing each time series (randomizing Fourier phases while keeping the amplitude spectrum) or by bootstrapping/re‑sampling procedures that keep individual channel statistics intact but break cross‑channel relationships. The same correlation/MCC‑thresholding pipeline is then applied to the surrogate data, yielding a distribution of C and L under the null hypothesis that no true interaction exists but all methodological constraints remain.

The utility of the tailored null model is demonstrated on real multichannel electroencephalographic (EEG) recordings from patients experiencing epileptic seizures. The EEG data are segmented into pre‑seizure, seizure, and post‑seizure windows, and networks are built for each segment. While the seizure segment shows a pronounced increase in network density, clustering, and a decrease in average path length, the ER‑based comparison would attribute many of these changes to small‑world organization. In contrast, when compared against the tailored random networks, only the seizure‑related changes exceed the 95 % confidence interval, indicating that they reflect genuine physiological synchronization rather than methodological artifacts.

The paper concludes that (1) limited data length and spectral heterogeneity systematically bias network‑topology measures; (2) conventional ER random graphs are insufficient as null models for interaction networks derived from time‑series; and (3) surrogate‑based, pipeline‑preserving random networks provide a statistically sound baseline that isolates true dynamical features. The authors further argue that the approach generalizes to other dependence measures (e.g., mutual information, transfer entropy) and to diverse fields such as finance, climate science, and genomics, where similar data constraints are common. By adopting tailored random networks as the standard null hypothesis, researchers can avoid over‑interpreting spurious small‑world or other complex‑network signatures that arise solely from the analysis methodology.

💡 Research Summary

📜 Original Paper Content