Empirical Mode Decomposition and Graph Transformation of the MSCI World Index: A Multiscale Topological Analysis for Graph Neural Network Modeling
This study applies Empirical Mode Decomposition (EMD) to the MSCI World index and converts the resulting intrinsic mode functions (IMFs) into graph representations to enable modeling with graph neural networks (GNNs). Using CEEMDAN, we extract nine IMFs spanning high-frequency fluctuations to long-term trends. Each IMF is transformed into a graph using four time-series-to-graph methods: natural visibility, horizontal visibility, recurrence, and transition graphs. Topological analysis shows clear scale-dependent structure: high-frequency IMFs yield dense, highly connected small-world graphs, whereas low-frequency IMFs produce sparser networks with longer characteristic path lengths. Visibility-based methods are more sensitive to amplitude variability and typically generate higher clustering, while recurrence graphs better preserve temporal dependencies. These results provide guidance for designing GNN architectures tailored to the structural properties of decomposed components, supporting more effective predictive modeling of financial time series.
💡 Research Summary
**
This paper presents a novel multiscale framework that combines Empirical Mode Decomposition (EMD) with graph‑theoretic transformations to enable Graph Neural Network (GNN) modeling of the MSCI World index. The authors first verify that the daily closing price series (January 2012 – November 2025, 3,490 observations) satisfies the prerequisite conditions for EMD: non‑stationarity (ADF and KPSS tests), non‑linearity (BDS, Ljung‑Box, AR‑GARCH analyses), and pronounced amplitude‑frequency variability (volatility windows and zero‑crossing counts).
Using the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN), the index is decomposed into nine Intrinsic Mode Functions (IMFs). The first three IMFs capture high‑frequency market fluctuations, while the last three represent long‑term trends. For each IMF, four time‑series‑to‑graph conversion methods are applied: Natural Visibility Graph (NVG), Horizontal Visibility Graph (HVG), Recurrence Graph, and Transition Graph. NVG and HVG encode geometric visibility relationships; the recurrence approach builds an adjacency matrix from phase‑space embeddings (optimal delay τ and embedding dimension d are selected via mutual information and false‑nearest‑neighbors, respectively); the transition method creates directed edges based on one‑step transition probabilities.
Topological metrics—clustering coefficient (C), average shortest‑path length (L), global efficiency, degree distribution, betweenness centrality, and assortativity—are computed for every graph. High‑frequency IMFs (NVG/HVG) exhibit dense, small‑world structures (C≈0.45‑0.52, L≈2.1‑2.4) with exponential‑like degree distributions, indicating rapid information diffusion across many nodes. Low‑frequency IMFs generate sparser networks (C≈0.12‑0.18, L≈5.8‑7.3) with power‑law tails in the degree distribution, reflecting the emergence of hub‑like nodes around major trend‑change points. Recurrence graphs maintain moderate density (C≈0.30, L≈3.5) and preserve temporal dependencies, making them well‑suited for capturing nonlinear dynamics across all scales. Transition graphs, being directed and relatively low‑clustering, highlight causal flow but provide limited community structure.
The authors translate these findings into concrete GNN design recommendations. For dense, small‑world graphs derived from high‑frequency IMFs, deep GCN or GraphSAGE architectures with multiple propagation hops are recommended, complemented by batch normalization and dropout to mitigate over‑fitting. For sparse, long‑path graphs from low‑frequency IMFs, attention‑based models such as Graph Attention Networks (GAT) or Heterogeneous Attention Networks (HAN) are advantageous, and hierarchical pooling (Top‑K, DiffPool) can extract global representations. Recurrence graphs benefit from temporal‑graph models (TGAT, Temporal Graph Networks, Temporal Graph Convolution) that explicitly encode time‑ordered edges and can be enriched with Recurrence Quantification Analysis (RQA) features. The paper also proposes a multi‑scale GNN pipeline where each IMF‑specific graph is processed by a dedicated sub‑network, and the resulting embeddings are fused in a final aggregation layer, enabling the model to leverage both short‑term volatility and long‑term trend information.
In summary, the study demonstrates that EMD‑derived IMFs of a financial index possess distinct, scale‑dependent network topologies when mapped to graphs. These topologies dictate the most appropriate GNN architecture, offering a systematic pathway from raw price data to high‑performance, multiscale predictive models for financial forecasting, risk assessment, and anomaly detection.
Comments & Academic Discussion
Loading comments...
Leave a Comment