DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs

DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Dynamic graph modeling aims to uncover evolutionary patterns in real-world systems, enabling accurate social recommendation and early detection of cancer cells. Inspired by the success of recent state space models in efficiently capturing long-term dependencies, we propose DyG-Mamba by translating dynamic graph modeling into a long-term sequence modeling problem. Specifically, inspired by Ebbinghaus’ forgetting curve, we treat the irregular timespans between events as control signals, allowing DyG-Mamba to dynamically adjust the forgetting of historical information. This mechanism ensures effective usage of irregular timespans, thereby improving both model effectiveness and inductive capability. In addition, inspired by Ebbinghaus’ review cycle, we redefine core parameters to ensure that DyG-Mamba selectively reviews historical information and filters out noisy inputs, further enhancing the model’s robustness. Through exhaustive experiments on 12 datasets covering dynamic link prediction and node classification tasks, we show that DyG-Mamba achieves state-of-the-art performance on most datasets, while demonstrating significantly improved computational and memory efficiency. Code is available at https://github.com/Clearloveyuan/DyG-Mamba.


💡 Research Summary

**
DyG‑Mamba introduces a continuous‑time state‑space model (SSM) for dynamic graph representation learning, addressing two major shortcomings of existing approaches: (1) the inability to efficiently capture long‑range temporal dependencies, and (2) susceptibility to noisy events. Traditional recurrent models (e.g., JODIE, TGN) suffer from gradient vanishing/exploding on long sequences, while Transformer‑based methods (e.g., DyGFormer, SimpleDyG) incur quadratic O(N²) time and memory costs. DyG‑Mamba reframes dynamic graph modeling as a long‑term sequence problem and leverages the Mamba architecture’s parallel‑scan optimization to achieve linear O(N) complexity.

The core novelty lies in treating irregular inter‑event times as control signals. Inspired by Ebbinghaus’s forgetting curve, the step‑size Δt is redefined as a monotonic, learnable function of the elapsed timespan: Δtₖ = w₁ ⊙ (1 − exp(‑w₂·(tₖ₊₁ − tₖ)/(τ − t₁))). This makes the state transition matrix Aₖ = exp(Δtₖ·Aₖ) decay faster for larger gaps, implementing a “fast‑then‑slow” forgetting pattern that aligns with empirical memory decay. Consequently, the model automatically compresses historical information proportionally to the time elapsed, improving both effectiveness and inductive capability on unseen timestamps.

To enhance robustness, DyG‑Mamba redefines the input‑dependent projection matrices B and C as linear functions of the current input, and imposes spectral‑norm constraints to guarantee Lipschitz continuity. This input‑adaptive parameterization enables selective “review” of past states—mirroring Ebbinghaus’s review cycle—while suppressing the influence of noisy events. The model thus filters out irrelevant history without sacrificing the ability to recall important patterns.

The encoding pipeline extracts four modalities for each node’s interaction sequence: node features, edge features, absolute temporal encodings (cosine functions of remaining time), and co‑occurrence frequency encodings (capturing shared neighbor statistics). Each modality is linearly projected to a common dimension, concatenated, and processed by a 1‑D convolution followed by SiLU activation, expanding the representation to 8d. This enriched sequence feeds the continuous SSM, which updates hidden states using the redefined Δt, A, B, and C, and finally produces node embeddings via an element‑wise product of the SSM output and a skip‑connected transformed input.

Extensive experiments on twelve benchmark datasets covering dynamic link prediction and node classification demonstrate that DyG‑Mamba consistently outperforms state‑of‑the‑art baselines (JODIE, DyRep, TGN, CAWN, TGA‑T, EdgeBank, GraphMixer, TCL, DyGFormer). It achieves higher AUC/ACC scores, especially on long sequences where Transformer‑based methods become infeasible. Under identical GPU memory constraints, DyG‑Mamba processes full sequences without pooling, resulting in up to 40 % faster inference and 50 % lower memory consumption. When 50 % of temporal edges are replaced with random noise, performance degradation stays below 10 %, far better than the >20 % drop observed in competing models, confirming its robustness.

In summary, DyG‑Mamba is the first work to apply continuous‑time SSMs to irregular‑timestamp dynamic graphs, integrating a timespan‑aware forgetting mechanism and input‑adaptive review dynamics. It delivers linear scalability, strong resistance to noise, and superior predictive accuracy, while requiring only a modest number of trainable parameters (a single learnable step size vector and small linear layers). Future directions include extending the architecture to multi‑hop neighborhoods, hierarchical graph pooling, and exploring hypergraph extensions. The code and datasets are publicly released for reproducibility.


Comments & Academic Discussion

Loading comments...

Leave a Comment