Sequence Diffusion Model for Temporal Link Prediction in Continuous-Time Dynamic Graph
Temporal link prediction in dynamic graphs is a fundamental problem in many real-world systems. Existing temporal graph neural networks mainly focus on learning representations of historical interactions. Despite their strong performance, these models are still purely discriminative, producing point estimates for future links and lacking an explicit mechanism to capture the uncertainty and sequential structure of future temporal interactions. In this paper, we propose SDG, a novel sequence-level diffusion framework that unifies dynamic graph learning with generative denoising. Specifically, SDG injects noise into the entire historical interaction sequence and jointly reconstructs all interaction embeddings through a conditional denoising process, thereby enabling the model to capture more comprehensive interaction distributions. To align the generative process with temporal link prediction, we employ a cross-attention denoising decoder to guide the reconstruction of the destination sequence and optimize the model in an end-to-end manner. Extensive experiments on various temporal graph benchmarks show that SDG consistently achieves state-of-the-art performance in the temporal link prediction task.
💡 Research Summary
Temporal link prediction in continuous‑time dynamic graphs is a cornerstone task for applications such as recommendation, anomaly detection, and knowledge‑graph completion. Existing temporal graph neural networks (TGNNs) – whether memory‑based (e.g., JODIE, DyRep, TGN) or memory‑free (e.g., TGAT, DyFormer, CRAFT) – are fundamentally discriminative: they encode historical interactions and output a pointwise score for each candidate future edge. This paradigm suffers from two critical drawbacks. First, it provides no explicit quantification of uncertainty in the stochastic evolution of future interactions. Second, it treats each future link independently, ignoring the sequential dependencies that naturally arise when multiple events unfold over time.
The paper “Sequence Diffusion Model for Temporal Link Prediction in Continuous‑Time Dynamic Graph” (SDG) reframes temporal link prediction as a conditional generation problem and solves it with a diffusion‑based generative framework. The authors construct a target sequence Tᵤ,ₜ that consists of the source node’s recent L‑1 neighbors (shifted by one step) followed by the true destination node. The embeddings of this sequence, denoted X₀, are treated as the clean data for a denoising diffusion probabilistic model (DDPM). Unlike conventional diffusion for images or text, SDG injects Gaussian noise into the entire sequence (both historical neighbor embeddings and the destination embedding) at each diffusion step k, yielding a noisy version X_k = √ᾱ_k X₀ + √(1‑ᾱ_k) ε.
During the reverse process, a conditional Markov chain reconstructs the whole sequence from pure Gaussian noise. The conditioning signal c is the encoded historical interaction sequence Z₁:ₗ, obtained by feeding the source’s neighbor embeddings (with sinusoidal positional encodings) into a causal transformer. The denoising network f_θ receives three inputs: (i) the noisy sequence X_k, (ii) a timestep embedding γ(k) processed by an MLP, and (iii) the context Z₁:ₗ. The timestep embedding is added to each position of X_k, after which a cross‑attention transformer uses Z₁:ₗ as queries and the time‑conditioned noisy sequence as keys/values. This cross‑attention decoder predicts the clean sequence ˆX₀, from which the final destination embedding is extracted (the last token).
Training departs from the standard DDPM objective in two ways. First, the model predicts the clean data x₀ directly rather than the noise ε, which is more stable when historical interactions contain irregular, noisy components. Second, the authors replace the usual mean‑squared error with a cosine‑based reconstruction loss: L_diff = ∑₁ᴸ (1 − cos(ˆX₀ᵢ, X₀ᵢ))². This loss is invariant to embedding scale, aligns better with ranking‑based evaluation, and can be derived as a valid ELBO variant under a cosine similarity measure.
Extensive experiments were conducted on six public continuous‑time dynamic graph benchmarks (Reddit, Wikipedia, MOOC, Github, DBLP, Yelp). The evaluation protocol follows recent work (e.g., CRAFT) by treating link prediction as a ranking task and reporting HR@10, MRR, and Recall@50. SDG consistently outperforms all baselines, achieving improvements of 3–7 percentage points on average and up to 12 pp on long sequences (L ≥ 50). Ablation studies confirm that (a) injecting noise into the whole sequence yields superior performance compared to corrupting only the destination embedding; (b) the cosine loss outperforms MSE in ranking metrics; and (c) the cross‑attention decoder is more effective than a simple convolutional decoder.
The paper also discusses limitations. The sequence‑level diffusion incurs O(L · d) memory and compute per diffusion step, which can become costly for very long histories. Moreover, the performance is sensitive to the noise schedule β_k and the chosen sequence length L, suggesting that adaptive scheduling or hierarchical diffusion could be fruitful future directions.
In summary, SDG introduces a novel generative perspective to temporal link prediction: by treating the entire interaction history as a noisy sequence and learning to denoise it conditioned on past context, the model captures both uncertainty and long‑range temporal dependencies that discriminative TGNNs miss. The proposed architecture—historical transformer encoder, cross‑attention denoising decoder, and cosine‑based loss—delivers state‑of‑the‑art results across diverse dynamic graph datasets, establishing diffusion‑based sequence modeling as a promising new paradigm for dynamic graph learning.
Comments & Academic Discussion
Loading comments...
Leave a Comment