Dynamic Topic Evolution with Temporal Decay and Attention in Large Language Models

This paper proposes a modeling framework for dynamic topic evolution based on temporal large language models. The method first uses a large language model to obtain contextual embeddings of text and then introduces a temporal decay function and an attention mechanism. These components allow the model to adjust the importance of semantic units according to time intervals and capture topic variations across different periods. The temporal representations are then mapped into a latent topic space, where a state transition matrix is applied to describe the dynamic evolution of topics. A joint optimization objective constrains both semantic modeling and temporal consistency, ensuring diversity and smoothness in topic generation. The design emphasizes the unified modeling of semantic representation and temporal evolution, which improves topic coherence and diversity while enhancing stability and interpretability over time. Experiments on real-world corpora show that the framework effectively captures the generation, expansion, and decline of topics and outperforms existing models across multiple metrics. Overall, the proposed method provides a systematic solution for understanding dynamic semantic patterns in large-scale text, enriches the research paradigm of topic modeling, and supports complex text analysis tasks in multiple domains.

💡 Research Summary

The paper introduces a unified framework for modeling the evolution of topics over time by leveraging large language models (LLMs) as the semantic backbone and augmenting them with temporal decay and attention mechanisms. First, a pre‑trained LLM (e.g., GPT‑3, LLaMA) encodes each document or sentence into a high‑dimensional contextual embedding that captures rich linguistic nuances. To reflect the intuition that older information should gradually lose influence, the authors apply a temporal decay function w(Δt) that scales each embedding according to the time gap Δt between its creation and the current analysis point. Both exponential (w=exp(−γΔt)) and linear (w=max(0,1−αΔt)) decay forms are examined, with γ or α controlling the speed of forgetting.

After decay, a multi‑head self‑attention layer is employed across all documents belonging to the same time slice. This attention step allows the model to capture intra‑slice interactions, letting different semantic streams influence each other while preserving the temporal weighting introduced earlier. The attention‑augmented vectors are then projected into a K‑dimensional latent topic space, where each dimension corresponds to a latent topic.

Within this latent space, a state‑transition matrix Φ∈ℝ^{K×K} governs the dynamics: the vector at time t+1 is approximated by Φz_t, where z_t denotes the topic distribution at time t. Elements φ_{ij} encode the probability that topic i evolves into topic j, and the matrix is regularized to be sparse and row‑normalized, which both curbs spurious cross‑topic leakage and enhances interpretability.

Training optimizes a joint loss composed of (1) a reconstruction term L_rec = ‖X_t – \hat{X}_t‖2^2 that forces the latent representation to faithfully reconstruct the original LLM embeddings, and (2) a temporal‑consistency term L_temp = Σ_t ‖z{t+1} – Φz_t‖_2^2 that penalizes deviations from the prescribed transition dynamics. The full objective L = L_rec + λL_temp + βR(Φ) includes hyper‑parameters λ and β to balance semantic fidelity, temporal smoothness, and a regularization R(Φ) that enforces sparsity and normalization. Optimization proceeds with Adam on mini‑batches, allowing the model to learn both meaningful embeddings and a coherent transition matrix simultaneously.

Empirical evaluation spans three heterogeneous corpora: news articles (Reuters), social‑media posts (Twitter), and scientific papers (arXiv). Baselines include LDA, Dynamic Topic Model (DTM), and the recent BERTopic approach. The authors assess topic coherence, diversity, and a newly defined temporal smoothness metric. Across all datasets, the proposed method outperforms baselines by 8–12 % on average, delivering more coherent and diverse topics while exhibiting smoother evolution curves. Visualizations of Φ reveal clear patterns of topic birth, growth, and decay—for instance, the rapid rise and subsequent decline of pandemic‑related topics in the news corpus—something that static or weakly dynamic models fail to capture.

In summary, the work demonstrates that integrating LLM‑based semantic embeddings with time‑aware decay and attention yields a powerful, interpretable model of topic dynamics. It bridges the gap between high‑quality contextual representation and explicit temporal modeling, enabling applications such as real‑time trend detection, policy‑impact monitoring, and business intelligence on streaming text streams. The authors suggest future extensions toward multimodal inputs and online learning schemes for truly continuous topic tracking.

💡 Research Summary

📜 Original Paper Content