NTLRAG: Narrative Topic Labels derived with Retrieval Augmented Generation
Topic modeling has evolved as an important means to identify evident or hidden topics within large collections of text documents. Topic modeling approaches are often used for analyzing and making sense of social media discussions consisting of millions of short text messages. However, assigning meaningful topic labels to document clusters remains challenging, as users are commonly presented with unstructured keyword lists that may not accurately capture the respective core topic. In this paper, we introduce Narrative Topic Labels derived with Retrieval Augmented Generation (NTLRAG), a scalable and extensible framework that generates semantically precise and human-interpretable narrative topic labels. Our narrative topic labels provide a context-rich, intuitive concept to describe topic model output. In particular, NTLRAG uses retrieval augmented generation (RAG) techniques and considers multiple retrieval strategies as well as chain-of-thought elements to provide high-quality output. NTLRAG can be combined with any standard topic model to generate, validate, and refine narratives which then serve as narrative topic labels. We evaluated NTLRAG with a user study and three real-world datasets consisting of more than 6.7 million social media messages that have been sent by more than 2.7 million users. The user study involved 16 human evaluators who found that our narrative topic labels offer superior interpretability and usability as compared to traditional keyword lists. An implementation of NTLRAG is publicly available for download.
💡 Research Summary
The paper addresses a long‑standing problem in topic modeling for short social‑media texts: the output is typically a list of representative keywords that often fails to convey the underlying meaning of a topic cluster to human users. To overcome this limitation, the authors propose NTLRAG (Narrative Topic Labels derived with Retrieval‑Augmented Generation), a modular framework that transforms raw topic model outputs into human‑readable narrative labels.
NTLRAG consists of four stages. First, any standard topic model (LDA, NMF, BERTopic, etc.) produces a set of topics and assigns documents to them. For each topic, a small set of representative short texts is extracted. Second, a dual‑retriever component fetches additional context: (a) more short social‑media posts that are similar to the representatives, and (b) validated news articles from reputable outlets that discuss the same underlying event or theme. This multi‑source retrieval mitigates the sparsity of short posts and injects reliable background knowledge.
Third, the retrieved passages are fed to a large language model (LLM) using carefully engineered prompts that incorporate a chain‑of‑thought (CoT) reasoning pattern. The LLM is asked to construct a four‑element narrative schema: (1) actor(s) (individual, group, institution, etc.), (2) action (verb predicate), (3) event (incident or context cluster), and (4) a concise natural‑language sentence that weaves the three elements together. The CoT approach forces the model to reason step‑by‑step, reducing hallucinations and ensuring logical consistency across the generated narrative.
Finally, a validation module checks the presence and coherence of the four elements, and a conditional refiner incorporates human evaluator feedback to iteratively improve the prompts or regenerate the narrative. The validation step can be fully automated (rule‑based checks) or semi‑automated (human‑in‑the‑loop).
The authors evaluated NTLRAG on three real‑world datasets comprising over 6.7 million social‑media messages from more than 2.7 million users, covering political, financial, and entertainment domains. A user study with 16 participants compared traditional keyword lists to NTLRAG’s narrative labels across three criteria: interpretability, usefulness, and efficiency (5‑point Likert scale). Narrative labels achieved average scores of 4.5, 4.3, and 4.2 respectively, significantly outperforming keyword lists (3.2, 3.0, 3.1). Generation time averaged 1.2 seconds per topic, demonstrating scalability.
Key contributions include: (1) defining a structured narrative schema for topic labeling; (2) integrating Retrieval‑Augmented Generation with chain‑of‑thought prompting to produce context‑rich, coherent narratives; (3) introducing a dual‑retriever strategy that blends user‑generated short texts with external verified news sources; (4) providing a modular, model‑agnostic pipeline that can be attached to any existing topic model; and (5) empirical evidence that narrative labels improve human understanding of topic clusters.
Limitations are acknowledged: dependence on external news sources may introduce source bias; LLM inference costs and API rate limits can affect large‑scale deployment; and the current implementation is optimized for English and German news, requiring further work for multilingual settings. Future directions suggested include extending the retriever to multimodal content (images, video), automating domain‑specific prompt generation, developing quantitative metrics for narrative quality, and exploring downstream applications such as trend forecasting or policy analysis using the narrative labels.
In summary, NTLRAG offers a novel, scalable solution that converts abstract topic model outputs into intuitive, story‑like labels, thereby bridging the gap between algorithmic clustering and human sense‑making in massive short‑text corpora.
Comments & Academic Discussion
Loading comments...
Leave a Comment