CATCH: A Controllable Theme Detection Framework with Contextualized Clustering and Hierarchical Generation

Reading time: 5 minute
...

📝 Original Info

  • Title: CATCH: A Controllable Theme Detection Framework with Contextualized Clustering and Hierarchical Generation
  • ArXiv ID: 2512.21715
  • Date: 2025-12-25
  • Authors: ** - Rui Ke¹ - Jiahui Xu¹ - Shenghao Yang¹ - Kuang Wang¹ - Feng Jiang²* (교신 저자) - Haizhou Li¹,³,⁴ ¹ Shenzhen Research Institute of Big Data, The School of Data Science, The Chinese University of Hong Kong, Shenzhen ² Artificial Intelligence Research Institute, Shenzhen University of Advanced Technology ³ The School of Artificial Intelligence, The Chinese University of Hong Kong, Shenzhen ⁴ Department of Electrical and Computer Engineering, National University of Singapore **

📝 Abstract

Theme detection is a fundamental task in user-centric dialogue systems, aiming to identify the latent topic of each utterance without relying on predefined schemas. Unlike intent induction, which operates within fixed label spaces, theme detection requires cross-dialogue consistency and alignment with personalized user preferences, posing significant challenges. Existing methods often struggle with sparse, short utterances for accurate topic representation and fail to capture user-level thematic preferences across dialogues. To address these challenges, we propose CATCH (Controllable Theme Detection with Contextualized Clustering and Hierarchical Generation), a unified framework that integrates three core components: (1) context-aware topic representation, which enriches utterance-level semantics using surrounding topic segments; (2) preference-guided topic clustering, which jointly models semantic proximity and personalized feedback to align themes across dialogue; and (3) a hierarchical theme generation mechanism designed to suppress noise and produce robust, coherent topic labels. Experiments on a multi-domain customer dialogue benchmark (DSTC-12) demonstrate the effectiveness of CATCH with 8B LLM in both theme clustering and topic generation quality.

💡 Deep Analysis

📄 Full Content

CATCH: A Controllable Theme Detection Framework with Contextualized Clustering and Hierarchical Generation Rui Ke 1, Jiahui Xu 1, Shenghao Yang 1, Kuang Wang 1, Feng Jiang 2*, Haizhou Li 1,3,4 1Shenzhen Research Institute of Big Data, The School of Data Science, The Chinese University of Hong Kong, Shenzhen 2Artificial Intelligence Research Institute, Shenzhen University of Advanced Technology 3The School of Artificial Intelligence, The Chinese University of Hong Kong, Shenzhen 4Department of Electrical and Computer Engineering, National University of Singapore jiangfeng@suat-sz.edu.cn Abstract Theme detection is a fundamental task in user-centric dia- logue systems, aiming to identify the latent topic of each ut- terance without relying on predefined schemas. Unlike intent induction, which operates within fixed label spaces, theme detection requires cross-dialogue consistency and alignment with personalized user preferences, posing significant chal- lenges. Existing methods often struggle with sparse, short utterances for accurate topic representation and fail to cap- ture user-level thematic preferences across dialogues. To ad- dress these challenges, we propose CATCH (Controllable Theme Detection with Contextualized Clustering and Hi- erarchical Generation), a unified framework that integrates three core components: (1) context-aware topic represen- tation, which enriches utterance-level semantics using sur- rounding topic segments; (2) preference-guided topic cluster- ing, which jointly models semantic proximity and personal- ized feedback to align themes across dialogue; and (3) a hi- erarchical theme generation mechanism designed to suppress noise and produce robust, coherent topic labels. Experiments on a multi-domain customer dialogue benchmark (DSTC-12) demonstrate the effectiveness of CATCH with 8B LLM in both theme clustering and topic generation quality. Introduction In real-world customer service domains such as banking, fi- nance, travel, and insurance, accurately identifying the un- derlying theme of each user utterance is essential for en- hancing service efficiency, understanding user needs, and retrieving contextually relevant knowledge. Unlike intent induction (Gung et al. 2023), which typically maps utter- ances to a predefined label space (Pu et al. 2022; Costa et al. 2023), theme detection aims to uncover latent and potentially novel topics without prior knowledge. Effective theme detection requires preliminary precise topic assign- ment within a single dialogue (Nguyen et al. 2022; Du, Buntine, and Johnson 2013a), but more importantly should be consistent across multiple dialogues and align with user preferences (Mendonc¸a et al. 2025), which regularizes inter- dialogue theme consolidation, as illustrated in Figure 1. *Corresponding Author. Copyright © 2026, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. C: What’s the interest rate today ? A: Sure thing! C: I want to open a new account. change PIN Inter-dialog Alignment Theme Space inquire about bank account check interest rate change PIN check account balance check interest rate open bank account Intra-dialog Separated Theme Spaces Dialogue within a Specific Domain User Preference Semantic Conclusion A: Sure thing! A: hi how may I help you? C: Can you check my account balance ? C: I want to change my PIN. Figure 1: Illustration of the controllable theme detection task. Given a set of dialogues with unlabeled utterances, a theme is generated for each utterance. The theme granular- ity is influenced by auxiliary inputs such as user preferences (Mendonc¸a et al. 2025). These challenges underscore the need for models that can generalize beyond surface-level semantics and adapt to di- verse real-world conversational scenarios. However, existing approaches fail to address the real- world controllable theme detection for three key challenges. First, short utterances often lead to sparse and ambiguous semantic signals, making it difficult for conventional topic modeling methods (Blei, Ng, and Jordan 2003; Pham et al. 2024) to construct reliable topic representations. Second, while topic clustering methods (Chatterjee and Sengupta 2020; Gung et al. 2023) group utterances based on surface- level semantics, they typically overlook user-specific pref- erences, resulting in inconsistent clustering across dialogues even when the underlying intent is similar. Moreover, most previous work lacks a structured and controllable theme gen- eration mechanism (Perkins and Yang 2019; Zeng et al. 2021), causing the generated topic labels to vary arbitrarily between contexts and limiting their applicability in down- stream applications. To address these challenges, we propose CATCH (Controllable And Thematic Clustering with Hierarchy), a controllable theme detection framework that combines intra-dialogue context modeling with inter- dialogue user preference alignment. Specifically, CATCH comprises th

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut