Multi-Intent Spoken Language Understanding: Methods, Trends, and Challenges

Reading time: 5 minute
...

📝 Original Info

  • Title: Multi-Intent Spoken Language Understanding: Methods, Trends, and Challenges
  • ArXiv ID: 2512.11258
  • Date: 2025-12-12
  • Authors: Di Wu¹²†, Ruiyu Fang²†, Liting Jiang¹†, Shuangyong Song²†, Xiaomeng Huang², Shiquan Wang², Zhongqiu Li², Lingling Shi², Mengjiao Bao², Yongxiang Li²* (liyx25@chinatelecom.cn), Hao Huang¹³* (huanghao@xju.edu.cn) ¹ Xinjiang University, School of Computer Science and Technology ² China Telecom Corp Ltd., Institute of Artificial Intelligence (TeleAI) ³ Joint International Research Laboratory of Silk Road Multilingual Cognitive Computing, Urumqi, China † 공동 1저자 (동등 기여) —

📝 Abstract

Multi-intent spoken language understanding (SLU) involves two tasks: multiple intent detection and slot filling, which jointly handle utterances containing more than one intent. Owing to this characteristic, which closely reflects real-world applications, the task has attracted increasing research attention, and substantial progress has been achieved. However, there remains a lack of a comprehensive and systematic review of existing studies on multi-intent SLU. To this end, this paper presents a survey of recent advances in multi-intent SLU. We provide an in-depth overview of previous research from two perspectives: decoding paradigms and modeling approaches. On this basis, we further compare the performance of representative models and analyze their strengths and limitations. Finally, we discuss the current challenges and outline promising directions for future research. We hope this survey will offer valuable insights and serve as a useful reference for advancing research in multi-intent SLU.

💡 Deep Analysis

Figure 1

📄 Full Content

Multi-Intent Spoken Language Understanding: Methods, Trends, and Challenges Di Wu1,2†, Ruiyu Fang2†, Liting Jiang1†, Shuangyong Song2†, Xiaomeng Huang2, Shiquan Wang2, Zhongqiu Li2, Lingling Shi2, Mengjiao Bao2, Yongxiang Li2*, Hao Huang1,3* 1School of Computer Science and Technology, Xinjiang University. 2Institute of Artificial Intelligence (TeleAI), China Telecom Corp Ltd. 3Joint International Research Laboratory of Silk Road Multilingual Cognitive Computing, Urumqi, China. *Corresponding author(s). E-mail(s): liyx25@chinatelecom.cn; huanghao@xju.edu.cn; †These authors contributed equally to this work. Abstract Multi-intent spoken language understanding (SLU) involves two tasks: multiple intent detection and slot filling, which jointly handle utterances containing more than one intent. Owing to this characteristic, which closely reflects real-world applications, the task has attracted increasing research attention, and substantial progress has been achieved. However, there remains a lack of a comprehensive and systematic review of existing studies on multi-intent SLU. To this end, this paper presents a survey of recent advances in multi-intent SLU. We provide an in- depth overview of previous research from two perspectives: decoding paradigms and modeling approaches. On this basis, we further compare the performance of representative models and analyze their strengths and limitations. Finally, we discuss the current challenges and outline promising directions for future research. We hope this survey will offer valuable insights and serve as a useful reference for advancing research in multi-intent SLU. Keywords: Multiple intent detection, Slot filling, Joint training 1 arXiv:2512.11258v1 [cs.CL] 12 Dec 2025 1 Introduction Spoken Language Understanding (SLU) typically involves two core tasks: intent detec- tion and slot filling [1]. The former identifies the request underlying goal, while the latter extracts entities related to that goal from the utterance. As a key component of task-oriented dialogue systems, SLU enables systems to interpret user input and respond appropriately. As shown in Fig. 1(a), given the user utterance “Show me the airports serviced by tower Air”, the intent is “atis airport”, and the corresponding slot annotations are “O O O O O O B-airline name I-airline name”. Show me the type of aircraft that cp uses and what time zone is denver in? B-airline_code B-city_name atis_aircraft # atis_city Slot: Utterance: Intent: (b) Show me the airports serviced by tower Air. B-airline_name I-airline_name atis airport Slot: Utterance: Intent: (a) Show me the type of aircraft that cp uses and what time zone is denver in? B-airline_code B-city_name atis_aircraft # atis_city Slot: Utterance: Intent: (b) Show me the airports serviced by tower Air. B-airline_name I-airline_name atis airport Slot: Utterance: Intent: (a) Fig. 1 Examples from the SLU dataset. (a) shows a single-intent SLU sample, while (b) presents a multi-intent SLU sample. For clarity, slot labels marked as “O” are omitted. In real-world scenarios, a single utterance may express multiple intents, about 52% of utterances in Amazon’s internal dataset contain multiple intents [2]. For instance, as shown in Fig. 1(b), the user utterance “Show me the type of aircraft that cp uses and what time zone is denver in?” conveys multiple intents (“atis aircraft#atis city”) with corresponding slot annotations “O O O O O O O B-airline code O O O O O O B-city name O”. Although multi-intent SLU shares a similar objective with single-intent SLU, namely predicting intents and slot labels from utterances, it poses unique challenges. Multi-intent utterances often comprise multiple clauses, making it difficult for single- intent models to capture all semantic dependencies [3–7]. The challenges mainly arise from two aspects: 1. Complex decoding mechanisms. Single-intent SLU mod- els usually encode an utterance into a single sentence-level representation for intent prediction, which limits their ability to distinguish multiple co-occurring intents. 2. Increased interaction complexity between intent and slot features. As illus- trated in Fig. 1, intent detection and slot filling are interdependent. However, in multi-intent utterances, cross-clause interference exacerbates the difficulty of modeling this interaction effectively. Recognizing the importance of multi-intent SLU for building more capable dialogue systems, researchers have made significant progress in recent years. The number of 2 related publications has grown stea

📸 Image Gallery

0.png 1.png intent_decoding_1.png intent_decoding_2.png intent_decoding_3.png intent_decoding_4.png interaction.png slot_decoding_1.png slot_decoding_2.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut