Electrical Engineering and Systems Science / Audio Processing

Improving Code-Switching Speech Recognition with TTS Data Augmentation

February 09, 2026

Reading time: 1 minute

...

📝 Original Info

Title: Improving Code-Switching Speech Recognition with TTS Data Augmentation
ArXiv ID: 2601.00935
Date: 2026-01-02
Authors: Yue Heng Yeo, Yuchen Hu, Shreyas Gopal, Yizhou Peng, Hexin Liu, Eng Siong Chng

📝 Abstract

Automatic speech recognition (ASR) for conversational code-switching speech remains challenging due to the scarcity of realistic, high-quality labeled speech data. This paper explores multilingual text-to-speech (TTS) models as an effective data augmentation technique to address this shortage. Specifically, we fine-tune the multilingual CosyVoice2 TTS model on the SEAME dataset to generate synthetic conversational Chinese-English code-switching speech, significantly increasing the quantity and speaker diversity of available training data. Our experiments demonstrate that augmenting real speech with synthetic speech reduces the mixed error rate (MER) from 12.1% to 10.1% on DevMan and from 17.8% to 16.0% on DevSGE, indicating performance gains. These results confirm that multilingual TTS is an effective and practical tool for enhancing ASR robustness in low-resource, conversational code-switching scenarios.

📄 Full Content

...(본문 내용이 길어 생략되었습니다. 사이트에서 전문을 확인해 주세요.)

Improving Code-Switching Speech Recognition with TTS Data Augmentation

📝 Original Info

📝 Abstract

📄 Full Content

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

📄 Full Content

Start searching

No results found