Multilingual VLM Training: Adapting an English-Trained VLM to French

Reading time: 1 minute
...

📝 Original Info

  • Title: Multilingual VLM Training: Adapting an English-Trained VLM to French
  • ArXiv ID: 2512.10336
  • Date: 2025-12-11
  • Authors: Jules Lahmi, Alexis Roger

📝 Abstract

Artificial intelligence has made great progress in recent years, particularly in the development of Vision-Language Models (VLMs) that understand both visual and textual data. However, these advancements remain largely limited to English, reducing their accessibility for non-English speakers. It is essential to extend these capabilities to a broader range of languages. This paper explores the challenges of adapting an English-trained VLM to different languages. To this end, we will explore and compare different methods for their performance and computational cost. We consider a translation-based pipeline, LoRA finetuning, and a two-stage finetuning strategy that separates vision adaptation from language adaptation. To evaluate these methods, we use a combination of standard multimodal benchmarks translated into the target language and manual assessments by native experts. The results reveal that dataset translation remains a major bottleneck in multilingual VLM performance, with data quality limiting the effectiveness of training and evaluation. These findings suggest that future efforts should focus on native-language dataset collection and improved translation strategies.

📄 Full Content

...(본문 내용이 길어 생략되었습니다. 사이트에서 전문을 확인해 주세요.)

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut