Language Steering for Multilingual In-Context Learning
While multilingual large language models have gained widespread adoption, their performance on non-English languages remains substantially inferior to English. This disparity is particularly evident in in-context learning scenarios, where providing demonstrations in English but testing on non-English inputs leads to significant performance degradation. In this paper, we hypothesize that LLMs develop a universal semantic space for understanding languages, where different languages are encoded as distinct directions within this space. Based on this hypothesis, we propose language vectors – a training-free language steering approach that leverages activation differences between source and target languages to guide model behavior. We steer the model generations by adding the vector to the intermediate model activations during inference. This is done to make the model’s internal representations shift towards the target language space without any parameter updates. We evaluate our method across three datasets and test on a total of 19 languages on three different models. Our results show consistent improvements on multilingual in-context learning over baselines across all tasks and languages tested. Beyond performance gains, hierarchical clustering of steering vectors reveals meaningful linguistic structure aligned with language families. These vectors also successfully transfer across tasks, demonstrating that these representations are task-agnostic.
💡 Research Summary
The paper addresses a persistent gap in multilingual large language models (LLMs): when using in‑context learning (ICL), performance drops sharply when the few‑shot demonstrations are in English but the test query is in a low‑resource language. The authors hypothesize that LLMs embed a universal semantic space in which each language occupies a distinct direction. If this hypothesis holds, one can shift the model’s internal representations from the “English direction” toward a target‑language direction without any weight updates.
To operationalize this idea, they introduce language steering vectors. The method proceeds in three steps. First, they collect parallel question‑answer pairs in English (source) and a target language. For each pair they concatenate several (k=6) QA examples into a single prompt, feed it through the model, and extract hidden states at a chosen layer t. They average‑pool across all token positions to obtain a single vector per sample for each language. Second, they compute the difference between the target‑language and English vectors for each sample and average these differences over N samples, yielding a steering vector v(t) ∈ ℝ^d. This vector captures the systematic activation shift associated with the target language. Third, during inference, they add α·v(t) to the hidden states at layer t for a selected set of token positions (e.g., all few‑shot tokens, only the boundary token, only the test question, or the entire prompt). The addition is performed via a forward hook, requiring no gradient computation or parameter modification. Hyperparameters (layer t, scaling factor α, and token set P) are tuned on a validation split.
The authors evaluate the approach on three instruction‑tuned models—Llama‑3.1‑8B‑Instruct, Qwen2.5‑7B‑Instruct, and Qwen2.5‑14B‑Instruct—across three benchmark suites: MGSM (multilingual grade‑school math), XNLI (cross‑lingual natural language inference), and MSV‑AMP (multilingual arithmetic word problems). They test 19 languages spanning high‑resource (e.g., French, Chinese) and low‑resource (e.g., Swahili, Urdu) families. Two baselines are considered: (B) English few‑shot prompts with target‑language queries, and (MFS) multilingual few‑shot prompts that mix demonstrations from several languages. An oracle condition (target‑language few‑shot) serves as an upper bound.
Results show consistent gains over the English‑only baseline. For MGSM, the steering method improves average accuracy from 61.01 % to 65.87 % on Llama‑3.1, with similar lifts on the Qwen models (up to +3 % absolute). Gains are especially pronounced for languages that are typologically distant from English, suggesting the vector effectively compensates for language‑specific encoding differences. On XNLI, improvements are more variable but still positive on average; the multilingual few‑shot baseline sometimes matches or exceeds steering, reflecting the benefit of diverse demonstrations for inference tasks. On MSV‑AMP, the multilingual few‑shot baseline is competitive, yet steering still yields a modest boost. Across all models, the most effective steering layer lies in the middle of the network (layers 10–20), and scaling factors around 1.0–2.0 work best.
Beyond performance, the authors analyze the steering vectors themselves. Hierarchical clustering of vectors groups languages according to known families (Romance, Germanic, Slavic, etc.), indicating that the vectors capture genuine linguistic structure. Moreover, vectors derived from one task transfer to other tasks without re‑computation, confirming that they encode language‑mode information rather than task‑specific cues.
In summary, the paper contributes (1) a simple, training‑free technique to “steer” multilingual LLMs into a desired language mode, yielding measurable performance gains in cross‑lingual ICL; and (2) a diagnostic tool that reveals how multilingual models internally organize language information. The method’s low computational overhead and lack of parameter updates make it attractive for deployment in resource‑constrained settings or for languages with scarce labeled data. Future work could explore multi‑language steering, integration of steering vectors into pre‑training objectives, or application to even lower‑resource or dialectal languages.
Comments & Academic Discussion
Loading comments...
Leave a Comment