On Fairness of Task Arithmetic: The Role of Task Vectors

On Fairness of Task Arithmetic: The Role of Task Vectors
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Model editing techniques, particularly task arithmetic with task vectors, offer an efficient alternative to full fine-tuning by enabling direct parameter updates through simple arithmetic operations. While this approach promises substantial computational savings, its impact on fairness has remained largely unexplored – despite growing concern over biased outcomes in high-stakes applications such as hate speech detection. In this work, we present the first systematic study of group fairness in task arithmetic within this binary text and image classification regime, comparing it against full fine-tuning (FFT) and Low-Rank Adaptation (LoRA). We evaluate across multiple language models and datasets using standard group fairness metrics, including Demographic Parity and Equalized Odds. Our analysis shows that task vectors can be tuned to achieve competitive accuracy while reducing disparities, and that merging subgroup-specific task vectors provides a practical mechanism for steering fairness outcomes. We further provide a theoretical bound linking task vector scaling to fairness metrics, offering insight into the observed trade-offs. Together, these findings establish task arithmetic not only as a cost-efficient editing method but also as a fairness-aware alternative to existing adaptation techniques, within the standard group-fair classification setting, laying the groundwork for responsible deployment of large language models.


💡 Research Summary

This paper presents the first systematic investigation of group fairness in the emerging model‑editing paradigm known as task arithmetic, where “task vectors”—the parameter difference between a fine‑tuned model and its base—are added, subtracted, or scaled to modify model behavior without additional gradient updates. The authors compare task‑vector editing against full fine‑tuning (FFT) and Low‑Rank Adaptation (LoRA) across both natural‑language and vision domains, focusing on binary classification tasks (hate‑speech detection, toxicity detection, and age classification) while evaluating multiple demographic subgroups (gender and race).

Key methodological contributions include: (1) defining a single global scalar λ that uniformly scales a task vector, providing a one‑dimensional control knob for fairness‑accuracy trade‑offs; (2) proposing the merging of subgroup‑specific task vectors via linear combination, enabling targeted performance boosts for under‑represented groups; and (3) deriving a theoretical upper bound that links λ to the standard fairness metrics Demographic Parity Difference (DPD) and Equalized Odds Difference (EOD), assuming a 1‑Lipschitz relationship between parameter changes and model logits.

Experiments use LLaMA‑7B, DistilBERT, Qwen2.5‑0.5B for text, and ViT‑Base/16 for vision. Datasets include the Berkeley D‑Lab hate‑speech corpus (with fine‑grained gender and race annotations), Civil Comments (toxicity), and UTKFace (age, gender, race). For each setting the authors report subgroup‑wise accuracy, macro‑averaged and worst‑group DPD/EOD, and overall accuracy.

Results show that a modest scaling (λ≈0.5–0.8) yields accuracy comparable to FFT (within 1–2 %) while reducing DPD and EOD by roughly 30 % on average. Merging subgroup‑specific vectors further lowers error rates for minority groups with negligible overall performance loss, demonstrating that task arithmetic can be tuned to satisfy fairness constraints post‑hoc. Simple vector addition sometimes causes “negative transfer”—improving one group at the expense of another—but the same λ‑grid search can mitigate these effects.

The theoretical analysis confirms that increasing λ linearly amplifies group‑wise selection‑rate disparities, matching empirical observations and offering a principled way to predict fairness impact before deployment. Compared to LoRA, which reduces computational load but does not inherently address bias, task arithmetic adds negligible overhead (no extra training) and provides interpretability: each vector can be inspected to understand which demographic behavior it encodes.

Limitations include the focus on binary, multi‑group classification; extensions to multi‑label, regression, or generative tasks remain open. Additionally, λ is tuned on a held‑out validation set, which may not capture distribution shifts in real‑world deployment, suggesting a need for adaptive or online scaling mechanisms.

In conclusion, the study demonstrates that task‑vector based model editing is not only a cost‑effective alternative to traditional fine‑tuning but also a viable fairness‑aware tool. By adjusting a single scalar or merging subgroup vectors, practitioners can achieve competitive performance while substantially reducing demographic disparities, paving the way for more responsible deployment of large language and vision models.


Comments & Academic Discussion

Loading comments...

Leave a Comment