Dataset Creation and Baseline Models for Sexism Detection in Hausa

February 22, 2026

Reading time: 2 minute

...

📝 Original Info

Title: Dataset Creation and Baseline Models for Sexism Detection in Hausa
ArXiv ID: 2510.27038
Date: 2025-10-30
Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (예: 이름, 소속 등) **

📝 Abstract

Sexism reinforces gender inequality and social exclusion by perpetuating stereotypes, bias, and discriminatory norms. Noting how online platforms enable various forms of sexism to thrive, there is a growing need for effective sexism detection and mitigation strategies. While computational approaches to sexism detection are widespread in high-resource languages, progress remains limited in low-resource languages where limited linguistic resources and cultural differences affect how sexism is expressed and perceived. This study introduces the first Hausa sexism detection dataset, developed through community engagement, qualitative coding, and data augmentation. For cultural nuances and linguistic representation, we conducted a two-stage user study (n=66) involving native speakers to explore how sexism is defined and articulated in everyday discourse. We further experiment with both traditional machine learning classifiers and pre-trained multilingual language models and evaluating the effectiveness few-shot learning in detecting sexism in Hausa. Our findings highlight challenges in capturing cultural nuance, particularly with clarification-seeking and idiomatic expressions, and reveal a tendency for many false positives in such cases.

Dataset Creation and Baseline Models for Sexism Detection in Hausa

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

Memories Retrieved from Many Paths: A Multi-Prefix Framework for Robust Detection of Training Data Leakage in Large Language Models

Zero-shot data citation function classification using transformer-based large language models (LLMs)

Data Descriptions from Large Language Models with Influence Estimation

Start searching

No results found