AHaSIS: Shared Task on Sentiment Analysis for Arabic Dialects
📝 Abstract
The hospitality industry in the Arab world increasingly relies on customer feedback to shape services, driving the need for advanced Arabic sentiment analysis tools. To address this challenge, the Sentiment Analysis on Arabic Dialects in the Hospitality Domain shared task focuses on Sentiment Detection in Arabic Dialects. This task leverages a multi-dialect, manually curated dataset derived from hotel reviews originally written in Modern Standard Arabic (MSA) and translated into Saudi and Moroccan (Darija) dialects. The dataset consists of 538 sentiment-balanced reviews spanning positive, neutral, and negative categories. Translations were validated by native speakers to ensure dialectal accuracy and sentiment preservation. This resource supports the development of dialect-aware NLP systems for real-world applications in customer experience analysis. More than 40 teams have registered for the shared task, with 12 submitting systems during the evaluation phase. The top-performing system achieved an F1 score of 0.81, demonstrating the feasibility and ongoing challenges of sentiment analysis across Arabic dialects.
💡 Analysis
The hospitality industry in the Arab world increasingly relies on customer feedback to shape services, driving the need for advanced Arabic sentiment analysis tools. To address this challenge, the Sentiment Analysis on Arabic Dialects in the Hospitality Domain shared task focuses on Sentiment Detection in Arabic Dialects. This task leverages a multi-dialect, manually curated dataset derived from hotel reviews originally written in Modern Standard Arabic (MSA) and translated into Saudi and Moroccan (Darija) dialects. The dataset consists of 538 sentiment-balanced reviews spanning positive, neutral, and negative categories. Translations were validated by native speakers to ensure dialectal accuracy and sentiment preservation. This resource supports the development of dialect-aware NLP systems for real-world applications in customer experience analysis. More than 40 teams have registered for the shared task, with 12 submitting systems during the evaluation phase. The top-performing system achieved an F1 score of 0.81, demonstrating the feasibility and ongoing challenges of sentiment analysis across Arabic dialects.
📄 Content
Arabic Sentiment Analysis (ASA) has become an increasingly prominent field within Natural Language Processing (NLP), spurred by the growing volume of Arabic content across digital platforms and the pressing need for automated systems to gauge public opinion. In contrast to high-resource languages, ASA continues to face enduring challenges due to the linguistic complexity of Arabic, its diglossic nature, and the considerable variation across regional dialects (Habash et al., 2013). These challenges are particularly evident in informal domains such as social media and hospitality, where sentiment expressions differ significantly across dialects.
To date, the majority of available resources for ASA have concentrated on Modern Standard Arabic (MSA), offering limited applicability to dialectal variants (Aladeemy et al., 2024). Consequently, models trained on MSA frequently struggle to generalise across dialects, leading to diminished performance in practical settings (Khrisat and Al-Harthy, 2015). Additionally, the development of robust, dialect-sensitive models has been hindered by a notable lack of high-quality, annotated datasets.
In response to these limitations, we present the Ahasis 2025 Shared Task, which seeks to advance sentiment classification techniques across Arabic dialects within the hospitality domain. This shared task provides a balanced dataset comprising hotel reviews written in Saudi Arabic and Moroccan Darija, each annotated with sentiment labels. Participants are invited to explore both traditional and neural classification approaches under conditions of limited training data. The task aims to evaluate the effectiveness of various modelling strategies in identifying sentiment from dialect-rich, usergenerated content.
The remainder of this paper is organised as follows: Section 2 reviews the relevant literature; Section 3 details the shared task and its setup; Section 4 describes the dataset; Section 5 presents the evaluation results; and finally, the paper concludes with key findings and outlines future directions.
Arabic sentiment analysis has witnessed growing attention in recent years, with early studies laying the foundation by addressing the lack of dialectspecific annotations and lexical resources (Nabil et al., 2015). Aladeemy et al. (2024) critically reviewed the state of sentiment annotation in Arabic dialects, highlighting the prevalence of manual labelling techniques and the limited use of auto-mated methods due to a shortage of robust linguistic resources. Their findings emphasise that machine learning approaches dominate the field, while lexicon-based systems remain underutilised.
Recent literature has placed emphasis on tackling dialectal diversity, recognising that Arabic dialects differ significantly in syntax, morphology, and vocabulary. A systematic review by Matrane et al. (2023) identified key preprocessing stages, such as normalisation, feature extraction, and sentiment tagging, as decisive factors in improving classification performance. The review also underscored the importance of handling negation and morphological variation, both of which are vital to interpreting sentiment in dialectal contexts.
Deep learning architectures, including convolutional neural networks (CNNs) and recurrent models like LSTM (Hochreiter and Schmidhuber, 1997) and GRU (Chung et al., 2014), have shown strong results in Arabic sentiment tasks (Baali and Ghneim, 2019). However, preprocessing remains a critical bottleneck. Guellil et al. (2020) stressed the necessity of standardised pipelines to improve performance consistency across tasks and datasets.
In parallel, researchers have explored crosslingual methods to augment Arabic sentiment resources. Saadany and Orasan (2020) investigated the preservation of sentiment polarity in neural machine-translated Arabic reviews and identified frequent distortions introduced by automated translation tools. Similarly, Poncelas et al. (2020) examined the impact of machine translation on downstream sentiment classification, revealing that models trained on original data outperform those trained on translated corpora, especially in sentimentsensitive applications.
Finally, while most progress has been made in MSA, Aladeemy et al. (2024) emphasise that Arabic dialects remain underrepresented in sentiment analysis research. They call for a shift towards developing dialect-aware resources and models that address the linguistic variation inherent to Arabic. The Ahasis shared task responds to this call by offering a domain-specific, multidialectal dataset and encouraging participants to experiment with resource-efficient and generative learning paradigms.
3 Task Description
The Ahasis 2025 Shared Task centres on sentiment analysis within the hospitality domain, specifically targeting hotel reviews written in regional Arabic dialects. Given Arabic’s linguistic richness, marked by the coexistence of MSA and a wide range of spoken dialects, sentiment clas
This content is AI-processed based on ArXiv data.