NileTMRG at SemEval-2017 Task 4: Arabic Sentiment Analysis

Reading time: 6 minute
...

📝 Original Info

  • Title: NileTMRG at SemEval-2017 Task 4: Arabic Sentiment Analysis
  • ArXiv ID: 1710.08458
  • Date: 2017-10-25
  • Authors: Researchers from original ArXiv paper

📝 Abstract

This paper describes two systems that were used by the authors for addressing Arabic Sentiment Analysis as part of SemEval-2017, task 4. The authors participated in three Arabic related subtasks which are: Subtask A (Message Polarity Classification), Sub-task B (Topic-Based Message Polarity classification) and Subtask D (Tweet quantification) using the team name of NileTMRG. For subtask A, we made use of our previously developed sentiment analyzer which we augmented with a scored lexicon. For subtasks B and D, we used an ensemble of three different classifiers. The first classifier was a convolutional neural network for which we trained (word2vec) word embeddings. The second classifier consisted of a MultiLayer Perceptron, while the third classifier was a Logistic regression model that takes the same input as the second classifier. Voting between the three classifiers was used to determine the final outcome. The output from task B, was quantified to produce the results for task D. In all three Arabic related tasks in which NileTMRG participated, the team ranked at number one.

💡 Deep Analysis

Deep Dive into NileTMRG at SemEval-2017 Task 4: Arabic Sentiment Analysis.

This paper describes two systems that were used by the authors for addressing Arabic Sentiment Analysis as part of SemEval-2017, task 4. The authors participated in three Arabic related subtasks which are: Subtask A (Message Polarity Classification), Sub-task B (Topic-Based Message Polarity classification) and Subtask D (Tweet quantification) using the team name of NileTMRG. For subtask A, we made use of our previously developed sentiment analyzer which we augmented with a scored lexicon. For subtasks B and D, we used an ensemble of three different classifiers. The first classifier was a convolutional neural network for which we trained (word2vec) word embeddings. The second classifier consisted of a MultiLayer Perceptron, while the third classifier was a Logistic regression model that takes the same input as the second classifier. Voting between the three classifiers was used to determine the final outcome. The output from task B, was quantified to produce the results for task D. In a

📄 Full Content

1

NileTMRG at SemEval-2017 Task 4: Arabic Sentiment Analysis

Samhaa R. El-Beltagy1, Mona El Kalamawy2, Abu Bakr Soliman1 1Center for Informatics Sciences Nile University, Juhayna Square, Sheikh Zayed City, Giza, Egypt 2Faculty of Computers and Information,
Cairo University, Ahmed Zewail St, Giza, Egypt samhaa@computer.org, mona.elkalamawy@fci-cu.edu.eg, ab.soliman@nu.edu.eg

Abstract This paper describes two systems that were used by the NileTMRG for address- ing Arabic Sentiment Analysis as part of SemEval-2017, task 4. NileTMRG partici- pated in three Arabic related subtasks which are: Subtask A (Message Polarity Classification), Subtask B (Topic-Based Message Polarity classification) and Sub- task D (Tweet quantification). For sub- task A, we made use of our previously de- veloped sentiment analyzer which we augmented with a scored lexicon. For sub- tasks B and D, we used an ensemble of
three different classifiers. The first classi- fier was a convolutional neural network for which we trained (word2vec) word embeddings. The second classifier consist- ed of a MultiLayer Perceptron while the third classifier was a Logistic regression model that takes the same input as the se- cond classifier. Voting between the three classifiers was used to determine the final outcome. The output from task B, was quantified to produce the results for task D. In all three Arabic related tasks in which NileTMRG participated, the team ranked at number one.
1 Introduction Because of the potential impact of understanding how people react to certain products, events, peo- ple, etc., sentiment analysis is an area that has at- tracted much attention over the past number of years. The consistent increase in Arabic social media content since 2011 (Neal 2013)(Anon 2012)(Farid 2013) resulted in increased interest in Arabic sentiment analysis. Lack of Arabic re- sources (datasets and lexicons), initially hindered research efforts in the area, but the area gradually gained attention, with research effort either focus- ing on building missing resources (El-Beltagy 2016; Refaee & Rieser 2014; El-Beltagy 2017), or on experimenting with different classifiers and features while creating needed resources as is briefly described in the related work section.
In this paper we present our approach to address- ing the following three SemEval related senti- ment analysis subtasks (Arabic):
A) Message Polarity Classification: given a tweet/some text the task is to determine whether the tweet reflects positive, nega- tive, or neutral sentiment. B) Topic-Based Message Polarity Classifica- tion: given some text and a topic, deter- mine whether the sentiment embodied by the text is positive or negative towards the given topic.
D) Tweet quantification: given a set of tweets about a given topic, estimate their distri- bution across the positive and negative classes.
Two systems have been used to address these tasks. The first system is a slightly altered version of that presented in (El-Beltagy et al. 2016). The second is composed on an ensemble of three dif- ferent classifiers: a convolutional neural network(Kim 2014), a Multi-Layer Perceptron, and a Logistic regression classifier.

2 The rest of this paper is organized as follows: section 2 presents a brief overview of related work, section 3 describes the datasets used for training, section 4 overviews the developed sys- tems, while section 5 presents the evaluation re- sults, and section 6 concludes the paper. 2 Related Work
2.1 Task A Research in Arabic Sentiment analysis has been gaining momentum over the past couple of years. The work of (El-Beltagy & Ali 2013) out- lined challenges faced for carrying out Arabic sentiment analysis and presented a simple lexi- con based approach for the task. (Abdulla et al. 2013) compared machine learning and lexicon based techniques for Arabic sentiment analysis on tweets written in the Jordanian dialect. The best obtained results were reported to be those of SVM and Naive Bayes. The work pre- sented in (Shoukry & Rafea 2012) targeted tweets written in the Egyptian dialect and was focused on examining the effect of different pre- processing steps on the task of sentiment analy- sis. The authors used a SVM classifier in all their experiments. (Salamah & Elkhlifi 2014) devel- oped a system for extracting sentiment from the Kuwaiti-Dialect. They experimented with a manually annotated dataset comprised of 340,000 tweets, using SVM, J48, ADTREE, and Random Tree classifiers. The best result was ob- tained using SVM. (Duwairi et al. 2014) presented a sentiment analysis tool for Jordanian Arabic tweets. The authors experi- mented with Naïve Bayes (NB), SVM and KNN classifiers. The NB classifier performed best in their experiments. (Shoukry & Rafea 2015) pre- sented an approach that combines sentiment scores obtaine

…(Full text truncated)…

📸 Image Gallery

cover.png page_2.webp page_3.webp

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut