Skin Tone Emoji and Sentiment on Twitter

Reading time: 5 minute
...

📝 Original Info

  • Title: Skin Tone Emoji and Sentiment on Twitter
  • ArXiv ID: 1805.00444
  • Date: 2018-05-01
  • Authors: Saif M. Mohammad; Svetlana Kiritchenko; Yoon Kim; Timothy J. O’Connor; Mark Dredze

📝 Abstract

In 2015, the Unicode Consortium introduced five skin tone emoji that can be used in combination with emoji representing human figures and body parts. In this study, use of the skin tone emoji is analyzed geographically in a large sample of data from Twitter. It can be shown that values for the skin tone emoji by country correspond approximately to the skin tone of the resident populations, and that a negative correlation exists between tweet sentiment and darker skin tone at the global level. In an era of large-scale migrations and continued sensitivity to questions of skin color and race, understanding how new language elements such as skin tone emoji are used can help frame our understanding of how people represent themselves and others in terms of a salient personal appearance attribute.

💡 Deep Analysis

Figure 1

📄 Full Content

Unicode code points are used not only to map the characters of the world's languages, but since 2009 also for emojicharacters that often depict faces or human forms. 1Introduced by Japanese telecommunications providers in the 1990s, emoji were implemented in the popular iOS and Android mobile operating systems as well as on Social Media platforms such as Facebook, Twitter, or Instagram shortly after their canonization in the Unicode scheme. In 2015 the Unicode consortium introduced a new set of emoji characters that include code points allowing users to select from five different skin tones, in addition to a default skin tone (usually yellow, Fig. 1), for a set of emoji characters that depict persons and body parts [1]. The skin tones, derived from the Fitzpatrick scale used in dermatology, are applied to a face or body-part emoji by appending the Unicode code point for the skin tone to the code point for the face or body part.

In this study the use of the skin tone emoji in a large global dataset of messages collected from Twitter is investigated. After characterizing the global distribution of skin tone emoji, a sentiment analysis is conducted. The correlation of skin tone emoji and sentiment may reflect demographic and economic realities but can also shed light on evolving attitudes towards skin color, race and ethnicity. Sentiment analysis, or the automatic extraction of opinions or emotions from text data, is an important topic in Natural Language Processing. Approaches in sentiment analysis range from lexicon-based frequency counts (the “bag-of-words” model) to the use of machine learning techniques based on the automatic extraction of features in multi-dimensional vector space or the use of neural networks (for an overview, see [2]). The approach adopted in this paper utilizes an existing emoji sentiment classification scale [3] to annotate tweets with sentiment.

In the next section related work on emoji and skin tone emoji is described, as well as methods for sentiment analysis relevant to the present research. In Section 3, the collection and processing of a data set from the Twitter APIs and the tools and methods used to undertake the analysis are introduced. In Section 4, the results of two experiments are presented. In Section 5, the results are interpreted, a preliminary conclusion is reached, and an outlook for further investigation of skin tone emoji is offered.

Due to the newness of the phenomenon, analyses of skin tone emoji use are relatively few, but some research has investigated patterns of emoji usage in general. Emoticons, older ASCII-character sequences used to represent mainly facial expressions, have a longer history in Computer-mediated Communication (CMC), and have been subject to several analyses, including of their use on Twitter [4,5,6,7].

For emoji, Barbieri et al. [8] used vector space representations to compare the meanings of emoji in Twitter corpora of American English, British English, peninsular Spanish and Italian. They note that while the semantics of emoji across languages and varieties are relatively stable, some emoji are used quite differently in the corpora.

McGill [9] drew attention to the underrepresentation of lighter skin-tone emoji in the United States, and suggested that while the default yellow skin tone may be used by some as a stand-in for lighter skin tones, people of European descent in the United States may also be fearful of asserting their racial identity. Kralj-Novak et al. [3] engaged annotators to rate the sentiment of Twitter messages containing emoji in 13 languages. The derived sentiment values for individual emoji are utilized in Section 4 to assign sentiment to the data collected for this study.

Ljubešić and Fišer [10] demonstrated that Twitter users who make use of emoji tend to be more active on the platform than non-emoji users, as well as have more followers and friends. They note that the “Emoji modifier Fitzpatrick type-1-2”, encoding light skin tone, is one of the most frequent emoji in their data set, comprising 2.3% of all emoji forms (85). In terms of geographic distribution, they note that clustering nations on the basis of emoji probability distributions results in a stratification of the skin tone emoji, with lighter skin tones among the most characteristic types in “firstand second-world” nations and darker skin tones more characteristic for the “fourth-world” cluster comprising mainly African nations (86-87).

Many sentiment analysis studies have utilized data from Twitter [11,12], and sentiment analysis of monolingual labelled data can typically attain high rates of precision and accuracy. Sentiment analysis of multilingual data, on the other hand, poses various problems: For some languages there are no existing resources such as sentiment lexicons or sentiment-labelled corpora with which supervised models could be trained. Where multilingual sentiment analysis has been undertaken, it often targets specific language pai

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut