EVOKE: Emotion Vocabulary Of Korean and English

EVOKE: Emotion Vocabulary Of Korean and English
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper introduces EVOKE, a parallel dataset of emotion vocabulary in English and Korean. The dataset offers comprehensive coverage of emotion words in each language, in addition to many-to-many translations between words in the two languages and identification of language-specific emotion words. The dataset contains 1,427 Korean words and 1,399 English words, and we systematically annotate 819 Korean and 924 English adjectives and verbs. We also annotate multiple meanings of each word and their relationships, identifying polysemous emotion words and emotion-related metaphors. The dataset is, to our knowledge, the most comprehensive, systematic, and theory-agnostic dataset of emotion words in both Korean and English to date. It can serve as a practical tool for emotion science, psycholinguistics, computational linguistics, and natural language processing, allowing researchers to adopt different views on the resource reflecting their needs and theoretical perspectives. The dataset is publicly available at https://github.com/yoonwonj/EVOKE.


💡 Research Summary

The paper presents EVOKE, a bilingual parallel dataset of emotion vocabulary in Korean and English, designed to address longstanding gaps in existing emotion‑lexicon resources. The authors first compile candidate emotion words from prior English and Korean studies, arriving at 1,399 English and 1,427 Korean entries. They then establish many‑to‑many translation mappings using two Korean‑English bilingual dictionaries and monolingual definitions, ensuring that each word is linked to all plausible equivalents rather than forcing a one‑to‑one correspondence.

Annotation focuses on adjectives and verbs, the parts of speech most directly involved in expressing affective states. Six native speakers (three Korean, three English) received a week of training and then spent ten weeks labeling the words. The annotation scheme consists of 14 binary questions grouped into four sections: (1) acceptability judgments in four prototypical sentences (“I feel X”, “They feel X”, “It feels X”, “I am X”), (2) follow‑up semantic experiencer judgments (subjectivity, evaluation, causation), (3) exclusion criteria (pure bodily sensation, behavioral expression, pure epistemic state), and (4) polysemy detection (additional meanings, domain distinctness, systematic relatedness). Annotators could also select “unsure,” which was later discussed in weekly meetings; consensus was not forced, preserving genuine ambiguity. A 10 % overlap among annotators enabled inter‑rater reliability checks.

The final dataset contains annotations for 819 Korean and 924 English adjectives/verbs. Approximately 12 % of Korean and 15 % of English entries were marked as polysemous. Translation mappings average 1.8 target equivalents per source word, illustrating the many‑to‑many nature of affective semantics (e.g., English “dismay” lacks a direct Korean counterpart, while Korean “슬프다” maps to English “sad,” “sorrowful,” and “depressed”).

EVOKE’s contributions are threefold. First, it offers the most comprehensive lexical coverage of emotion words for the two languages to date, including rare or culture‑specific terms such as Korean 정 (chong) and 한 (han). Second, its theory‑agnostic annotation scheme lets researchers filter the data according to any preferred definition of “emotion word,” supporting diverse theoretical frameworks. Third, the explicit representation of polysemy and metaphorical extensions provides a foundation for cross‑linguistic studies of affective semantics, enabling quantitative analyses of how emotions are metaphorically extended across cultures.

Potential applications span emotion science, psycholinguistics, computational linguistics, and natural language processing. Researchers can use EVOKE to build or refine sentiment analysis tools, train multilingual emotion‑recognition models, or conduct comparative cultural studies of affective vocabularies. The publicly released GitHub repository (https://github.com/yoonwonj/EVOKE) ensures reproducibility and invites extensions.

The authors acknowledge limitations: the annotation team is small, which may affect coverage of dialectal or subcultural variations; the “unsure” label remains in the released data, requiring downstream users to decide how to handle it; and the resource is currently limited to Korean and English. Future work includes expanding to additional languages, increasing annotator diversity, and providing richer reliability metrics (e.g., Cohen’s κ) to further validate the annotations. Despite these constraints, EVOKE represents a significant step toward a unified, richly annotated, cross‑linguistic emotion lexicon.


Comments & Academic Discussion

Loading comments...

Leave a Comment