A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations

Reading time: 6 minute
...

📝 Original Info

  • Title: A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations
  • ArXiv ID: 0809.0124
  • Date: 2008-09-02
  • Authors: ** Peter D. Turney (National Research Council of Canada, Institute for Information Technology) **

📝 Abstract

Recognizing analogies, synonyms, antonyms, and associations appear to be four distinct tasks, requiring distinct NLP algorithms. In the past, the four tasks have been treated independently, using a wide variety of algorithms. These four semantic classes, however, are a tiny sample of the full range of semantic phenomena, and we cannot afford to create ad hoc algorithms for each semantic phenomenon; we need to seek a unified approach. We propose to subsume a broad range of phenomena under analogies. To limit the scope of this paper, we restrict our attention to the subsumption of synonyms, antonyms, and associations. We introduce a supervised corpus-based machine learning algorithm for classifying analogous word pairs, and we show that it can solve multiple-choice SAT analogy questions, TOEFL synonym questions, ESL synonym-antonym questions, and similar-associated-both questions from cognitive psychology.

💡 Deep Analysis

Deep Dive into A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations.

Recognizing analogies, synonyms, antonyms, and associations appear to be four distinct tasks, requiring distinct NLP algorithms. In the past, the four tasks have been treated independently, using a wide variety of algorithms. These four semantic classes, however, are a tiny sample of the full range of semantic phenomena, and we cannot afford to create ad hoc algorithms for each semantic phenomenon; we need to seek a unified approach. We propose to subsume a broad range of phenomena under analogies. To limit the scope of this paper, we restrict our attention to the subsumption of synonyms, antonyms, and associations. We introduce a supervised corpus-based machine learning algorithm for classifying analogous word pairs, and we show that it can solve multiple-choice SAT analogy questions, TOEFL synonym questions, ESL synonym-antonym questions, and similar-associated-both questions from cognitive psychology.

📄 Full Content

arXiv:0809.0124v1 [cs.CL] 31 Aug 2008 A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations Peter D. Turney National Research Council of Canada Institute for Information Technology M50 Montreal Road Ottawa, Ontario, Canada K1A 0R6 peter.turney@nrc-cnrc.gc.ca Abstract Recognizing analogies, synonyms, anto- nyms, and associations appear to be four distinct tasks, requiring distinct NLP al- gorithms. In the past, the four tasks have been treated independently, using a wide variety of algorithms. These four seman- tic classes, however, are a tiny sample of the full range of semantic phenomena, and we cannot afford to create ad hoc algo- rithms for each semantic phenomenon; we need to seek a unified approach. We pro- pose to subsume a broad range of phenom- ena under analogies. To limit the scope of this paper, we restrict our attention to the subsumption of synonyms, antonyms, and associations. We introduce a supervised corpus-based machine learning algorithm for classifying analogous word pairs, and we show that it can solve multiple-choice SAT analogy questions, TOEFL synonym questions, ESL synonym-antonym ques- tions, and similar-associated-both ques- tions from cognitive psychology. 1 Introduction A pair of words (petrify:stone) is analogous to an- other pair (vaporize:gas) when the semantic re- lations between the words in the first pair are highly similar to the relations in the second pair. Two words (levied and imposed) are synonymous in a context (levied a tax) when they can be in- terchanged (imposed a tax), they are are antony- mous when they have opposite meanings (black and white), and they are associated when they tend to co-occur (doctor and hospital). On the surface, it appears that these are four dis- tinct semantic classes, requiring distinct NLP al- gorithms, but we propose a uniform approach to all four. We subsume synonyms, antonyms, and associations under analogies. In essence, we say that X and Y are antonyms when the pair X:Y is analogous to the pair black:white, X and Y are synonyms when they are analogous to the pair levied:imposed, and X and Y are associated when they are analogous to the pair doctor:hospital. There is past work on recognizing analogies (Reitman, 1965), synonyms (Landauer and Dumais, 1997), antonyms (Lin et al., 2003), and associations (Lesk, 1969), but each of these four tasks has been examined separately, in isolation from the others. As far as we know, the algorithm proposed here is the first attempt to deal with all four tasks using a uniform approach. We believe that it is important to seek NLP algorithms that can handle a broad range of semantic phenomena, because developing a specialized algorithm for each phenomenon is a very inefficient research strategy. It might seem that a lexicon, such as Word- Net (Fellbaum, 1998), contains all the information we need to handle these four tasks. However, we prefer to take a corpus-based approach to seman- tics. Veale (2004) used WordNet to answer 374 multiple-choice SAT analogy questions, achieving an accuracy of 43%, but the best corpus-based ap- proach attains an accuracy of 56% (Turney, 2006). Another reason to prefer a corpus-based approach to a lexicon-based approach is that the former re- quires less human labour, and thus it is easier to extend to other languages. In Section 2, we describe our algorithm for recognizing analogies. We use a standard su- pervised machine learning approach, with feature vectors based on the frequencies of patterns in a large corpus. We use a support vector machine (SVM) to learn how to classify the feature vectors (Platt, 1998; Witten and Frank, 1999). Section 3 presents four sets of experiments. We apply our algorithm for recognizing analogies to multiple-choice analogy questions from the SAT college entrance test, multiple-choice synonym questions from the TOEFL (test of English as a foreign language), ESL (English as a second lan- guage) practice questions for distinguishing syn- onyms and antonyms, and a set of word pairs that are labeled similar, associated, and both, devel- oped for experiments in cognitive psychology. We discuss the results of the experiments in Sec- tion 4. The accuracy of the algorithm is competi- tive with other systems, but the strength of the al- gorithm is that it is able to handle all four tasks, with no tuning of the learning parameters to the particular task. It performs well, although it is competing against specialized algorithms, devel- oped for single tasks. Related work is examined in Section 5 and lim- itations and future work are considered in Sec- tion 6. We conclude in Section 7. 2 Classifying Analogous Word Pairs An analogy, A:B::C:D, asserts that A is to B as C is to D; for example, traffic:street::water:riverbed asserts that traffic is to street as water is to riverbed; that is, the semantic relations between traffic and street are highly similar to the semantic relations between water and riverbed. We may view the task of recogn

…(Full text truncated)…

📸 Image Gallery

cover.png page_2.webp page_3.webp

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut