📝 Original Info
- Title: A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations
- ArXiv ID: 0809.0124
- Date: 2008-09-02
- Authors: ** Peter D. Turney (National Research Council of Canada, Institute for Information Technology) **
📝 Abstract
Recognizing analogies, synonyms, antonyms, and associations appear to be four distinct tasks, requiring distinct NLP algorithms. In the past, the four tasks have been treated independently, using a wide variety of algorithms. These four semantic classes, however, are a tiny sample of the full range of semantic phenomena, and we cannot afford to create ad hoc algorithms for each semantic phenomenon; we need to seek a unified approach. We propose to subsume a broad range of phenomena under analogies. To limit the scope of this paper, we restrict our attention to the subsumption of synonyms, antonyms, and associations. We introduce a supervised corpus-based machine learning algorithm for classifying analogous word pairs, and we show that it can solve multiple-choice SAT analogy questions, TOEFL synonym questions, ESL synonym-antonym questions, and similar-associated-both questions from cognitive psychology.
💡 Deep Analysis
Deep Dive into A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations.
Recognizing analogies, synonyms, antonyms, and associations appear to be four distinct tasks, requiring distinct NLP algorithms. In the past, the four tasks have been treated independently, using a wide variety of algorithms. These four semantic classes, however, are a tiny sample of the full range of semantic phenomena, and we cannot afford to create ad hoc algorithms for each semantic phenomenon; we need to seek a unified approach. We propose to subsume a broad range of phenomena under analogies. To limit the scope of this paper, we restrict our attention to the subsumption of synonyms, antonyms, and associations. We introduce a supervised corpus-based machine learning algorithm for classifying analogous word pairs, and we show that it can solve multiple-choice SAT analogy questions, TOEFL synonym questions, ESL synonym-antonym questions, and similar-associated-both questions from cognitive psychology.
📄 Full Content
arXiv:0809.0124v1 [cs.CL] 31 Aug 2008
A Uniform Approach to Analogies, Synonyms, Antonyms,
and Associations
Peter D. Turney
National Research Council of Canada
Institute for Information Technology
M50 Montreal Road
Ottawa, Ontario, Canada
K1A 0R6
peter.turney@nrc-cnrc.gc.ca
Abstract
Recognizing analogies, synonyms, anto-
nyms, and associations appear to be four
distinct tasks, requiring distinct NLP al-
gorithms. In the past, the four tasks have
been treated independently, using a wide
variety of algorithms. These four seman-
tic classes, however, are a tiny sample of
the full range of semantic phenomena, and
we cannot afford to create ad hoc algo-
rithms for each semantic phenomenon; we
need to seek a unified approach. We pro-
pose to subsume a broad range of phenom-
ena under analogies. To limit the scope of
this paper, we restrict our attention to the
subsumption of synonyms, antonyms, and
associations.
We introduce a supervised
corpus-based machine learning algorithm
for classifying analogous word pairs, and
we show that it can solve multiple-choice
SAT analogy questions, TOEFL synonym
questions, ESL synonym-antonym ques-
tions, and similar-associated-both ques-
tions from cognitive psychology.
1
Introduction
A pair of words (petrify:stone) is analogous to an-
other pair (vaporize:gas) when the semantic re-
lations between the words in the first pair are
highly similar to the relations in the second pair.
Two words (levied and imposed) are synonymous
in a context (levied a tax) when they can be in-
terchanged (imposed a tax), they are are antony-
mous when they have opposite meanings (black
and white), and they are associated when they tend
to co-occur (doctor and hospital).
On the surface, it appears that these are four dis-
tinct semantic classes, requiring distinct NLP al-
gorithms, but we propose a uniform approach to
all four. We subsume synonyms, antonyms, and
associations under analogies. In essence, we say
that X and Y are antonyms when the pair X:Y
is analogous to the pair black:white, X and Y
are synonyms when they are analogous to the pair
levied:imposed, and X and Y are associated when
they are analogous to the pair doctor:hospital.
There
is
past
work
on
recognizing
analogies
(Reitman, 1965),
synonyms
(Landauer and Dumais, 1997),
antonyms
(Lin et al., 2003), and associations (Lesk, 1969),
but each of these four tasks has been examined
separately, in isolation from the others. As far as
we know, the algorithm proposed here is the first
attempt to deal with all four tasks using a uniform
approach. We believe that it is important to seek
NLP algorithms that can handle a broad range
of semantic phenomena, because developing a
specialized algorithm for each phenomenon is a
very inefficient research strategy.
It might seem that a lexicon, such as Word-
Net (Fellbaum, 1998), contains all the information
we need to handle these four tasks. However, we
prefer to take a corpus-based approach to seman-
tics. Veale (2004) used WordNet to answer 374
multiple-choice SAT analogy questions, achieving
an accuracy of 43%, but the best corpus-based ap-
proach attains an accuracy of 56% (Turney, 2006).
Another reason to prefer a corpus-based approach
to a lexicon-based approach is that the former re-
quires less human labour, and thus it is easier to
extend to other languages.
In Section 2, we describe our algorithm for
recognizing analogies.
We use a standard su-
pervised machine learning approach, with feature
vectors based on the frequencies of patterns in a
large corpus.
We use a support vector machine
(SVM) to learn how to classify the feature vectors
(Platt, 1998; Witten and Frank, 1999).
Section 3 presents four sets of experiments. We
apply our algorithm for recognizing analogies to
multiple-choice analogy questions from the SAT
college entrance test, multiple-choice synonym
questions from the TOEFL (test of English as a
foreign language), ESL (English as a second lan-
guage) practice questions for distinguishing syn-
onyms and antonyms, and a set of word pairs that
are labeled similar, associated, and both, devel-
oped for experiments in cognitive psychology.
We discuss the results of the experiments in Sec-
tion 4. The accuracy of the algorithm is competi-
tive with other systems, but the strength of the al-
gorithm is that it is able to handle all four tasks,
with no tuning of the learning parameters to the
particular task.
It performs well, although it is
competing against specialized algorithms, devel-
oped for single tasks.
Related work is examined in Section 5 and lim-
itations and future work are considered in Sec-
tion 6. We conclude in Section 7.
2
Classifying Analogous Word Pairs
An analogy, A:B::C:D, asserts that A is to B as C
is to D; for example, traffic:street::water:riverbed
asserts that traffic is to street as water is to riverbed;
that is, the semantic relations between traffic and
street are highly similar to the semantic relations
between water and riverbed.
We may view the
task of recogn
…(Full text truncated)…
📸 Image Gallery
Reference
This content is AI-processed based on ArXiv data.