📝 Original Info
- Title: Adaptively Learning the Crowd Kernel
- ArXiv ID: 1105.1033
- Date: 2013-07-19
- Authors: - Omer Tamuz (Microsoft Research New England, Weizmann Institute of Science) - Ce Liu (Microsoft Research New England) - Serge Belongie (University of California, San Diego) - Ohad Shamir (Microsoft Research New England) - Adam Tauman Kalai (Microsoft Research New England)
📝 Abstract
We introduce an algorithm that, given n objects, learns a similarity matrix over all n^2 pairs, from crowdsourced data alone. The algorithm samples responses to adaptively chosen triplet-based relative-similarity queries. Each query has the form "is object 'a' more similar to 'b' or to 'c'?" and is chosen to be maximally informative given the preceding responses. The output is an embedding of the objects into Euclidean space (like MDS); we refer to this as the "crowd kernel." SVMs reveal that the crowd kernel captures prominent and subtle features across a number of domains, such as "is striped" among neckties and "vowel vs. consonant" among letters.
💡 Deep Analysis
📄 Full Content
Adaptively Learning the Crowd Kernel
Omer Tamuz
omertamuz@weizmann.ac.il
Microsoft Research New England and Weizmann Institute of Science
Ce Liu
celiu@microsoft.com
Microsoft Research New England
Serge Belongie
sjb@cs.ucsd.edu
UC San Diego
Ohad Shamir
ohadsh@microsoft.com
Adam Tauman Kalai
adum@microsoft.com
Microsoft Research New England
Abstract
We introduce an algorithm that, given n ob-
jects, learns a similarity matrix over all n2
pairs, from crowdsourced data alone. The al-
gorithm samples responses to adaptively cho-
sen triplet-based relative-similarity queries.
Each query has the form “is object a more
similar to b or to c?”
and is chosen to be
maximally informative given the preceding
responses.
The output is an embedding of
the objects into Euclidean space (like MDS);
we refer to this as the “crowd kernel.” SVMs
reveal that the crowd kernel captures promi-
nent and subtle features across a number of
domains, such as “is striped” among neckties
and “vowel vs. consonant” among letters.
1. Introduction
Essential to the success of machine learning on a new
domain is determining a good “similarity function” be-
tween objects (or alternatively defining good object
“features”). With such a “kernel,” one can perform
a number of interesting tasks, e.g. binary classifica-
tion using Support Vector Machines, clustering, inter-
active database search, or any of a number of other
off-the-shelf kernelized applications. Since this step of
determining a kernel is most often the step that is still
not routinized, effective systems for achieving this step
Appearing in Proceedings of the 28 th International Con-
ference on Machine Learning, Bellevue, WA, USA, 2011.
Copyright 2011 by the author(s)/owner(s).
are desirable as they hold the potential for completely
removing the machine learning researcher from “the
loop.” Such systems could allow practitioners with no
machine learning expertise to employ learning on their
domain. In many domains, people have a good sense
of what similarity is, and in these cases the similarity
function may be determined based upon crowdsourced
human responses alone.
The problem of capturing and extrapolating a human
notion of perceptual similarity has received increasing
attention in recent years including areas such as vi-
sion (Agarwal et al., 2007), audition (McFee & Lanck-
riet, 2009), information retrieval (Schultz & Joachims,
2003) and a variety of others represented in the UCI
Datasets (Xing et al., 2003; Huang et al., 2010). Con-
cretely, the goal of these approaches is to estimate a
similarity matrix K over all pairs of n objects given a
(potentially exhaustive) subset of human perceptual
measurements on tuples of objects.
In some cases
the set of human measurements represents ‘side infor-
mation’ to computed descriptors (MFCC, SIFT, etc.),
while in other cases – the present work included – one
proceeds exclusively with human reported data. When
K is a positive semidefinite matrix induced purely
from distributed human measurements, we refer to it
as the crowd kernel for the set of objects.
Given such a Kernel, one can exploit it for a vari-
ety of purposes including exploratory data analysis or
embedding visualization (as in Multidimensional Scal-
ing) and relevance-feedback based interactive search.
As discussed in the above works and (Kendall & Gib-
bons, 1990), using a triplet based representation of rel-
ative similarity, in which a subject is asked “is object
arXiv:1105.1033v2 [cs.LG] 25 Jun 2011
Adaptively Learning the Crowd Kernel
Figure 1. A sample top-level of a similarity search system
that enables a user to search for objects by similarity. In
this case, since the user clicked on the middle-left tile, she
will “zoom-in” and be presented with similar tiles.
a more similar to b or to c,” has a number of desir-
able properties over the classical approach employed
in Multi-Dimensional Scaling (MDS), i.e., asking for a
numerical estimate of “how similar is object a to b.”
These advantages include reducing fatigue on human
subjects and alleviating the need to reconcile individu-
als’ scales of similarity. The obvious drawback with the
triplet based method, however, is the potential O(n3)
complexity. It is therefore expedient to seek methods
of obtaining high quality approximations of K from
as small a subset of human measurements as possible.
Accordingly, the primary contribution of this paper is
an efficient method for estimating K via an informa-
tion theoretic adaptive sampling approach.
At the heart of our approach is a new scale-invariant
Kernel approximation model. The choice of model is
shown to be crucial in terms of the adaptive triples
that are produced, and the new model produces effec-
tive triples to label. Although this model is noncon-
vex, we prove that it can be optimized under certain
assumptions.
We construct an end-to-end system for interactive vi-
sual search and browsing using our Kernel acquisition
algorithm. The input to this system is a set of im-
ages
Reference
This content is AI-processed based on open access ArXiv data.