Multi-instance learning attempts to learn from a training set consisting of labeled bags each containing many unlabeled instances. Previous studies typically treat the instances in the bags as independently and identically distributed. However, the instances in a bag are rarely independent, and therefore a better performance can be expected if the instances are treated in an non-i.i.d. way that exploits the relations among instances. In this paper, we propose a simple yet effective multi-instance learning method, which regards each bag as a graph and uses a specific kernel to distinguish the graphs by considering the features of the nodes as well as the features of the edges that convey some relations among instances. The effectiveness of the proposed method is validated by experiments.
Deep Dive into Multi-Instance Learning by Treating Instances As Non-I.I.D. Samples.
Multi-instance learning attempts to learn from a training set consisting of labeled bags each containing many unlabeled instances. Previous studies typically treat the instances in the bags as independently and identically distributed. However, the instances in a bag are rarely independent, and therefore a better performance can be expected if the instances are treated in an non-i.i.d. way that exploits the relations among instances. In this paper, we propose a simple yet effective multi-instance learning method, which regards each bag as a graph and uses a specific kernel to distinguish the graphs by considering the features of the nodes as well as the features of the edges that convey some relations among instances. The effectiveness of the proposed method is validated by experiments.
arXiv:0807.1997v4 [cs.LG] 13 May 2009
Multi-Instance Learning by Treating Instances
As Non-I.I.D. Samples
Zhi-Hua Zhou, Yu-Yin Sun, and Yu-Feng Li
National Key Laboratory for Novel Software Technology,
Nanjing University, Nanjing 210093, China
{zhouzh, sunyy, liyf}@lamda.nju.edu.cn
Abstract. Previous studies on multi-instance learning typically treated
instances in the bags as independently and identically distributed. The
instances in a bag, however, are rarely independent in real tasks, and
a better performance can be expected if the instances are treated in
an non-i.i.d. way that exploits relations among instances. In this paper,
we propose two simple yet effective methods. In the first method, we
explicitly map every bag to an undirected graph and design a graph
kernel for distinguishing the positive and negative bags. In the second
method, we implicitly construct graphs by deriving affinity matrices and
propose an efficient graph kernel considering the clique information. The
effectiveness of the proposed methods are validated by experiments.
1
Introduction
In multi-instance learning [11], each training example is a bag of instances. A
bag is positive if it contains at least one positive instance, and negative other-
wise. Although the labels of the training bags are known, however, the labels
of the instances in the bags are unknown. The goal is to construct a learner to
classify unseen bags. Multi-instance learning has been found useful in diverse
domains such as image categorization [6, 7], image retrieval [35] , text catego-
rization [2, 24], computer security [22], face detection [27, 32], computer-aided
medical diagnosis [12], etc.
A prominent advantage of multi-instance learning mainly lies in the fact that
many real objects have inherent structures, and by adopting the multi-instance
representation we are able to represent such objects more naturally and capture
more information than simply using the flat single-instance representation. For
example, suppose we can partition an image into several parts. In contrast to
representing the whole image as a single-instance, if we represent each part as
an instance, then the partition information is captured by the multi-instance
representation; and if the partition is meaningful (e.g., each part corresponds to
a region of saliency), the additional information captured by the multi-instance
representation may be helpful to make the learning task easier to deal with.
It is obviously not a good idea to apply multi-instance learning techniques
everywhere since if the single-instance representation is sufficient, using multi-
instance representation just gilds the lily. Even on tasks where the objects have
inherent structures, we should keep in mind that the power of multi-instance
representation exists in its ability of capturing some structure information. How-
ever, as Zhou and Xu [36] indicated, previous studies on multi-instance learning
typically treated the instances in the bags as independently and identically dis-
tributed; this neglects the fact that the relations among the instances convey im-
portant structure information. Considering the above image task again, treating
the different image parts as inter-correlated samples is evidently more meaning-
ful than treating them as unrelated samples. Actually, the instances in a bag are
rarely independent, and a better performance can be expected if the instances
are treated in an non-i.i.d. way that exploits the relations among instances.
In this paper, we propose two multi-instance learning methods which do not
treat the instances as i.i.d. samples. Our basic idea is to regard each bag as
an entity to be processed as a whole, and regard instances as inter-correlated
components of the entity. Experiments show that our proposed methods achieve
performances highly competitive with state-of-the-art multi-instance learning
methods.
The rest of this paper is organized as follows. We briefly review related work
in Section 2, propose the new methods in Section 3, report on our experiments
in Section 4, conclude the paper finally in Section 5.
2
Related Work
Many multi-instance learning methods have been developed during the past
decade. To name a few, Diverse Density [16], k-nearest neighbor algorithm
Citation-kNN [29], decision trees RELIC [22] and MITI [4], neural networks
BP-MIP and RBF-MIP [33], rule learning algorithm RIPPER-MI [9], ensemble
algorithms MIBoosting [31] and MILBoosting [3], logistic regression algorithm
MI-LR [20], etc.
Kernel methods for multi-instance learning have been studied by many re-
searchers. G¨artner et al. [14] defined the MI-Kernel by regarding each bag as
a set of feature vectors and then applying set kernel directly. Andrews et al.
[2] proposed mi-SVM and MI-SVM. mi-SVM tries to identify a maximal mar-
gin hyperplane for the instances with subject to the constraints that at least
one instance of each positive bag locates in the positive half-space while all
instances of negative bags
…(Full text truncated)…
This content is AI-processed based on ArXiv data.