Multi-Instance Learning by Treating Instances As Non-I.I.D. Samples

Reading time: 6 minute
...

📝 Original Info

  • Title: Multi-Instance Learning by Treating Instances As Non-I.I.D. Samples
  • ArXiv ID: 0807.1997
  • Date: 2009-05-13
  • Authors: Researchers from original ArXiv paper

📝 Abstract

Multi-instance learning attempts to learn from a training set consisting of labeled bags each containing many unlabeled instances. Previous studies typically treat the instances in the bags as independently and identically distributed. However, the instances in a bag are rarely independent, and therefore a better performance can be expected if the instances are treated in an non-i.i.d. way that exploits the relations among instances. In this paper, we propose a simple yet effective multi-instance learning method, which regards each bag as a graph and uses a specific kernel to distinguish the graphs by considering the features of the nodes as well as the features of the edges that convey some relations among instances. The effectiveness of the proposed method is validated by experiments.

💡 Deep Analysis

Deep Dive into Multi-Instance Learning by Treating Instances As Non-I.I.D. Samples.

Multi-instance learning attempts to learn from a training set consisting of labeled bags each containing many unlabeled instances. Previous studies typically treat the instances in the bags as independently and identically distributed. However, the instances in a bag are rarely independent, and therefore a better performance can be expected if the instances are treated in an non-i.i.d. way that exploits the relations among instances. In this paper, we propose a simple yet effective multi-instance learning method, which regards each bag as a graph and uses a specific kernel to distinguish the graphs by considering the features of the nodes as well as the features of the edges that convey some relations among instances. The effectiveness of the proposed method is validated by experiments.

📄 Full Content

arXiv:0807.1997v4 [cs.LG] 13 May 2009 Multi-Instance Learning by Treating Instances As Non-I.I.D. Samples Zhi-Hua Zhou, Yu-Yin Sun, and Yu-Feng Li National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China {zhouzh, sunyy, liyf}@lamda.nju.edu.cn Abstract. Previous studies on multi-instance learning typically treated instances in the bags as independently and identically distributed. The instances in a bag, however, are rarely independent in real tasks, and a better performance can be expected if the instances are treated in an non-i.i.d. way that exploits relations among instances. In this paper, we propose two simple yet effective methods. In the first method, we explicitly map every bag to an undirected graph and design a graph kernel for distinguishing the positive and negative bags. In the second method, we implicitly construct graphs by deriving affinity matrices and propose an efficient graph kernel considering the clique information. The effectiveness of the proposed methods are validated by experiments. 1 Introduction In multi-instance learning [11], each training example is a bag of instances. A bag is positive if it contains at least one positive instance, and negative other- wise. Although the labels of the training bags are known, however, the labels of the instances in the bags are unknown. The goal is to construct a learner to classify unseen bags. Multi-instance learning has been found useful in diverse domains such as image categorization [6, 7], image retrieval [35] , text catego- rization [2, 24], computer security [22], face detection [27, 32], computer-aided medical diagnosis [12], etc. A prominent advantage of multi-instance learning mainly lies in the fact that many real objects have inherent structures, and by adopting the multi-instance representation we are able to represent such objects more naturally and capture more information than simply using the flat single-instance representation. For example, suppose we can partition an image into several parts. In contrast to representing the whole image as a single-instance, if we represent each part as an instance, then the partition information is captured by the multi-instance representation; and if the partition is meaningful (e.g., each part corresponds to a region of saliency), the additional information captured by the multi-instance representation may be helpful to make the learning task easier to deal with. It is obviously not a good idea to apply multi-instance learning techniques everywhere since if the single-instance representation is sufficient, using multi- instance representation just gilds the lily. Even on tasks where the objects have inherent structures, we should keep in mind that the power of multi-instance representation exists in its ability of capturing some structure information. How- ever, as Zhou and Xu [36] indicated, previous studies on multi-instance learning typically treated the instances in the bags as independently and identically dis- tributed; this neglects the fact that the relations among the instances convey im- portant structure information. Considering the above image task again, treating the different image parts as inter-correlated samples is evidently more meaning- ful than treating them as unrelated samples. Actually, the instances in a bag are rarely independent, and a better performance can be expected if the instances are treated in an non-i.i.d. way that exploits the relations among instances. In this paper, we propose two multi-instance learning methods which do not treat the instances as i.i.d. samples. Our basic idea is to regard each bag as an entity to be processed as a whole, and regard instances as inter-correlated components of the entity. Experiments show that our proposed methods achieve performances highly competitive with state-of-the-art multi-instance learning methods. The rest of this paper is organized as follows. We briefly review related work in Section 2, propose the new methods in Section 3, report on our experiments in Section 4, conclude the paper finally in Section 5. 2 Related Work Many multi-instance learning methods have been developed during the past decade. To name a few, Diverse Density [16], k-nearest neighbor algorithm Citation-kNN [29], decision trees RELIC [22] and MITI [4], neural networks BP-MIP and RBF-MIP [33], rule learning algorithm RIPPER-MI [9], ensemble algorithms MIBoosting [31] and MILBoosting [3], logistic regression algorithm MI-LR [20], etc. Kernel methods for multi-instance learning have been studied by many re- searchers. G¨artner et al. [14] defined the MI-Kernel by regarding each bag as a set of feature vectors and then applying set kernel directly. Andrews et al. [2] proposed mi-SVM and MI-SVM. mi-SVM tries to identify a maximal mar- gin hyperplane for the instances with subject to the constraints that at least one instance of each positive bag locates in the positive half-space while all instances of negative bags

…(Full text truncated)…

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut