Computer Science / Artificial Intelligence Computer Science / Machine Learning

Multi-Instance Learning by Treating Instances As Non-I.I.D. Samples

February 23, 2026

Reading time: 6 minute

...

#Machine Learning #Computer Science #Learning #Artificial Intelligence

📝 Original Info

Title: Multi-Instance Learning by Treating Instances As Non-I.I.D. Samples
ArXiv ID: 0807.1997
Date: 2009-05-13
Authors: Researchers from original ArXiv paper

📝 Abstract

Multi-instance learning attempts to learn from a training set consisting of labeled bags each containing many unlabeled instances. Previous studies typically treat the instances in the bags as independently and identically distributed. However, the instances in a bag are rarely independent, and therefore a better performance can be expected if the instances are treated in an non-i.i.d. way that exploits the relations among instances. In this paper, we propose a simple yet effective multi-instance learning method, which regards each bag as a graph and uses a specific kernel to distinguish the graphs by considering the features of the nodes as well as the features of the edges that convey some relations among instances. The effectiveness of the proposed method is validated by experiments.

💡 Deep Analysis

Deep Dive into Multi-Instance Learning by Treating Instances As Non-I.I.D. Samples.

📄 Full Content

arXiv:0807.1997v4 [cs.LG] 13 May 2009 Multi-Instance Learning by Treating Instances As Non-I.I.D. Samples Zhi-Hua Zhou, Yu-Yin Sun, and Yu-Feng Li National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China {zhouzh, sunyy, liyf}@lamda.nju.edu.cn Abstract. Previous studies on multi-instance learning typically treated instances in the bags as independently and identically distributed. The instances in a bag, however, are rarely independent in real tasks, and a better performance can be expected if the instances are treated in an non-i.i.d. way that exploits relations among instances. In this paper, we propose two simple yet eﬀective methods. In the ﬁrst method, we explicitly map every bag to an undirected graph and design a graph kernel for distinguishing the positive and negative bags. In the second method, we implicitly construct graphs by deriving aﬃnity matrices and propose an eﬃcient graph kernel considering the clique information. The eﬀectiveness of the proposed methods are validated by experiments. 1 Introduction In multi-instance learning [11], each training example is a bag of instances. A bag is positive if it contains at least one positive instance, and negative other- wise. Although the labels of the training bags are known, however, the labels of the instances in the bags are unknown. The goal is to construct a learner to classify unseen bags. Multi-instance learning has been found useful in diverse domains such as image categorization [6, 7], image retrieval [35] , text catego- rization [2, 24], computer security [22], face detection [27, 32], computer-aided medical diagnosis [12], etc. A prominent advantage of multi-instance learning mainly lies in the fact that many real objects have inherent structures, and by adopting the multi-instance representation we are able to represent such objects more naturally and capture more information than simply using the ﬂat single-instance representation. For example, suppose we can partition an image into several parts. In contrast to representing the whole image as a single-instance, if we represent each part as an instance, then the partition information is captured by the multi-instance representation; and if the partition is meaningful (e.g., each part corresponds to a region of saliency), the additional information captured by the multi-instance representation may be helpful to make the learning task easier to deal with. It is obviously not a good idea to apply multi-instance learning techniques everywhere since if the single-instance representation is suﬃcient, using multi- instance representation just gilds the lily. Even on tasks where the objects have inherent structures, we should keep in mind that the power of multi-instance representation exists in its ability of capturing some structure information. How- ever, as Zhou and Xu [36] indicated, previous studies on multi-instance learning typically treated the instances in the bags as independently and identically dis- tributed; this neglects the fact that the relations among the instances convey im- portant structure information. Considering the above image task again, treating the diﬀerent image parts as inter-correlated samples is evidently more meaning- ful than treating them as unrelated samples. Actually, the instances in a bag are rarely independent, and a better performance can be expected if the instances are treated in an non-i.i.d. way that exploits the relations among instances. In this paper, we propose two multi-instance learning methods which do not treat the instances as i.i.d. samples. Our basic idea is to regard each bag as an entity to be processed as a whole, and regard instances as inter-correlated components of the entity. Experiments show that our proposed methods achieve performances highly competitive with state-of-the-art multi-instance learning methods. The rest of this paper is organized as follows. We brieﬂy review related work in Section 2, propose the new methods in Section 3, report on our experiments in Section 4, conclude the paper ﬁnally in Section 5. 2 Related Work Many multi-instance learning methods have been developed during the past decade. To name a few, Diverse Density [16], k-nearest neighbor algorithm Citation-kNN [29], decision trees RELIC [22] and MITI [4], neural networks BP-MIP and RBF-MIP [33], rule learning algorithm RIPPER-MI [9], ensemble algorithms MIBoosting [31] and MILBoosting [3], logistic regression algorithm MI-LR [20], etc. Kernel methods for multi-instance learning have been studied by many re- searchers. G¨artner et al. [14] deﬁned the MI-Kernel by regarding each bag as a set of feature vectors and then applying set kernel directly. Andrews et al. [2] proposed mi-SVM and MI-SVM. mi-SVM tries to identify a maximal mar- gin hyperplane for the instances with subject to the constraints that at least one instance of each positive bag locates in the positive half-space while all instances of negative bags

…(Full text truncated)…

🇰🇷 이 논문을 한글로 읽기

📄 Read Full PDF on ArXiv

Reference

This content is AI-processed based on ArXiv data.

Multi-Instance Learning by Treating Instances As Non-I.I.D. Samples

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

Learning Bayesian Networks from Incomplete Databases

Learning Bayesian Networks with Local Structure

Structure and Parameter Learning for Causal Independence and Causal Interaction Models

Start searching

No results found