Active Learning with Multiple Views
Active learners alleviate the burden of labeling large amounts of data by detecting and asking the user to label only the most informative examples in the domain. We focus here on active learning for multi-view domains, in which there are several disjoint subsets of features (views), each of which is sufficient to learn the target concept. In this paper we make several contributions. First, we introduce Co-Testing, which is the first approach to multi-view active learning. Second, we extend the multi-view learning framework by also exploiting weak views, which are adequate only for learning a concept that is more general/specific than the target concept. Finally, we empirically show that Co-Testing outperforms existing active learners on a variety of real world domains such as wrapper induction, Web page classification, advertisement removal, and discourse tree parsing.
💡 Research Summary
The paper addresses the costly problem of obtaining labeled data by introducing a novel active‑learning algorithm designed specifically for multi‑view learning scenarios, called Co‑Testing. In a multi‑view setting the feature space is partitioned into several disjoint subsets (views), each of which is sufficient on its own to learn the target concept. Traditional multi‑view methods are semi‑supervised: they exploit unlabeled data to bootstrap classifiers across views but they do not decide which examples should be labeled. Co‑Testing fills this gap by iteratively (1) training a separate classifier on each view using the currently labeled set, and (2) scanning the pool of unlabeled instances to find “contention points” – examples on which the view‑specific classifiers disagree. By definition, at least one view makes a mistake on a contention point, so asking the user to label such an instance guarantees that the erroneous view receives highly informative feedback.
A key contribution is the relaxation of the strong‑view assumption. The authors introduce the notion of weak views, which can only learn a concept that is more general or more specific than the target. Although a weak view alone cannot recover the target, its predictions still contribute to the generation of contention points, thereby increasing the pool of useful queries. This allows Co‑Testing to exploit all available sources of information without additional feature‑engineering effort.
The algorithm supports several query‑selection strategies. The simplest “Naïve Co‑Testing” picks a random contention point. More sophisticated variants estimate the expected information gain or expected error reduction for each contention point and select the one with the highest expected benefit. Because the candidate set is already limited to contention points, these estimations are far cheaper than the full‑pool uncertainty‑sampling or version‑space reduction methods that must evaluate every unlabeled example.
Another important advantage is the minimal assumptions about the base learner. Conventional uncertainty‑reduction approaches require a classifier that can produce reliable confidence scores (e.g., logistic regression, SVMs, Naïve Bayes). Co‑Testing only needs the ability to produce a class label per view, making it applicable to a wide range of learners, including decision trees, rule learners, or even non‑probabilistic models. Consequently, each view can employ the learner that best fits its feature representation, and the algorithm remains agnostic to the underlying learning paradigm.
Empirical evaluation is performed on four real‑world tasks that naturally admit multiple views: (1) wrapper induction for information extraction, (2) web‑page classification using page text versus hyperlink text, (3) advertisement removal from broadcast video using audio versus visual cues, and (4) discourse‑tree parsing using lexical versus syntactic features. For each domain the authors compare Co‑Testing against state‑of‑the‑art pool‑based active learners such as Query‑by‑Committee, uncertainty sampling, and expected‑error‑minimization. Results consistently show that Co‑Testing reaches higher accuracy with fewer labeled instances. The benefit is especially pronounced when weak views are incorporated, confirming the theoretical claim that weak views boost the number of informative contention points.
In summary, Co‑Testing provides an efficient, view‑agnostic active‑learning framework that (a) leverages disagreement among multiple classifiers to identify the most informative queries, (b) accommodates both strong and weak views without extra engineering, (c) reduces computational overhead by limiting query evaluation to contention points, and (d) imposes virtually no restrictions on the underlying learning algorithms. These properties make Co‑Testing a compelling choice for modern multimodal, multisource, or otherwise richly featured learning problems where labeling resources are scarce.
Comments & Academic Discussion
Loading comments...
Leave a Comment