Multiple-instance learning: Difference between revisions

Content deleted Content added
The last sentence was left incomplete.
No edit summary
Line 1:
In [[machine learning]], '''multiple-instance learning''' (MIL) is a variation on [[supervised learning]]. Instead of receiving a set of instances which are individually labeled, the learner receives a set of labeled ''bags'', each containing many instances. In the simple case of multiple-instance [[binary classification]], a bag may be labeled negative if all the instances in it are negative. On the other hand, a bag is labeled positive if there is at least one instance in it which is positive. From a collection of labeled bags, the learner tries to either (i) induce a concept that will label individual instances correctly or (ii) learn how to label bags without inducing the concept.
 
Take image classification for example in {{harvtxt|Amores|2013}}. Given an image, we want to know its target class based on its visual content. For instance, the target class might be "beach", where the image contains both "sand" and "water". In '''MIL''' terms, the image is described as a ''bag'' <math>X = \{X_1,..,X_N\}</math>, where each<math>X_i</math> is the feature vector (called ''instance'') extracted from the corresponding i-th region in the image and N is the total regions (instances) partitioning the image. The bag is labeled ''positive'' ("beach") iff.if it contains both "sand" region instance and "water" region instance.
 
Multiple-instance learning was originally proposed under this name by {{harvtxt|Dietterich|Lathrop|Lozano-Pérez|1997}}, but earlier examples of similar research exist, for instance in the work on [[handwriting|handwritten]] [[Numerical digit|digit]] [[optical character recognition|recognition]] by {{harvtxt|Keeler|Rumelhart|Leow|1990}}. Recent reviews of the MIL literature include {{harvtxt|Amores|2013}}, which provides an extensive review and comparative study of the different paradigms, and {{harvtxt|Foulds|Frank|2010}}, which provides a thorough review of the different assumptions used by different paradigms in the literature.