Winnow (algorithm): Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 18:21, 29 January 2015 edit 2601:2:4d00:27b:30ed:fbde:48c8:c301 (talk) No edit summary ← Previous edit		Latest revision as of 03:32, 13 February 2020 edit undo GreenC bot (talk \| contribs) Bots 3,051,499 edits Move 1 url. Wayback Medic 2.5
(13 intermediate revisions by 11 users not shown)
Line 1: The '''winnow algorithm'''<ref name="littlestone88"> Nick Littlestone (1988). "Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm", ''[~~http~~https://~~www~~doi.~~springerlink~~org/10.~~com/content/j0k7t38567325716/~~1023%2FA%3A1022869011914 ''Machine Learning'' 285–318(2)].</ref> is a technique from [[machine learning]] for learning a [[linear classifier]] from labeled examples. It is very similar to the [[perceptron\|perceptron algorithm]]. However, the perceptron algorithm uses an additive weight-update scheme, while Winnow uses a [[Multiplicative Weight Update Method\|multiplicative scheme]] that allows it to perform much better when many dimensions are irrelevant (hence its name [[winnowing\|winnow]]). It is a simple algorithm that scales well to high-dimensional data. During training, Winnow is shown a sequence of positive and negative examples. From these it learns a decision [[hyperplane]] that can then be used to label novel examples as positive or negative. The algorithm can also be used in the [[Online machine learning\|online learning]] setting, where the learning and the classification phase are not clearly separated.▼ ~~The '''winnow algorithm'''<ref name="littlestone88">~~ ▲Nick Littlestone (1988). "Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm", ''[http://www.springerlink.com/content/j0k7t38567325716/ Machine Learning 285–318(2)].</ref> is a technique from [[machine learning]] for learning a [[linear classifier]] from labeled examples. It is very similar to the [[perceptron\|perceptron algorithm]]. However, the perceptron algorithm uses an additive weight-update scheme, while Winnow uses a multiplicative scheme that allows it to perform much better when many dimensions are irrelevant (hence its name). It is a simple algorithm that scales well to high-dimensional data. During training, Winnow is shown a sequence of positive and negative examples. From these it learns a decision [[hyperplane]] that can then be used to label novel examples as positive or negative. The algorithm can also be used in the [[Online machine learning\|online learning]] setting, where the learning and the classification phase are not clearly separated. == Algorithm == The basic algorithm, Winnow1, is as follows. The instance space is <math>X=\{0,1\}^n</math>, that is, each instance is described as a set of [[Boolean-valued]] [[features (pattern recognition)\|features]]. The algorithm maintains non-negative weights <math>w_i</math> for <math>i\in \{1,\ldots,n\}</math>, which are initially set to 1, one weight for each feature. When the learner is given an example <math>(x_1,\ldots,x_n)</math>, it applies the typical prediction rule for linear classifiers: Line 14 ⟶ 13: * If an example is correctly classified, do nothing. * If an example is predicted toincorrectly ~~be 1 but~~and the correct result was 0, ~~all~~for ofeach ~~the~~feature ~~weights implicated in~~<math>x_{i}=1</math>, the ~~mistake~~corresponding ~~are divided by~~weight <math>~~\alpha~~w_{i}</math> is set to 0 (demotion step). : <math>\forall x_{i} = 1, w_{i} = 0</math> If an example is predicted toincorrectly ~~be 0 but~~and the correct result was 1, ~~all~~for ofeach ~~the~~feature ~~weights implicated in~~<math>x_{i}=1</math>, the ~~mistake~~corresponding ~~are multiplied by~~weight <math>~~\alpha~~w_{i}</math> multiplied by {{mvar\|α}}(promotion step). *: <math>\forall x_{i} = 1, w_{i} = \alpha w_{i}</math> ~~Here, "implicated" means weights on features of the instance that have value 1.~~ A typical value for ~~<math>\~~{{mvar\|&alpha~~</math>~~;}} is 2. There are many variations to this basic approach. ''Winnow2''<ref name="littlestone88"/> is similar except that in the demotion step the weights are divided by ~~<math>\~~{{mvar\|&alpha~~</math>~~;}} instead of being set to 0. ''Balanced Winnow'' maintains two sets of weights, and thus two hyperplanes. This can then be generalized for [[multi-label classification]]. ==Mistake bounds==