Content deleted Content added
move |
mNo edit summary |
||
Line 2:
{{Use dmy dates|date=May 2024}}
The
== Setup ==
Given a training set consisting of samples from two classes, the
<math display="block"> \mathbf{Yw} = \mathbf{b} </math>
where <math>\mathbf{Y}</math> is the augmented data matrix with samples from both classes (with appropriate sign conventions, e.g., samples from class 2 are negated), <math>\mathbf{w}</math> is the weight vector to be determined, and <math>\mathbf{b}</math> is a positive margin vector.
Line 22:
== Algorithm ==
The idea of the
* Given any <math>\mathbf{b}</math>, the corresponding <math>\mathbf{w}</math> is known: It is simply <math>\mathbf{w} = \mathbf{Y}^+ \mathbf{b}</math>, where <math>\mathbf{Y}^+</math> denotes the [[Moore–Penrose inverse|
* Therefore, it only remains to find <math>\mathbf{b}</math> by gradient descent.
* However, the gradient descent may sometimes ''decrease'' some of the coordinates of <math>\mathbf{b}</math>, which may cause some coordinates of <math>\mathbf{b}</math> to become negative, which is undesirable. Therefore, whenever some coordinates of <math>\mathbf{b}</math> would have decreased, those coordinates are unchanged instead. As for the coordinates of <math>\mathbf{b}</math> that would increase, those would increase without issue.
Line 44:
== Relationship to other algorithms ==
* [[Perceptron]] algorithm: Both seek linear separators. The perceptron updates weights incrementally based on individual misclassified samples, while
* [[Linear discriminant analysis]] (LDA): LDA assumes underlying Gaussian distributions with equal covariances for the classes and derives the decision boundary from these statistical assumptions.
* [[Support vector machine]]s (SVM): For linearly separable data, SVMs aim to find the maximum-margin hyperplane. The
== Variants ==
Modified
Kernel
== See also ==
Line 63:
== References ==
<references />
* {{cite book |last1=Duda |first1=R. O. |last2=Hart |first2=P. E. |last3=Stork |first3=D. G. |year=2001 |title=Pattern Classification |edition=2nd |chapter=5.9. The
* {{cite book |last1=Bishop |first1=C. M. |year=2006 |title=Pattern Recognition and Machine Learning |publisher=Springer |isbn=978-0-387-31073-2}}
* {{cite book |last1=Theodoridis |first1=S. |last2=Koutroumbas |first2=K. |year=2008 |title=Pattern Recognition |edition=4th |publisher=Academic Press |isbn=978-1-59749-272-0}}
|