Revision as of 20:18, 6 December 2014 edit Kjross (talk \| contribs) 150 edits →Hinge Loss ← Previous edit		Revision as of 20:19, 6 December 2014 edit undo Kjross (talk \| contribs) 150 edits →Hinge Loss Next edit →
Line 20: The hinge loss function is defined as :<math>V(f(\vec{x}),y) = \max(0, 1-yf(\vec{x})) = \|1 - yf(\vec{x}) \|_{+}</math> The hinge loss provides a relatively tight, convex upper bound on the 0-1 [[indicator function]]. Specifically, the hinge loss equals the 0-1 [[indicator function]] when <math>sgn(f(\vec{x})) = y</math> and <math>yf(\vec{x}) ≥>= 1</math>. In addition, the structure provides a "maximum-margin" classification for [[support vector machines]] (SVMs). While the hinge loss function is both convex and continuous, it is not smooth (that is not differentiable) at <math>yf(\vec{x})=1</math>. Consequently, the hinge loss function cannot be used with gradient descent methods or stochastic gradient descent methods which rely on differentiability over the entire ___domain. However, the hinge loss does have a subgradient at <math>yf(\vec{x})=1</math>, which allows for the utilization of subgradient descent methods. (cite Utah)

Loss functions for classification: Difference between revisions