Content deleted Content added
No edit summary |
No edit summary |
||
Line 9:
:<math>V(f(\vec{x}),y)=\mathbf{\theta}(-yf(\vec{x}))</math>
where <math>\mathbf{\theta}</math> indicates the [[Heaviside step function]].
However, this loss function is non-convex and non-smooth, and solving for the optimal solution is an [[NP-hard]] combinatorial optimization problem. (cite utah) As a result, we seek continuous, convex '''loss function surrogates''' which are tractable for our learning algorithms. In addition to their computational tractability, the convexity of these functions allows us to provide probabilistic bounds on their estimation error from the true distribution. (cite uci)
== Square Loss ==
While more commonly used in regression, the square loss function can be re-written as a function <math>\phi(yf(\vec{x}))</math> and utilized for classification. Defined as
:<math>V(f(\vec{x}),y) = (1-yf(\vec{x}))^2</math>
the square loss function is both convex and smooth and matches the 0-1 [[indicator function]] when <math>(yf(\vec{x}))= 0</math> and when <math>(yf(\vec{x})) = 1</math>.
== Hinge Loss ==
|