Revision as of 14:56, 5 December 2014 edit Kjross (talk \| contribs) 150 edits No edit summary ← Previous edit		Revision as of 15:27, 5 December 2014 edit undo Kjross (talk \| contribs) 150 edits No edit summary Next edit →
Line 9: :<math>V(f(\vec{x}),y)=\mathbf{\theta}(-yf(\vec{x}))</math> where <math>\mathbf{\theta}</math> indicates the [[Heaviside step function]]. However, this loss function is non-convex and non-smooth, and solving for the optimal solution is an [[NP-hard]] combinatorial optimization problem. (cite utah) As a result, we seek continuous, convex '''loss function surrogates''' which are tractable for our learning algorithms. In addition to their computational tractability, the convexity of these functions allows us to provide probabilistic bounds on their estimation error from the true distribution. (cite uci). Some of these surrogates are described below. == Square Loss == While more commonly used in regression, the square loss function can be re-written as a function <math>\phi(yf(\vec{x}))</math> and utilized for classification. Defined as :<math>V(f(\vec{x}),y) = (1-yf(\vec{x}))^2</math> the square loss function is both convex and smooth and matches the 0-1 [[indicator function]] when <math>(yf(\vec{x}))= 0</math> and when <math>(yf(\vec{x})) = 1</math>. == Hinge Loss ==

Loss functions for classification: Difference between revisions