Revision as of 22:29, 4 December 2014 edit Kjross (talk \| contribs) 150 edits No edit summary ← Previous edit		Revision as of 14:51, 5 December 2014 edit undo Kjross (talk \| contribs) 150 edits No edit summary Next edit →
Line 6: impacts the optimal <math>f^{*}</math> which minimizes empirical risk, as well as the computational complexity of the learning algorithm. Given the binary nature of classification, a natural selection for a loss function (assuming equal ~~disdain~~cost for [[false positives and false negatives]]) would be the 0-1 [[indicator function]] which takes the value of 0 if the predicted classification equals that of the true class or a 1 if the predicted classification does not match the true class. Consequently, we could choose the loss function: :<math>V(f(\vec{x}),y)=\mathbf{\theta}(-yf(\vec{x}))</math> where <math>\mathbf{\theta}</math> indicates the [[Heaviside step function]]. However, this loss function is ~~not~~ non-convex, ~~making~~and itnon-smooth, ~~intractable~~and solving for ~~most~~the optimal solution is an [[NP-hard]] combinatorial optimization ~~problems~~problem. (cite utah) As a result, we seek continuous, convex '''loss function surrogates''' which are tractable for our learning algorithms. Some of these surrogates are described below. == Square Loss ==

Loss functions for classification: Difference between revisions