Loss functions for classification: Difference between revisions

Content deleted Content added
Kjross (talk | contribs)
No edit summary
Kjross (talk | contribs)
No edit summary
Line 1:
{{User sandbox}}
<!-- EDIT BELOW THIS LINE -->
'''Loss function surrogates for classification''' are computationally feasible [[loss functions]] representing the price we will pay for inaccuracy in our predictions in classification problems. <ref>{{cite doi|10.1162/089976604773135104}}</ref> Specifically, if <math>g: X \mapsto</math> {-1,1} represents the mapping of a vector <math>\vec{x} \in X</math> to a class label <math>y \in </math> {-1,1}, we wish to find a function <math>f: X \mapsto \mathbb{R}</math> which best approximates the true mapping <math>g</math>. (citation needed) Given that [[loss functions]] are always true functions of only one variable, it is standard practice to define loss functions for classification solely in terms of the product of the true classifier <math>y</math> and the predicted value <math>f(\vec{x})</math>. (citation needed) Selection of this <math>\phi(yf(\vec{x}))</math> impacts the optimal <math>f^{*}</math> which minimizes empirical risk, as well as the computational complexity of the learning algorithm.
 
== Square Loss ==