Loss functions for classification

This sandbox is in the article namespace. Either move this page into your userspace, or remove the {{User sandbox}} template.

Loss function surrogates for classification are computationally feasible loss functions representing the price we will pay for inaccuracy in our predictions in classification problems. ^[1] Specifically, if $g:X\mapsto$ {-1,1} represents the mapping of a vector ${\vec {x}}\in X$ to a class label $y\in$ {-1,1}, we wish to find a function $f:X\mapsto \mathbb {R}$ which best approximates the true mapping $g$ . (citation needed) Given that loss functions are always true functions of only one variable, it is standard practice to define loss functions for classification solely in terms of the product of the true classifier $y$ and the predicted value $f({\vec {x}})$ . (citation needed) Selection of a loss function in this manner

V(f({\vec {x}}),y)=\phi (yf({\vec {x}}))

impacts the optimal $f^{*}$ which minimizes empirical risk, as well as the computational complexity of the learning algorithm.

Given the binary nature of classification, a natural selection for a loss function (assuming equal disdain for false positives and false negatives) would be the 0-1 indicator function which takes the value of 0 if the predicted classification equals that of the true class or a 1 if the predicted classification does not match the true class. Consequently, we could choose the loss function:

V(f({\vec {x}}),y)=\mathbf {\theta } (-yf({\vec {x}}))

where $\mathbf {\theta }$ indicates the Heaviside step function. However, this loss function is not convex, making it intractable for most optimization problems. (cite) As a result, we seek continuous, convex loss function surrogates which are tractable for our learning algorithms. Some of these surrogates are described below.

Square Loss

Hinge Loss

Hinge loss

V(f({\vec {x}}),y)=\max(0,1-yf({\vec {x}}))=|1-yf({\vec {x}})|_{+}

Logistic Loss

References

^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1162/089976604773135104, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1162/089976604773135104 instead.

[1] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1162/089976604773135104, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1162/089976604773135104 instead.

[1]