Loss functions for classification: Difference between revisions

Content deleted Content added
adding links to references using Google Scholar
added "margin losses".
Line 18:
For computational ease, it is standard practice to write [[loss functions]] as functions of only one variable. Within classification, loss functions are generally written solely in terms of the product of the true classifier <math>y</math> and the predicted value <math>f(\vec{x})</math>.<ref name="robust"> {{Citation | last= Masnadi-Shirazi | first= Hamed | last2= Vasconcelos | first2= Nuno | title= On the Design of Loss Functions for Classification: theory, robustness to outliers, and SavageBoost | publisher= Statistical Visual Computing Laboratory, University of California, San Diego | url= http://www.svcl.ucsd.edu/publications/conference/2008/nips08/NIPS08LossesWITHTITLE.pdf | accessdate= 6 December 2014}}</ref> Selection of a loss function within this framework
:<math>V(f(\vec{x}),y)=\phi(-yf(\vec{x}))</math>
impacts the optimal <math>f^{*}_S</math> which [[empirical risk minimization |minimizes empirical risk]], as well as the computational complexity of the learning algorithm. Loss functions in this form are known as margin losses.
 
Given the binary nature of classification, a natural selection for a loss function (assuming equal cost for [[false positives and false negatives]]) would be the [[0-1 loss function]] (0–1 [[indicator function]]), which takes the value of 0 if the predicted classification equals that of the true class or a 1 if the predicted classification does not match the true class. This selection is modeled by