Loss functions for classification: Difference between revisions

Content deleted Content added
added logistic generation
logistic loss
Line 111:
A benefit of the square loss function is that its structure lends itself to easy cross validation of regularization parameters. Specifically for [[Tikhonov regularization]], one can solve for the regularization parameter using leave-one-out [[cross-validation (statistics) |cross-validation]] in the same time as it would take to solve a single problem.<ref>{{Citation| last= Rifkin| first= Ryan M.| last2= Lippert| first2= Ross A.| title= Notes on Regularized Least Squares| publisher= MIT Computer Science and Artificial Intelligence Laboratory| date= 1 May 2007|url=https://dspace.mit.edu/bitstream/handle/1721.1/37318/MIT-CSAIL-TR-2007-025.pdf?sequence=1}}</ref>
 
The minimizer of <math>I[f]</math> for the square loss function iscan be directly found from equation (1) as
 
:<math>f^*_\text{Square}= 2\eta-1=2p(1\mid x)-1.</math>
 
== Logistic loss ==
The logistic loss function can be generated using (2) and Table-I as follows
 
:<math>\phi(v)=C[f^{-1}(v)]+(1-f^{-1}(v))C'[f^{-1}(v)] =\frac{1}{\log(2)}[\frac{-e^v}{1+e^v}\log(\frac{e^v}{1+e^v})-(1-\frac{e^v}{1+e^v})\log(1-\frac{e^v}{1+e^v}))]+(1-\frac{e^v}{1+e^v})[\frac{-1}{\log(2)}(\log(\frac{\frac{e^v}{1+e^v}}{1-\frac{e^v}{1+e^v}}))]=\frac{1}{\log(2)}\log(1+e^{-v}).</math>
 
The logistic loss is convex and grows linearly for negative values which make it less sensitive to outliers.
This function displays a similar convergence rate to the hinge loss function, and since it is continuous, [[gradient descent]] methods can be utilized. However, the logistic loss function does not assign zero penalty to any points. Instead, functions that correctly classify points with high confidence (i.e., with high values of <math>|f(\vec{x})|</math>) are penalized less. This structure leads the logistic loss function to be sensitive to outliers in the data.
 
The minimizer of <math>I[f]</math> for the logistic loss function is can be directly found from equation (1) as
 
:<math>f^*_\text{Logistic}= \log\left(\frac{\eta}{1-\eta}\right)=\log\left(\frac{p(1\mid x)}{1-p(1\mid x)}\right).</math>
 
This function is undefined when <math>p(1\mid x)=1</math> or <math>p(1\mid x)=0</math> (tending toward ∞ and −∞ respectively), but predicts a smooth curve which grows when <math>p(1\mid x)</math> increases and equals 0 when <math>p(1\mid x)= 0.5</math>.<ref name="mitlec" />