Loss functions for classification: Difference between revisions

Content deleted Content added
AnomieBOT (talk | contribs)
m Dating maintenance tags: {{Merge}}
Bayes consistency: Added definition of \eta
Line 30:
One can solve for the minimizer of <math>I[f]</math> by taking the functional derivative of the last equality with respect to <math>f</math> and setting the derivative equal to 0. This will result in the following equation
 
:<math>\frac{\partial \phi(f)}{\partial f}\eta + \frac{\partial \phi(-f)}{\partial f}(1-\eta)=0, \;\;\;\;\;(1)</math>{{Citation needed|date=February 2023}}{{Clarify|reason=What is η?|date=February 2023}}
:<math>
\frac{\partial \phi(f)}{\partial f}\eta + \frac{\partial \phi(-f)}{\partial f}(1-\eta)=0 \;\;\;\;\;(1)
</math>{{Citation needed|date=February 2023}}{{Clarify|reason=What is η?|date=February 2023}}
 
:where <math>
which is also equivalent to setting the derivative of the conditional risk equal to zero.
\eta=p(y=1|\vec{x})
</math>, which is also equivalent to setting the derivative of the conditional risk equal to zero.
 
Given the binary nature of classification, a natural selection for a loss function (assuming equal cost for [[false positives and false negatives]]) would be the [[0-1 loss function]] (0–1 [[indicator function]]), which takes the value of 0 if the predicted classification equals that of the true class or a 1 if the predicted classification does not match the true class. This selection is modeled by