Revision as of 03:19, 27 May 2019 edit Nbarth (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers, Rollbackers 35,278 edits →top: 0-1 loss function ← Previous edit		Revision as of 21:32, 12 June 2019 edit undo KristianHolsheimer (talk \| contribs) 7 edits put in the logistic sigmoid to make the equivalence between "logistic loss" and "cross entropy" more explicit. Next edit →
Line 101: == Cross entropy loss (Log Loss) == {{main\|Cross entropy}} Using the alternative label convention <math>t=(1+y)/2</math> so that <math>t \in \{0,1\}</math>, the binary cross entropy loss is defined as :<math>V(f(\vec{x}),t) = -t\ln(f\sigma(\vec{x}))-(1-t)\ln(1-f\sigma(\vec{x}))</math> where we introduced the logistic sigmoid: :<math>\sigma(\vec{x}) = \frac{1}{1+\exp(-f(\vec{x}))}</math> It's easy to check that the [[logistic loss]] (above) and binary cross entropy are in fact the same (up to a multiplicative constant <math>1/\ln2</math>). The cross entropy loss is closely related to the [[Kullback-Leibler divergence]] between the empirical distribution and the predicted distribution. This function is not naturally represented as a product of the true label and the predicted value, but is convex and can be minimized using [[stochastic gradient descent]] methods. The cross entropy loss is ubiquitous in modern [[deep learning\|deep neural networks]].

Loss functions for classification: Difference between revisions