Revision as of 17:46, 4 May 2021 edit 198.11.28.44 (talk) update URL for Piyush PDF since old URL broken ← Previous edit		Revision as of 17:53, 2 November 2021 edit undo Tmlabonte (talk \| contribs) 7 edits Bayes consistency is \phi'(0)<0; originally said \phi'(0)=0 which is incorrect. See Thm 2 in Bartlett et al. 2006. Next edit →
Line 55: A loss function is said to be ''classification-calibrated or Bayes consistent'' if its optimal <math>f^_{\phi}</math> is such that <math>f^_{0/1}(\vec{x}) = \operatorname{sgn}(f^_{\phi}(\vec{x}))</math>and is thus optimal under the Bayes decision rule. A Bayes consistent loss function allows us to find the Bayes optimal decision function <math>f^_{\phi}</math> by directly minimizing the expected risk and without having to explicitly model the probability density functions. For convex margin loss <math>\phi(\upsilon)</math>, it can be shown that <math>\phi(\upsilon)</math> is Bayes consistent if and only if it is differentiable at 0 and <math>\phi'(0)=<0</math>.<ref>{{Cite journal\|last1=Bartlett\|first1=Peter L.\|last2=Jordan\|first2=Michael I.\|last3=Mcauliffe\|first3=Jon D.\|date=2006\|title=Convexity, Classification, and Risk Bounds\|journal=Journal of the American Statistical Association\|volume=101\|issue=473\|pages=138–156\|issn=0162-1459\|jstor=30047445\|doi=10.1198/016214505000000907\|s2cid=2833811}}</ref><ref name="mit" /> Yet, this result does not exclude the existence of non-convex Bayes consistent loss functions. A more general result states that Bayes consistent loss functions can be generated using the following formulation <ref name=":0">{{Cite journal\|last1=Masnadi-Shirazi\|first1=Hamed\|last2=Vasconcelos\|first2=Nuno\|date=2008\|title=On the Design of Loss Functions for Classification: Theory, Robustness to Outliers, and SavageBoost\|url=https://papers.nips.cc/paper/3591-on-the-design-of-loss-functions-for-classification-theory-robustness-to-outliers-and-savageboost.pdf\|journal=Proceedings of the 21st International Conference on Neural Information Processing Systems\|series=NIPS'08\|___location=USA\|publisher=Curran Associates Inc.\|pages=1049–1056\|isbn=9781605609492}}</ref> :<math>\phi(v)=C[f^{-1}(v)]+(1-f^{-1}(v))C'[f^{-1}(v)] \;\;\;\;\;(2)</math>,

Loss functions for classification: Difference between revisions