Revision as of 02:11, 1 March 2021 edit Eudamonic (talk \| contribs) 372 edits added Differentiable computing navbox ← Previous edit		Revision as of 17:46, 4 May 2021 edit undo 198.11.28.44 (talk) update URL for Piyush PDF since old URL broken Next edit →
Line 36: where <math>H</math> indicates the [[Heaviside step function]]. However, this loss function is non-convex and non-smooth, and solving for the optimal solution is an [[NP-hard]] combinatorial optimization problem.<ref name="Utah">{{Citation \| last= Piyush \| first= Rai \| title= Support Vector Machines (Contd.), Classification Loss Functions and Regularizers \| publisher= Utah CS5350/6350: Machine Learning \| date= 13 September 2011 \| url= ~~http~~https://~~www~~cis.~~cs.utah~~temple.edu/~~~piyush~~latecki/~~teaching~~Courses/13AI-~~9-print~~Fall12/Lectures/SVM.pdf \| access-date= 64 ~~December~~May ~~2014~~2021}}</ref> As a result, it is better to substitute '''loss function surrogates''' which are tractable for commonly used learning algorithms, as they have convenient properties such as being convex and smooth. In addition to their computational tractability, one can show that the solutions to the learning problem using these loss surrogates allow for the recovery of the actual solution to the original classification problem.<ref name="uci">{{Citation \| last= Ramanan \| first= Deva \| title= Lecture 14 \| publisher= UCI ICS273A: Machine Learning \| date= 27 February 2008 \| url= http://www.ics.uci.edu/~dramanan/teaching/ics273a_winter08/lectures/lecture14.pdf \| access-date= 6 December 2014}}</ref> Some of these surrogates are described below. In practice, the probability distribution <math>p(\vec{x},y)</math> is unknown. Consequently, utilizing a training set of <math>n</math> [[iid\|independently and identically distributed]] sample points

Loss functions for classification: Difference between revisions