Revision as of 01:56, 24 July 2014 edit Monkbot (talk \| contribs) Bots 3,695,952 edits m Task 5: Fix CS1 deprecated coauthor parameter errors ← Previous edit		Revision as of 15:00, 16 July 2015 edit undo Trappist the monk (talk \| contribs) Administrators 494,619 edits m replace/remove deprecated cs1\|2 parameters; using AWB Next edit →
Line 13: where <math>\mathcal{H}</math> is a [[hypothesis space]]<ref>A hypothesis space is the set of functions used to model the data in a machine learning problem. Each function corresponds to a hypothesis about the structure of the data. Typically the functions in a hypothesis space form a [[Hilbert space]] of functions with norm formed from the loss function.</ref> of functions, <math>V:\mathbf Y \times \mathbf Y \to \mathbb R</math> is the loss function, <math>\|\|\cdot\|\|_\mathcal H</math> is a [[norm (mathematics)\|norm]] on the hypothesis space of functions, and <math>\lambda\in\mathbb R</math> is the [[regularization parameter]].<ref>For insight on choosing the parameter, see, e.g., {{cite journal\|last=Wahba\|first=Grace\|author2=Yonghua Wang \|title=When is the optimal regularization parameter insensitive to the choice of the loss function\|journal=Communications in Statistics - Theory and Methods\|year=1990\|volume=19\|issue=5\|pages=1685–1700\|doi=10.1080/03610929008830285\|url=http://www.tandfonline.com/doi/abs/10.1080/03610929008830285}}</ref> When <math>\mathcal{H}</math> is a [[reproducing kernel Hilbert space]], there exists a [[kernel function]] <math>K: \mathbf X \times \mathbf X \to \mathbb R</math> that can be written as an <math>n\times n</math> [[symmetric]] [[Positive-definite kernel\|positive definite]] [[matrix (mathematics)\|matrix]] <math>\mathbf K</math>. By the [[representer theorem]],<ref>See {{cite journal\|last=Scholkopf\|first=Bernhard \|~~coauthors~~author2=Ralf Herbrich ~~and~~ \|author3=Alex Smola\|title=A Generalized Representer Theorem\|journal=Computational Learning Theory: Lecture Notes in Computer Science\|year=2001\|volume=2111\|pages=416–426\|doi=10.1007/3-540-44581-1_27\|url=http://www.springerlink.com/content/v1tvba62hd4837h9/?MUD=MP}}</ref> <math>f(x_i) = \sum_{f=1}^n c_j \mathbf K_{ij}</math>, and <math> \|\|f\|\|^2_{\mathcal H} = \langle f,f\rangle_\mathcal H = \sum_{i=1}^n\sum_{j=1}^n c_ic_jK(x_i,x_j) = c^T\mathbf K c </math> ==Special properties of the hinge loss== Line 41: {{Reflist}} {{cite journal\|last=Evgeniou\|first=Theodoros \|~~coauthors~~author2=Massimiliano Pontil ~~and~~ \|author3=Tomaso Poggio\|title=Regularization Networks and Support Vector Machines\|journal=Advances in Computational Mathematics\|year=2000\|volume=13\|issue=1\|pages=1–50\|doi=10.1023/A:1018946025316\|url=http://cbcl.mit.edu/projects/cbcl/publications/ps/evgeniou-reviewall.pdf}} {{cite web\|last=Joachims\|first=Thorsten\|title=SVMlight\|url=http://svmlight.joachims.org/}}

Regularization perspectives on support vector machines: Difference between revisions