Content deleted Content added
mNo edit summary |
→Approximations: better explain z, add Nyström method |
||
Line 26:
==Approximations==
Because support vector machines and other models employing the [[kernel trick]] do not scale well to large numbers of training samples or large numbers of features in the input space, several approximations to the RBF kernel (and similar kernels) have been devised.<ref>Andreas Müller (2012). [http://peekaboo-vision.blogspot.de/2012/12/kernel-approximations-for-efficient.html Kernel Approximations for Efficient SVMs (and other feature extraction methods)].</ref>
Typically, these take the form of a function ''z''
:<math>z(\mathbf{x})z(\mathbf{x'}) \approx \varphi(\mathbf{x})\varphi(\mathbf{x'}) = K(\mathbf{x}, \mathbf{x'})</math>
Line 32:
where <math>\textstyle\varphi</math> is the implicit mapping embedded in the RBF kernel.
One way to construct such a ''z'' is to randomly sample from the [[Fourier transformation]] of the kernel.<ref>Ali Rahimi and Benjamin Recht (2007). Random features for large-scale kernel machines. Neural Information Processing Systems.</ref> Another approach uses the [[Nyström method]] to approximate the [[eigendecomposition]] of the [[Gramian matrix|Gram matrix]] ''K'', using only a random sample of the training set.<rev>{{cite journal |authors=Williams, C.K.I. and Seeger, M. |title=Using the Nyström method to speed up kernel machines |journal=Advances in Neural Information Processing Systems |year=2001}}</ref>
==External links==
|