Radial basis function kernel: Difference between revisions

Content deleted Content added
Line 53:
where <math>\textstyle\varphi</math> is the implicit mapping embedded in the RBF kernel.
 
One way to construct such a ''z'' is to randomly sample from the [[Fourier transformation]] of the kernel<ref>Ali Rahimi and Benjamin Recht (2007). [http://www.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf "Random features for large-scale kernel machines"]. ''Neural Information Processing Systems''.</ref><math display="block">\varphi(x) = \frac{1}{\sqrt D}[\cos\langle w_1, x\rangle, \sin\langle w_1, x\rangle, \cdots \cos\langle w_D, x\rangle, \sin\langle w_D, x\rangle]^T</math>where <math>w_1, ..., w_D</math> are independent samples from the normal distribution <math>N(0, \sigma^2 I)</math>. The proof boils down to evaluating the integral <math>\int_{-\infty}^{\infty} \frac{\cos (k x) e^{-x^2 / 2}}{\sqrt{2 \pi}} d x=e^{-k^2 / 2}</math>.
 
where <math>w_1, ..., w_D</math> are independent samples from the normal distribution <math>N(0, \sigma^2 I)</math>.
The variance of the estimator is <math>\propto D^{-1}</math>. (Appendix A.2<ref>{{Cite journal |last=Peng |first=Hao |last2=Pappas |first2=Nikolaos |last3=Yogatama |first3=Dani |last4=Schwartz |first4=Roy |last5=Smith |first5=Noah A. |last6=Kong |first6=Lingpeng |date=2021-03-19 |title=Random Feature Attention |url=http://arxiv.org/abs/2103.02143 |journal=arXiv:2103.02143 [cs]}}</ref>).
 
Another approach uses the [[Nyström method]] to approximate the [[eigendecomposition]] of the [[Gramian matrix|Gram matrix]] ''K'', using only a random sample of the training set.<ref>{{cite journal |authors=C.K.I. Williams and M. Seeger|title=Using the Nyström method to speed up kernel machines |journal=Advances in Neural Information Processing Systems |year=2001 |volume=13 |url= http://papers.nips.cc/paper/1866-using-the-nystrom-method-to-speed-up-kernel-machines}}</ref>