Content deleted Content added
→Approximations: variance |
→Approximations: theorem |
||
Line 53:
where <math>\textstyle\varphi</math> is the implicit mapping embedded in the RBF kernel.
One way to construct such a ''z'' is to randomly sample from the [[Fourier transformation]] of the kernel<ref>
'''Theorem:''' <math>\mathbb E[\langle \varphi(x), \varphi(y)\rangle] = e^{\frac{\|x-y\|^2}{2\sigma^2}}</math>.
'''Proof:''' It suffices to prove the case of <math>D=1</math>. Use the trigonometric identity <math>\cos(a-b) = \cos(a)\cos(b) + \sin(a)\sin(b)</math>, the spherical symmetry of gaussian distribution, then evaluate the integral <math>\int_{-\infty}^{\infty} \frac{\cos (k x) e^{-x^2 / 2}}{\sqrt{2 \pi}} d x=e^{-k^2 / 2}</math>.
The variance of the estimator is <math>\propto D^{-1}</math>. (Appendix A.2<ref>{{Cite journal |last=Peng |first=Hao |last2=Pappas |first2=Nikolaos |last3=Yogatama |first3=Dani |last4=Schwartz |first4=Roy |last5=Smith |first5=Noah A. |last6=Kong |first6=Lingpeng |date=2021-03-19 |title=Random Feature Attention |url=http://arxiv.org/abs/2103.02143 |journal=arXiv:2103.02143 [cs]}}</ref>).
|