Reproducing kernel Hilbert space: Difference between revisions

Content deleted Content added
Provided some clarification to the statement in the introduction about pointwise convergence not implying convergence under the norm of the reproducing kernel Hilbert space. However, the example I gave does not actually apply to reproducing kernel Hilbert spaces [I used the supremum norm as my example, but the supremum norm cannot be induced by any inner product]. It would be much better if someone had an actual Hilbert space example - I just didn't want perfect to be the enemy of the good here.
m (1) Corrected the missing scaling factor \sigma in the norm for the Laplacian kernel. (2) I changed the usage of the ReLU function as ReLU(x) was useless for because of x >= 0.
Line 210:
*::<math> K(x,y) = e^{-\frac{\|x - y\|}{\sigma}}, \qquad \sigma > 0 </math>
*:The squared norm of a function <math>f</math> in the RKHS <math>H</math> with this kernel is:<ref>Berlinet, Alain and Thomas, Christine. ''[https://books.google.com/books?hl=en&lr=&id=bX3TBwAAQBAJ&oi=fnd&pg=PP11&dq=%22Reproducing+kernel+Hilbert+spaces+in+Probability+and+Statistics%22&ots=jV1gYX6vJ5&sig=um-eULpDSuKtXcYhzTYXwX8ZZzA#v=onepage&q=%22Reproducing%20kernel%20Hilbert%20spaces%20in%20Probability%20and%20Statistics%22&f=false Reproducing kernel Hilbert spaces in Probability and Statistics]'', Kluwer Academic Publishers, 2004</ref>
*::<math>\|f\|_H^2=\intint_{\mathbb R}\Big( \frac1{\sigma} f(x)^2\,dx + \intsigma f'(x)^2\,dxBig) \mathrm d x.</math>
 
===[[Bergman kernel]]s===
Line 295:
This implies <math>K_y=K(\cdot, y)</math> reproduces <math>f</math>.
 
By takingMoreover the limitminimum function on <math>y X\totimes X = [0,\infty)\times [0,\infty) </math>, wehas obtainthe following representations with the ReLUReLu function,:
 
: <math> \min(x,y) = x -\operatorname{ReLU}(x-y) = y - \operatorname{ReLU}(y-x). </math>
: <math>K_\infty(x)=
\begin{cases}
x, & \text{if } x\geq 0\\
0, & \text{otherwise}
\end{cases}
=\operatorname{ReLU}(x)</math>
 
Using this formulation, we can apply the [[representer theorem]] to the RKHS, letting one prove the optimality of using ReLU activations in neural network settings.{{Citation needed|date=January 2022|reason=Optimal in what sense?}}