Reproducing kernel Hilbert space: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 07:26, 29 April 2025 edit Luca Innocenti (talk \| contribs) 454 edits m notational consistency Tag: Visual edit ← Previous edit		Latest revision as of 07:22, 17 August 2025 edit undo DrinksOrCoffee (talk \| contribs) Extended confirmed users 1,466 edits Link suggestions feature: 2 links added. Tags: Visual edit Mobile edit Mobile web edit Advanced mobile edit Newcomer task Suggested: add links
(8 intermediate revisions by 4 users not shown)
Line 4: In [[functional analysis]], a '''reproducing kernel Hilbert space''' ('''RKHS''') is a [[Hilbert space]] of functions in which point evaluation is a continuous [[linear functional]]. Specifically, a Hilbert space <math>H</math> of functions from a set <math>X</math> (to <math>\mathbb{R}</math> or <math>\mathbb{C}</math>) is an RKHS if the point-evaluation functional <math>L_x:H\to\mathbb{C}</math>, <math>L_x(f)=f(x)</math>, is continuous for every <math>x\in X</math>. Equivalently, <math>H</math> is an RKHS if there exists a function <math>K_x \in H</math> such that, for all <math>f \in H</math>,<math display="block">\langle f, K_x \rangle = f(x).</math>The function <math>K_x</math> is then called the ''reproducing kernel'', and it reproduces the value of <math>f</math> at <math>x</math> via the inner product. An immediate consequence of this property is that convergence in norm implies [[uniform convergence]] on any subset of <math>X</math> on which <math>\\|K_x\\|</math> is bounded. However, the converse does not necessarily hold. Often the set <math>X</math> carries a topology, and <math>\\|K_x\\|</math> depends continuously on <math>x\in X</math>, in which case: convergence in norm implies uniform convergence on compact subsets of <math>X</math>. It is not entirely straightforward to construct natural examples of a Hilbert space which are not an RKHS in a non-trivial fashion.<ref>Alpay, D., and T. M. Mills. "A family of Hilbert spaces which are not reproducing kernel Hilbert spaces." J. Anal. Appl. 1.2 (2003): 107–111.</ref> Some examples, however, have been found.<ref> Z. Pasternak-Winiarski, "On weights which admit reproducing kernel of Bergman type", ''International Journal of Mathematics and Mathematical Sciences'', vol. 15, Issue 1, 1992. </ref><ref> T. Ł. Żynda, "On weights which admit reproducing kernel of Szegő type", ''Journal of Contemporary Mathematical Analysis'' (Armenian Academy of Sciences), 55, 2020. </ref> Line 12: An RKHS is associated with a kernel that reproduces every function in the space in the sense that for every <math>x</math> in the set on which the functions are defined, "evaluation at <math>x</math>" can be performed by taking an inner product with a function determined by the kernel. Such a ''reproducing kernel'' exists if and only if every evaluation functional is continuous. The reproducing kernel was first introduced in the 1907 work of [[Stanisław Zaremba (mathematician)\|Stanisław Zaremba]]{{fact\|date=June 2025}} concerning [[boundary value problem]]s for [[Harmonic function\|harmonic]] and [[Biharmonic equation\|biharmonic functions]]. [[James Mercer (mathematician)\|James Mercer]] simultaneously examined [[Positive-definite kernel\|functions]] which satisfy the reproducing property in the theory of [[integral equation]]s. The idea of the reproducing kernel remained untouched for nearly twenty years until it appeared in the dissertations of [[Gábor Szegő]], [[Stefan Bergman]], and [[Salomon Bochner]]. The subject was eventually systematically developed in the early 1950s by [[Nachman Aronszajn]] and Stefan Bergman.<ref>Okutmustur</ref> These spaces have wide applications, including [[complex analysis]], [[harmonic analysis]], and [[quantum mechanics]]. Reproducing kernel Hilbert spaces are particularly important in the field of [[statistical learning theory]] because of the celebrated [[representer theorem]] which states that every function in an RKHS that minimises an empirical risk functional can be written as a [[linear combination]] of the kernel function evaluated at the training points. This is a practically useful result as it effectively simplifies the [[empirical risk minimization]] problem from an infinite dimensional to a finite dimensional optimization problem. Line 103: :'''Theorem'''. Suppose ''K'' is a symmetric, [[positive definite kernel]] on a set ''X''. Then there is a unique Hilbert space of functions on ''X'' for which ''K'' is a reproducing kernel. '''Proof'''. For all ''x'' in ''X'', define ''K<sub>x</sub>'' = ''K''(''x'', ⋅ ). Let ''H''<sub>0</sub> be the [[linear span]] of {''K<sub>x</sub>'' : ''x'' ∈ ''X''}. Define an inner product on ''H''<sub>0</sub> by :<math> \left\langle \sum_{j=1}^n b_j K_{y_j}, \sum_{i=1}^m a_i K_{x_i} \right \rangle_{H_0} = \sum_{i=1}^m \sum_{j=1}^n {a_i} b_j K(y_j, x_i),</math> Line 134: We may characterize a symmetric positive definite kernel <math>K</math> via the integral operator using [[Mercer's theorem]] and obtain an additional view of the RKHS. Let <math>X</math> be a compact space equipped with a strictly positive finite [[Borel measure]] <math>\mu</math> and <math>K: X \times X \to \R</math> a continuous, symmetric, and positive definite function. Define the integral operator <math>T_K: L_2(X) \to L_2(X)</math> as :<math> [T_K f](\cdot) =\int_X K({}\cdot{},t) f(t)\, d\mu(t) </math> where <math>L_2(X)</math> is the space of square integrable functions with respect to <math> \mu </math>. Line 140: Mercer's theorem states that the spectral decomposition of the integral operator <math>T_K</math> of <math>K</math> yields a series representation of <math>K</math> in terms of the eigenvalues and eigenfunctions of <math> T_K </math>. This then implies that <math>K</math> is a reproducing kernel so that the corresponding RKHS can be defined in terms of these eigenvalues and eigenfunctions. We provide the details below. Under these assumptions <math>T_K</math> is a compact, continuous, self-adjoint, and positive operator. The [[spectral theorem]] for self-adjoint operators implies that there is an at most countable decreasing sequence <math>(\sigma_i)_i_{i \geq 0} </math> such that <math display="inline">\lim_{i \to \infty}\sigma_i = 0</math> and <math>T_K\varphi_i(x) = \sigma_i\varphi_i(x)</math>, where the <math>\{\varphi_i\}</math> form an orthonormal basis of <math>L_2(X)</math>. By the positivity of <math>T_K, \sigma_i > 0</math> for all <math>i.</math> One can also show that <math>T_K </math> maps continuously into the space of continuous functions <math>C(X)</math> and therefore we may choose continuous functions as the eigenvectors, that is, <math>\varphi_i \in C(X)</math> for all <math>i.</math> Then by Mercer's theorem <math> K </math> may be written in terms of the eigenvalues and continuous eigenfunctions as Line 159: :<math> \left\langle f,g \right\rangle_H = \sum_{i=1}^\infty \frac{\left\langle f,\varphi_i \right\rangle_{L_2}\left\langle g,\varphi_i \right\rangle_{L_2}}{\sigma_i}. </math> This representation of the RKHS has application in probability and statistics, for example to the [[Karhunen–Loève theorem\|~~Karhunen-Loève~~Karhunen–Loève representation]] for stochastic processes and [[kernel PCA]]. ==Feature maps==