Content deleted Content added
m Hyperlink addition |
|||
Line 39:
The form of the kernel {{math|Γ}} induces both the representation of the [[feature space]] and structures the output across tasks. A natural simplification is to choose a ''separable kernel,'' which factors into separate kernels on the input space {{mathcal|X}} and on the tasks <math> \{1,...,T\} </math>. In this case the kernel relating scalar components <math> f_t </math> and <math> f_s </math> is given by <math display="inline"> \gamma((x_i,t),(x_j,s )) = k(x_i,x_j)k_T(s,t)=k(x_i,x_j)A_{s,t} </math>. For vector valued functions <math> f\in \mathcal H </math> we can write <math>\Gamma(x_i,x_j)=k(x_i,x_j)A</math>, where {{mvar|k}} is a scalar reproducing kernel, and {{mvar|A}} is a symmetric positive semi-definite <math>T\times T</math> matrix. Henceforth denote <math> S_+^T=\{\text{PSD matrices} \} \subset \mathbb R^{T \times T} </math> .
This factorization property, separability, implies the input feature space representation does not vary by task. That is, there is no interaction between the input kernel and the task kernel. The structure on tasks is represented solely by {{mvar|A}}. Methods for non-separable kernels {{math|Γ}} is
For the separable case, the representation theorem is reduced to <math display="inline">f(x)=\sum _{i=1} ^N k(x,x_i)Ac_i</math>. The model output on the training data is then {{mvar|KCA}} , where {{mvar|K}} is the <math>n \times n</math> empirical kernel matrix with entries <math display="inline">K_{i,j}=k(x_i,x_j)</math>, and {{mvar|C}} is the <math>n \times T</math> matrix of rows <math>c_i</math>.
|