Multiple kernel learning

This sandbox is in the article namespace. Either move this page into your userspace, or remove the {{User sandbox}} template.

Multiple kernel learning refers to a set of machine learning methods that use a predefined set of kernels and learn an optimal linear or non-linear combination of kernels as part of the algorithm. Reasons to use multiple kernel learning include a) the ability to select for an optimal kernel and parameters from a larger set of kernels, reducing bias due to kernel selection while allowing for more automated machine learning methods, and b) combining data from different sources (e.g. sound and images from a video) that have different notions of similarity and thus require different kernels. Instead of creating a new kernel, multiple kernel algorithms can be used to combine kernels already established for each individual data source.

Multiple kernel learning algorithms have been developed for supervised, semi-supervised, as well as unsupervised learning. Most work has been done on the supervised learning case with linear combinations of kernels. The basic idea behind multiple kernel learning algorithms is as follows: we begin with a set of $n$ kernels $K$ . In the linear case, we introduce a new kernel $K'=\sum _{i=1}^{n}\beta _{i}K_{i}$ , where $\beta _{i}$ is a vector of coefficients for each kernel. For a set of data $X$ with labels $Y$ , the minimization problem can then be written as

\min _{\beta ,c}\mathrm {E} (Y,K'c)+R(K'c)where<math>\mathrm {E}

is an error function and

R

is a regularization term. Typically,

\mathrm {E}

is typically the square loss function (Tikhonov regularization) or the hinge loss function (for SVM algorithms), and

R

is usually an

\ell _{n}

norm or some combination of the norms (i.e. elastic net regularization).

MKL Libraries

Available MKL libraries include