Multiple kernel learning

This is an old revision of this page, as edited by Tamhok (talk | contribs) at 19:11, 8 December 2014 (Algorithms). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

This sandbox is in the article namespace. Either move this page into your userspace, or remove the {{User sandbox}} template.

Multiple kernel learning refers to a set of machine learning methods that use a predefined set of kernels and learn an optimal linear or non-linear combination of kernels as part of the algorithm. Reasons to use multiple kernel learning include a) the ability to select for an optimal kernel and parameters from a larger set of kernels, reducing bias due to kernel selection while allowing for more automated machine learning methods, and b) combining data from different sources (e.g. sound and images from a video) that have different notions of similarity and thus require different kernels. Instead of creating a new kernel, multiple kernel algorithms can be used to combine kernels already established for each individual data source.

Algorithms

Multiple kernel learning algorithms have been developed for supervised, semi-supervised, as well as unsupervised learning. Most work has been done on the supervised learning case with linear combinations of kernels. The basic idea behind multiple kernel learning algorithms is as follows: we begin with a set of   kernels  . In the linear case, we introduce a new kernel  , where   is a vector of coefficients for each kernel. For a set of data   with labels  , the minimization problem can then be written as

 

where   is an error function and   is a regularization term.   is typically the square loss function (Tikhonov regularization) or the hinge loss function (for SVM algorithms), and   is usually an   norm or some combination of the norms (i.e. elastic net regularization).

For supervised learning, there are many other algorithms that use different methods to learn the form of the kernel. The following categorization has been proposed by Gonen and Alpaydın (2011) [1] 1. Fixed rules, such as the linear combination algorithm described above. These do not require parameterization and use rules like summation and multiplication to combine the kernels. The weighting is learned in the algorithm. 2.

For more information on these methods, see Gonen and Alpaydın (2011) [1]

MKL Libraries

Available MKL libraries include