Multiple kernel learning: Difference between revisions

Content deleted Content added
Tamhok (talk | contribs)
Tamhok (talk | contribs)
Line 99:
 
===Unsupervised learning===
[[Unsupervised learning|Unsupervised]] multiple kernel learning algorithms have also been proposed by Zhuang et al. The problem is defined as follows. Let <math>U={x_i}</math> be a set of unlabeled data. The kernel definition is the linear combined kernel <math>K'=\sum_{i=1}^M\beta_iK_m</math>. TheIn minimizationthis problem, canthe data needs to be written"clustered" into groups based on the kernel distances. Let <math>B_i</math> be a the group or cluster of which <math>x_i</math> is a member. We define the loss function as \sum^n_{i=1}\left\Vert x_i - \sum_{x_j\in B_i} K(x_i,x_j)x_j\right\Vert^2. Furthermore, we minimize the distortion by minimizing \sum_{i=1}^n\sum_{x_j\in B_i}K(x_i,x_j)\left\Vert x_i - x_j \right\Vert^2. Finally, we add a regularization term to avoid overfitting. Combining these terms, we can write the minimization problem as follows.
 
:<math>\min_{\beta,B}\sum^n_{i=1}\left\Vert x_i - \sum_{x_j\in B_i} K(x_i,x_j)x_j\right\Vert^2 + \gamma_1\sum_{i=1}^n\sum_{x_j\in B_i}K(x_i,x_j)\left\Vert x_i - x_j \right\Vert^2 + \gamma_2\sum_i |B_i|</math>
 
where <math>B_i</math> are defined as a set of local bases for each <math>x_i</math>. One formulation of this can beis defined as follows. Let <math>D\in {0,1}^{n\times n}</math> be a matrix such that <math>D_{ij}=1</math> means that <math>x_i</math> and <math>x_j</math> are neighbors. Then, <math>B_i={x_j:D_{ij}=1}</math>. Note that these groups must be learned as well. Zhuang et al. solve this problem by an alternating minimization method for <math>K</math> and the groups <math>B_i</math>. For more information, see Zhuang et al<ref>J. Zhuang, J. Wang, S.C.H. Hoi & X. Lan. [http://jmlr.csail.mit.edu/proceedings/papers/v20/zhuang11/zhuang11.pdf Unsupervised Multiple Kernel Learning]. Jour. Mach. Learn. Res. 20:129–144, 2011</ref>.
 
==MKL Libraries==