Content deleted Content added
Line 80:
===Semisupervised learning===
[[Semisupervised learning]] approaches to multiple kernel learning are similar to other extensions of supervised learning approaches. An inductive procedure has been developed that uses a log-likelihood empirical loss and group LASSO regularization with conditional expectation consensus on unlabeled data for image categorization.
:<math>f(x)=\alpha_0+\sum_{i=1}^{|L|}\alpha_iK_i(x)</math>
The problem can be written as
:<math>\min_f L(f) + \lambdaR(f)+\gamma\Theta(f)</math>
where <math>L</math> is the loss function (weighted negative log-likelihood in this case), <math>R</math> is the regularization parameter ([[Proximal_gradient_methods_for_learning#Exploiting_group_structure|Group LASSO]] in this case), and <math>\Theta</math> is the conditional expectation consensus (CEC) penalty on unlabeled data. The CEC penalty is defined as follows. Let the marginal kernel density for all the data be
:<math>g^{\pi}_m(x)=<\phi^{\pi}_m,\psi_m(x)></math>
where <math>\psi_m(x)=[K_m(x_1,x),\ldots,K_m(x_L,x)]^T</math> (the kernel distance between the labeled data and all of the labeled and unlabeled data) and <math>\phi^{\pi}_m</math> is a non-negative random vector with a 2-norm of 1. The value of <math>\Pi</math> is the number of times each kernel is projected. Expectation regularization is then performed on the MKD, resulting in a reference expectation <math>q^{pi}_m(y|g^{\pi}_m(x))</math> and model expectation <math>p^{\pi}_m(f(x)|g^{\pi}_m(x))</math>. Then, we define
:<math>\Theta=\frac{1}{\Pi}\sum_{\Pi}_{\pi=1}\sum^{M}_{m=1}D(q^{pi}_m(y|g^{\pi}_m(x))||p^{\pi}_m(f(x)|g^{\pi}_m(x)))</math>
<ref> Wang, Shuhui et al. [http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6177671 S3MKL: Scalable Semi-Supervised Multiple Kernel Learning for Real-World Image Applications]. IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 14, NO. 4, AUGUST 2012 </ref>
===Unsupervised learning===
|