Revision as of 21:51, 15 December 2014 edit Tamhok (talk \| contribs) 68 edits →Bayesian approaches ← Previous edit		Revision as of 23:06, 15 December 2014 edit undo Tamhok (talk \| contribs) 68 edits →Semisupervised learning Next edit →
Line 80: ===Semisupervised learning=== [[Semisupervised learning]] approaches to multiple kernel learning are similar to other extensions of supervised learning approaches. An inductive procedure has been developed that uses a log-likelihood empirical loss and group LASSO regularization with conditional expectation consensus on unlabeled data for image categorization. ~~<ref>~~We ~~Wang,~~can ~~Shuhui~~define etthe ~~al.~~problem ~~[http://ieeexplore~~as follows.~~ieee.org/stamp/stamp.jsp?arnumber=6177671~~ Let ~~S3MKL:~~<math>L={(x_i,y_i)}</math> ~~Scalable~~be ~~Semi-Supervised~~the ~~Multiple~~labeled ~~Kernel~~data, ~~Learning~~and ~~for~~let ~~Real-World~~<math>U={x_i}</math> ~~Image~~be ~~Applications].~~the ~~IEEE~~set ~~TRANSACTIONS~~of ONunlabeled ~~MULTIMEDIA, VOL~~data. 14Then, ~~NO.~~we 4,can ~~AUGUST~~write ~~2012~~the decision function as follows. ~~</ref>~~ :<math>f(x)=\alpha_0+\sum_{i=1}^{\|L\|}\alpha_iK_i(x)</math> The problem can be written as :<math>\min_f L(f) + \lambdaR(f)+\gamma\Theta(f)</math> where <math>L</math> is the loss function (weighted negative log-likelihood in this case), <math>R</math> is the regularization parameter ([[Proximal_gradient_methods_for_learning#Exploiting_group_structure\|Group LASSO]] in this case), and <math>\Theta</math> is the conditional expectation consensus (CEC) penalty on unlabeled data. The CEC penalty is defined as follows. Let the marginal kernel density for all the data be :<math>g^{\pi}_m(x)=<\phi^{\pi}_m,\psi_m(x)></math> where <math>\psi_m(x)=[K_m(x_1,x),\ldots,K_m(x_L,x)]^T</math> (the kernel distance between the labeled data and all of the labeled and unlabeled data) and <math>\phi^{\pi}_m</math> is a non-negative random vector with a 2-norm of 1. The value of <math>\Pi</math> is the number of times each kernel is projected. Expectation regularization is then performed on the MKD, resulting in a reference expectation <math>q^{pi}_m(y\|g^{\pi}_m(x))</math> and model expectation <math>p^{\pi}_m(f(x)\|g^{\pi}_m(x))</math>. Then, we define :<math>\Theta=\frac{1}{\Pi}\sum_{\Pi}_{\pi=1}\sum^{M}_{m=1}D(q^{pi}_m(y\|g^{\pi}_m(x))\|\|p^{\pi}_m(f(x)\|g^{\pi}_m(x)))</math> <ref> Wang, Shuhui et al. [http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6177671 S3MKL: Scalable Semi-Supervised Multiple Kernel Learning for Real-World Image Applications]. IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 14, NO. 4, AUGUST 2012 </ref> ===Unsupervised learning===

Multiple kernel learning: Difference between revisions