Multiple kernel learning: Difference between revisions

Content deleted Content added
Tamhok (talk | contribs)
Tamhok (talk | contribs)
Line 86:
The problem can be written as
 
:<math>\min_f L(f) + \lambdaRlambda R(f)+\gamma\Theta(f)</math>
 
where <math>L</math> is the loss function (weighted negative log-likelihood in this case), <math>R</math> is the regularization parameter ([[Proximal_gradient_methods_for_learning#Exploiting_group_structure|Group LASSO]] in this case), and <math>\Theta</math> is the conditional expectation consensus (CEC) penalty on unlabeled data. The CEC penalty is defined as follows. Let the marginal kernel density for all the data be