Revision as of 19:15, 6 December 2013 edit Mgfbinae (talk \| contribs) 122 edits No edit summary ← Previous edit		Revision as of 19:19, 6 December 2013 edit undo Mgfbinae (talk \| contribs) 122 edits No edit summary Next edit →
Line 1: {{user sandbox}} '''Proximal gradient methods for learning''' is an area of research in [[statistical learning theory]] which studies algorithms for a general class of [[~~Tikhonov~~ Regularization_(mathematics)\|regularization]] problems where the regularization penalty may not be [[Convex_function#Definition\|strictly convex]]. One such example is <math>\ell_1</math> regularization (also known as Lasso) of the form :<math>\min_{w\in\mathbb{R}^n} \frac{1}{n}\sum_{i=1}^n (y_i- \langle w,x_i\rangle)^2+ \lambda \\|w\\|_1, \quad \text{ where } x_i\in \mathbb{R}^d\text{ and } y_i\in\mathbb{R}.</math> Line 85: ===Adaptive step size=== Consider a problem of the form <math>\min_w F(w) + R(w),</math> where <math>F(w)</math> is convex and differentiable and <math>R(w)</math> is convex (for example, for differentiable [[Loss_function\|loss functions]] ~~Tikhonov~~empirical loss minimization with regularization takes this form~~, one such example of which is considered with lasso regularization~~). This gives rise to the fixed point iteration scheme :<math>w^{k+1} = \operatorname{prox}_{\gamma R}\left(w^k-\gamma \nabla F\left(w^k\right)\right).</math> One standard modification is to let <math>\gamma</math> vary with <math>k</math>, hence we have the scheme

Proximal gradient methods for learning: Difference between revisions