Proximal gradient methods for learning: Difference between revisions

Content deleted Content added
Cewbot (talk | contribs)
Citation bot (talk | contribs)
Add: s2cid. | Use this bot. Report bugs. | Suggested by Whoop whoop pull up | #UCB_webform 506/3485
Line 90:
In the fixed point iteration scheme
:<math>w^{k+1} = \operatorname{prox}_{\gamma R}\left(w^k-\gamma \nabla F\left(w^k\right)\right),</math>
one can allow variable step size <math>\gamma_k</math> instead of a constant <math>\gamma</math>. Numerous adaptive step size schemes have been proposed throughout the literature.<ref name=combettes /><ref name=bauschke /><ref>{{cite journal|last=Loris|first=I. |author2=Bertero, M. |author3=De Mol, C. |author4=Zanella, R. |author5=Zanni, L. |title=Accelerating gradient projection methods for <math>\ell_1</math>-constrained signal recovery by steplength selection rules|journal=Applied & Comp. Harmonic Analysis|volume=27|issue=2|pages=247–254|year=2009|doi=10.1016/j.acha.2009.02.003|arxiv=0902.4424 |s2cid=18093882 }}</ref><ref>{{cite journal|last=Wright|first=S.J.|author2=Nowak, R.D. |author3=Figueiredo, M.A.T. |title=Sparse reconstruction by separable approximation|journal=IEEE Trans. Image Process.|year=2009|volume=57|issue=7|pages=2479–2493|doi=10.1109/TSP.2009.2016892|bibcode=2009ITSP...57.2479W|citeseerx=10.1.1.115.9334|s2cid=7399917 }}</ref> Applications of these schemes<ref name=structSparse /><ref>{{cite journal|last=Loris|first=Ignace|title=On the performance of algorithms for the minimization of <math>\ell_1</math>-penalized functionals|journal=Inverse Problems|year=2009|volume=25|issue=3|doi=10.1088/0266-5611/25/3/035008|page=035008|arxiv=0710.4082|bibcode=2009InvPr..25c5008L|s2cid=14213443}}</ref> suggest that these can offer substantial improvement in number of iterations required for fixed point convergence.
 
=== Elastic net (mixed norm regularization) ===