Content deleted Content added
→See also: changed article categories to links per WP:USERNOCAT using AWB (9774) |
|||
Line 85:
===Adaptive step size===
Consider a problem of the form <math>\min_w F(w) + R(w),</math> where <math>F(w)</math> is convex and differentiable and <math>R(w)</math> is convex (for example, for differentiable [[Loss_function|loss functions]] empirical
:<math>w^{k+1} = \operatorname{prox}_{\gamma R}\left(w^k-\gamma \nabla F\left(w^k\right)\right).</math>
One standard modification is to let <math>\gamma</math> vary with <math>k</math>, hence we have the scheme
|