Content deleted Content added
Clarify that subgradient projection methods (and bundle methods of descent) are still competitive (although idiotic subgradient methods are not) |
→Basic subgradient update: Rewrite, correcting "algorithm" when "iterative method" is needed, etc. |
||
Line 7:
Subgradient projection methods are often applied to large-scale problems with decomposition techniques. Such decomposition methods often allow a simple distributed method for a problem.
==
Let <math>f:\mathbb{R}^n \to \mathbb{R}</math> be a [[convex function]] with ___domain <math>\mathbb{R}^n</math>.
:<math>x^{(k+1)} = x^{(k)} - \alpha_k g^{(k)} \ </math>
where <math>g^{(k)}</math> denotes a [[subgradient]] of <math> f \ </math> at <math>x^{(k)} \ </math>. If <math>f \ </math> is differentiable, then its only subgradient is the gradient vector <math>\nabla f</math> itself.
It may happen that <math>-g^{(k)}</math> is not a descent direction for <math>f \ </math> at <math>x^{(k)}</math>. We therefore maintain a list <math>f_{\rm{best}} \ </math> that keeps track of the lowest objective function value found so far, i.e.
:<math>f_{\rm{best}}^{(k)} = \min\{f_{\rm{best}}^{(k-1)} , f(x^{(k)}) \}.</math>
Line 17:
===Step size rules===
Many different types of step
*Constant step size, <math>\alpha_k = \alpha.</math>
Line 27:
*Nonsummable diminishing step lengths, i.e. <math>\alpha_k = \gamma_k/\lVert g^{(k)} \rVert_2</math>, where
:<math>\gamma_k\geq0,\qquad \lim_{k\to\infty} \gamma_k = 0,\qquad \sum_{k=1}^\infty \gamma_k = \infty.</math>
===Convergence results===
|