Proximal gradient method: Difference between revisions

Content deleted Content added
No edit summary
Rambor12 (talk | contribs)
mNo edit summary
 
(4 intermediate revisions by 3 users not shown)
Line 1:
{{Short description|Form of projection}}
{{more footnotes|date=November 2013}}
 
Line 6 ⟶ 7:
 
<math>
\operatornamemin_{min}\limits_mathbf{x} \in \mathbb{R}^Nd} \sum_{i=1}^n f_i(\mathbf{x})
</math>
 
where <math>f_i: \mathbb{R}^Nd \rightarrow \mathbb{R},\ i = 1, \dots, n</math> are possibly non-differentiable [[convex functions]]. The lack of differentiability rules out conventional smooth optimization techniques like the [[Gradient descent|steepest descent method]] and the [[conjugate gradient method]], but proximal gradient methods can be used instead.
 
Proximal gradient methods starts by a splitting step, in which the functions <math>f_1, . . . , f_n</math> are used individually so as to yield an easily [[wikt:implementable|implementable]] algorithm. They are called [[proximal]] because each non-differentiable function among <math>f_1, . . . , f_n</math> is involved via its [[Proximal operator|proximity operator]]. Iterative shrinkage thresholding algorithm,<ref>
Line 23 ⟶ 24:
x_{k+1} = P_{C_1} P_{C_2} \cdots P_{C_n} x_k
</math>
However beyond such problems [[projection operator]]s are not appropriate and more general operators are required to tackle them. Among the various generalizations of the notion of a convex projection operator that exist, proximityproximal operators are best suited for other purposes.
 
== Examples ==