Content deleted Content added
Improved flow of the text |
mNo edit summary |
||
(8 intermediate revisions by 4 users not shown) | |||
Line 1:
{{Short description|Form of projection}}
{{more footnotes|date=November 2013}}
'''Proximal gradient methods''' are a generalized form of projection used to solve non-differentiable [[convex optimization]] problems.
[[File:Frank_Wolfe_vs_Projected_Gradient.webm|thumb|A comparison between the iterates of the projected gradient method (in red) and the [[Frank–Wolfe algorithm|Frank-Wolfe method]] (in green).]]
Many interesting problems can be formulated as convex optimization problems of the form
<math>
\
</math>
where <math>f_i: \mathbb{R}^
Proximal gradient methods starts by a splitting step, in which the functions <math>f_1, . . . , f_n</math> are used individually so as to yield an easily [[wikt:implementable|implementable]] algorithm. They are called [[proximal]] because each non-differentiable function among <math>f_1, . . . , f_n</math> is involved via its [[Proximal operator|proximity operator]]. Iterative shrinkage thresholding algorithm,<ref>
Line 16 ⟶ 17:
For the theory of proximal gradient methods from the perspective of and with applications to [[statistical learning theory]], see [[proximal gradient methods for learning]].
== Projection onto convex sets (POCS) ==
Line 56 ⟶ 24:
x_{k+1} = P_{C_1} P_{C_2} \cdots P_{C_n} x_k
</math>
However beyond such problems [[projection operator]]s are not appropriate and more general operators are required to tackle them. Among the various generalizations of the notion of a convex projection operator that exist,
== Examples ==
Line 105 ⟶ 33:
== See also ==
* [[Frank–Wolfe algorithm]]▼
* [[Proximal operator]]
* [[Proximal gradient methods for learning]]
▲* [[Frank–Wolfe algorithm]]
== Notes ==
|