Content deleted Content added
No edit summary |
mNo edit summary |
||
(23 intermediate revisions by 12 users not shown) | |||
Line 1:
{{Short description|Form of projection}}
{{more footnotes|date=November 2013}}
'''Proximal gradient methods''' are a generalized form of projection used to solve non-differentiable [[convex optimization]] problems.
[[File:Frank_Wolfe_vs_Projected_Gradient.webm|thumb|A comparison between the iterates of the projected gradient method (in red) and the [[Frank–Wolfe algorithm|Frank-Wolfe method]] (in green).]]
Many interesting problems can be formulated as convex optimization problems of the form
<math>
\
</math>
where <math>f_i: \mathbb{R}^d \rightarrow \mathbb{R},\ i = 1, \dots, n</math> are possibly non-differentiable [[convex functions]]. The lack of differentiability rules out conventional smooth optimization techniques like the [[Gradient descent|steepest descent method]] and the [[conjugate gradient method]], but proximal gradient methods can be used instead.
Proximal gradient methods starts by a splitting step, in which the functions <math>f_1, . . . , f_n</math> are used individually so as to yield an easily [[wikt:implementable|implementable]] algorithm. They are called [[proximal]] because each non-differentiable function among <math>f_1, . . . , f_n</math> is involved via its [[Proximal operator|proximity operator]]. Iterative shrinkage thresholding algorithm,<ref>
{{cite journal | last1=Daubechies | first1=I | last2=Defrise | first2 = M | last3 = De Mol| first3 = C|author3-link= Christine De Mol | title=An iterative thresholding algorithm for linear inverse problems with a sparsity constraint |journal= Communications on Pure and Applied Mathematics|volume=57 | issue=11 |year=2004|pages=1413–1457| bibcode=2003math......7152D | arxiv=math/0307152 |doi=10.1002/cpa.20042}}</ref> [[Landweber iteration|projected Landweber]], projected gradient, [[alternating projection]]s, [[Alternating direction method of multipliers#Alternating direction method of multipliers|alternating-direction method of multipliers]], alternating
split [[Bregman method|Bregman]] are special instances of proximal algorithms.<ref>Details of proximal methods are discussed in {{cite arXiv |last1=Combettes |first1=Patrick L. |last2= Pesquet |first2=Jean-Christophe |title=Proximal Splitting Methods in Signal Processing|page=|year=2009 |eprint=0912.3522|class=math.OC }}</ref>
For the theory of proximal gradient methods from the perspective of and with applications to [[statistical learning theory]], see [[proximal gradient methods for learning]].
== Projection onto convex sets (POCS) ==
One of the widely used convex optimization algorithms is [[projections onto convex sets]] (POCS). This algorithm is employed to recover/synthesize a signal satisfying simultaneously several convex constraints. Let <math>f_i</math> be the indicator function of non-empty closed convex set <math>C_i</math> modeling a constraint. This reduces to convex feasibility problem, which require us to find a solution such that it lies in the intersection of all convex sets <math>C_i</math>. In POCS method each set <math>C_i</math> is incorporated by its [[projection operator]] <math>P_{C_i}</math>. So in each [[iteration]] <math>x</math> is updated as
: <math>
x_{k+1} = P_{C_1} P_{C_2} \cdots P_{C_n} x_k
</math>
However beyond such problems [[projection operator]]s are not appropriate and more general operators are required to tackle them. Among the various generalizations of the notion of a convex projection operator that exist, proximal operators are best suited for other purposes.
== Examples ==
Line 115 ⟶ 31:
*[[Alternating projection]]
*[[Alternating direction method of multipliers#Alternating direction method of multipliers|Alternating-direction method of multipliers]]
== See also ==
* [[Proximal operator]]
* [[Proximal gradient methods for learning]]
* [[Frank–Wolfe algorithm]]
== Notes ==
{{reflist}}
== References ==
Line 144 ⟶ 55:
| last2 = Pesquet
| first2 = Jean-Christophe
| title =
| volume = 49
| year = 2011
Line 151 ⟶ 62:
==External links==
* Stephen Boyd and Lieven Vandenberghe Book, [
* [
* [
* [https://github.com/kul-forbes/ProximalOperators.jl ProximalOperators.jl]: a [[Julia (programming language)|Julia]] package implementing proximal operators.
* [https://github.com/kul-forbes/ProximalAlgorithms.jl ProximalAlgorithms.jl]: a [[Julia (programming language)|Julia]] package implementing algorithms based on the proximal operator, including the proximal gradient method.
|