Revision as of 21:37, 19 February 2022 edit LucasYin (talk \| contribs) 8 edits →References ← Previous edit		Revision as of 11:35, 16 December 2022 edit undo Saung Tadashi (talk \| contribs) Extended confirmed users 2,083 edits Improved flow of the text Tag: Visual edit Next edit →
Line 9: </math> where <math>f_i: \mathbb{R}^N \rightarrow \mathbb{R},\ i = 1, \dots, n</math> are possibly non-differentiable [[convex functions]]. The lack of differentiability rules out conventional smooth optimization techniques like the [[Gradient descent\|steepest descent method]] and the [[conjugate gradient method]], but proximal gradient methods can be used instead. ~~where <math>f_i,\ i = 1, \dots, n</math> are [[convex functions]] defined from <math>f: \mathbb{R}^N \rightarrow \mathbb{R} </math>~~ ~~where some of the functions are non-differentiable. This rules out conventional smooth optimization techniques like~~ ~~[[Gradient descent\|Steepest descent method]], [[conjugate gradient method]] etc.~~ Proximal gradient methods ~~can be used instead. These methods proceed~~starts by a splitting step, in ~~that~~which the functions <math>f_1, . . . , f_n</math> are used individually so as to yield an easily [[wikt:implementable\|implementable]] algorithm. They are called [[proximal]] because each non-differentiable function among <math>f_1, . . . , f_n</math> is involved via its [[Proximal operator\|proximity operator]]. Iterative shrinkage thresholding algorithm,<ref> {{cite journal \| last1=Daubechies \| first1=I \| last2=Defrise \| first2 = M \| last3 = De Mol\| first3 = C\|author3-link= Christine De Mol \| title=An iterative thresholding algorithm for linear inverse problems with a sparsity constraint \|journal= Communications on Pure and Applied Mathematics\|volume=57 \| issue=11 \|year=2004\|pages=1413–1457\| bibcode=2003math......7152D \| arxiv=math/0307152 \|doi=10.1002/cpa.20042}}</ref> [[Landweber iteration\|projected Landweber]], projected gradient, [[alternating projection]]s, [[Alternating direction method of multipliers#Alternating direction method of multipliers\|alternating-direction method of multipliers]], alternating▼ ~~They are called [[proximal]] because each non [[smooth function]] among <math>f_1, . . . , f_n</math> is involved via its proximity~~ split [[Bregman method\|Bregman]] are special instances of proximal algorithms.<ref>Details of proximal methods are discussed in {{cite arXiv \|last1=Combettes \|first1=Patrick L. \|last2= Pesquet \|first2=Jean-Christophe \|title=Proximal Splitting Methods in Signal Processing\|page=\|year=2009 \|eprint=0912.3522\|class=math.OC }}</ref> ~~For the theory of proximal gradient methods from the perspective of and with applications to [[statistical learning theory]], see [[proximal gradient methods for learning]].~~▼ ~~operator. Iterative Shrinkage thresholding algorithm,<ref>~~ ▲{{cite journal \| last1=Daubechies \| first1=I \| last2=Defrise \| first2 = M \| last3 = De Mol\| first3 = C\|author3-link= Christine De Mol \| title=An iterative thresholding algorithm for linear inverse problems with a sparsity constraint \|journal= Communications on Pure and Applied Mathematics\|volume=57 \| issue=11 \|year=2004\|pages=1413–1457\| bibcode=2003math......7152D \| arxiv=math/0307152 \|doi=10.1002/cpa.20042}}</ref> [[Landweber iteration\|projected Landweber]], projected For the theory of proximal gradient methods from the perspective of and with applications to [[statistical learning theory]], see [[proximal gradient methods for learning]]. ~~gradient, [[alternating projection]]s, [[Alternating direction method of multipliers#Alternating direction method of multipliers\|alternating-direction method of multipliers]], alternating~~ ▲split [[Bregman method\|Bregman]] are special instances of proximal algorithms.<ref>Details of proximal methods are discussed in {{cite arXiv \|last1=Combettes \|first1=Patrick L. \|last2= Pesquet \|first2=Jean-Christophe \|title=Proximal Splitting Methods in Signal Processing\|page=\|year=2009 \|eprint=0912.3522\|class=math.OC }}</ref> For the theory of proximal gradient methods from the perspective of and with applications to [[statistical learning theory]], see [[proximal gradient methods for learning]]. == Notations and terminology ==

Proximal gradient method: Difference between revisions