Content deleted Content added
No edit summary |
Citation bot (talk | contribs) Add: authors 1-1. Removed URL that duplicated identifier. Removed parameters. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox | #UCB_webform_linked 406/1032 |
||
(27 intermediate revisions by 11 users not shown) | |||
Line 1:
{{Short description|Primal-Dual algorithm optimization for convex problems}}
{{multiple image
}}
In [[mathematics]], the '''
The
The algorithm is based on a primal-dual formulation, which allows for simultaneous updates of primal and [[Duality (mathematics)|dual]] variables. By employing the [[proximal operator]], the
==Problem statement==
Let be <math> \mathcal{X}, \mathcal{Y} </math> two real [[Vector space|vector spaces]] equipped with an [[Inner product space|inner product]] <math> \langle \cdot, \cdot \rangle </math> and a [[Norm (mathematics)|norm]] <math> \lVert \,\cdot \,\rVert = \langle \cdot, \cdot \rangle^{\frac{1}{2}} </math>. From up to now, a function <math>F</math> is called ''simple'' if its [[proximal operator]] <math> \text{prox}_{\tau F} </math> has a [[Closed-form expression|closed-form representation]] or can be accurately computed, for <math>\tau >0</math>
:<math display="block"> x = \text{prox}_{\tau F}(\tilde{x}) = \text{arg } \min_{x'\in \mathcal{X}}\left\{
\frac{\lVert x'-\tilde{x}\rVert^2}{2\tau} + F(x')
\right\}</math>
Consider the following constrained primal problem:<ref name=":0" />
:<math display="block"> \min_{x\in\mathcal{X}} F(Kx) + G(x) </math>
where <math>K:\mathcal{X} \rightarrow \mathcal{Y}</math> is a [[Bounded operator|bounded linear operator]], <math>F:\mathcal{Y} \rightarrow [0, +\infty), G:\mathcal{X} \rightarrow [0, +\infty) </math> are [[Convex function|convex]], [[Semi-continuity|lower semicontinuous]] and simple.<ref name=":0" />
The minimization problem has its dual corresponding problem as<ref name=":0" />
Line 32 ⟶ 36:
where <math>F^*, G^*</math> and <math> K^*</math> are the dual map of <math> F, G </math> and <math> K </math>, respectively.<ref name=":0" />
Assume that the primal and the dual problems have at least a solution <math> (\hat{x}, \hat{y}) \in \mathcal{X}\times \mathcal{Y} </math>, that means they satisfies<ref name=":4">{{Cite book |
<math display="block">
Line 43 ⟶ 47:
where <math> \partial F^* </math> and <math> \partial G</math> are the [[Subderivative|subgradient]] of the convex functions <math> F^* </math> and <math> G </math>, respectively.<ref name=":4" />
The
:<math display="block"> \min_{x\in\mathcal{X}} \max_{y\in\mathcal{Y}} \langle Kx, y \rangle + G(x) - F^*(y)
Line 51 ⟶ 55:
==Algorithm==
The
{{nowrap|Input: <math> F, G, K, \tau, \sigma >0, \, \theta \in[0,1],\, (x^0,y^0)\in\mathcal{X}\times\mathcal{Y}</math> and set <math> \overline{x}^0 = x^0</math>,}} ''stopping criterion''.
{{nowrap|<math> k \leftarrow 0 </math>}}
Line 61 ⟶ 65:
{{nowrap|<math>k\leftarrow k+1</math>}}
'''end do'''
{{algorithm-end}}
Chambolle and Pock proved<ref name=":0" /> that the algorithm converges if <math>\theta = 1</math> and <math>\tau \sigma \lVert K \rVert^2 \leq 1</math>, sequentially and with <math>\mathcal{O}(1/N)</math> as rate of convergence for the primal-dual gap. This has been extended by S. Banert et al.<ref>{{Cite arXiv |last1=Banert |first1=Sebastian|last2=Upadhyaya |first2=Manu|last3=Giselsson |first3=Pontus |date=2023 |title=The Chambolle-Pock method converges weakly with <math>\theta > 1/2</math> and <math> \tau \sigma \lVert L \rVert^{2} < 4 / ( 1 + 2 \theta )</math> |class=math.OC |eprint=2309.03998 }}</ref> to hold whenever <math>\theta>1/2</math> and <math>\tau \sigma \lVert K \rVert^2 < 4 / (1+2\theta)</math>.
The semi-implicit Arrow-Hurwicz method<ref>{{cite book |first=H. |last=Uzawa |chapter=Iterative methods for concave programming |editor1-first=K. J. |editor1-last=Arrow |editor2-first=L. |editor2-last=Hurwicz |editor3-first=H. |editor3-last=Uzawa |title=Studies in linear and nonlinear programming |url=https://archive.org/details/studiesinlinearn0000arro |url-access=registration |publisher=Stanford University Press |year=1958 }}</ref> coincides with the particular choice of <math>\theta = 0</math> in the
==Acceleration==
There are special cases in which the rate of convergence has a theoretical speed up.<ref name=":0" /> In fact, if <math>G </math>, respectively <math> F^* </math>, is [[Convex function#Uniformly convex functions|uniformly convex]] then <math> G^* </math>, respectively <math> F </math>, has a [[Lipschitz continuity|Lipschitz continuous]] gradient. Then, the rate of convergence can be improved to <math> \mathcal{O}(1/N^2)</math>, providing a slightly changes in the
In case of <math> G </math> uniformly convex, with <math> \gamma>0 </math> the uniform-convexity constant, the modified algorithm becomes<ref name=":0" />
Line 84 ⟶ 89:
'''end do'''
{{algorithm-end}}
Moreover, the convergence of the algorithm slows down when <math>L</math>, the norm of the operator <math>K</math>, cannot be estimated easily or might be very large. Choosing proper [[Preconditioner|preconditioners]] <math>T</math> and <math>\Sigma</math>, modifying the proximal operator with the introduction of the [[Inner product space#Euclidean vector space|induced norm]] through the operators <math>T</math> and <math>\Sigma</math>, the convergence of the proposed preconditioned algorithm will be ensured.<ref>{{Cite
==Application==
{{multiple image
}}
A typical application of this algorithm is in the image [[Noise reduction|denoising]] framework, based on total variation.<ref name=":2" />
Define the inner product on <math> \mathcal{X} </math> as<ref name=":0" />
Line 140 ⟶ 146:
\end{align}</math>
On <math>\mathcal{Y}</math> is defined an [[Square-integrable function|<math> L^1-</math> based norm]] as<ref name=":0" />
:<math displau="block">
\lVert p \rVert_1 = \sum_{i,j} \sqrt{\left(p_{i,j}^1\right)^2 + \left(p_{i,j}^2\right)^2}, \quad p\in \mathcal{Y}.
Line 175 ⟶ 181:
\tilde{u}_{i,j}+\tau\lambda g_{i,j}}{1+\tau \lambda}
\end{align}
</math>The image total-variation denoising problem can be also treated with other algorithms<ref>{{Cite journal |
== Implementation ==
* The Manopt.jl<ref>{{Cite web |title=Chambolle-Pock · Manopt.jl |url=https://docs.juliahub.com/Manopt/h1Pdc/0.3.8/solvers/ChambollePock.html |access-date=2023-07-07 |website=docs.juliahub.com |language=en}}</ref> package implements the algorithm in [[Julia (programming language)|Julia]]
* [[Gabriel Peyré]] implements the algorithm in [[MATLAB]],{{refn|group="note"|These codes were used to obtain the images in the article.}} Julia, [[R (programming language)|R]] and [[Python (programming language)|
* In the Operator Discretization Library (ODL),<ref>{{Cite web |title=Chambolle-Pock solver — odl 0.6.1.dev0 documentation |url=https://odl.readthedocs.io/guide/chambolle_pock_guide.html#chambolle-pock-guide |access-date=2023-07-07 |website=odl.readthedocs.io}}</ref> a Python library for inverse problems, {{code|chambolle_pock_solver}} implements the method.
Line 203 ⟶ 209:
== External links ==
* [
{{Optimization algorithms|convex}}
|