Frank–Wolfe algorithm: Difference between revisions

Content deleted Content added
Mathmon (talk | contribs)
Mathmon (talk | contribs)
Added a scetion about lower bounds
Line 12:
:Initialization. Let <math>k \leftarrow 0</math> and let <math>\mathbf{x}_0 \!</math> be any point in <math>\mathcal{D}</math>.
 
:'''Step 1.''' Direction-finding subproblem. Find <math>\mathbf{s}_k</math> solving
::Minimize <math> \mathbf{s}^T, \nabla f(\mathbf{x}_k)</math>
::Subject to <math>\mathbf{s} \in \mathcal{D}</math>
:Interpretation: Minimize the linear approximation of the problem given by the first-order [[Taylor series|Taylor approximation]] of f around <math>\mathbf{x}_k \!</math>.
 
:'''Step 2.''' Step size determination. Set <math>\gamma \leftarrow \frac{2}{k+2}</math>, or alternatively find <math>\gamma</math> that minimizes <math> f(\mathbf{x}_k+\gamma(\mathbf{s}_k -\mathbf{x}_k))</math> subject to <math>0 \le \gamma \le 1</math> .
 
:'''Step 3.''' Update. Let <math>\mathbf{x}_{k+1}\leftarrow \mathbf{x}_k+\gamma(\mathbf{s}_k-\mathbf{x}_k)</math>, let <math>k \leftarrow k+1</math> and go back to Step 1.
 
==Properties==
Line 31:
 
While the worst-case convergence rate with <math>O(1/k)</math> can not be improved in general, faster convergence can be obtained for special problem classes, such as some strongly convex problems.<ref>{{Cite book|title=Nonlinear Programming|first= Dimitri |last=Bertsekas|year= 2003|page= 222|publisher= Athena Scientific| isbn =1-886529-00-0}}</ref>
 
==Lower bounds on the solution value==
 
Since <math>f</math> is convex, <math>f(\mathbf{y})</math> is always above the [[Tangent|tangent plane]] of <math>f</math> at any point <math>\mathbf{x} \in \mathcal{D}</math>:
 
:<math>
f(\mathbf{y}) \geq f(\mathbf{x}) + (\mathbf{y} - \mathbf{x})^T \nabla f(\mathbf{x})
</math>
 
This holds in particular for the optimal solution <math>\mathbf{x}^*</math>. The best lower bound with respect to a given point <math>\mathbf{x}</math> is given by
 
:<math>
f(\mathbf{x}^*) \geq \min_{\mathbf{x} \in D} f(\mathbf{x}) + (\mathbf{y} - \mathbf{x})^T \nabla f(\mathbf{x}) = f(\mathbf{x}) - \mathbf{x}^T \nabla f(\mathbf{x}) + \min_{\mathbf{y} \in D} \mathbf{y}^T \nabla f(\mathbf{x})
</math>
 
The latter optimization problem is solved in every iteration of the Frank-Wolfe algorithm, therefore the solution <math>\mathbf{s}_k</math> of the direction-finding subproblem of the <math>k</math>-th iteration can be used to determine increasing lower bounds <math>u_k</math> during each iteration by setting <math>u_0 = - \infty</math> and
 
:<math>
u_k := \max (u_{k - 1}, f(\mathbf{x}_k) + (\mathbf{s}_k - \mathbf{x}_k)^T, \nabla f(\mathbf{x}_k))
</math>
 
==Notes==