Content deleted Content added
Martinjaggi (talk | contribs) more general description, and sparse greedy reference. the original paper in section 6 already proposes the algorithm for general convex functions (not only QPs) |
Martinjaggi (talk | contribs) m minor fix in the algorithm, better linking |
||
Line 1:
The Frank–Wolfe algorithm is an [[iterative method|iterative]] [[First-order approximation|first-order]] [[Mathematical optimization|optimization]] [[algorithm]] for [[constrained optimization|constrained]] [[convex optimization]]. Also known as the '''conditional gradient method''', '''reduced gradient algorithm''' and the '''convex combination algorithm''', the method was originally proposed by [[Marguerite Frank]] and [[Philip Wolfe]] in 1956<ref>{{cite journal |last1=Frank |first1=M. |last2=Wolfe |first2=P. |title=An algorithm for quadratic programming |journal=Naval Research Logistics Quarterly |volume=3 |number=1-2 |year=1956 |pages=95–110|doi=10.1002/nav.3800030109}}</ref>. In each iteration, the Frank-Wolfe algorithm considers [[linear approximation]] of the objective function, and moves slightly towards a minimizer of this linear function (taken over the same ___domain).
While competing methods such as [[gradient descent]] for constrained optimization require a [[Projection (mathematics)|projection step]] back to the feasible set in each iteration, the Frank–Wolfe algorithm only needs the solution of a linear-problem over the same set in each iteration, and automatically stays in the feasible set.
Line 9:
:Minimize <math> f(\mathbf{x})</math>
:subject to <math> \mathbf{x} \in \mathcal{D}</math>.
Where the function f is [[Convex function|convex]] and [[differentiable function|differentiable]], and the ___domain or feasible set <math>\mathcal{D}</math>
==Algorithm==
Line 16:
Initialization. Let <math>k \leftarrow 0</math> and let <math>\mathbf{x}_0 \!</math> be any point in <math>\mathcal{D}</math>.
'''Step 1.''' Direction-finding subproblem.
:Minimize <math>\mathbf{s}^T \nabla f(\mathbf{x}_k)</math>
:Subject to <math>\mathbf{s} \in \mathcal{D}</math>
Interpretation: Minimize the linear approximation of the problem given by the first-order [[Taylor series|Taylor approximation]] of f around <math>\mathbf{x}_k \!</math>.
'''Step 2.''' Step size determination. Set <math>\gamma \leftarrow \frac{k}{k+2}</math>, or alternatively find <math>\gamma \!</math> that minimizes <math> f(\mathbf{x}_k+\gamma(\mathbf{s}-\mathbf{x}_k))</math> subject to <math>0 \le \gamma \le 1</math> .
'''Step 3.''' Update. Let <math>\mathbf{x}_{k+1}\leftarrow \mathbf{x}_k+\gamma(\mathbf{s}-\mathbf{x}_k)</math>, let <math>k \leftarrow k+1</math> and go back to Step
==Comments==
|