Revision as of 00:59, 14 December 2012 edit Martinjaggi (talk \| contribs) 6 edits more general description, and sparse greedy reference. the original paper in section 6 already proposes the algorithm for general convex functions (not only QPs) ← Previous edit		Revision as of 10:47, 14 December 2012 edit undo Martinjaggi (talk \| contribs) 6 edits m minor fix in the algorithm, better linking Next edit →
Line 1: The Frank–Wolfe algorithm is an [[iterative method\|iterative]] [[First-order approximation\|first-order]] [[Mathematical optimization\|optimization]] [[algorithm]] for [[constrained optimization\|constrained]] [[convex optimization]]. Also known as the '''conditional gradient method''', '''reduced gradient algorithm''' and the '''convex combination algorithm''', the method was originally proposed by [[Marguerite Frank]] and [[Philip Wolfe]] in 1956<ref>{{cite journal \|last1=Frank \|first1=M. \|last2=Wolfe \|first2=P. \|title=An algorithm for quadratic programming \|journal=Naval Research Logistics Quarterly \|volume=3 \|number=1-2 \|year=1956 \|pages=95–110\|doi=10.1002/nav.3800030109}}</ref>. In each iteration, the Frank-Wolfe algorithm considers [[linear approximation]] of the objective function, and moves slightly towards a minimizer of this linear function (taken over the same ___domain). While competing methods such as [[gradient descent]] for constrained optimization require a [[Projection (mathematics)\|projection step]] back to the feasible set in each iteration, the Frank–Wolfe algorithm only needs the solution of a linear-problem over the same set in each iteration, and automatically stays in the feasible set. Line 9: :Minimize <math> f(\mathbf{x})</math> :subject to <math> \mathbf{x} \in \mathcal{D}</math>. Where the function f is [[Convex function\|convex]] and [[differentiable function\|differentiable]], and the ___domain or feasible set <math>\mathcal{D}</math> ~~represents a feasible region defined given by a~~is [[convex]] and bounded ~~set~~. ==Algorithm== Line 16: Initialization. Let <math>k \leftarrow 0</math> and let <math>\mathbf{x}_0 \!</math> be any point in <math>\mathcal{D}</math>. '''Step 1.''' Direction-finding subproblem. ~~The approximation of the problem that is obtained by replacing the function f with its first-order [[Taylor series\|Taylor expansion]] around~~Find <math>\mathbf{xs}~~_k \!~~</math> ~~is found. Solve for <math>\mathbf{s}</math>:~~solving :Minimize <math>\mathbf{s}^T \nabla f(\mathbf{x}_k)</math> :Subject to <math>\mathbf{s} \in \mathcal{D}</math> Interpretation: Minimize the linear approximation of the problem given by the first-order [[Taylor series\|Taylor approximation]] of f around <math>\mathbf{x}_k \!</math>. '''Step 2.''' Step size determination. Set <math>\gamma \leftarrow \frac{k}{k+2}</math>, or alternatively find <math>\gamma \!</math> that minimizes <math> f(\mathbf{x}_k+\gamma(\mathbf{s}-\mathbf{x}_k))</math> subject to <math>0 \le \gamma \le 1</math> . '''Step 3.''' Update. Let <math>\mathbf{x}_{k+1}\leftarrow \mathbf{x}_k+\gamma(\mathbf{s}-\mathbf{x}_k)</math>, let <math>k \leftarrow k+1</math> and go back to Step 21. ==Comments==

Frank–Wolfe algorithm: Difference between revisions