Frank–Wolfe algorithm

This is an old revision of this page, as edited by Kiefer.Wolfowitz (talk | contribs) at 09:36, 30 November 2010 (Reduced gradient method, not an algorithm for general nonlinear programming. Lots of mostly minor rewriting). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In mathematical optimization, the reduced gradient method of Frank and Wolfe is an iterative method for nonlinear programming. Also known as the Frank–Wolfe algorithm and the convex combination algorithm, the reduced gradient method was proposed by Marguerite Frank and Phil Wolfe in 1956 as an algorithm for solving quadratic programming problems. The method is initialized by solving a feasible solution to the linear constraints. At each iteration, the method takes a descent step in the negative gradient direction, so reducing the objective function; this gradient descent step is "reduced" to maintain remain in the polyhedral feasible region of the linear constraints. Because quadratic programming is a generalization of linear programming, the reduced gradient method is a generalization of Dantzig's simplex algorithm for linear programming.

More generally, the reduced gradient method can be used on nonlinear programming problems in addition to quadratic programming problems. While the method is slower than competing methods and has been abandoned as a general purpose method of nonlinear programming, it remains widely used for specially structured problems of large scale optimization. In particular, the reduced gradient method remains popular and effective for finding approximate minimum–cost flows in transportation networks, which often have hundreds of thousands of nodes.

Problem statement

Minimize
subject to .

Where the n×n matrix E is positive semidefinite, h is an n×1 vector, and represents a feasible region defined by a mix of linear inequality and equality constraints (for example Ax ≤ b, Cx = d).

Algorithm

Step 1. Initialization. Let and let be any point in .

Step 2. Convergence test. If then Stop, we have found the minimum.

Step 3. Direction-finding subproblem. The approximation of the problem that is obtained by replacing the function f with its first-order Taylor expansion around is found. Solve for :

Minimize
Subject to
(note that this is a Linear Program. is fixed during Step 3, while the minimization takes place by varying and is equivalent to minimization of ).

Step 4. Step size determination. Find that minimizes subject to . If then Stop, we have found the minimum.

Step 5. Update. Let , let and go back to Step 2.

Comments

The algorithm generally makes good progress towards the optimum during the first few iterations, but convergence often slows down substantially when close to the minimum point. For this reason the algorithm is perhaps best used to find an approximate solution. It can be shown that the worst case convergence rate is sublinear; however, in practice the convergence rate has been observed to improve in case of many constraints.[1]

References

Notes

  1. ^ "Nonlinear Programming", Dimitri Bertsekas, 2003, page 222. Athena Scientific, ISBN 1-886529-00-0.