Levenberg–Marquardt algorithm

This is an old revision of this page, as edited by Billlion (talk | contribs) at 19:09, 5 December 2004 (References). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The Levenberg-Marquardt algorithm provides a numerical solution to the mathematical problem of minimizing a sum of squares of several, generally nonlinear functions that depend on a common set of parameters.

This minimization problem arises especially in least squares curve fitting (see also: nonlinear programming).

The Levenberg-Marquardt algorithm (LMA) interpolates between the Gauss-Newton algorithm (GNA) and the method of gradient descent. The LMA is robuster than the GNA, which means that in many cases it finds a solution even if it starts very far off the final minimum. On the other hand, for well-behaved functions and reasonable starting parameters, the LMA tends to be a bit slower than the GNA. The LMA is the most popular curve-fitting algorithm; it is used in almost any software that provides a generic curve-fitting tool; few users will ever need another curve-fitting algorithm.

The problem

Given are m functions f1, ..., fm of n parameters p1, ..., pn with mn. It is convenient to use vector notation for both the functions and the parameters,

fT=(f1, ..., fm), and pT=(p1, ..., pn).

The least squares problem consists in finding the parameter vector p for which the cost function

S(p) = fTf=  [fi(p)]2

becomes minimal.

The main application is in the least squares curve fitting problem: given a set of empirical data pairs (ti,yi), optimize the parameters p of the model curve c(t|p) so that the sum of the squares of the deviations

fi(p)=yi - c(ti|p)

becomes minimal.

(A word on notation: we avoid the letter x because it is used sometimes in the place of our p, sometimes in the place of our t).

The solution

Like other numeric minimization algorithms, the Levenberg-Marquardt algorithm is an iterative procedure. To start a minimization, the user has to provide an initial guess for the parameter vector p. In many cases, an uninformed standard guess like pT=(1,1,...,1) will work fine; in other cases, the algorithm converges only if the initial guess is already somewhat close to the final solution.

In each iteration step, the parameter vector p is replaced by a new estimate p+q. To determine q, the functions fi(p+q) are approximated by their linearizations

f(p+q) ≈ fi(p) + JTq

where J is the Jacobian of f at p.

The sum of squares S becomes minimal if ∇qS=0. With the above linearization, this leads to the following equation

(JTJ)q = -JTf

from which q can be obtained by inverting JTJ.

(JTJ + λ)q = -JTf.

The damping factor λ is redetermined at every iteration.

If a retrieved step length or the reduction of sum of squares to the latest parameter vector p fall short to predefiened limits, the iteration is aborted and the last paramter vector p is considered to be the solution.

a tutorial

history of the algorithm

Public ___domain implementations:

References

The algorithm was first published by Kenneth Levenberg, while working at the Frankford Army Arsenal. It was rediscovered by Donald Marquardt.

  • Levenberg, K. "A Method for the Solution of Certain Problems in Least Squares." Quart. Appl. Math. 2, 164-168, 1944.
  • Marquardt, D. "An Algorithm for Least-Squares Estimation of Nonlinear Parameters." SIAM J. Appl. Math. 11, 431-441, 1963.