Content deleted Content added
→The solution: Credit Marquardt with the scale invariance (it is in the 1963 paper); Fletcher 1971 expresses it more cleanly. |
→The solution: mo' lambda, mo' easy |
||
Line 47:
The (non-negative) damping factor {{tmath|\lambda}} is adjusted at each iteration. If reduction of {{tmath|S}} is rapid, a smaller value can be used, bringing the algorithm closer to the [[Gauss–Newton algorithm]], whereas if an iteration gives insufficient reduction in the residual, {{tmath|\lambda}} can be increased, giving a step closer to the gradient-descent direction. Note that the [[gradient]] of {{tmath|S}} with respect to {{tmath|\boldsymbol\beta}} equals <math>-2\left (\mathbf J^{\mathrm T}\left [\mathbf y - \mathbf f\left (\boldsymbol\beta\right )\right ]\right )^{\mathrm T}</math>. Therefore, for large values of {{tmath|\lambda}}, the step will be taken approximately in the direction opposite to the gradient. If either the length of the calculated step {{tmath|\boldsymbol\delta}} or the reduction of sum of squares from the latest parameter vector {{tmath|\boldsymbol\beta + \boldsymbol\delta}} fall below predefined limits, iteration stops, and the last parameter vector {{tmath|\boldsymbol\beta}} is considered to be the solution.
To make the solution scale invariant Marquardt's algorithm solved a modified problem with each component of the gradient scaled according to the curvature. This provides larger movement along the directions where the gradient is smaller, which avoids slow convergence in the direction of small gradient. Fletcher in his 1971 paper ''A modified Marquardt subroutine for non-linear least squares'' simplified the form, replacing the identity matrix {{tmath|\mathbf I}} with the diagonal matrix consisting of the diagonal elements of {{tmath|\mathbf J^\text{T}\mathbf J}}:
|