Revision as of 02:29, 10 March 2020 edit MathRB (talk \| contribs) 17 edits →The solution ← Previous edit		Revision as of 15:27, 27 March 2020 edit undo 23.117.21.175 (talk) Fixed typo which was just the sentence “N.” I removed it, but the prior sentence needs some clarifications Tags: Visual edit Mobile edit Mobile web edit Next edit →
Line 46: The (non-negative) damping factor {{tmath\|\lambda}} is adjusted at each iteration. If reduction of {{tmath\|S}} is rapid, a smaller value can be used, bringing the algorithm closer to the [[Gauss–Newton algorithm]], whereas if an iteration gives insufficient reduction in the residual, {{tmath\|\lambda}} can be increased, giving a step closer to the gradient-descent direction. Note that the [[gradient]] of {{tmath\|S}} with respect to {{tmath\|\boldsymbol\beta}} equals <math>-2\left (\mathbf J^{\mathrm T}\left [\mathbf y - \mathbf f\left (\boldsymbol\beta\right )\right ]\right )^{\mathrm T}</math>. Therefore, for large values of {{tmath\|\lambda}}, the step will be taken approximately in the direction opposite to the gradient. If either the length of the calculated step {{tmath\|\boldsymbol\delta}} or the reduction of sum of squares from the latest parameter vector {{tmath\|\boldsymbol\beta + \boldsymbol\delta}} fall below predefined limits, iteration stops, and the last parameter vector {{tmath\|\boldsymbol\beta}} is considered to be the solution. Levenberg's algorithm has the disadvantage that if the value of damping factor {{tmath\|\lambda}} is large, inverting {{tmath\|\mathbf J^\text{T}\mathbf J + \lambda\mathbf I}} is not used at all~~. R~~. Fletcher provided the insight that we can scale each component of the gradient according to the curvature, so that there is larger movement along the directions where the gradient is smaller. This avoids slow convergence in the direction of small gradient. Therefore, Fletcher in his 1971 paper ''A modified Marquardt subroutine for non-linear least squares'' replaced the identity matrix {{tmath\|\mathbf I}} with the diagonal matrix consisting of the diagonal elements of {{tmath\|\mathbf J^\text{T}\mathbf J}}, thus making the solution scale invariant: :<math>\left [\mathbf J^{\mathrm T} \mathbf J + \lambda \operatorname{diag}\left (\mathbf J^{\mathrm T} \mathbf J\right )\right ] \boldsymbol\delta = \mathbf J^{\mathrm T}\left [\mathbf y - \mathbf f\left (\boldsymbol\beta\right )\right ].</math>

Levenberg–Marquardt algorithm: Difference between revisions