Content deleted Content added
Line 49:
Levenberg's algorithm has the disadvantage that if the value of damping factor, λ, is large, inverting '''J'''<sup>T</sup>'''J''' + λ'''I''' is not used at all. Marquardt provided the insight that we can scale each component of the gradient according to the curvature so that there is larger movement along the directions where the gradient is smaller. This avoids slow convergence in the direction of small gradient. Therefore, Marquardt replaced the identity matrix, '''I''', with the diagonal matrix consisting of the diagonal elements of '''J'''<sup>T</sup>'''J''', resulting in the Levenberg–Marquardt algorithm:
:<math>\mathbf{(J^T J + \lambda\, diag(J^T J))\boldsymbol \delta = -J^T [y - f(\boldsymbol \beta)]}\!</math>.
A similar damping factor appears in [[Tikhonov regularization]], which is used to solve linear [[ill-posed problems]], as well as in [[ridge regression]], an [[estimation theory|estimation]] technique in [[statistics]].
|