Steffensen's method

In numerical analysis, Steffensen's method is a root-finding method. It is similar to Newton's method and it also achieves quadratic convergence, but it does not use derivatives. The method is named after Johan Frederik Steffensen.

Simple description

The simplest form of the formula for Steffensen's method occurs when it is used to find the zeros, or roots, of a function $f$ , that is, to find the input value $x_{\star }$ that satisifies $f(x_{\star })=0$ . Near the solution $x_{\star }$ , the function $f$ is supposed to approximately satisfy $-1<f'(x_{\star })<0$ , which makes it adequate as an correction function for finding its own solution, although it is not required to be efficient. For some functions, Steffensen's method can work even if this condition is not met, but in such a case, the starting value $x_{0}$ must be very close to the actual solution $x_{\star }$ , and convergence to the solution may be slow.

Given an adequate starting value $x_{0}\$ , a sequence of values $x_{0},\ x_{1},\ x_{2},\dots ,\ x_{n},\dots$ can be generated. When it works, each value in the sequence is much closer to the solution $x_{\star }$ than the prior value. The value $x_{n}\$ from the current step generates the value $x_{n+1}\$ for the next step, via this formula^[1]:

x_{n+1}=x_{n}-{\frac {f(x_{n})}{g(x_{n})}}

for n = 0, 1, 2, 3, ... , where the slope function $g(x_{n})$ is a composite of the original function $f$ given by the following formula:

g(x_{n})={\frac {f(x_{n}+f(x_{n}))-f(x_{n})}{f(x_{n})}}

The function $g$ is the average slope of the function ƒ between the last sequence point $(x,y)=(x_{n},\ f(x_{n}))$ and the auxiliary point $(x,y)=(x_{n}+h,\ f(x_{n}+h))$ , with the step $h=f(x_{n})\$ . It is only for the purpose of finding this auxiliary point that the value of the function $f$ must be an adequate correction to get closer to its own solution. For all other parts of the calculation, Steffensen's method only requires the function $f$ to be continuous, and to actually have a nearby solution. Several modest modifications of the step $h$ in the slope calculation $g$ exist to accommodate functions $f$ that do not quite meet this requirement.

The main advantage of Steffensen's method is that it can find the roots of an equation $f$ just as "quickly" as Newton's method but the formula does not require a separate function for the derivative, so it can be programmed for any generic function. In this case quickly means that the number of correct digits in the answer doubles with each step. The cost for the quick convergence is the double function evaluation: both $f(x_{n})$ and $f(x_{n}+f(x_{n}))$ must be calculated, which might be time-consuming if $f$ is a complicated function. For comparison, the secant method needs only one function evaluation per step, so allowing for two function evaluations the secant method can do two steps and two steps of the secant method increase the number of correct digits by a factor 2.6 while one step of Steffensen's (or Newton's) method increases it by a factor 2.

Similar to Newton's method and most other quadratically convergent methods, the crucial weakness with the method is the choice of the starting value $x_{0}$ . If the value of $x_{0}$ is not "close enough" to the actual solution, the method will fail and the sequence of values $x_{0},x_{1},x_{2},x_{3},\dots$ will either flip flop between two extremes, or diverge to infinity (possibly both!).

Implementation in Matlab

Here is the source for an implementation of Steffensen's Method in MATLAB.

function Steffensen(f,p0,tol)
% This function takes as inputs: a fixed point iteration function, f, 
% and initial guess to the fixed point, p0, and a tolerance, tol.
% The fixed point iteration function is assumed to be input as an
% inline function. 
% This function will calculate and return the fixed point, p, 
% that makes the expression f(x) = p true to within the desired 
% tolerance, tol. 

format compact % This shortens the output.
format long    % This prints more decimal places. 

for i=1:1000   % get ready to do a large, but finite, number of iterations.
               % This is so that if the method fails to converge, we won't
               % be stuck in an infinite loop.
    p1=f(p0);  % calculate the next two guesses for the fixed point.
    p2=f(p1);
    p=p0-(p1-p0)^2/(p2-2*p1+p0) % use Aitken's delta squared method to
                                % find a better approximation to p0.
    if abs(p-p0)<tol  % test to see if we are within tolerance.
        break         % if we are, stop the iterations, we have our answer.
    end
    p0=p;              % update p0 for the next iteration.
end
if abs(p-p0)>tol       % If we fail to meet the tolerance, we output a
                       % message of failure.
    'failed to converge in 1000 iterations.'
end

Generalisation

Steffensen's method can also be used to find an input $x=x_{\star }$ of the function $f$ that produces the same output: $x_{\star }=f(x_{\star })$ . Such solutions are called "fixed points". Many of such functions can be used to find their own solutions by repeatedly recycling the result back as input, but the rate of convergence can be slow, or the function can fail to converge at all, depending on the individual function. Steffensen's method accelerates convergence.

This method for finding fixed points of a real-valued function has been generalised for functions $f:X\to X$ on a Banach space $X$ . The generalised method assumes that a family of bounded linear operators $\{L(u,v):u,v\in X\}$ associated with $u\$ and $v\$ are can be found to satisfy the condition^[2]

f(u)-f(v)=L(u,v)\ (u-v)

.

In the original form (given in the section above), where the function $f\$ simply takes in and produces real numbers, the operators are divided differences. In the general form, the operators $L\$ are the analogue of divided differences in the Banach space.

Steffensen's method is then very similar to the Newton's method, except that it uses the divided difference $L(f(x),x)\$ instead of the derivative $Df(x)\$ . It is thus defined by

x_{n+1}=x_{n}+[I-L(f(x_{n}),x_{n})]^{-1}(f(x_{n})-x_{n})\

,

for $n=1,\ 2,\ 3,\ ...$ , and where $I\$ is the identity operator. If the operator $L\$ satisfies

\|L(u,v)-L(x,y)\|\leq K{\big (}\|u-x\|+\|v-y\|{\big )}

for some constant $K\$ , then the method converges quadratically to a fixed point of ƒ if the initial approximation $x_{0}\$ is sufficiently close to the desired solution $x_{\star }$ , that satisfies $x_{\star }=f(x_{\star })$ .

References

^ Germund Dahlquist, Åke Björck, tr. Ned Anderson (1974) Numerical Methods, pp. 230-231, Prentice Hall, Englewood Cliffs, NJ
^ L. W. Johnson; D. R. Scholz (1968) On Steffensen's Method, SIAM Journal on Numerical Analysis (June 1968), vol. 5, no. 2., pp. 296-302. Stable URL: [1]

[1] Germund Dahlquist, Åke Björck, tr. Ned Anderson (1974) Numerical Methods, pp. 230-231, Prentice Hall, Englewood Cliffs, NJ

[2] L. W. Johnson; D. R. Scholz (1968) On Steffensen's Method, SIAM Journal on Numerical Analysis (June 1968), vol. 5, no. 2., pp. 296-302. Stable URL: [1]

[1]

[2]