Content deleted Content added
RowanElder (talk | contribs) Copyedit of lead section, then expanded the lead section to include a bit more on what is in the article, particularly the Aitken method derivation and the Banach space generalization. |
m convert special characters found by Wikipedia:Typo Team/moss (via WP:JWB) |
||
(42 intermediate revisions by 11 users not shown) | |||
Line 1:
{{short description|Newton-like root-finding algorithm that does not use derivatives}}
In [[numerical analysis]], '''Steffensen's method''' is an [[iterative method]] named after [[Johan Frederik Steffensen]] for numerical [[root-finding method|root-finding]]
Steffensen's method has the drawback that it requires two function evaluations per step, whereas the secant method requires only one evaluation per step, so it is not necessarily most efficient in terms of [[computational cost]], depending on the number of iterations each requires. Newton's method also requires evaluating two functions per step – for the function and for its derivative – and its computational cost varies between being at best the same as the secant method, and at worst the same as Steffensen's method. For most functions calculation of the derivative is just as computationally costly as calculating the original function, and so the normal case is that Newton's method is equally costly as Steffensen's.{{efn|
Steffenson's method can be derived as an adaptation of [[Aitken's delta-squared process]] applied to [[fixed-point iteration]]. Viewed in this way, Steffenson's method naturally generalizes to efficient fixed-point calculation in general [[Banach space|Banach spaces]], whenever fixed points are guaranteed to exist and fixed point iteration is guaranteed to converge, although possibly slowly, by the [[Banach fixed-point theorem]].▼
For rare special case functions the derivative for Newton's method can be calculated at negligible cost, by using saved parts from evaluation of the main function. If optimized in this way, Newton's method becomes only slightly more costly per step than the secant method, and benefits from slightly faster convergence.
}}▼
▲
==Simple description==
The simplest form of the formula for Steffensen's method occurs when it is used to find a [[zero of a function|zero]] of a [[real function]] <math>
name=deriv_cdx_note| The condition <math> -1 < f'(x_\star) < 0\ </math> ensures that if <math>\ f\ </math> }} For some functions, Steffensen's method can work even if this condition is not met, but in such a case, the starting value <math>\ x_0\ </math> must be ''very'' close to the actual solution <math>\ x_\star\ ,</math> and convergence to the solution may be slow. Adjustment of the size of the method's intermediate step, mentioned later, can improve convergence in some of these cases. Given an adequate starting value <math>\ x_0\
:<math>
for <math>\ n = 0, 1, 2, 3, ...\
:<math>
or perhaps more clearly,
:<math>
where <math>\ h
Technically, the function <math>\ g\ </math> is called the first-order [[divided difference]] of <math>\ f\ </math> between those two points{{efn|The
Because the value of <math>\ g\ </math> is an approximation for <math>\ f'\ ,</math> its value can optionally be checked to see if it meets the condition <math>\ -1 < g < 0\ ,</math> which is required
It is only for the purpose of finding <math>\ h\ </math> for this auxiliary point that the value of the function <math>\ f\ </math> must
==Advantages and drawbacks==
The main advantage of Steffensen's method is that it has [[quadratic convergence]]<ref name=Dahlquist-Björck-1974/> like [[Newton's method]] – that is, both methods find roots to an equation <math>
The price for the quick convergence is the double function evaluation: Both <math>
Because <math>
}} in practical use the secant method actually converges faster than Steffensen's method, when both algorithms succeed: The secant method achieves a factor of about {{
▲}}
▲in practical use the secant method actually converges faster than Steffensen's method, when both algorithms succeed: The secant method achieves a factor of about {{nowrap|(1.6)<sup>2</sup> ≈ 2.6 times}} as many digits for every two steps (two function evaluations), compared to Steffensen's factor of 2 for every one step (two function evaluations).
Similar to most other [[Root-finding algorithm#Iterative methods|iterative root-finding algorithms]], the crucial weakness in Steffensen's method is choosing
==Derivation using Aitken's delta-squared process==
The version of Steffensen's method implemented in the [[MATLAB]] code shown below can be found using
:<math>\frac{
so that
:<math>
Solving for the desired limit of the sequence <math>
:<math> p
:<math> =~ \frac{\, (\, p_{n}^2 + p_{n} \, p_{n+2} - 2 \, p_{n} \, p_{n+1} \,) - (\, p_{n}^2 - 2 \, p_{n} \, p_{n+1} + p_{n+1}^2 \, ) \, }{\, p_{n+2} - 2 \, p_{n+1} + p_n \,}</math>
:<math> =
which results in the more rapidly convergent sequence:
:<math>
== Code example ==
Line 65 ⟶ 68:
Here is the source for an implementation of Steffensen's Method in [[MATLAB]].
:<syntaxhighlight lang="matlab">
function Steffensen(f, p0, tol)
% This function takes as inputs: a fixed point iteration function, f,
Line 81 ⟶ 84:
% This is so that if the method fails to converge, we won't
% be stuck in an infinite loop.
p1 = f(p0) + p0; % calculate the next two guesses for the fixed point.
p2 = f(p1) + p1;
p = p0-(p1-p0)^2/(p2-2*p1+p0) % use Aitken's delta squared method to
% find a better approximation to p0.
Line 90 ⟶ 93:
p0 = p; % update p0 for the next iteration.
end
if abs(p - p0) > tol % If we fail to meet the tolerance, we output a
% message of failure.
Line 98 ⟶ 102:
===In Python===
Here is the source for an implementation of Steffensen's
:<syntaxhighlight lang="python">
from typing import Callable, Iterator
Func = Callable[[float], float, float]
def g(f: Func, x: float, fx: float) -> Func:
Line 114 ⟶ 118:
return lambda x: f(x + fx) / fx - 1
def steff(f: Func, x: float, tol: float) -> Iterator[float]:
"""
This recursive generator yields the x_{n+1} value first then, when the generator iterates,
Line 124 ⟶ 128:
x: Starting value upon first call, each level n that the function recurses x is x_n
"""
while True:
print("failed to converge in 1000 iterations")
break
else:
n = n + 1
fx = f(x)
▲ gx = g(f, x, fx)(x)
if
break
else:
gx = g(f, x, fx)(x)
x = x - fx / gx # Update to x_{n+1}
yield x # Yield value
Line 137 ⟶ 150:
Steffensen's method can also be used to find an input <math>\ x = x_\star\ </math> for a different kind of function <math>\ F\ </math> that produces output the same as its input: <math>\ x_\star = F(x_\star)\ </math> for the special value <math>\ x_\star ~.</math> Solutions like <math>\ x_\star\ </math> are called ''[[fixed point (mathematics)|fixed point]]s''. Many of these functions can be used to find their own solutions by repeatedly recycling the result back as input, but the rate of convergence can be slow, or the function can fail to converge at all, depending on the individual function. Steffensen's method accelerates this convergence, to make it [[quadratic convergence|quadratic]].
This method for finding fixed points of a real-valued function has been
{{NumBlk|:|<math>
The operator <math>\ G\ </math> is roughly equivalent to a [[Matrix (mathematics)|matrix]] whose entries are all functions of [[vector (mathematics)|vector]] [[Argument of a function|arguments]] <math>\ u\ </math> and <math>\ v ~.</math> Refer again back to the [[Function of a real variable|simple function]] <math>\ f\ ,</math> given in the first section, where the function merely takes in and puts out real numbers: There, the function <math>\ g\ </math> is a ''[[divided difference]]''. In the generalized form here, the operator <math>\ G\ </math> is the analogue of a divided difference for use in the [[Banach space]].
''If division is possible'' in the [[Banach space]], the linear operator <math>\ G\ </math> can be obtained from▼
:<math>\ G\left( u, v \right) = \bigl[\ F\left( u \right)- F\left( v \right)\ \bigr] \bigl[\ u - v\ \bigr]^{-1}\ ,</math>▼
which may provide some insight: Expressed in this way, the linear operator <math>\ G\ </math> can be more easily seen to be an elaborate version of the [[divided difference]] <math>\ g\ </math> discussed in the first section, above. The quotient form is shown here for orientation only; it is ''not'' required ''per se''. Note also that division within the Banach space is not necessary for the elaborated Steffensen's method to be viable; the only requirement is that the operator <math>\ G\ </math> satisfy the [[#⸎|equation marked]] with the [[Coronis (textual symbol)|coronis]], {{big|({{math|⸎}})}}.▼
▲
▲:<math>
▲which may provide some insight: Expressed in this way, the linear operator <math>\ G\ </math> can be more easily seen to be an elaborate version of the [[divided difference]] <math>\ g\ </math> discussed in the first section, above. The quotient form is shown here for orientation only; it is ''not'' required ''per se''. Note also that division within the Banach space is not necessary for the elaborated Steffensen's method to be viable; the only requirement is that the operator <math>\ G\ </math> satisfy
Steffensen's method is then very similar to the Newton's method, except that it uses the divided difference <math>\ G \bigl(
In the case that division is possible in the Banach space, the generalized iteration formula is given by
: <math> x_{n+1} = x_n + \Bigl[\ I - G\bigl( F\left( x_n \right), x_n \bigr)\ \Bigr]^{-1}\Bigl[\ F\left( x_n \right) - x_n\ \Bigr]\ ,</math>
for <math>\ n = 1,\
: <math> \Bigl[\ I - G\bigl( F\left( x_n \right), x_n \bigr)\ \Bigr] \
Equivalently, one may seek the solution <math>\ x_{n+1}\ </math> to the somewhat reduced form
: <math> \Bigl[\ I - G\bigl( F\left( x_n \right), x_n \bigr)\ \Bigr]\ x_{n+1} = \Bigl[\ F\left( x_n \right) - G\bigl( F\left( x_n \right), x_n \bigr) \ x_n\ \Bigr]\ ,</math>
with all the values inside square brackets being independent of <math>\ x_{n+1}\ :</math> The bracketed terms all only depend on <math>\ x_{n}\
If the linear operator <math>
: <math> \Bigl\| G \left( u, v \right) - G \left( x, y \right) \Bigr\| \le k \biggl( \Bigl\|u - x \Bigr\| + \Bigr\| v - y \Bigr\| \biggr)
for some positive real constant <math>\ k\
==Notes==
Line 208 ⟶ 221:
{{Root-finding algorithms}}
[[Category:Articles with example MATLAB/Octave code]]
[[Category:Articles with example Python (programming language) code]]
[[Category:Quasi-Newton methods]]
|