Content deleted Content added
spelling |
m convert special characters found by Wikipedia:Typo Team/moss (via WP:JWB) |
||
(23 intermediate revisions by 4 users not shown) | |||
Line 1:
{{short description|Newton-like root-finding algorithm that does not use derivatives}}
In [[numerical analysis]], '''Steffensen's method''' is an [[iterative method]] named after [[Johan Frederik Steffensen]] for numerical [[root-finding method|root-finding
Steffensen's method has the drawback that it requires two function evaluations per step, whereas the secant method requires only one evaluation per step, so it is not necessarily most efficient in terms of [[computational cost]], depending on the number of iterations each requires. Newton's method also requires evaluating two functions per step – for the function and for its derivative – and its computational cost varies between being at best the same as the secant method, and at worst the same as Steffensen's method.
For rare special case functions the derivative for Newton's method can be calculated at negligible cost, by using saved parts from evaluation of the main function. If optimized in this way, Newton's method becomes only slightly more costly per step than the secant method,
}}
Line 9:
==Simple description==
The simplest form of the formula for Steffensen's method occurs when it is used to find a [[zero of a function|zero]] of a [[real function]] <math>f</math>; that is, to find the real value <math>\ x_\star\ </math> that satisfies <math>\ f(x_\star) = 0 ~.</math> Near the solution <math>\ x_\star\ ,</math> the derivative of the function, <math>\ f'\ ,</math> needs to either exactly or very nearly satisfy <math> -1 < f'(x_\star) < 0
name=deriv_cdx_note| The condition <math> -1 < f'(x_\star) < 0\ </math> ensures that if <math>\ f\ </math> ''was'' used as a correction-function for <math>\ x\ ,</math> for finding its ''own'' solution, it would step in the direction of the solution (<math>\ f' < 0\ </math>), and that the new value would tend to lie in between the solution and the prior value (<math> -1\ < f'\ </math>). }} For some functions, Steffensen's method can work even if this condition is not met, but in such a case, the starting value <math>\ x_0\ </math> must be ''very'' close to the actual solution <math>\ x_\star\ ,</math> and convergence to the solution may be slow. Adjustment of the size of the method's intermediate step, mentioned later, can improve convergence in some of these cases. Given an adequate starting value <math>\ x_0\ ,</math> a sequence of values <math>\ x_0,\ \ x_1,\ x_2,\ \dots,\ x_n,\ \dots\ </math> can be generated using the formula below. When it works, each value in the sequence is much closer to the solution <math>\ x_\star\ </math> than the prior value. The value <math>\ x_n\ </math> from the current step generates the value <math>\ x_{n+1}\ </math> for the next step, via the formula<ref name=Dahlquist-Björck-1974/>
Line 29 ⟶ 33:
Because the value of <math>\ g\ </math> is an approximation for <math>\ f'\ ,</math> its value can optionally be checked to see if it meets the condition <math>\ -1 < g < 0\ ,</math> which is required to guarantee convergence of Steffensen's algorithm. Although slight non-conformance may not necessarily be dire, any large departure from the condition warns that Steffensen's method is liable to fail, and temporary use of some fallback algorithm is warranted (e.g. the more robust [[Illinois algorithm]], or plain [[regula falsi]]).
It is only for the purpose of finding <math>\ h\ </math> for this auxiliary point that the value of the function <math>\ f\ </math> must
==Advantages and drawbacks==
The main advantage of Steffensen's method is that it has [[quadratic convergence]]<ref name=Dahlquist-Björck-1974/> like [[Newton's method]] – that is, both methods find roots to an equation <math>\ f\ </math> just as "quickly". In this case, ''quickly'' means that for both methods, the number of correct digits in the answer doubles with each step. But the formula for Newton's method requires evaluation of the function's derivative <math>\ f'\ </math> as well as the function <math>\ f\ ,</math> while Steffensen's method only requires <math>\ f\ </math> itself. This is important when the derivative is not easily or efficiently available.
The price for the quick convergence is the double function evaluation: Both <math>\ f(x_n)\ </math> and <math>\ f(x_n + h)\ </math> must be calculated, which might be time-consuming if <math>\ f\ </math> is complicated. For comparison, both [[regula falsi]] and the [[secant method]] only need one function evaluation per step. The secant method increases the number of correct digits by "only" a factor of roughly
Because <math>\ f( x_n + h )\ </math> requires the prior calculation of <math>\ h \equiv f(x_n)\ ,</math> the two evaluations must be done sequentially – the algorithm ''per se'' cannot be made faster by running the function evaluations in parallel. This is yet another disadvantage of Steffensen's method.
}} in practical use the secant method actually converges faster than Steffensen's method, when both algorithms succeed:
Similar to most other [[Root-finding algorithm#Iterative methods|iterative root-finding algorithms]], the crucial weakness in Steffensen's method is choosing a "sufficiently close" starting value <math>\ x_0 ~.</math> If the value of <math>\ x_0\ </math> is not "close enough" to the actual solution <math>\ x_\star\ ,</math> the method may fail, and the sequence of values <math>\ x_0, \, x_1, \, x_2, \, x_3, \, \dots\ </math> may either erratically flip-flop between two (or more) extremes, or diverge to infinity, or both.
==Derivation using Aitken's delta-squared process==
Line 89 ⟶ 93:
p0 = p; % update p0 for the next iteration.
end
if abs(p - p0) > tol % If we fail to meet the tolerance, we output a
% message of failure.
Line 114 ⟶ 119:
def steff(f: Func, x: float, tol: float) -> Iterator[float]:
"""
This recursive generator yields the x_{n+1} value first then, when the generator iterates,
Line 123 ⟶ 128:
x: Starting value upon first call, each level n that the function recurses x is x_n
"""
while True:
print("failed to converge in 1000 iterations")
break
else:
n = n + 1
fx = f(x)
▲ if abs(fx) <= tol:
if abs(fx) < tol:
break
else:
Line 134 ⟶ 148:
==Generalization to Banach space==
Steffensen's method can also be used to find an input <math>\ x = x_\star\ </math> for a different kind of function <math>\ F\ </math> that produces output the same as its input: <math>\ x_\star = F(x_\star)\ </math> for the special value <math>\ x_\star ~.</math>
This method for finding fixed points of a real-valued function has been generalized for functions <math>\ F : X \to X\ </math> that map a [[Banach space]] <math>\ X\ </math> onto itself or even more generally <math>\ F : X \to Y\ </math> that map from one [[Banach space]] <math>X </math> into another [[Banach space]] <math>\ Y ~.</math>
{{NumBlk|:|<math> F\left( u \right) - F\left( v \right) = G\left( u, v \right)\ \bigl
If division is possible in the [[Banach space]], then the linear operator <math>\ G\ </math> can be obtained from
:<math> G\left( u, v \right) = \bigl[\ F\left( u \right)- F\left( v \right)\ \bigr]\ \bigl
which may provide some insight: Expressed in this way, the linear operator <math>\ G\ </math> can be more easily seen to be an elaborate version of the [[divided difference]] <math>\ g\ </math> discussed in the first section, above. The quotient form is shown here for orientation only; it is ''not'' required ''per se''. Note also that division within the Banach space is not necessary for the elaborated Steffensen's method to be viable; the only requirement is that the operator <math>\ G\ </math> satisfy ({{EquationNote|1}}).
Steffensen's method is then very similar to the Newton's method, except that it uses the divided difference <math>\ G \bigl( F\left( x \right), x \bigr)\ </math> instead of the derivative <math>\ F'(x) ~.</math>
In the case that division is possible in the Banach space, the generalized iteration formula is given by
: <math> x_{n+1} = x_n + \Bigl[\ I - G\bigl( F\left( x_n \right), x_n \bigr)\ \Bigr]^{-1}\Bigl[\ F\left( x_n \right) - x_n\ \Bigr]\ ,</math>
for <math>\ n = 1,\
: <math> \Bigl[\ I - G\bigl( F\left( x_n \right), x_n \bigr)\ \Bigr] \
Equivalently, one may seek the solution <math>\ x_{n+1}\ </math> to the somewhat reduced form
: <math> \Bigl[\ I - G\bigl( F\left( x_n \right), x_n \bigr)\ \Bigr]\ x_{n+1} = \Bigl[\ F\left( x_n \right) - G\bigl( F\left( x_n \right), x_n \bigr) \ x_n\ \Bigr]\ ,</math>
with all the values inside square brackets being independent of <math>\ x_{n+1}\ :</math>
If the linear operator <math>\ G\ </math> satisfies
: <math> \Bigl\| G \left( u, v \right) - G \left( x, y \right) \Bigr\| \le k \biggl( \Bigl\|u - x \Bigr\| + \Bigr\| v - y \Bigr\| \biggr) </math>
for some positive real constant <math>\ k\ ,</math>
==Notes==
|