Steffensen's method: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 02:23, 18 June 2025 edit 12.75.128.3 (talk) minor text edit ← Previous edit		Latest revision as of 18:14, 24 July 2025 edit undo Beland (talk \| contribs) Autopatrolled, Administrators 259,161 edits m convert special characters found by Wikipedia:Typo Team/moss (via WP:JWB) Tag: JWB
(11 intermediate revisions by 4 users not shown)
Line 1: {{short description\|Newton-like root-finding algorithm that does not use derivatives}} In [[numerical analysis]], '''Steffensen's method''' is an [[iterative method]] named after [[Johan Frederik Steffensen]] for numerical [[root-finding method\|root-finding~~]] named after [[Johan Frederik Steffensen~~]] that is similar to the [[secant method]] and to [[Newton's method]]. '''Steffensen's method''' achieves a quadratic [[order of convergence]] without using [[derivative]]s, whereas the more familiar Newton's method also converges quadratically, but requires derivatives and the secant method does not require derivatives but also converges less quickly than quadratically. Steffensen's method has the drawback that it requires two function evaluations per step, whereas the secant method requires only one evaluation per step, so it is not necessarily most efficient in terms of [[computational cost]], depending on the number of iterations each requires. Newton's method also requires evaluating two functions per step – for the function and for its derivative – and its computational cost varies between being at best the same as the secant method, and at worst the same as Steffensen's method. ~~(for~~For most functions~~, where~~ calculation of the derivative is just as computationally costly as calculating the original function), and so the normal case is that Newton's method is equally costly as Steffensen's.{{efn\| For rare special case functions the derivative for Newton's method can be calculated at negligible cost, by using saved parts from evaluation of the main function. If optimized in this way, Newton's method becomes only slightly more costly per step than the secant method, ~~with~~and benefits from slightly faster convergence. }} Line 38: The main advantage of Steffensen's method is that it has [[quadratic convergence]]<ref name=Dahlquist-Björck-1974/> like [[Newton's method]] – that is, both methods find roots to an equation <math>\ f\ </math> just as "quickly". In this case, ''quickly'' means that for both methods, the number of correct digits in the answer doubles with each step. But the formula for Newton's method requires evaluation of the function's derivative <math>\ f'\ </math> as well as the function <math>\ f\ ,</math> while Steffensen's method only requires <math>\ f\ </math> itself. This is important when the derivative is not easily or efficiently available. The price for the quick convergence is the double function evaluation: Both <math>\ f(x_n)\ </math> and <math>\ f(x_n + h)\ </math> must be calculated, which might be time-consuming if <math>\ f\ </math> is complicated. For comparison, both [[regula falsi]] and the [[secant method]] only need one function evaluation per step. The secant method increases the number of correct digits by "only" a factor of roughly ~~ ~~ 1.6 per step,~~ ~~ but one can do twice as many steps of the secant method within a given time. Since the secant method can carry out twice as many steps in the same time as Steffensen's method,{{efn\| Because <math>\ f( x_n + h )\ </math> requires the prior calculation of <math>\ h \equiv f(x_n)\ ,</math> the two evaluations must be done sequentially – the algorithm ''per se'' cannot be made faster by running the function evaluations in parallel. This is yet another disadvantage of Steffensen's method. }} in practical use the secant method actually converges faster than Steffensen's method, when both algorithms succeed: The secant method achieves a factor of about ~~ ~~ {{nobr\|{{math\|(1.6){{sup\|2}} ~~&hairsp;~~ ≈ ~~&hairsp;~~ 2.6}} times}} ~~ ~~ as many digits for every two steps (two function evaluations), compared to Steffensen's factor of ~~ ~~ {{math\|2}} ~~ ~~ for every one step (two function evaluations). Similar to most other [[Root-finding algorithm#Iterative methods\|iterative root-finding algorithms]], the crucial weakness in Steffensen's method is choosing a "sufficiently close" starting value <math>\ x_0 ~.</math> If the value of <math>\ x_0\ </math> is not "close enough" to the actual solution <math>\ x_\star\ ,</math> the method may fail, and the sequence of values <math>\ x_0, \, x_1, \, x_2, \, x_3, \, \dots\ </math> may either erratically flip-flop between two (or more) extremes, or diverge to infinity, or both. ==Derivation using Aitken's delta-squared process== Line 93: p0 = p; % update p0 for the next iteration. end if abs(p - p0) > tol % If we fail to meet the tolerance, we output a % message of failure. Line 118 ⟶ 119: def steff(f: Func, x: float, tol: float) -> Iterator[float]: """~~Steffenson~~Steffensen algorithm for finding roots. This recursive generator yields the x_{n+1} value first then, when the generator iterates, Line 127 ⟶ 128: x: Starting value upon first call, each level n that the function recurses x is x_n """ ~~while True:~~ n = 0 while True: if ~~abs(fx)~~n <=> ~~tol~~1000:▼ print("failed to converge in 1000 iterations") break else: n = n + 1 fx = f(x) ▲ if abs(fx) <= tol: if abs(fx) < tol: break else: Line 140 ⟶ 150: Steffensen's method can also be used to find an input <math>\ x = x_\star\ </math> for a different kind of function <math>\ F\ </math> that produces output the same as its input: <math>\ x_\star = F(x_\star)\ </math> for the special value <math>\ x_\star ~.</math> Solutions like <math>\ x_\star\ </math> are called ''[[fixed point (mathematics)\|fixed point]]s''. Many of these functions can be used to find their own solutions by repeatedly recycling the result back as input, but the rate of convergence can be slow, or the function can fail to converge at all, depending on the individual function. Steffensen's method accelerates this convergence, to make it [[quadratic convergence\|quadratic]]. AsMomentarily ignoring the issues of a more general [[Banach space]] vs. basic [[real numbers]] for the sake of an example: toTo re-orient the reader to the earlier section, a ~~pseudo-version~~simple of[[toy ~~the~~model]] fixed-point function, <math>\ \tilde F\ ,</math> ~~can~~using ~~be concocted from a~~any root function <math>\ f\ ,</math> ~~simply~~can bybe made with <math>\ \tilde F(x) = x + \varepsilon\ f(x)\ ,~.</math> ~~where~~Here <math>\ \varepsilon\ </math> is ~~some scalar~~a constant with the appropriate sign that is small enough in magnitude to make <math>\ \tilde F\ </math> stable under iteration, but large enough for the [[non-linearity]] of the function <math>\ f\ </math> to be appreciable~~. All issues of a more general [[Banach space]] vs. basic [[real numbers]] being momentarily ignored for the sake of the comparison~~. This method for finding fixed points of a real-valued function has been generalized for functions <math>\ F : X \to X\ </math> that map a [[Banach space]] <math>\ X\ </math> onto itself or even more generally <math>\ F : X \to Y\ </math> that map from one [[Banach space]] <math>X </math> into another [[Banach space]] <math>\ Y ~.</math> The generalized method assumes that a [[Indexed family\|family]] of [[Bounded set\|bounded]] [[linear operators]] <math>\ \bigl\{\ G(u,v): u, v \in X\ \bigr\}\ </math> associated with <math>\ u\ </math> and <math>\ v\ </math> can be devised that (locally) satisfies the condition<ref name=Johnson-Scholz-1968/> {{NumBlk\|:\|<math> F\left( u \right) - F\left( v \right) = G\left( u, v \right)\ \bigl(\ u - v\ \bigr) \quad</math>\|{{EquationRef\|1}}}}