Root-finding algorithm: Difference between revisions

Content deleted Content added
m Fixed grammar
Tags: Mobile edit Mobile app edit iOS app edit
OAbot (talk | contribs)
m Open access bot: url-access=subscription updated in citation with #oabot.
 
(29 intermediate revisions by 18 users not shown)
Line 1:
{{Short description|Algorithms for zeros of functions}}
In [[mathematics]]numerical and [[computinganalysis]], a '''root-finding algorithm''' is an [[algorithm]] for finding [[Zero of a function|zeros]], also called "roots", of [[continuous function]]s. A [[zero of a function]] {{math|''f''}}, from the [[real number]]s to real numbers or from the [[complex number]]s to the complex numbers, is a number {{math|''x''}} such that {{math|1=''f''(''x'') = 0}}. As, generally, the zeros of a function cannot be computed exactly nor expressed in [[closed form expression|closed form]], root-finding algorithms provide approximations to zeros. For functions from the [[real number]]s to real numbers or from the [[complex number]]s to the complex numbers, these are expressed either as [[floating-point arithmetic|floating-point]] numbers without error bounds or as floating-point values together with error bounds. The latter, approximations with error bounds, are equivalent to small isolating [[interval (mathematics)|intervals]], for real roots or [[disk (mathematics)|disks]] for complex roots (an interval or disk output being equivalent to an approximate output together with an error bound).<ref>{{Cite book |last1=Press |first1=W. H. |title=Numerical Recipes: The Art of Scientific Computing |last2=Teukolsky |first2=S. A. |last3=Vetterling |first3=W. T. |last4=Flannery |first4=B. P. |publisher=Cambridge University Press |year=2007 |isbn=978-0-521-88068-8 |edition=3rd |publication-place=New York |chapter=Chapter 9. Root Finding and Nonlinear Sets of Equations |chapter-url=http://apps.nrbook.com/empanel/index.html#pg=442}}</ref>
 
[[Equation solving|Solving an equation]] {{math|1=''f''(''x'') = ''g''(''x'')}} is the same as finding the roots of the function {{math|1=''h''(''x'') = ''f''(''x'') – ''g''(''x'')}}. Thus root-finding algorithms allowcan solvingbe used to solve any [[equation (mathematics)|equation]] defined byof continuous functions. However, most root-finding algorithms do not guarantee that they will find all the roots; inof particulara function, and if such an algorithm does not find any root, that does not necessarily mean that no root exists.
 
Most numerical root-finding methods useare [[iterationIterative method|iterative methods]], producing a [[sequence]] of numbers that hopefullyideally converges towards thea root as itsa [[Limit of a sequence|limit]]. They require one or more ''initial guesses'' of the root as starting values, then each iteration of the algorithm produces a successively more accurate approximation to the root. Since the iteration must be stopped at some point, these methods produce an approximation to the root, not an exact solution. Many methods compute subsequent values by evaluating an auxiliary function on the preceding values. The limit is thus a [[Fixed point (mathematics)|fixed point]] of the auxiliary function, which is chosen for having the roots of the original equation as fixed points, and for converging rapidly to these fixed points.
 
The behavior of general root-finding algorithms is studied in [[numerical analysis]]. However, for polynomials specifically, the study of root-finding studyalgorithms belongs generally to [[computer algebra]], since algebraic properties of polynomials are fundamental for the most efficient algorithms. The efficiency and applicability of an algorithm may depend dramaticallysensitively on the characteristics of the given functions. For example, many algorithms use the [[derivative]] of the input function, while others work on every [[continuous function]]. In general, numerical algorithms are not guaranteed to find all the roots of a function, so failing to find a root does not prove that there is no root. However, for [[polynomial]]s, there are specific algorithms that use algebraic properties for certifying that no root is missed, and for locating the roots in separate intervals (or [[disk (mathematics)|disks]] for complex roots) that are small enough to ensure the convergence of numerical methods (typically [[Newton's method]]) to the unique root sowithin locatedeach interval (or disk).
 
== Bracketing methods ==
Bracketing methods determine successively smaller intervals (brackets) that contain a root. When the interval is small enough, then a root hasis beenconsidered found. TheyThese generally use the [[intermediate value theorem]], which asserts that if a continuous function has values of opposite signs at the end points of an interval, then the function has at least one root in the interval. Therefore, they require to startstarting with an interval such that the function takes opposite signs at the end points of the interval. However, in the case of [[polynomial]]s there are other methods (such as [[Descartes' rule of signs]], [[Budan's theorem]] and [[Sturm's theorem]]) for gettingbounding informationor ondetermining the number of roots in an interval. They lead to efficient algorithms for [[real-root isolation]] of polynomials, which ensure findingfind all real roots with a guaranteed accuracy.
 
=== Bisection method ===
The simplest root-finding algorithm is the [[bisection method]]. Let {{math|''f''}} be a [[continuous function]], for which one knows an interval {{math|[''a'', ''b'']}} such that {{math|''f''(''a'')}} and {{math|''f''(''b'')}} have opposite signs (a bracket). Let {{math|1=''c'' = (''a'' + ''b'')/2}} be the middle of the interval (the midpoint or the point that bisects the interval). Then either {{math|''f''(''a'')}} and {{math|''f''(''c'')}}, or {{math|''f''(''c'')}} and {{math|''f''(''b'')}} have opposite signs, and one has divided by two the size of the interval. Although the bisection method is robust, it gains one and only one [[bit]] of accuracy with each iteration. Therefore, the number of function evaluations required for finding an ''ε''-approximate root is <math>\log_2\frac{b-a}{\varepsilon}</math>. Other methods, under appropriate conditions, can gain accuracy faster.
 
=== False position (''regula falsi'') ===
Line 18:
:<math>c= \frac{af(b)-bf(a)}{f(b)-f(a)}.</math>
 
False position is similar to the [[secant method]], except that, instead of retaining the last two points, it makes sure to keep one point on either side of the root. The false position method can be faster than the bisection method and will never diverge like the secant method;. howeverHowever, it may fail to converge in some naive implementations due to roundoff errors that may lead to a wrong sign for {{math|''f''(''c'')}};. typicallyTypically, this may occur if the [[rate of variationderivative]] of {{mvar|f}} is large in the neighborhood of the root.
 
=== ITP method ===
The [[ITP Method|ITP method]] is the only known method to bracket the root with the same worst case guarantees of the bisection method while guaranteeing a superlinear convergence to the root of smooth functions as the secant method. It is also the only known method guaranteed to outperform the bisection method on the average for any continuous distribution on the ___location of the root (see [[ITP Method#Analysis]]). It does so by keeping track of both the bracketing interval as well as the minmax interval in which any point therein converges as fast as the bisection method. The construction of the queried point c follows three steps: interpolation (similar to the regula falsi), truncation (adjusting the regula falsi similar to [[Regula falsi#Improvements%20in%20''regula%20falsi''|Regula falsi § Improvements in ''regula falsi'']]) and then projection onto the minmax interval. The combination of these steps produces a simultaneously minmax optimal method with guarantees similar to interpolation based methods for smooth functions, and, in practice will outperform both the bisection method and interpolation based methods under both smooth and non-smooth functions.
 
== Interpolation ==
Line 27 ⟶ 24:
Many root-finding processes work by [[interpolation]]. This consists in using the last computed approximate values of the root for approximating the function by a [[polynomial]] of low degree, which takes the same values at these approximate roots. Then the root of the polynomial is computed and used as a new approximate value of the root of the function, and the process is iterated.
 
TwoInterpolating two values allow interpolatingyields a function byline: a polynomial of degree one. (thatThis is approximating the graphbasis of the function[[secant by a line)method]]. This''Regula falsi'' is thealso basisan ofinterpolation method that interpolates two points at a time but it differs from the [[secant method]]. Threeby valuesusing definetwo apoints [[quadraticthat function]],are whichnot approximatesnecessarily the graphlast oftwo thecomputed functionpoints. byThree values define a parabolic curve: a [[parabolaquadratic function]]. This is the basis of [[Muller's method]].
 
''Regula falsi'' is also an interpolation method, which differs from the secant method by using, for interpolating by a line, two points that are not necessarily the last two computed points.
 
== Iterative methods ==
Although all root-finding algorithms proceed by [[iteration]], an [[iterative method|iterative]] root-finding method generally uses a specific type of iteration, consisting of defining an auxiliary function, which is applied to the last computed approximations of a root for getting a new approximation. The iteration stops when a [[fixed-point iteration|fixed point]] ([[up to]] the desired precision) of the auxiliary function is reached to the desired precision, that isi.e., when thea new computed value is sufficiently close to the preceding ones.
 
=== Newton's method (and similar derivative-based methods) ===
[[Newton's method]] assumes the function ''f'' to have a continuous [[derivative]]. Newton's method may not converge if started too far away from a root. However, when it does converge, it is faster than the bisection method,; andits [[order of convergence]] is usually quadratic whereas the bisection method's is linear. Newton's method is also important because it readily generalizes to higher-dimensional problems. [[Householder's method]]s are a class of Newton-like methods with higher orders of convergence are the [[Householder's method]]s. The first one after Newton's method is [[Halley's method]] with cubic order of convergence.
 
=== Secant method ===
Replacing the derivative in Newton's method with a [[finite difference]], we get the [[secant method]]. This method does not require the computation (nor the existence) of a derivative, but the price is slower convergence (the order isof approximatelyconvergence 1.6is the ([[golden ratio]], approximately 1.62<ref>{{Cite web |last=Chanson |first=Jeffrey R. |date=October 3, 2024 |title=Order of Convergence |url=https://math.libretexts.org/Bookshelves/Applied_Mathematics/Numerical_Methods_(Chasnov)/02%3A_Root_Finding/2.04%3A_Order_of_Convergence |access-date=October 3, 2024 |website=LibreTexts Mathematics}}</ref>). A generalization of the secant method in higher dimensions is [[Broyden's method]].
 
=== Steffensen's method ===
If we use a polynomial fit to remove the quadratic part of the finite difference used in the Secantsecant method, so that it better approximates the derivative, we obtain [[Steffensen's method]], which has quadratic convergence, and whose behavior (both good and bad) is essentially the same as Newton's method but does not require a derivative.
 
=== Fixed point iteration method ===
Line 69 ⟶ 64:
== Finding roots in higher dimensions ==
 
The [[bisection method]] has been generalized to higher dimensions; these methods are called '''generalized bisection methods'''.<ref name=":0">{{Cite journal |last1=Mourrain |first1=B. |last2=Vrahatis |first2=M. N. |last3=Yakoubsohn |first3=J. C. |date=2002-06-01 |title=On the Complexity of Isolating Real Roots and Computing with Certainty the Topological Degree |url=https://www.sciencedirect.com/science/article/pii/S0885064X01906363 |journal=Journal of Complexity |language=en |volume=18 |issue=2 |pages=612–640 |doi=10.1006/jcom.2001.0636 |issn=0885-064X|doi-access=free }}</ref><ref>{{Cite book |last=Vrahatis |first=Michael N. |date=2020 |editor-last=Sergeyev |editor-first=Yaroslav D. |editor2-last=Kvasov |editor2-first=Dmitri E. |chapter=Generalizations of the Intermediate Value Theorem for Approximating Fixed Points and Zeros of Continuous Functions |chapter-url=https://link.springer.com/chapter/10.1007/978-3-030-40616-5_17 |title=Numerical Computations: Theory and Algorithms |series=Lecture Notes in Computer Science |language=en |___location=Cham |publisher=Springer International Publishing |volume=11974 |pages=223–238 |doi=10.1007/978-3-030-40616-5_17 |isbn=978-3-030-40616-5 |s2cid=211160947}}</ref> At each iteration, the ___domain is partitioned into two parts, and the algorithm decides - based on a small number of function evaluations - which of these two parts must contain a root. In one dimension, the criterion for decision is that the function has opposite signs. The main challenge in extending the method to multiple dimensions is to find a criterion that can be computed easily, and guarantees the existence of a root.
 
The [[Poincaré–Miranda theorem]] gives a criterion for the existence of a root in a rectangle, but it is hard to verify, sincebecause it requires to evaluateevaluating the function on the entire boundary of the trianglerectangle.
 
Another criterion is given by a theorem of [[Leopold Kronecker|Kronecker]].<ref>{{Cite webbook |last1=Ortega |first1= James M. |last2=Rheinboldt |first2=Werner C. |title=Iterative solution of nonlinear equations in several variables |url=https://dl.acm.org/doi/abs/10.5555/335947 |access-date=2023-04-162000 |websitepublisher=GuideSociety booksfor Industrial and Applied Mathematics |languageisbn=EN978-0-89871-461-6 }}</ref>{{Page needed|date=December 2024}} It says that, if the [[Degree of a continuous mapping|topological degree]] of a function ''f'' on a rectangle is non-zero, then the rectangle must contain at least one root of ''f''. This criterion is the basis for several root-finding methods, such as bythose of Stenger<ref>{{Cite journal |last=Stenger |first=Frank |date=1975-03-01 |title=Computing the topological degree of a mapping inRn |url=https://doi.org/10.1007/BF01419526 |journal=Numerische Mathematik |language=en |volume=25 |issue=1 |pages=23–38 |doi=10.1007/BF01419526 |s2cid=122196773 |issn=0945-3245|url-access=subscription }}</ref> and Kearfott.<ref>{{Cite journal |last=Kearfott |first=Baker |date=1979-06-01 |title=An efficient degree-computation method for a generalized method of bisection |url=https://doi.org/10.1007/BF01404868 |journal=Numerische Mathematik |volume=32 |issue=2 |pages=109–127 |doi=10.1007/BF01404868 |s2cid=122058552 |issn=0029-599X|url-access=subscription }}</ref> However, computing the topological degree can be time-consuming.
 
A third criterion is based on a ''characteristic polyhedron''. This criterion is used by a method called Characteristic Bisection.<ref name=":0" />{{Rp|page=19--}} It does not require computing the topological degree; it only requires computing the signs of function values. The number of required evaluations is at least <math>\log_2(D/\epsilon)</math>, where ''D'' is the length of the longest edge of the characteristic polyhedron.<ref name=":2">{{Cite journal |last1=MourrainVrahatis |first1=B. |last2=Vrahatis |first2=M. N. |last3last2=YakoubsohnIordanidis |first3first2=JK. CI. |date=20021986-0603-01 |title=OnA theRapid ComplexityGeneralized Method of IsolatingBisection Realfor RootsSolving andSystems Computingof withNon-linear Certainty the Topological DegreeEquations |url=https://wwwdoi.sciencedirectorg/10.com1007/science/article/pii/S0885064X01906363BF01389620 |journal=Journal ofNumerische ComplexityMathematik |language=en |volume=1849 |issue=2 |pages=612–640123–138 |doi=10.10061007/jcom.2001.0636BF01389620 |issn=08850945-064X3245 |s2cid=121771945|url-access=subscription }}</ref>{{Rp|page=19--11|___location=Lemma.4.7}} ItNote doesthat notVrahatis requireand toIordanidis compute<ref thename=":2" topological/> degreeprove -a itlower onlybound requires to computeon the signs of function values. The number of required evaluations is <math>\log_2(D/\epsilon)</math>, whereand ''D'' is the length of thenot longestan edgeupper of the characteristic polyhedronbound. Note that this number does not depend on the dimension.
 
A fourth method uses an [[intermediate- value theorem]] on simplices.<ref>{{Cite journal |last=Vrahatis |first=Michael N. |date=2020 |editor-last=Sergeyev |editor04-first=Yaroslav D. |editor2-last=Kvasov |editor2-first=Dmitri E.15 |title=Generalizations of the Intermediate Valuevalue Theoremtheorem for Approximatingsimplices Fixedfor Pointssimplicial and Zerosapproximation of Continuousfixed Functionspoints |url=https://link.springer.com/chapter/10.1007/978-3-030-40616-5_17and zeros |journal=Numerical Computations: TheoryTopology and AlgorithmsIts |series=Lecture Notes in Computer Science |volume=11974Applications |language=en |___locationvolume=Cham |publisher=Springer International Publishing275 |pages=223–238107036 |doi=10.10071016/978-3-030-40616-5_17j.topol.2019.107036 |isbns2cid=978213249321 |issn=0166-3-030-40616-58641|s2ciddoi-access=211160947free }}</ref> Again, no upper bound on the number of queries is given.
 
== See also ==
Line 98 ⟶ 93:
{{reflist}}
== Further reading ==
* Victor Yakovlevich Pan: "Solving a Polynomial Equation: Some History and Recent Progress", SIAM Review, Vol.39, No.2, pp.187-220 (June, 1997).
* J.M. McNamee: "Numerical Methods for Roots of Polynomials - Part I", Elsevier (2007).
* J.M.John McNamee and VictorMichael PanMcNamee: "''Numerical Methods for Roots of Polynomials - Part II"I'', Elsevier, ISBN 978-0-444-52729-5 (20132007).
* J.M.John Michael McNamee and Victor Yakovlevich Pan: "''Numerical Methods for Roots of Polynomials - Part I"II'', Elsevier, ISBN 978-0-444-52730-1 (20072013).
 
 
{{Root-finding algorithms}}
{{Data structures and algorithms}}
 
[[Category:Root-finding algorithms| ]]