Content deleted Content added
m General fixes and Typo fixing, typos fixed: sligthly → slightly, occurence → occurrence using AWB |
The Golden Ratio is about 1.618033… so if you round it to three decimals, it would be ≈1.62 and not ≈1.61. A better fix is to say ≈1.618 since that is an excellent approximation, correctly rounded. |
||
(46 intermediate revisions by 36 users not shown) | |||
Line 1:
{{Short description|Root-finding algorithm for polynomials}}
The '''
This article describes the complex variant. Given a polynomial ''P'',
with complex coefficients it computes approximations to the ''n'' zeros <math>\alpha_1,\alpha_2,\dots,\alpha_n</math> of ''P''(''z''), one at a time in roughly increasing order of magnitude. After each root is computed, its linear factor is removed from the polynomial. Using this ''deflation'' guarantees that each root is computed only once and that all roots are found.
The real variant follows the same pattern, but computes two roots at a time, either two real roots or a pair of conjugate complex roots. By avoiding complex arithmetic, the real variant can be faster (by a factor of 4) than the complex variant. The Jenkins–Traub algorithm has stimulated considerable research on theory and software for methods of this type.
▲:<math>P(z)=\sum_{i=0}^na_iz^{n-i}, \quad a_0=1,\quad a_n\ne 0</math>
==Overview==
The
A description can also be found in Ralston and [[Philip Rabinowitz (mathematician)|
Rabinowitz]]<ref>Ralston, A. and Rabinowitz, P. (1978), A First Course in Numerical Analysis, 2nd ed., McGraw-Hill, New York.</ref> p. 383.
The algorithm is similar in spirit to the two-stage algorithm studied by Traub.<ref>Traub, J. F. (1966), [
=== Root-finding procedure ===
Starting with the current polynomial ''P''(''X
To that end, a sequence of so-called ''H'' polynomials is constructed. These polynomials are all of degree ''n'' − 1 and are supposed to converge to the factor <math>\bar H(X)</math> of ''P''(''X'') containing (the linear factors of) all the remaining roots. The sequence of ''H'' polynomials occurs in two variants, an unnormalized variant that allows easy theoretical insights and a normalized variant of <math>\bar H</math> polynomials that keeps the coefficients in a numerically sensible range.
The construction of the ''H'' polynomials <math>\left(H^{(\lambda)}(z)\right)_{\lambda=0,1,2,\dots}</math>
<math display="block">
H^{(0)}(z)=P^\prime(z)
</math> and <math display="block">
(X-s_\lambda)\cdot H^{(\lambda+1)}(X)\equiv H^{(\lambda)}(X)\pmod{P(X)}\ .
</math>
A direct solution to this implicit equation is
H^{(\lambda+1)}(X)
=\frac1{X-s_\lambda}\cdot
Line 34 ⟶ 35:
where the polynomial division is exact.
Algorithmically, one would use
\left.\begin{align}
P(X)&=p(X)\cdot(X-s_\lambda)+P(s_\lambda)\\
Line 43 ⟶ 44:
</math>
Since the highest degree coefficient is obtained from ''P(X)'', the leading coefficient of <math>H^{(\lambda+1)}(X)</math> is <math>-\tfrac{H^{(\lambda)}(s_\lambda)}{P(s_\lambda)}</math>. If this is divided out the normalized ''H'' polynomial is
\bar H^{(\lambda+1)}(X)
&=\frac1{X-s_\lambda}\cdot
Line 55 ⟶ 56:
</math>
==== Stage
For <math>\lambda = 0,1,\dots, M-1</math> set <math>s_\lambda=0</math>. Usually ''M=5'' is chosen for polynomials of moderate degrees up to ''n
▲==== Stage Two: Fixed-Shift Process ====
The shift for this stage is determined as some point close to the smallest root of the polynomial. It is quasi-randomly located on the circle with the inner root radius, which in turn is estimated as the positive solution of the equation
R^n+|a_{n-1}|\,R^{n-1}+\dots+|a_{1}|\,R=|a_0|\,.
</math>
Since the left side is a convex function and increases monotonically from zero to infinity, this equation is easy to solve, for instance by [[Newton's method]].
Now choose <math>s=R\cdot \exp(i\,\phi_\text{random})</math> on the circle of this radius. The sequence of polynomials <math>H^{(\lambda+1)}(z)</math>, <math>\lambda=M,M+1,\dots,L-1</math>, is generated with the fixed shift value <math>s_\lambda = s</math>. This creates an asymmetry relative to the previous stage which increases the chance that the ''H'' polynomial moves towards the cofactor of a single root.
During this iteration, the current approximation for the root is traced. The second stage is
<math display="block">
|t_{\lambda+1}-t_\lambda|<\tfrac12\,|t_\lambda|
</math> and <math display="block">
|t_\lambda-t_{\lambda-1}|<\tfrac12\,|t_{\lambda-1}|
</math>
are simultaneously met. This limits the relative step size of the iteration, ensuring that the approximation sequence stays in the range of the smaller roots. If there was no success after some number of iterations, a different random point on the circle is tried. Typically one uses a number of 9 iterations for polynomials of moderate degree, with a doubling strategy for the case of multiple failures.
==== Stage
The <math>H^{(\lambda+1)}(X)</math> polynomials are now generated using the variable shifts <math>s_{\lambda},\quad\lambda=L,L+1,\dots</math> which are generated by
being the last root estimate of the second stage and
If the step size in stage three does not fall fast enough to zero, then stage two is restarted using a different random point. If this does not succeed after a small number of restarts, the number of steps in stage two is doubled.
==== Convergence ====
It can be shown that, provided ''L'' is chosen sufficiently large, ''s''<sub>λ</sub> always converges to a root of ''P''.
The algorithm converges for any distribution of roots, but may fail to find all roots of the polynomial. Furthermore, the convergence is slightly faster than the [[Rate of convergence|quadratic convergence]] of
==What gives the algorithm its power?==
Compare with the [[
The iteration uses the given ''P'' and <math>\scriptstyle P^{\prime}</math>. In contrast the third-stage of
▲:<math>z_{i+1}=z_i - \frac{P(z_i)}{P^{\prime}(z_i)}.</math>
<math display="block">
▲The iteration uses the given ''P'' and <math>\scriptstyle P^{\prime}</math>. In contrast the third-stage of Jenkins-Traub
s_{\lambda+1}
=s_\lambda- \frac{P(s_\lambda)}{\bar H^{\lambda+1}(s_\lambda)}
Line 102 ⟶ 105:
</math>
is precisely a
▲:<math>W^\lambda(z)=\frac{P(z)}{H^\lambda(z)}</math>.
For <math>\lambda</math> sufficiently large, ▼
:<math>\frac{P(z)}{\bar H^{\lambda}(z)}=W^\lambda(z)\,LC(H^{\lambda})</math>▼
is as close as desired to a first degree polynomial ▼
:<math>z-\alpha_1</math>, ▼
where <math>\alpha_1</math> is one of the zeros of <math>P</math>. Even though Stage 3 is precisely a Newton-Raphson iteration, differentiation is not performed.▼
▲where <math>\alpha_1</math> is one of the zeros of <math>P</math>. Even though Stage 3 is precisely a
=== Analysis of the ''H'' polynomials ===
Let <math>\alpha_1,\dots,\alpha_n</math> be the roots of ''P''(''X
If all roots are different, then the Lagrange factors form a basis of the space of polynomials of degree at most ''n
H^{(\lambda)}(X)
=\sum_{m=1}^n
Line 128 ⟶ 126:
</math>
Each Lagrange factor has leading coefficient 1, so that the leading coefficient of the H polynomials is the sum of the coefficients. The normalized H polynomials are thus
\bar H^{(\lambda)}(X)
=\frac{\sum_{m=1}^n
Line 140 ⟶ 138:
\right]^{-1}
}
= \frac{P_1(X)+\sum_{m=2}^n
\left[
\prod_{\kappa=0}^{\lambda-1}\frac{\alpha_1-s_\kappa}{\alpha_m-s_\kappa}
Line 156 ⟶ 154:
Under the condition that
one gets the
*stage 1: <math display="block">
H^{(\lambda)}(X)
=P_1(X)+O\left(\left|\frac{\alpha_1}{\alpha_2}\right|^\lambda\right).
</math>
*for stage 2, if ''s'' is close enough to <math>\alpha_1</math>: <math display="block">
H^{(\lambda)}(X)
= P_1(X)
+O\left(
\left|\frac{\alpha_1}{\alpha_2}\right|^M
\cdot
\left|\frac{\alpha_1-s}{\alpha_2-s}\right|^{\lambda-M}\right)
</math> and <math display="block">
s-\frac{P(s)}{\bar H^{(\lambda)}(s)}
= \alpha_1+O\left(\ldots\cdot|\alpha_1-s|\right).</math>
*and for stage 3: <math display="block">
H^{(\lambda)}(X)
=P_1(X)
Line 183 ⟶ 176:
\left|\frac{\alpha_1-s_\kappa}{\alpha_2-s_\kappa}\right|
\right)
</math> and <math display="block">
s_{\lambda+1}=
s_\lambda-\frac{P(s)}{\bar H^{(\lambda+1)}(s_\lambda)}
Line 193 ⟶ 184:
\frac{|\alpha_1-s_\lambda|^2}{|\alpha_2-s_\lambda|}
\right)
▲:giving rise to a higher than quadratic convergence order of <math>\phi^2=1+\phi\approx 2.61</math>, where <math>\phi=\tfrac12(1+\sqrt5)</math> is the [[golden ratio]].
=== Interpretation as inverse power iteration ===
All stages of the
with a root <math>\alpha_1\in\C</math> and <math>P_1(X) = P(X) / (X-\alpha_1)</math> the remaining factor of degree ''n
This maps polynomials of degree at most ''n
which implies that <math>(X-\alpha)\cdot H
the resulting
0 & 0 & \dots & 0 & -
1 & 0 & \dots & 0 & -
0 & 1 & \dots & 0 & -
\vdots & \vdots & \
0 & 0 & \dots & 1 & -
\end{pmatrix}\,.</math>
To this matrix the [[inverse power iteration]] is applied in the three variants of no shift, constant shift and generalized
==Real coefficients==
The
==A connection with the shifted QR algorithm==
There is a surprising connection with the shifted QR algorithm for computing matrix eigenvalues. See Dekker and Traub [http://linkinghub.elsevier.com/retrieve/pii/0024379571900358 The shifted QR algorithm for Hermitian matrices].<ref>Dekker, T. J. and Traub, J. F. (1971), [http://linkinghub.elsevier.com/retrieve/pii/0024379571900358 The shifted QR algorithm for Hermitian matrices], Lin. Algebra Appl., 4(2),
==Software and testing==
The software for the
The methods have been extensively tested by many people.{{Who|date=December 2021}} As predicted they enjoy faster than quadratic convergence for all distributions of zeros.
However, there are polynomials which can cause loss of precision<ref>{{Cite web|date=8 August 2005|title=William Kahan Oral history interview by Thomas Haigh|url=http://history.siam.org/oralhistories/kahan.htm|access-date=2021-12-03|website=The History of Numerical Analysis and Scientific Computing|publication-place=Philadelphia, PA}}</ref> as illustrated by the following example. The polynomial has all its zeros lying on two half-circles of different radii. [[James H. Wilkinson|Wilkinson]] recommends that it is desirable for stable deflation that smaller zeros be computed first. The second-stage shifts are chosen so that the zeros on the smaller half circle are found first. After deflation the polynomial with the zeros on the half circle is known to be ill-conditioned if the degree is large; see Wilkinson,<ref>Wilkinson, J. H. (1963), Rounding Errors in Algebraic Processes, Prentice Hall, Englewood Cliffs, N.J.</ref> p. 64. The original polynomial was of degree 60 and suffered severe deflation instability.
==References==
Line 232 ⟶ 222:
==External links==
*[http://www.hvks.com/Numerical/winsolve.html A free downloadable Windows application using the
*[https://github.com/sweeneychris/RpolyPlusPlus RPoly++] An SSE-Optimized C++ implementation of the RPOLY algorithm.
▲*[http://www.hvks.com/Numerical/winsolve.html A free downloadable Windows application using the Jenkins-Traub Method for polynomials with real and complex coefficients]
{{Root-finding algorithms}}
{{DEFAULTSORT:Jenkins-Traub Algorithm}}
[[Category:Numerical analysis]]
[[Category:
|