Jenkins–Traub algorithm

The Jenkins-Traub algorithm for polynomial zeros is a fast globally convergent iterative method. It has been described as practically a standard in black-box polynomial root-finders.

Given a polynomial P,

P(z)=\sum _{i=0}^{n}a_{i}z^{n-i},\quad a_{0}=1,\quad a_{n}\neq 0

with complex coefficients compute approximations to the n zeros $\alpha _{1},\alpha _{2},\dots ,\alpha _{n}$ of P(z). There is a variation of the Jenkins-Traub algorithm which is faster if the coefficients are real. The Jenkins-Traub algorithm has stimulated considerable research on theory and software for methods of this type.

Overview

The Jenkins-Traub algorithm is a three-stage process for calculating the zeros of a polynomial with complex coefficents. See Jenkins and Traub^[1] . A description can also be found in Ralston and Rabinowitz^[2] p. 383. The algorithm is similar in spirit to the two-stage algorithm studied by Traub^[3].

During the stages of the algorithm, a sequence $\left(H^{\lambda }(z)\right)_{\lambda =0,1,2,\dots }$ of polynomials of degree (n-1) and a sequence of complex shifts $(s_{\lambda })_{\lambda =0,1,2,\dots }$ are constructed. The stages differ in the determination of the shifts, the construction of the polynomials is a recursive procedure dependent on the sequence of shifts, where $H^{0}(z)=P^{\prime }(z)$ and

H^{(\lambda +1)}(z)={\frac {1}{P(s_{\lambda })}}\cdot {\frac {P(s_{\lambda })H^{\lambda }(z)-P(z)H^{\lambda }(s_{\lambda })}{z-s_{\lambda }}}

.

Since the numerator has by construction a root at $z=s_{\lambda }$ the division can be carried out without remainder, for instance by using the Horner scheme or Ruffini rule to compute quotient and remainder in

\left.{\begin{aligned}P(z)&=p(z)\cdot (z-s_{\lambda })+P(s_{\lambda })\\H^{\lambda }(z)&=h(z)\cdot (z-s_{\lambda })+H^{\lambda }(s_{\lambda })\\\end{aligned}}\right\}\implies H^{(\lambda +1)}(z)=h(z)-{\frac {H^{\lambda }(s_{\lambda })}{P(s_{\lambda })}}p(z).

The three-stages of the algorithm are these:

Stage One: No-Shift Process. For $\lambda =0,1,\dots ,M-1$ set $s_{\lambda }=0$ . Usually M=5 is choosen for polynomials of moderate degrees up to n=50. This stage is not necessary from theoretical considerations alone, but is useful in practice.

Stage Two: Fixed-Shift Process. One tries to locate the smallest root of the polynomial by estimating the inner root radius and chosing a random point $s_{M}$ on the circle of this radius. The sequence of polynomials $H^{(\lambda +1)}(z)$ , $\lambda =M,M+1,\dots ,L-1$ , is generated with the fixed shift value $s_{\lambda }=s_{M}$ . Typically one uses L=M+9 for polynomials of moderate degree.

Stage Three: Variable-Shift Process. The $H^{(\lambda )}(z)$ are now generated using the variable shifts $s_{\lambda }$ which are generated by
$s_{\lambda +1}=s_{\lambda }-{\frac {P(s_{\lambda })}{{\bar {H}}^{(\lambda +1)}(s_{\lambda })}},\quad \lambda =L,L+1,\dots ,$

where

{\bar {H}}^{(\lambda +1)}(z)

is

H^{(\lambda )}(z)

divided by its leading coefficient.

If the step size in stage three does not fall fast enough to zero, then stage two is restarted using a different random point. If this does not succeed after a small number of restarts, the number of steps in stage two is doubled.

It can be shown that, provided L is chosen sufficiently large, s_λ always converges to a zero of P. After an approximate zero has been found the degree of P is reduced by one by deflation and the algorithm is performed on the new polynomial until all the zeros have been computed.

The algorithm converges for any distribution of zeros. Furthermore, the convergence is faster than the quadratic convergence of Newton-Raphson iteration.

What gives the algorithm its power?

Compare with the Newton-Raphson iteration

z_{i+1}=z_{i}-{\frac {P(z_{i})}{P^{\prime }(z_{i})}}.

The iteration uses the given P and $\scriptstyle P^{\prime }$ . In contrast the third-stage of Jenkins-Traub

s_{\lambda +1}=s_{\lambda }-{\frac {P(s_{\lambda })}{{\bar {H}}^{\lambda +1}(s_{\lambda })}}=s_{\lambda }-{\frac {W^{\lambda }(s_{\lambda })}{(W^{\lambda })'(s_{\lambda })}}

is precisely a Newton-Raphson iteration performed on certain rational functions. More precisely, Newton-Raphson is being performed on a sequence of rational functions

W^{\lambda }(z)={\frac {P(z)}{H^{\lambda }(z)}}

.

For $\lambda$ sufficiently large,

{\frac {P(z)}{{\bar {H}}^{\lambda }(z)}}=W^{\lambda }(z)\,LC(H^{\lambda })

is as close as desired to a first degree polynomial

z-\alpha _{1}

,

where $\alpha _{1}$ is one of the zeros of $P$ . Even though Stage 3 is precisely a Newton-Raphson iteration, differentiation is not performed.

As inverse power iteration

All stages of the Jenkins-Traub complex algorithm may be represented as the linear algebra problem of determining the eigenvalues of a special matrix. This matrix is the coordinate representation of a linear map in the n-dimensional space of polynomials of degree n-1 or less. The principal idea of this map is to interpret the factorization

P(X)=(X-\alpha _{1})\cdot P_{1}(X)

with a root $\alpha _{1}\in \mathbb {C}$ and $P_{1}(X)=P(X)/(X-\alpha _{1})$ the remaining factor of degree n-1 as the eigenvector equation for the multiplication with the variable X, followed by remainder computation with divisor P(X),

M_{X}(H)=(X\cdot H(X)){\bmod {P}}(X)\,.

This maps polynomials of degree at most n-1 to polynomials of degree at most n-1. The eigenvalues of this map are the roots of P(X), since the eigenvector equation reads

0=(M_{X}-\alpha \cdot id)(H)=((X-\alpha )\cdot H){\bmod {P}}\,,

which implies that $(X-\alpha )\cdot H)=C\cdot P(X)$ , that is, $(X-\alpha )$ is a linear factor of P(X). In the monomial basis the linear map $M_{X}$ is represented by a companion matrix of the polynomial P, as

M_{X}(H)=\sum _{m=1}^{n-1}(H_{m-1}-P_{m}H_{n-1})X^{m}-P_{0}H_{n-1}\,,

the resulting coefficient matrix is

A={\begin{pmatrix}0&0&\dots &0&-P_{0}\\1&0&\dots &0&-P_{1}\\0&1&\dots &0&-P_{2}\\\vdots &\vdots &\vdots &\vdots &\vdots \\0&0&\dots &1&-P_{n-1}\\\end{pmatrix}}\,.

To this matrix the inverse power iteration is applied in three variants in the three stages of the algorithm. It is more efficient to perform the linear algebra operations in polynomial arithmetic and not by matrix operations, however, the properties of the inverse power iteration remain the same.

Stage 1: Here the inverse power iteration without shift

H^{k+1}=M_{X}{}^{-1}(H^{k})

resp.

{\bar {H}}^{k+1}=q_{k}\cdot M_{X}{}^{-1}({\bar {H}}^{k})

,

is used. In the second variant ${\bar {H}}^{k}$ is a normalized multiple of $H^{k}$ , with the normalization fixing the norm of the vector or some coordinate of it to be 1. This variant approximates the smallest eigenvalue, that is the smallest root of P(X). The convergence is of linear order with a factor that is the quotient of the smallest (in absolute value) and next-to-smallest eigenvalue. This only works if the there is only one eigenvalue with a minimal distance to zero.

Stage 2: This step uses the inverse power iteration in the variant with a constant shift s. In consequnce, it is computing the smallest eigenvalue of $M_{X}-s\cdot id$ by

H^{k+1}=(M_{X}-s\cdot id)^{-1}(H^{k})\,,

resp.

{\bar {H}}^{k+1}=q_{k}\cdot (M_{X}-s\cdot id)^{-1}({\bar {H}}^{k})

,

If the shift s is close to the smallest eigenvalue $\alpha _{1}$ , then the smallest eigenvalue of the shifted matrix is $\alpha _{1}-s$ which is close to zero. The convergence has an order of

\alpha _{1}-s-q_{k}=O\left(\left({\tfrac {|\alpha _{1}-s|}{|\alpha _{2}-s|}}\right)^{k}\right)

which is the faster the smaller $|\alpha _{1}-s|<\!<|\alpha _{2}-s|=\min {}_{j=2,\dots ,n}|\alpha _{j}-s|$ is.

Stage 3: Here the shift is updated every step using

H^{k+1}=(M_{X}-s_{k}\cdot id)^{-1}(H^{k})\,,

and

H_{1}^{k+1}=(M_{X}-s_{k}\cdot id)^{-1}(H^{k+1})\,,

to obtain an update to $s_{k}$ from the growth factor between $H^{k+1}$ and $H_{1}^{k+1}$ ,

s_{k+1}=s_{k}+{\tfrac {L(H^{k+1})}{L(H_{1}^{k+1})}}

with some linear functional L. If $\delta _{k}$ denotes the distance of the normalized iteration vector ${\bar {H}}^{k}$ from the eigenspace of $\alpha _{1}$ , then one gets the for the convergence speed the recursions

\alpha _{1}-s_{k+1}=O(\delta _{k}\,|\alpha _{1}-s_{k}|^{2})

and

\delta _{k+1}=O(\delta _{k}\cdot |\alpha _{1}-s_{k}|)

resulting in an asymptotic convergence order of $\phi ^{2}=1+\phi \approx 2.61$ , where $\phi ={\tfrac {1+{\sqrt {5}}}{2}}$ is the golden ratio.

To solve the polynomial identity representing the inverse power iteration, one has to find a factor $Q(X)$ such that

H^{k}(X)=((X-s_{k})\cdot H^{k+1}(X)+Q(X)\cdot P(X))\,.

Evaluating at $X=s_{k}$ one arrives at $H^{k}(s_{k})=Q(s_{k})P(s_{k})$ , and by degree computation one concludes that $Q(X)=Q(s_{k})$ is a constant polynomial. Then the equation is solved by exact polynomial division giving

H^{k+1}(X)={\frac {1}{X-s_{k}}}\,(H^{k}(X)-Q(s_{k})\,P(X))={\frac {1}{X-s_{k}}}\,\left(H^{k}(X)-{\tfrac {H^{k}(s_{k})}{P(s_{k})}}\,P(X)\right)\,,

which is the unnormalized iteration for the polynomial sequence $(H^{k}(X))_{k\in \mathbb {N} }$ . Comparing the leading coefficients (LC) of this sequence one finds that

LC(H^{k+1})=-Q(s_{k})=-{\frac {H^{k}(s_{k})}{P(s_{k})}}=-LC(H^{k})\,{\tfrac {{\bar {H}}^{k}(s_{k})}{P(s_{k})}}\,,

such that the iteration for the normalized H polynomials reads as

{\bar {H}}^{k+1}(X)={\frac {1}{X-s_{k}}}\,(P(X)-H^{k}(X)/Q(s_{k}))={\frac {1}{X-s_{k}}}\,\left(P(X)-{\tfrac {P(s_{k})}{{\bar {H}}^{k}(s_{k})}}\,{\bar {H}}^{k}(X)\right)\,.

In the additional evaluation of the third stage one only needs the fraction of the leading coefficients,

LC(H_{1}^{k+1}(X))=-{\tfrac {H^{k+1}(s_{k})}{P(s_{k})}}=-LC(H^{k+1}(X))\,{\tfrac {{\bar {H}}^{k+1}(s_{k})}{P(s_{k})}}\,,

from where the update formula

s_{k+1}=s_{k}+{\tfrac {LC(H^{k+1}(X))}{LC(H_{1}^{k+1}(X))}}=s_{k}-{\tfrac {P(s_{k})}{{\bar {H}}^{k+1}(s_{k})}}

results.

Real coefficients

The Jenkins-Traub algorithm described earlier works for polynomials with complex coefficients. The same authors also created a three-stage algorithm for polynomials with real coefficients. See Jenkins and Traub A Three-Stage Algorithm for Real Polynomials Using Quadratic Iteration.^[4] The algorithm finds either a linear or quadratic factor working completely in real arithmetic. If the complex and real algorithms are applied to the same real polynomial, the real algorithm is about four times as fast. The real algorithm always converges and the rate of convergence is greater than second order.

A connection with the shifted QR algorithm

There is a surprising connection with the shifted QR algorithm for computing matrix eigenvalues. See Dekker and Traub The shifted QR algorithm for Hermitian matrices.^[5] Again the shifts may be viewed as Newton-Raphson iteration on a sequence of rational functions converging to a first degree polynomial.

Software and testing

The software for the Jenkins-traub algorithm was published as Jenkins and Traub Algorithm 419: Zeros of a Complex Polynomial.^[6] The software for the real algorithm was published as Jenkins Algorithm 493: Zeros of a Real Polynomial.^[7]

The methods have been extensively tested by many people. As predicted they enjoy faster than quadratic convergence for all distributions of zeros. They have been described as practically a standard in black-box polynomial root finders; see Press, et al., Numerical Recipes,^[8] p. 380.

However there are polynomials which can cause loss of precision as illustrated by the following example. The polynomial has all its zeros lying on two half-circles of different radii. Wilkinson recommends that it is desirable for stable deflation that smaller zeros be computed first. The second-stage shifts are chosen so that the zeros on the smaller half circle are found first. After deflation the polynomial with the zeros on the half circle is known to be ill-conditioned if the degree is large; see Wilkinson,^[9] p. 64. The original polynomial was of degree 60 and suffered severe deflation instability.

References

^ Jenkins, M. A. and Traub, J. F. (1970), A Three-Stage Variables-Shift Iteration for Polynomial Zeros and Its Relation to Generalized Rayleigh Iteration, Numer. Math. 14, 252-263.
^ Ralston, A. and Rabinowitz, P. (1978), A First Course in Numerical Analysis, 2nd ed., McGraw-Hill, New York.
^ Traub, J. F. (1966), A Class of Globally Convergent Iteration Functions for the Solution of Polynomial Equations, Math. Comp., 20(93), 113-138.
^ Jenkins, M. A. and Traub, J. F. (1970), A Three-Stage Algorithm for Real Polynomials Using Quadratic Iteration, SIAM J. Numer. Anal., 7(4), 545-566.
^ Dekker, T. J. and Traub, J. F. (1971), The shifted QR algorithm for Hermitian matrices, Lin. Algebra Appl., 4(2), 137-154.
^ Jenkins, M. A. and Traub, J. F. (1972), Algorithm 419: Zeros of a Complex Polynomial, Comm. ACM, 15, 97-99.
^ Jenkins, M. A. (1975), Algorithm 493: Zeros of a Real Polynomial, ACM TOMS, 1, 178-189.
^ Press, W. H., Teukolsky, S. A., Vetterling, W. T. and Flannery, B. P. (2002), Numerical Recipes in C++: The Art of Scientific Computing, 2nd. ed., Cambridge University Press, New York.
^ Wilkinson, J. H. (1963), Rounding Errors in Algebraic Processes, Prentice Hall, Englewood Cliffs, N.J.

External links

[1] Jenkins, M. A. and Traub, J. F. (1970), A Three-Stage Variables-Shift Iteration for Polynomial Zeros and Its Relation to Generalized Rayleigh Iteration, Numer. Math. 14, 252-263.

[2] Ralston, A. and Rabinowitz, P. (1978), A First Course in Numerical Analysis, 2nd ed., McGraw-Hill, New York.

[3] Traub, J. F. (1966), A Class of Globally Convergent Iteration Functions for the Solution of Polynomial Equations, Math. Comp., 20(93), 113-138.

[4] Jenkins, M. A. and Traub, J. F. (1970), A Three-Stage Algorithm for Real Polynomials Using Quadratic Iteration, SIAM J. Numer. Anal., 7(4), 545-566.

[5] Dekker, T. J. and Traub, J. F. (1971), The shifted QR algorithm for Hermitian matrices, Lin. Algebra Appl., 4(2), 137-154.

[6] Jenkins, M. A. and Traub, J. F. (1972), Algorithm 419: Zeros of a Complex Polynomial, Comm. ACM, 15, 97-99.

[7] Jenkins, M. A. (1975), Algorithm 493: Zeros of a Real Polynomial, ACM TOMS, 1, 178-189.

[8] Press, W. H., Teukolsky, S. A., Vetterling, W. T. and Flannery, B. P. (2002), Numerical Recipes in C++: The Art of Scientific Computing, 2nd. ed., Cambridge University Press, New York.

[9] Wilkinson, J. H. (1963), Rounding Errors in Algebraic Processes, Prentice Hall, Englewood Cliffs, N.J.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]