Lehmer–Schur algorithm: Difference between revisions

Content deleted Content added
References: fix spelling
Windeman (talk | contribs)
No edit summary
Line 1:
In [[mathematics]], the '''Lehmer–Schur algorithm''' (named after [[Derrick Henry Lehmer]] and [[Issai Schur]]) is a [[root-finding algorithm]] extending the idea of enclosing roots like in the one-dimensional [[bisection method]] to the complex plane. It uses the Schur–Cohn test to test increasingly smaller disks for the presence or absence of roots. Alternative methods like Wilf's algorithm apply different tests to differently shaped areas but keep the idea of descent by subdivision.
 
==The =Lehmer-Schur methodalgorithm===
The Schur–Cohn test described below allows one to determine if a polynomial has no roots in the unit disk and in some cases to determine the exact number of roots. The method proposed by Lehmer test for the presence of roots of a polynomial <math>p(z)</math> on a collection of disks <math>D(c,\rho)</math> in the complex plane by applying the Schur–Cohn test to the shifted and scaled polynomial <math>p(c+\rho z).</math>
 
In [[mathematics]], the '''Lehmer-Schur algorithm''' (named after [[Derrick Henry Lehmer]] and [[Issai Schur]]) is a [[root-finding algorithm]] for [[complex]] [[polynomial|polynomials]], extending the idea of enclosing roots like in the one-dimensional [[bisection method]] to the complex plane. It uses the Schur-Cohn test to test increasingly smaller disks for the presence or absence of roots.\\
Starting with ''c''=0 and &rho;=1, the radius in increased or decreased by factors of 2 until the annulus <math>\rho\le|z|\le2\rho</math> is found to contain roots. Then the method is recursively applied to the 8 disks with center <math>c_k=\frac53\rho e^{i\frac{k\pi}4}</math>, <math>k=0,1,\dots,7</math> and initial radius <math>\rho</math> (originally <math>\frac56\rho</math>, which is slightly too small to cover the full annulus).
 
If after some recursions a small disk is found that contains only one root, this root is further approximated using [[Newton's method]] and then the polynomial is deflated by splitting off the corresponding linear factor. After that, the whole procedure is restarted.
 
 
===Schur transformation of polynomials===
===Schur-Cohn algorithm===
Consider, as before, a polynomial with complex coefficients
This [[algorithm]] allows to find the distribution of the zeros of a complex polynomial with respect to the [[unit circle]] in the complex plane.
<ref>{{cite journal |last1=Cohn |first1=A |title="Uber die Anzahl der Wurzeln einer algebraischen Gleichung in einem Kreise. |journal=Math.Z. |date=1922 |volume=14 |page=110-148 |doi=10.1007/BF01215894}}</ref>
<ref name="Henrici">{{cite book |last1=Henrici |first1=Peter |title=Applied and computational complex analysis. Volume I: Power series- integration-conformal mapping-___location of zeros. |date=1988 |publisher=New York etc.: John Wiley |isbn=0-471-60841-6 |pages=xv + 682 |edition=Repr. of the orig., publ. 1974 by John Wiley \& Sons Ltd., Paperback ed.}}</ref>
<ref>{{cite book |last1=Marden |first1=Morris |title=The geometry of the zeros of a polynomial in a complex variable. |date=1949 |publisher=Mathematical Surveys. No. 3. New York: American Mathematical Society (AMS). |page=p.148 }}</ref>
It is based on two auxiliary polynomials, introduced by Schur.<ref>{{cite journal |last1=Schur |first1=I |title=Über Potenzreihen, die im Innern des Einheitskreises beschränkt sind. |journal=Journal für die reine und angewandte Mathematik (Crelle's Journal) |date=1917 |volume=1917 |issue=147 |page=205-232 |doi=10.1515/crll.1917.147.205 |url=http://gdz.sub.uni-goettingen.de/dms/load/img/?PID=GDZPPN002168510&physid=PHYS_0210}}</ref>
For a complex polynmial <math>p</math> of [[degree of a polynomial|degree]] <math>n</math> its ''reciprocal adjoint polynomial'' <math>p^{*}</math> is defined by <math>p^{*}(z) = z^{n}\overline{p(\bar{z}^{-1})} </math> and its ''Schurtransform'' <math>Tp</math> by
 
:<math>
Tp =\overline{p(0)}p-\overline{p^*(0)}p^*,
p(z)=a_n z^n+\dots+a_1 z+a_0.
</math>
where a bar denotes [[complex conjugate | complex conjugation]].
Denote the reverse conjugate polynomial as <math>p^*(z)=z^n\bar p(z^{-1})=z^n\overline p(\bar z^{-1})</math>.
 
Then the Schur transform <math>Tp</math> of <math>p</math> is the polynomial
 
:<math>
So, if <math> p(z) = a_{n}z^{n}+\cdots+a_{1}z +a_{0} </math> with <math>a_n\neq 0</math>, then
(Tp)(z)=\bar a_0p(z)-a_n p^*(z).
<math> p^{*}(z) = \bar{a}_{0}z^{n}+ \bar{a}_{1}z^{n-1}+\cdots+\bar{a}_{n}</math>,
</math>
with [[polynomial|leading zero-terms]], if any, removed. The [[polynomial|coefficients]] of <math>Tp</math> can therefore be directly expressed in those of <math>p</math> and, since one or more leading coefficients cancel, <math>Tp</math> has lower degree than <math>p</math>. The roots of <math>p</math>, <math>p^{*}</math>, and <math>Tp</math> are related as follows.
Since the highest degree coefficients cancel, <math>\deg Tp<\deg p</math>, and the constant coefficient of <math>Tp</math> is
 
<math>\delta_1=(Tp)(0)=|a_0|^2-|a_n|^2.</math>
 
;Lemma
:IfLet <math>\delta_1>0p</math>, thenbe <math>p</math>a complex polynomial and <math>\delta = (Tp)(0)</math>. have the same number of roots inside the unit disk and on the unit circle.
*The roots of <math>p^{*}</math>, including their [[Multiplicity (mathematics)|multiplicities]], are the images under [[Circle inversion | inversion]] in the unit circle of the non-zero roots of <math>p</math>.
* If <math>\delta \neq 0</math>, then <math>p,\,p^*</math>, and <math>Tp</math> share roots on the unit circle, including their multiplicities.
* If <math>\delta>0</math>, then <math>p</math> and <math>Tp</math> have the same number of roots inside the unit circle.
* If <math>\delta<0</math>, then <math>p^*</math> and <math>Tp</math> have the same number of roots inside the unit circle.
 
:If <math>\delta_1<0</math>, then <math>p</math> has the same number of roots inside the unit disk as <math>Tp</math> (and <math>p^*</math>) has outside, and both have the same number of roots on the unit circle.
 
;Proof
This result is a consequence of [[Rouché's theorem]].
For <math>z\neq 0</math> we have <math>p^*(z)=z^n \overline{p(z/|z|^2)}</math> and, in particular,
<math>|p^*(z)| = |p(z)|</math> for <math>|z|=1</math>.
Also <math>\delta \neq 0</math> implies <math>|p(0)| \neq |p^*(0)|</math>. From this and the definitions above the first two statements follow.
The other two statements are a consequence of [[Rouché's theorem]] applied on the unit circle to the functions <math>\overline{p(0)}p(z)/r(z)</math> and <math>-\overline{p^*(0)}p^*(z)/r(z)</math> , where <math>r</math> is a polynomial that has as its roots the roots of <math>p</math> on the unit circle, with the same multiplicities. □
 
===Schur–Cohn test===
 
For a more accessible representation of the lemma,
Apply the Schur transform repeatedly, <math>T^{k+1}p=T(T^kp)</math>, let ''K'' be the first index with <math>T^{K+1}p=0</math>. Denote <math>d_k=\deg T^kp</math>, <math>d_{k+1}<d_k</math> and <math>\delta_k=(T^kp)(0)</math>.
let <math>n^-_p,n^0_p</math>, and <math>n^+_p</math> denote the number of roots of <math>p</math> inside, on, and outside the unit circle respectively and similarly for <math>Tp</math>.
Moreover let <math>d</math> be the difference in degree of <math>p</math> and <math>Tp</math>. Then the lemma implies that
<math> (n^-_p,\;n^0_p,\;n^+_p)=(n^-_{Tp},\;n^0_{Tp},\;n^+_{Tp}+d)</math> if <math>\delta>0</math> and
<math> (n^-_p,\;n^0_p,\;n^+_p)=(n^+_{Tp}+d,\;n^0_{Tp},\;n^-_{Tp})</math> if <math>\delta<0</math>
(note the interchange of <math>^+</math> and <math>^-</math>).
 
Now consider the sequence of polynomials <math>T^{k}p</math> <math>(k=0,1,\ldots)</math>, where <math>T^0 p=p</math> and <math>T^{k+1}p=T(T^{k}p)</math>.
;Theorem
Application of the foregoing to each pair of consecutive members of this sequence gives the following result.
:If <math>\delta_k>0</math> for all ''k''&nbsp;=&nbsp;1,&nbsp;2,&nbsp;...,&nbsp;''K'', then <math>p</math> has no roots inside the unit disk.
 
:If <math>\delta_{\bar k}<0</math> exactly once, <math>\delta_k>0</math> for <math>k\ne \bar k</math>, then ''p'' has <math>d_{\bar k}</math> roots inside the unit disk.
 
;Theorem[Schur-Cohn test]
The first follows from the root number preserving
propertyLet of<math>p</math> thebe Schura transform.complex Forpolynomial the second,with <math>T^{Tp\barneq0</math> k}p,and \dots,let T^Kp<math>K</math> have no roots insidebe the unitsmallest disknumber orsuch on the unit circle.that <math>T^{\bar kK+1}p=0</math>. hasMoreover let <math>d_{\bar k}delta_k=\deg (T^{\bar k}p)(0)</math> rootsfor outside the unit disk<math>k=1,\ldots,K</math> so thatand <math>T^d_{k}= \bardeg T^{k-1}p</math> and thus alsofor <math>T^{\bar k-2}p=0,\dotsldots,Tp,pK</math> have the same number of roots inside the unit disk.
* All roots of <math>p</math> lie inside the unit circle if and only if
<math>\delta_1<0</math>, <math>\delta_k>0</math> for <math>k=2,\ldots,K</math>, and <math>d_{K}=0</math>.
* All roots of <math>p</math> lie outside the unit circle if and only if
<math>\delta_k>0</math> for <math>k=1,\ldots,K</math> and <math>d_{K}=0</math>.
* If <math>d_{K}=0</math> and if <math>\delta_k < 0 </math> for <math>k=k_0,k_1 \ldots k_m</math> (in increasing order) and <math>\delta_k > 0 </math> otherwise, then <math>p</math> has no roots on the unit circle and the number of roots of <math>p</math> inside the unit circle is
:<math>
\sum_{i=0}^m (-1)^i d_{k_i-1}
</math>.
 
==Variations on the subdivision idea==
 
===Wilf's global bisection algorithm===
 
More generally, the distribution of the roots of a polynomial <math>p</math> with respect to an arbitrary circle in the complex plane, say one with centre <math>c</math> and radius <math>\rho</math>, can be found by application of the Schur-Cohn test to the 'shifted and scaled' polynomial <math>q</math> defined by <math>q(z)=p(c+\rho\,z)</math>.
The aim of this algorithm is to find the roots of a function of one complex variable inside any rectangular region of the function's [[Holomorphic function|holomorphicity]] (''i.e.'', [[Analytic function|analyticity]]).
 
Not every scaling factor is allowed, however, for the Schur-Cohn test can be applied to the polynomial <math>q</math> only if none of the following equalities occur: <math>T^{k}q(0)=0</math> for some <math>k=1,\ldots K</math> or <math>T^{K}q=0</math> while <math>d_K>0</math>. Now, the coefficients of the polynomials <math>T^{k}q</math> are polynomials in <math>\rho</math> and the said equalities result in polynomial equations for <math>\rho</math>, which therefore hold for only finitely many values of <math>\rho</math>. So a suitable scaling factor can always be found, even arbitrarily close to <math>1</math>.
The rectangle in question is quadrisected into four, [[Congruence (geometry)|congruent]] quarter rectangles. For each quarter, the image of the boundary is a curve in the complex plane. The [[argument principle]] is then applied to this path to find the [[winding number]] about the origin. Given that the function is [[Analytic function|analytic]] within each of these quarters, a nonzero [[winding number]] ''N'' (always an integer) identifies ''N'' zeros of the function inside the quarter in question by [[Rouché's theorem]], each zero counted as many times as its [[Multiplicity (mathematics)|multiplicity]].
 
Analogously with the bisection method, the algorithm is then applied recursively to any quarter whose boundary has nonzero winding number to further refine the estimates of the zeros. The recursion is repeated until the zero-containing rectangles are either small enough that their centres give sufficiently accurate zero estimates or, alternatively, that another root-finding algorithm can be applied to the estimates to further refine them.
 
===Lehmer's method===
==References==
{{Cite check|date=June 2011}}
* {{citation
|author= D. H. Lehmer,
|title=A Machine Method for Solving Polynomial Equations
|journal=Journal of the ACM
|volume=8
|number=2
|date=April 1961
|pages=151–162
|doi=10.1145/321062.321064
}}
 
Lehmers method is as follows.
* {{citation
<ref>{{cite journal |last1=Lehmer |first1=D.H. |title=A machine method for solving polynomial equations. |journal=J. Assoc. Comput. Mach. |date=1961 |volume=8 |pages=151-162 |doi=10.1145/321062.321064}}</ref>
|author=Herbert S. Wilf
For a given complex polynomial <math>p</math>, with the Schur-Cohn test a circular disk can be found large enough to contain all roots of <math>p</math>. Next this disk can be covered with a set of overlapping smaller disks, one of them placed concentrically and the remaining ones evenly spread over the annulus yet to be covered. From this set, using the test again, disks containing no root of <math>p</math> can be removed. With each of the remaining disks this procedure of covering and removal can be repeated and so any number of times, resulting in a set of arbitrarily small disks that together contain all roots of <math>p</math>.
|title=A Global Bisection Algorithm for Computing the Zeros of Polynomials in the Complex Plane
|journal=Journal of the ACM
|year=1978
|volume=25
|number=3
|doi=10.1145/322077.322084
}}
 
The merits of the method are that it consists of repetition of a single procedure and that all roots are found simultaneously, whether they are real or complex, single, multiple or clustered. Also deflation, i.e. removal of roots allready found, is not needed and every test starts with the full-precision, original polynomial. Also, remarkably, this polynomial has never to be evaluared.
* {{citation
|author=Acton, Forman S.
|title=Numerical Methods That Work
|year=1970
|publication-date=1990
|edition=corrected
|place=Washington
|publisher=[[Mathematical Association of America]]
|chapter=Chapter 7
|isbn=0-88385-450-3}}.
 
However, the smaller the disks become, the more the coefficients of the corresponding 'scaled' polynomials will differ in relative magnitude. This may cause overflow or underflow of computer computations, thus limiting the radii of the disks from below and thereby the precision of the computed roots.
* {{citation
<ref name="Henrici"></ref>
|author1=W. H. Press
<ref>{{cite journal |last1=Stewart |first1=G.W.III |title=On Lehmer's method for finding the zeros of a polynomial. |journal=Math. Comput. |date=1969 |volume=23 |pages=829-835 |doi=10.2307/2004970}}</ref>.
|author2=B. P. Flannery
To avoid extreme scaling, or just for the sake of efficiency, one may start with testing a number of concentric disks for the number of included zeros and thus reduce the region where zeros occur to a number of narrow , concentric annuli. Repeating this procedure with another centre and combining the results, the said region becomes the union of intersections of such annuli.
|author3=S. A. Teukolsky
<ref>{{cite journal |last1=Loewenthal |first1=Dan |title=Improvement on the Lehmer-Schur root detection method. |journal=J. Comput. Phys. |date=1993 |volume=109 |issue=2 |pages=164-168 |doi=10.1006/jcph.1993.1209}}</ref>
|author4=W. T. Vetterling
Finally, when a small disk is found that contains a single root, that root may be further approximated using other methods, e.g. [[Newton's method]].
|title=[[Numerical Recipes|Numerical Recipes in C: The Art of Scientific Computing]]
|publisher=Cambridge University Press
|year=1992
|chapter=9.5 Roots of Polynomials
|page=369
|ISBN=0-521-43108-5
|chapter-url=http://www.nrbook.com/a/bookcpdf.html
}}
 
==External linksNotes==
* [https://svn.ece.lsu.edu/svn/dmk/trunk/benchmarks/simtools/scpu2k/gap/polystff.g GAP Library, Lehrstuhl D fuer Mathematik, RWTH Aachen, Germany]
* [[Jan van Leeuwen]] (January 1979): [http://www.cs.uu.nl/research/techreps/repo/CS-1979/1979-01.pdf "On program efficiency and algebraic complexity"] (or: how to compute the Schur transform of a complex polynomial). technical report RUU-CS-79~1. Department of Computer Science, University of Utrecht
 
{{DEFAULTSORT:Lehmer-Schur Algorithm}}