Division algorithm: Difference between revisions

Content deleted Content added
Newton–Raphson division: Split up "Variant Newton–Raphson division" subsection. The auadratic initial estimate material was moved up to join the linar estimate text in a new "Initial estimate" subsection. The remaining discusson of a cubic iteration is retitled "Cubic iteration." Rewrite "number of iterations to get P bits" equation to "number of bits after S iterations". FIXME: Is the cubic iteration exactly cubic, or is there a scaling term?
Line 231:
This squaring of the error at each iteration step{{snd}} the so-called [[Newton's method#Practical considerations|quadratic convergence]] of Newton–Raphson's method{{snd}} has the effect that the number of correct digits in the result roughly ''doubles for every iteration'', a property that becomes extremely valuable when the numbers involved have many digits (e.g. in the large integer ___domain). But it also means that the initial convergence of the method can be comparatively slow, especially if the initial estimate <math>X_0</math> is poorly chosen.
 
====Initial estimate====
For the subproblem of choosing an initial estimate <math>X_0</math>, it is convenient to apply a bit-shift to the divisor ''D'' to scale it so that 0.5&nbsp;≤&nbsp;''D''&nbsp;≤&nbsp;1; by. applyingApplying the same bit-shift to the numerator ''N'', one ensures the quotient does not change. Then oneOnce couldwithin usea bounded range, a linearsimple polynomial [[approximation]] incan thebe formused to find an initial estimate.
:<math>X_0 = T_1 + T_2 D \approx \frac{1}{D} \,</math>
 
toThe initializelinear Newton–Raphson.[[approximation]] Towith minimizemimimum the maximum of theworst-case absolute value of the error ofon thisinterval approximation onthe interval <math>[0.5,1]</math>, one should useis:
:<math>X_0 = {48 \over 17} - {32 \over 17} D. \,</math>
The coefficients of the linear approximation <math>T_0 +T_1 D</math> are determined as follows. The absolute value of the error is <math>|\varepsilon_0| = |1 - D(T_1T_0 + T_2T_1 D)|</math>. The minimum of the maximum absolute value of the error is determined by the [[Equioscillation theorem|Chebyshev equioscillation theorem]] applied to <math>F(D) = 1 - D(T_1T_0 + T_2T_1 D)</math>. The local minimum of <math>F(D)</math> occurs when <math>F'(D) = 0</math>, which has solution <math>D = -T_1T_0/(2T_22T_1)</math>. The function at that minimum must be of opposite sign as the function at the endpoints, namely, <math>F(1/2) = F(1) = -F(-T_1T_0/(2T_22T_1))</math>. The two equations in the two unknowns have a unique solution <math>T_1T_0 = 48/17</math> and <math>T_2T_1 = -32/17</math>, and the maximum error is <math>F(1) = 1/17</math>. Using this approximation, the absolute value of the error of the initial value is less than
:<math>\vert \varepsilon_0 \vert \leq {1 \over 17} \approx 0.059 . \,</math>
It is possible to generate a polynomial fit of degree larger than 1, computing the coefficients using the [[Remez algorithm]]. The trade-off is that the initial guess requires more computational cycles but hopefully in exchange for fewer iterations of Newton–Raphson.
 
The best quadratic fit to <math>1/D</math> in the interval is
Since for this method the [[rate of convergence|convergence]] is exactly quadratic, it follows that
:<math> X := \frac{140}{33} + D \cdot \left(\frac{-64}{11} + D \cdot \frac{256}{99}\right) .</math>
It is chosen to make the error equal to a re-scaled third order [[Chebyshev polynomial]] of the first kind, and gives an absolute value of the error less than or equal to 1/99. This improvement is equivalent to <math>\log_2(\log 99/\log 17) \approx 0.7</math> Newton–Raphson iterations, at a computational cost of less than one iteration.
 
It is possible to generate a polynomial fit of degree larger than 12, computing the coefficients using the [[Remez algorithm]]. The trade-off is that the initial guess requires more computational cycles but hopefully in exchange for fewer iterations of Newton–Raphson.
:<math>S = \left \lceil \log_2 \frac{P + 1}{\log_2 17} \right \rceil \,</math>
 
Since for this method the [[rate of convergence|convergence]] is exactly quadratic, it follows that, from an initial error <math>\epsilon_0</math>, <math>S</math> iterations will give an answer accurate to
steps are enough to calculate the value up to <math>P \,</math> binary places. This evaluates to 3 for IEEE [[single precision]] and 4 for both [[double precision]] and [[extended precision|double extended]] formats.
:<math>P = -2^S \log_2 \epsilon_0 - 1 = 2^S \log_2(1/\epsilon_0) - 1</math>
binary places. Typical values are:
{|class=wikitable style="text-align:right;"
|+ Binary digits of reciprocal accuracy
!rowspan=2| <math>\epsilon_0</math> ||colspan=5| Iterations
|-
! 0 || 1 || 2 || 3 || 4
|-
! <math>1/17</math>
| {{round|{{#expr: (ln17/ln2) - 1}}|2}}
| {{round|{{#expr: 2*(ln17/ln2) - 1}}|2}}
| {{round|{{#expr: 4*(ln17/ln2) - 1}}|2}}
| {{round|{{#expr: 8*(ln17/ln2) - 1}}|2}}
| {{round|{{#expr:16*(ln17/ln2) - 1}}|2}}
|-
! <math>1/99</math>
| {{round|{{#expr: (ln99/ln2) - 1}}|2}}
| {{round|{{#expr: 2*(ln99/ln2) - 1}}|2}}
| {{round|{{#expr: 4*(ln99/ln2) - 1}}|2}}
| {{round|{{#expr: 8*(ln99/ln2) - 1}}|2}}
| {{round|{{#expr:16*(ln99/ln2) - 1}}|2}}
|}
A quadratic initial estimate plus two iterations is accurate enough for IEEE [[single precision]], but three iterations are marginal for [[double precision]]. A linear initial estimate plus four iterations is sufficient for both double and [[extended precision|double extended]] formats.
 
====Pseudocode====
Line 260 ⟶ 284:
For example, for a double-precision floating-point division, this method uses 10 multiplies, 9 adds, and 2 shifts.
 
====VariantCubic Newton–Raphson divisioniteration====
There is an iteration which uses three multiplications to cube the error:
The Newton-Raphson division method can be modified to be slightly faster as follows. After shifting ''N'' and ''D'' so that ''D'' is in [0.5, 1.0], initialize with
:<math> X := \frac{140}{33} + D \cdot \left(\frac{-64}{11} + D \cdot \frac{256}{99}\right) .</math>
This is the best quadratic fit to 1/''D'' and gives an absolute value of the error less than or equal to 1/99. It is chosen to make the error equal to a re-scaled third order [[Chebyshev polynomial]] of the first kind. The coefficients should be pre-calculated and hard-coded.
 
Then in the loop, use an iteration which cubes the error.
:<math> E := 1 - D \cdot X </math>
:<math> Y := X \cdot E </math>
Line 271 ⟶ 291:
The ''Y''&sdot;''E'' term is new.
 
Because <math>\log 3 / \log 2 \approx 1.585 > 3/2</math>, this converges slightly faster per multiplication.
If the loop is performed until X agrees with 1/''D'' on its leading ''P'' bits, then the number of iterations will be no more than
 
:<math> \left \lceil \log_3 \left(\frac{P + 1}{\log_2 99} \right) \right \rceil </math>
which is theThe number of timescorrect 99bits mustafter be cubed to get to 2<supmath>''P''+1S</supmath>. Theniterations is
:<math>P Q:= N-3^S \cdotlog_2 X\epsilon_0 - 1 = 3^S \log_2(1/\epsilon_0) - 1</math>
binary places. Typical values are:
is the quotient to ''P'' bits.
{|class=wikitable style="text-align:right;"
|+ Bits of reciprocal accuracy
!rowspan=2| <math>\epsilon_0</math> ||colspan=4| Iterations
|-
! 0 || 1 || 2 || 3
|-
! <math>1/17</math>
| {{round|{{#expr: (ln17/ln2) - 1}}|2}}
| {{round|{{#expr: 3*(ln17/ln2) - 1}}|2}}
| {{round|{{#expr: 9*(ln17/ln2) - 1}}|2}}
| {{round|{{#expr:27*(ln17/ln2) - 1}}|2}}
|-
! <math>1/99</math>
| {{round|{{#expr: (ln99/ln2) - 1}}|2}}
| {{round|{{#expr: 3*(ln99/ln2) - 1}}|2}}
| {{round|{{#expr: 9*(ln99/ln2) - 1}}|2}}
| {{round|{{#expr:27*(ln99/ln2) - 1}}|2}}
|}
It is also possible to use a mixture of quadratic and cubic iterations.
 
Using higher degree polynomials in either the initialization or the iteration results in a degradation of performance because the extra multiplications required would be better spent on doing more iterations.{{cn|date=January 2025}}
 
==={{anchor|AEGP}}Goldschmidt division===