Division algorithm: Difference between revisions

Content deleted Content added
Formatting
Citation bot (talk | contribs)
Added bibcode. Removed URL that duplicated identifier. Removed access-date with no URL. Removed parameters. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox | #UCB_webform_linked 709/1032
 
(8 intermediate revisions by 5 users not shown)
Line 166:
Following this algorithm, the quotient is in a non-standard form consisting of digits of −1 and +1. This form needs to be converted to binary to form the final quotient. Example:
{| border="0" cellpadding="0"
|colspan=2 | Convert the following quotient to the digit set {0,1}:
|-
| colspan="2" | Convert the following quotient to the digit set {0,1}:
|Start: || <math>Q = 111\bar{1}1\bar{1}1\bar{1}</math>
|-
|1. Form the positive termStart: || <math>PQ = 11101010111\,bar{1}1\bar{1}1\bar{1}</math>
|-
|2 1. MaskForm the negativepositive term:{{NoteTag|Signed binary notation with [[ones' complement]] without [[two's complement]].}} || <math>MP = 0001010111101010\,</math>
|-
| 2. Mask the negative term:{{NoteTag|Signed binary notation with [[ones' complement]] without [[two's complement]].}} || <math>M = 00010101\,</math>
|3. Subtract: <math>P - M</math>: ||<math>Q = 11010101\,</math>
|-
| 3. Subtract: <math>P - M</math>: || <math>Q = 11010101\,</math>
|colspan=2 | {{reflist|group=note}}
|-
| colspan="2" | {{reflist|group=note}}
|}
 
Line 195 ⟶ 196:
 
==={{anchor|SRT}}SRT division===
SRT division is a popular method for division in many [[microprocessor]] implementations.<ref>{{cite tech report |url=http://pages.hmc.edu/harris/research/srtlong.pdf |title=SRT Division: Architectures, Models, and Implementations |first1=David L. |last1=Harris |first2=Stuart F. |last2=Oberman |first3=Mark A. |last3=Horowitz |publisher=Stanford University |date=9 September 1998 |access-date=23 December 2016 |archive-date=24 December 2016 |archive-url=https://web.archive.org/web/20161224030439/http://pages.hmc.edu/harris/research/srtlong.pdf |url-status=live }}</ref><ref>{{cite journal |url=https://ieeexplore.ieee.org/document/614875 |title=SRT Division Algorithms as Dynamical Systems |first1=Mark |last1=McCann |first2=Nicholas |last2=Pippenger |journal=SIAM Journal on Computing |volume=34 |issue=6 |pages=1279–1301 |year=2005 |doi=10.1137/S009753970444106X |hdl=2429/12179 |citeseerx=10.1.1.72.6993 |access-date=2022-08-24 |archive-date=2022-08-24 |archive-url=https://web.archive.org/web/20220824213238/https://ieeexplore.ieee.org/document/614875 |url-status=live }}</ref> The algorithm is named after D. W. Sweeney of [[IBM]], James E. Robertson of [[University of Illinois]], and [[K. D. Tocher]] of [[Imperial College London]]. They all developed the algorithm independently at approximately the same time (published in February 1957, September 1958, and January 1958 respectively).<ref>{{Citation |title=High speed arithmetic in a parallel device |last1=Cocke |first1=John |pages=20 |url=https://www.computerhistory.org/collections/catalog/102632302 |publication-date=11 February 1957 |last2=Sweeney |first2=D.W. |date=11 February 1957 |type=Company Memo |publication-place=IBM |access-date=24 August 2022 |archive-date=24 August 2022 |archive-url=https://web.archive.org/web/20220824212341/https://www.computerhistory.org/collections/catalog/102632302 |url-status=live }}</ref><ref>{{Cite journal |title=A New Class of Digital Division Methods |journal=IRE Transactions on Electronic Computers |url=https://ieeexplore.ieee.org/document/5222579 |last=Robertson |first=James |date=1958-09-01 |volume=EC-7 |issue=3 |pages=218–222 |publisher=IEEE |doi=10.1109/TEC.1958.5222579 |hdl=2027/uiuo.ark:/13960/t0gt7529c |hdl-access=free |access-date=2022-08-24 |archive-date=2022-08-24 |archive-url=https://web.archive.org/web/20220824213239/https://ieeexplore.ieee.org/document/5222579 |url-status=live |url-access=subscription }}</ref><ref>{{Cite journal |journal=The Quarterly Journal of Mechanics and Applied Mathematics |url=https://academic.oup.com/qjmam/article-abstract/11/3/364/1883426 |last=Tocher |first=K.D. |title=Techniques of Multiplication and Division for Automatic Binary Computers |date=1958-01-01 |issue=3 |volume=11 |pages=364–384 |doi=10.1093/qjmam/11.3.364 |access-date=2022-08-24 |archive-date=2022-08-24 |archive-url=https://web.archive.org/web/20220824214400/https://academic.oup.com/qjmam/article-abstract/11/3/364/1883426 |url-status=live |url-access=subscription }}</ref>
 
SRT division is similar to non-restoring division, but it uses a [[lookup table]] based on the dividend and the divisor to determine each quotient digit.
Line 206 ⟶ 207:
 
== Fast division methods ==
 
=== Newton–Raphson division ===
Newton–Raphson uses [[Newton's method]] to find the [[Multiplicative inverse|reciprocal]] of <math>D</math> and multiply that reciprocal by <math>N</math> to find the {{nowrap|final quotient <math>Q</math>.}}
Line 248:
:<math>P = -2^S \log_2 \varepsilon_0 - 1 = 2^S \log_2(1/\varepsilon_0) - 1</math>
binary places. Typical values are:
 
{| class="wikitable" style="text-align: right;"
|+ Binary digits of reciprocal accuracy
|-
! rowspan="2" | <math>\varepsilon_0</math> || colspan="5" | Iterations
|-
! 0 || 1 || 2 || 3 || 4
Line 268 ⟶ 270:
| {{round|{{#expr:16*(ln99/ln2) - 1}}|2}}
|}
 
A quadratic initial estimate plus two iterations is accurate enough for IEEE [[single precision]], but three iterations are marginal for [[double precision]]. A linear initial estimate plus four iterations is sufficient for both double and [[extended precision|double extended]] formats.
 
Line 273 ⟶ 276:
The following computes the quotient of {{var|N}} and {{var|D}} with a precision of {{var|P}} binary places:
 
<div style="font-family: monospace, monospace;">
<pre<includeonly />>
Express D as M × 2<sup>e</sup> where 1 &le; M < 2 (standard floating point representation)<br>
D' := D / 2<sup>e+1</sup> ''// scale between 0.5 and 1, can be performed with bit shift / exponent subtraction''<br>
N' := N / 2<sup>e+1</sup><br>
X := 48/17 − 32/17 × D' ''// precompute constants with same precision as D''<br>
{{nowrap|'''repeat''' <math>\left \lceil \log_2 \frac{P + 1}{\log_2 17} \right \rceil \,</math> '''times'''}} ''// can be precomputed based on fixed P''<br>
X := X + X × (1 - D' × X)<br>
'''end'''<br>
'''return''' N' × X<br>
</prediv>
 
For example, for a double-precision floating-point division, this method uses 10 multiplies, 9 adds, and 2 shifts.
Line 312 ⟶ 315:
:<math>P = -3^S \log_2 \varepsilon_0 - 1 = 3^S \log_2(1/\varepsilon_0) - 1</math>
binary places. Typical values are:
{| class="wikitable" style="text-align: right;"
|+ Bits of reciprocal accuracy
|-
! rowspan="2" | <math>\varepsilon_0</math> || colspan="4" | Iterations
|-
! 0 || 1 || 2 || 3
Line 330 ⟶ 334:
| {{round|{{#expr:27*(ln99/ln2) - 1}}|2}}
|}
 
A quadratic initial estimate plus two cubic iterations provides ample precision for an IEEE double-precision result. It is also possible to use a mixture of quadratic and cubic iterations.
 
Line 342 ⟶ 347:
Using higher degree polynomials in either the initialization or the iteration results in a degradation of performance because the extra multiplications required would be better spent on doing more iterations.{{cn|date=January 2025}}
 
=== {{anchor|AEGP}}Goldschmidt division ===
Goldschmidt division<ref>{{cite thesis |first=Robert E. |last=Goldschmidt |title=Applications of Division by Convergence |series=M.Sc. dissertation |publisher=M.I.T. |date=1964 |oclc=34136725 |url=http://dspace.mit.edu/bitstream/handle/1721.1/11113/34136725-MIT.pdf |access-date=2015-09-15 |archive-date=2015-12-10 |archive-url=https://web.archive.org/web/20151210223340/http://dspace.mit.edu/bitstream/handle/1721.1/11113/34136725-MIT.pdf |url-status=live }}</ref> (after Robert Elliott Goldschmidt)<ref>{{cite journal |title=Authors |url=https://ieeexplore.ieee.org/document/5392026 |journal=IBM Journal of Research and Development | year=1967 | volume=11 | pages=125–127 | doi=10.1147/rd.111.0125 |archive-url=https://web.archive.org/web/20180718114413/https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=5392026 |archive-date=18 July 2018|url-access=subscription }}</ref> uses an iterative process of repeatedly multiplying both the dividend and divisor by a common factor ''F''<sub>''i''</sub>, chosen such that the divisor converges to 1. This causes the dividend to converge to the sought quotient ''Q'':
 
Line 394 ⟶ 399:
It is not necessary to use specifically (1/''D''); any value (''X''/''Y'') that reduces to (1/''D'') may be used. For example, for division by 3, the factors 1/3, 2/6, 3/9, or 194/582 could be used. Consequently, if ''Y'' were a power of two the division step would reduce to a fast right bit shift. The effect of calculating ''N''/''D'' as (''N''·''X'')/''Y'' replaces a division with a multiply and a shift. Note that the parentheses are important, as ''N''·(''X''/''Y'') will evaluate to zero.
 
However, unless ''D'' itself is a power of two, there is no ''X'' and ''Y'' that satisfies the conditions above. Fortunately, (''N''·''X'')/''Y'' gives exactly the same result as ''N''/''D'' in integer arithmetic even when (''X''/''Y'') is not exactly equal to 1/''D'', but "close enough" that the error introduced by the approximation is in the bits that are discarded by the shift operation.<ref>{{cite journal |title=Division by Invariant Integers using Multiplication |first1=Torbjörn |last1=Granlund |first2=Peter L. |last2=Montgomery |journal=SIGPLAN Notices |volume=29 |issue=6 |date=June 1994 |pages=61–72 |doi=10.1145/773473.178249 |url=http://gmplib.org/~tege/divcnst-pldi94.pdf |citeseerx=10.1.1.1.2556 |access-date=2015-12-08 |archive-date=2019-06-06 |archive-url=https://web.archive.org/web/20190606211506/https://gmplib.org/~tege/divcnst-pldi94.pdf |url-status=live }}</ref><ref>{{cite journal |title=Improved Division by Invariant Integers |first1=Niels |last1=Möller |first2=Torbjörn |last2=Granlund |journal=IEEE Transactions on Computers |date=February 2011 |volume=60 |issue=2 |pages=165–175 |doi=10.1109/TC.2010.143 |bibcode=2011ITCmp..60..165M |s2cid=13347152 |url=http://gmplib.org/~tege/division-paper.pdf |access-date=2015-12-08 |archive-date=2015-12-22 |archive-url=https://web.archive.org/web/20151222160554/https://gmplib.org/~tege/division-paper.pdf |url-status=live }}</ref><ref>ridiculous_fish.
[http://ridiculousfish.com/files/faster_unsigned_division_by_constants.pdf "Labor of Division (Episode III): Faster Unsigned Division by Constants"] {{Webarchive|url=https://web.archive.org/web/20220108225258/http://ridiculousfish.com/files/faster_unsigned_division_by_constants.pdf |date=2022-01-08 }}.
2011.</ref> [[Barrett reduction]] uses powers of 2 for the value of ''Y'' to make division by ''Y'' a simple right shift.{{efn|1=Modern [[compilers]] commonly perform this integer multiply-and-shift optimization; for a constant only known at run-time, however, the program must implement the optimization itself.<ref>{{cite web |last1=ridiculous_fish |title=libdivide, optimized integer division |url=https://libdivide.com/ |access-date=6 July 2021 |archive-date=23 November 2021 |archive-url=https://web.archive.org/web/20211123015446/https://libdivide.com/ |url-status=live }}</ref>}}
Line 424 ⟶ 429:
 
== See also ==
* [[Galley division]]
* [[Multiplication algorithm]]
* [[Pentium FDIV bug]]
 
== Notes ==