Content deleted Content added
m Bot: http → https |
|||
(61 intermediate revisions by 36 users not shown) | |||
Line 1:
{{short description|Algorithm to multiply two numbers}}
{{Use dmy dates|date=May 2019|cs1-dates=y}}
A '''multiplication algorithm''' is an [[algorithm]] (or method) to [[multiplication|multiply]] two numbers. Depending on the size of the numbers, different algorithms are more efficient than others.
The oldest and simplest method, known since [[Ancient history|antiquity]] as '''long multiplication''' or '''grade-school multiplication''', consists of multiplying every digit in the first number by every digit in the second and adding the results. This has a [[time complexity]] of <math>O(n^2)</math>, where ''n'' is the number of digits. When done by hand, this may also be reframed as [[grid method multiplication]] or [[lattice multiplication]]. In software, this may be called "shift and add" due to [[bitshifts]] and addition being the only two operations needed.
In 1960, [[Anatoly Karatsuba]] discovered [[Karatsuba multiplication]], unleashing a flood of research into fast multiplication algorithms. This method uses three multiplications rather than four to multiply two two-digit numbers. (A variant of this can also be used to multiply [[complex numbers]] quickly.) Done [[recursively]], this has a time complexity of <math>O(n^{\log_2 3})</math>. Splitting numbers into more than two parts results in [[Toom-Cook multiplication]]; for example, using three parts results in the '''Toom-3''' algorithm. Using many parts can set the exponent arbitrarily close to 1, but the constant factor also grows, making it impractical.
In 1968, the [[Schönhage-Strassen algorithm]], which makes use of a [[Fourier transform]] over a [[Modulus (modular arithmetic)|modulus]], was discovered. It has a time complexity of <math>O(n\log n\log\log n)</math>. In 2007, [[Martin Fürer]] proposed an algorithm with complexity <math>O(n\log n 2^{\Theta(\log^* n)})</math>. In 2014, Harvey, [[Joris van der Hoeven]], and Lecerf proposed one with complexity <math>O(n\log n 2^{3\log^* n})</math>, thus making the [[implicit constant]] explicit; this was improved to <math>O(n\log n 2^{2\log^* n})</math> in 2018. Lastly, in 2019, Harvey and van der Hoeven came up with a [[galactic algorithm]] with complexity <math>O(n\log n)</math>. This matches a guess by Schönhage and Strassen that this would be the optimal bound, although this remains a [[conjecture]] today.
Integer multiplication algorithms can also be used to multiply polynomials by means of the method of [[Kronecker substitution]].
==Long multiplication==
Line 24 ⟶ 32:
====Other notations====
In some countries such as [[Germany]], the above multiplication is depicted similarly but with the original product kept horizontal and computation starting with the first digit of the multiplier:<ref>{{Cite web |title=Multiplication |url=
23958233 · 5830
Line 31 ⟶ 39:
191665864
71874699
00000000
———————————————
139676498390
Line 57 ⟶ 65:
Computers initially used a very similar algorithm to long multiplication in base 2, but modern processors have optimized circuitry for fast multiplications using more efficient algorithms, at the price of a more complex hardware realization.{{cn|date=March 2022}} In base two, long multiplication is sometimes called '''"shift and add"''', because the algorithm simplifies and just consists of shifting left (multiplying by powers of two) and adding. Most currently available microprocessors implement this or other similar algorithms (such as [[Booth encoding]]) for various integer and floating-point sizes in [[hardware multiplier]]s or in [[microcode]].{{cn|date=March 2022}}
On currently available processors, a bit-wise shift instruction is usually (but not always) faster than a multiply instruction and can be used to multiply (shift left) and divide (shift right) by powers of two. Multiplication by a constant and [[division algorithm#Division by a constant|division by a constant]] can be implemented using a sequence of shifts and adds or subtracts. For example, there are several ways to multiply by 10 using only bit-shift and addition.
<syntaxhighlight lang="php">
((x << 2) + x) << 1 # Here 10*x is computed as (x*2^2 + x)*2
Line 96 ⟶ 104:
|}
followed by addition to obtain 442, either in a single sum (see right), or through forming the row-by-row totals
: (300 + 40) + (90 + 12) = 340 + 102 = 442. This calculation approach (though not necessarily with the explicit grid arrangement) is also known as the [[partial products algorithm]]. Its essence is the calculation of the simple multiplications separately, with all addition being left to the final gathering-up stage.
Line 227 ⟶ 236:
</math>
In the case where <math>x</math> and <math>y</math> are integers, we have that
:<math> (x+y)^2 \equiv (x-y)^2 \bmod 4</math>
because <math>x+y</math> and <math>x-y</math> are either both even or both odd. This means that
:<math>\begin{align}
xy &= \frac14(x+y)^2 - \frac14(x-y)^2 \\
\end{align}</math>
and it's sufficient to (pre-)compute the integral part of squares divided by 4 like in the following example.
====Examples ====
Line 273 ⟶ 259:
====History of quarter square multiplication====
In prehistoric time, quarter square multiplication involved [[Floor and ceiling functions|floor function]]; that some sources<ref>{{citation |title= Quarter Tables Revisited: Earlier Tables, Division of Labor in Table Construction, and Later Implementations in Analog Computers |last=McFarland |first=David|url=
Antoine Voisin published a table of quarter squares from 1 to 1000 in 1817 as an aid in multiplication. A larger table of quarter squares from 1 to 100000 was published by Samuel Laundy in 1856,<ref>{{Citation |title=Reviews |journal=The Civil Engineer and Architect's Journal |year=1857 |pages=54–55 |url=https://books.google.com/books?id=gcNAAAAAcAAJ&pg=PA54 |postscript=.}}</ref> and a table from 1 to 200000 by Joseph Blater in 1888.<ref>{{Citation|title=Multiplying with quarter squares |first=Neville |last=Holmes| journal=The Mathematical Gazette |volume=87 |issue=509 |year=2003 |pages=296–299 |jstor=3621048|postscript=.|doi=10.1017/S0025557200172778 |s2cid=125040256 }}</ref>
Line 287 ⟶ 273:
{{unsolved|computer science|What is the fastest algorithm for multiplication of two <math>n</math>-digit numbers?}}
A line of research in [[theoretical computer science]] is about the number of single-bit arithmetic operations necessary to multiply two <math>n</math>-bit integers. This is known as the [[computational complexity]] of multiplication. Usual algorithms done by hand have asymptotic complexity of <math>O(n^2)</math>, but in 1960 [[Anatoly Karatsuba]] discovered that better complexity was possible (with the [[Karatsuba algorithm]]).<ref>{{cite web | url=https://youtube.com/watch?v=AMl6EJHfUWo | title= The Genius Way Computers Multiply Big Numbers| website=[[YouTube]]| date= 2 January 2025}}</ref>
Currently, the algorithm with the best computational complexity is a 2019 algorithm of [[David Harvey (mathematician)|David Harvey]] and [[Joris van der Hoeven]], which uses the strategies of using [[number-theoretic transform]]s introduced with the [[Schönhage–Strassen algorithm]] to multiply integers using only <math>O(n\log n)</math> operations.<ref>{{cite journal | last1 = Harvey | first1 = David | last2 = van der Hoeven | first2 = Joris | author2-link = Joris van der Hoeven | doi = 10.4007/annals.2021.193.2.4 | issue = 2 | journal = [[Annals of Mathematics]] | mr = 4224716 | pages = 563–617 | series = Second Series | title = Integer multiplication in time <math>O(n \log n)</math> | volume = 193 | year = 2021| s2cid = 109934776 | url = https://hal.archives-ouvertes.fr/hal-02070778v2/file/nlogn.pdf }}</ref> This is conjectured to be the best possible algorithm, but lower bounds of <math>\Omega(n\log n)</math> are not known.
Line 294 ⟶ 280:
{{Main|Karatsuba algorithm}}
Karatsuba multiplication is an O(''n''<sup>log<sub>2</sub>3</sup>) ≈ O(''n''<sup>1.585</sup>) divide and
By rewriting the formula, one makes it possible to do sub calculations / recursion. By doing recursion, one can solve this in a fast manner.
Let <math>x</math> and <math>y</math> be represented as <math>n</math>-digit strings in some base <math>B</math>. For any positive integer <math>m</math> less than <math>n</math>, one can write the two given numbers as
Line 341 ⟶ 326:
By exploring patterns after expansion, one see following:
<math display="block">\begin{alignat}{5} (x_1 B^{ m} + x_0) (y_1 B^{m} + y_0) (z_1 B^{ m} + z_0) (a_1 B^{ m} + a_0) &=
\end{alignat}</math>
<math> 2^{N+1}-1 </math>,
If we express this in fewer terms, we get:
<math
</math>, where <math> c(i,j) </math> means digit in number i at position j. Notice that <math> c(i,j) \in \{0,1\} </math>
<math display="block">
\begin{align}
z_{0} &= \prod_{j=1}^N x_{j,0}
\\
z_{N} &= \prod_{j=1}^N x_{j,1}▼
\\
▲z_{N} = \prod_{j=1}^N x_{j,1}
z_{N-1} &= \prod_{j=1}^N (x_{j,0} + x_{j,1}) - \sum_{i \ne N-1}^{N} z_i▼
\end{align}
▲z_{N-1} = \prod_{j=1}^N (x_{j,0} + x_{j,1}) - \sum_{i \ne N-1}^{N} z_i
</math>
Line 381 ⟶ 363:
{{Main|Schönhage–Strassen algorithm}}
[[File:Integer multiplication by FFT.svg|thumb|350px|Demonstration of multiplying 1234 × 5678 = 7006652 using fast Fourier transforms (FFTs). [[Number-theoretic transform]]s in the integers modulo 337 are used, selecting 85 as an 8th root of unity. Base 10 is used in place of base 2<sup>''w''</sup> for illustrative purposes.]]
Every number in base B, can be written as a polynomial:
<math display="block"> X = \sum_{i=0}^N {x_iB^i} </math>
Furthermore, multiplication of two numbers could be thought of as a product of two polynomials:
<math display="block">XY = (\sum_{i=0}^N {x_iB^i})(\sum_{j=0}^N {y_iB^j}) </math>
Because,for <math> B^k </math>: <math>c_k =\sum_{(i,j):i+j=k}
we have a convolution.
By using fft (fast fourier transformation) with convolution rule, we can get
<math display="block"> \hat{f}(a * b) = \hat{f}(\sum_{i=0}^k {a_ib_{k-i}}) = \
is the corresponding coefficient in fourier space. This can also be written as: <math>\mathrm{fft}(a * b) = \mathrm{fft}(a) \bullet \mathrm{fft}(b)</math>.
We have the same coefficient due to linearity under fourier transformation, and because these polynomials
<math> \hat{f}(X * Y) = \hat{X}</math> • <math>\hat{Y} </math>▼
only consist of one unique term per coefficient:
<math display="block"> \hat{f}(x^n) = \left(\frac{i}{2\pi}\right)^n \delta^{(n)} </math> and
<math display="block"> \hat{f}(a\, X(\xi) + b\, Y(\xi)) = a\, \hat{X}(\xi) + b\, \hat{Y}(\xi)</math>
▲* Convolution rule: <math> \hat{f}(X * Y) = \ \hat{
We have reduced our convolution problem
to product problem, through fft.
By finding ifft (polynomial interpolation), for each <math>c_k </math>, one get the desired
Algorithm uses divide and conquer strategy, to divide problem to subproblems.
It has a time complexity of O(''n'' log(''n'') log(log(''n''))).
Line 413 ⟶ 400:
==== History ====
=== Further improvements ===
In 2007 the [[asymptotic complexity]] of integer multiplication was improved by the Swiss mathematician [[Martin Fürer]] of Pennsylvania State University to
In 2014, Harvey, [[Joris van der Hoeven]] and Lecerf<ref>{{cite journal
Line 432 ⟶ 419:
| year = 2016}}</ref> gave a new algorithm that achieves a running time of <math>O(n\log n \cdot 2^{3\log^* n})</math>, making explicit the implied constant in the <math>O(\log^* n)</math> exponent. They also proposed a variant of their algorithm which achieves <math>O(n\log n \cdot 2^{2\log^* n})</math> but whose validity relies on standard conjectures about the distribution of [[Mersenne prime]]s. In 2016, Covanov and Thomé proposed an integer multiplication algorithm based on a generalization of [[Fermat primes]] that conjecturally achieves a complexity bound of <math>O(n\log n \cdot 2^{2\log^* n})</math>. This matches the 2015 conditional result of Harvey, van der Hoeven, and Lecerf but uses a different algorithm and relies on a different conjecture.<ref>{{cite journal |first1=Svyatoslav |last1=Covanov |first2=Emmanuel |last2=Thomé |title=Fast Integer Multiplication Using Generalized Fermat Primes |journal=[[Mathematics of Computation|Math. Comp.]] |volume=88 |year=2019 |issue=317 |pages=1449–1477 |doi=10.1090/mcom/3367 |arxiv=1502.02800 |s2cid=67790860 }}</ref> In 2018, Harvey and van der Hoeven used an approach based on the existence of short lattice vectors guaranteed by [[Minkowski's theorem]] to prove an unconditional complexity bound of <math>O(n\log n \cdot 2^{2\log^* n})</math>.<ref>{{cite journal |first1=D. |last1=Harvey |first2=J. |last2=van der Hoeven |year=2019 |title=Faster integer multiplication using short lattice vectors |journal=The Open Book Series |volume=2 |pages=293–310 |doi=10.2140/obs.2019.2.293 |arxiv=1802.07932|s2cid=3464567 }}</ref>
In March 2019, [[David Harvey (mathematician)|David Harvey]] and [[Joris van der Hoeven]] announced their discovery of an {{nowrap|''O''(''n'' log ''n'')}} multiplication algorithm.<ref>{{Cite magazine|url=https://www.quantamagazine.org/mathematicians-discover-the-perfect-way-to-multiply-20190411/|title=Mathematicians Discover the Perfect Way to Multiply|last=Hartnett|first=Kevin|magazine=Quanta Magazine|date=11 April 2019|access-date=2019-05-03}}</ref> It was published in the ''[[Annals of Mathematics]]'' in 2021.<ref>{{cite journal | last1 = Harvey | first1 = David | last2 = van der Hoeven | first2 = Joris | author2-link = Joris van der Hoeven | doi = 10.4007/annals.2021.193.2.4 | issue = 2 | journal = [[Annals of Mathematics]] | mr = 4224716 | pages = 563–617 | series = Second Series | title = Integer multiplication in time <math>O(n \log n)</math> | volume = 193 | year = 2021| s2cid = 109934776 | url = https://hal.archives-ouvertes.fr/hal-02070778v2/file/nlogn.pdf }}</ref> Because Schönhage and Strassen predicted that ''n'' log(''n'') is the
===Lower bounds===
There is a trivial lower bound of [[Big O notation#Family of Bachmann–Landau notations|Ω]](''n'') for multiplying two ''n''-bit numbers on a single processor; no matching algorithm (on conventional machines, that is on Turing equivalent machines) nor any sharper lower bound is known. The [[Hartmanis–Stearns conjecture]] would imply that <math>O(n)</math> cannot be achieved. Multiplication lies outside of [[ACC0|AC<sup>0</sup>[''p'']]] for any prime ''p'', meaning there is no family of constant-depth, polynomial (or even subexponential) size circuits using AND, OR, NOT, and MOD<sub>''p''</sub> gates that can compute a product. This follows from a constant-depth reduction of MOD<sub>''q''</sub> to multiplication.<ref>{{cite book |first1=Sanjeev |last1=Arora |first2=Boaz |last2=Barak |title=Computational Complexity: A Modern Approach |publisher=Cambridge University Press |date=2009 |isbn=978-0-521-42426-4 |url={{GBurl|8Wjqvsoo48MC|pg=PR7}}}}</ref> Lower bounds for multiplication are also known for some classes of [[branching program]]s.<ref>{{cite journal |first1=F. |last1=Ablayev |first2=M. |last2=Karpinski |title=A lower bound for integer multiplication on randomized ordered read-once branching programs |journal=Information and Computation |volume=186 |issue=1 |pages=78–89 |date=2003 |doi=10.1016/S0890-5401(03)00118-4 |url=https://core.ac.uk/download/pdf/82445954.pdf}}</ref>
==Complex number multiplication==
Line 465 ⟶ 452:
This algorithm uses only three multiplications, rather than four, and five additions or subtractions rather than two. If a multiply is more expensive than three adds or subtracts, as when calculating by hand, then there is a gain in speed. On modern computers a multiply and an add can take about the same time so there may be no speed gain. There is a trade-off in that there may be some loss of precision when using floating point.
For [[fast Fourier transform]]s (FFTs) (or any [[Linear map|linear transformation]]) the complex multiplies are by constant coefficients ''c'' + ''di'' (called [[twiddle factor]]s in FFTs), in which case two of the additions (''d''−''c'' and ''c''+''d'') can be precomputed. Hence, only three multiplies and three adds are required.<ref>{{cite journal |first1=P. |last1=Duhamel |first2=M. |last2=Vetterli |title=Fast Fourier transforms: A tutorial review and a state of the art |journal=Signal Processing |volume=19 |issue=4 |pages=259–299 See Section 4.1 |date=1990 |doi=10.1016/0165-1684(90)90158-U |bibcode=1990SigPr..19..259D |url=https://core.ac.uk/download/pdf/147907050.pdf}}</ref> However, trading off a multiplication for an addition in this way may no longer be beneficial with modern [[floating-point unit]]s.<ref>{{cite journal |first1=S.G. |last1=Johnson |first2=M. |last2=Frigo |title=A modified split-radix FFT with fewer arithmetic operations |journal=IEEE Trans. Signal Process. |volume=55 |issue= 1|pages=111–9 See Section IV |date=2007 |doi=10.1109/TSP.2006.882087 |bibcode=2007ITSP...55..111J |s2cid=14772428 |url=https://www.fftw.org/newsplit.pdf }}</ref>
==Polynomial multiplication==
Line 509 ⟶ 496:
* [[Horner scheme]] for evaluating of a polynomial
* [[Logarithm]]
* [[Matrix multiplication algorithm]]
* [[Mental calculation]]
* [[Number-theoretic transform]]
Line 528 ⟶ 516:
===Basic arithmetic===
* [
* [
* [
===Advanced algorithms===
* [
{{Number-theoretic algorithms}}
|