Content deleted Content added
Citation bot (talk | contribs) Alter: journal, pages, title. Formatted dashes. | Use this bot. Report bugs. | Suggested by Dominic3203 | Category:Unsolved problems in computer science | #UCB_Category 21/33 |
Citation bot (talk | contribs) Removed URL that duplicated identifier. | Use this bot. Report bugs. | Suggested by Abductive | Category:Computational complexity theory | #UCB_Category 71/110 |
||
(23 intermediate revisions by 11 users not shown) | |||
Line 1:
{{Short description|Algorithmic runtime requirements for matrix multiplication}}
{{CS1 config|mode=cs1}}
{{unsolved|computer science|What is the fastest algorithm for matrix multiplication?}}
In [[theoretical computer science]], the '''computational complexity of matrix multiplication''' dictates [[Analysis of algorithms|how quickly]] the operation of [[matrix multiplication]] can be performed. [[Matrix multiplication algorithm]]s are a central subroutine in theoretical and [[numerical algorithm|numerical]] algorithms for [[numerical linear algebra]] and [[optimization]], so finding the fastest algorithm for matrix multiplication is of major practical relevance.
Line 17 ⟶ 18:
}}</ref> The optimal number of field operations needed to multiply two square {{math|''n'' × ''n''}} matrices [[big O notation|up to constant factors]] is still unknown. This is a major open question in [[theoretical computer science]].
{{As of|2024|01}}, the best bound on the [[Time complexity|asymptotic complexity]] of a matrix multiplication algorithm is {{math|O(''n''<sup>2.371339</sup>)}}.<ref name="adwxxz24">
{{cite arXiv |eprint=2404.16349 |class=cs.DS |first1=Josh |last1=Alman |first2=Ran |last2=Duan |first3=Virginia Vassilevska |last3=Williams |first4=Yinzhan| last4=Xu |first5=Zixuan |last5=Xu |first6=Renfei |last6=Zhou |title=More Asymmetry Yields Faster Matrix Multiplication |year=2024}}</ref> However, this and similar improvements to Strassen are not used in practice, because they are [[galactic algorithm]]s: the constant coefficient hidden by the [[big O notation]] is so large that they are only worthwhile for matrices that are too large to handle on present-day computers.<ref>{{cite journal▼
| last = Iliopoulos
| first = Costas S.
Line 39 ⟶ 41:
== Simple algorithms ==
If ''A'', ''B'' are two {{math|''n'' × ''n''}} matrices over a field, then their product ''AB'' is also an {{math|''n'' × ''n''}} matrix over that field, defined entrywise as
:<math>
(AB)_{ij} = \sum_{k = 1}^n A_{ik} B_{kj}.
Line 57 ⟶ 59:
'''output''' ''C'' (as A*B)
This [[algorithm]] requires
=== Strassen's algorithm ===
{{Main|Strassen algorithm}}
Strassen's algorithm improves on naive matrix multiplication through a [[Divide-and-conquer algorithm|divide-and-conquer]] approach. The key observation is that multiplying two {{math|2 × 2}} matrices can be done with only
Unlike algorithms with faster asymptotic complexity, Strassen's algorithm is used in practice. The [[numerical stability]] is reduced compared to the naive algorithm,<ref>{{cite journal | last1=Miller | first1=Webb | title=Computational complexity and numerical stability | citeseerx = 10.1.1.148.9947 | year=1975 | journal=SIAM News | volume=4 | issue=2 | pages=97–107 | doi=10.1137/0204009}}</ref> but it is faster in cases where {{math|''n'' > 100}} or so<ref name="skiena">{{cite book |first=Steven |last=Skiena |date=2012 |author-link=Steven Skiena |title=The Algorithm Design Manual |url=https://archive.org/details/algorithmdesignm00skie_772 |url-access=limited |publisher=Springer |pages=[https://archive.org/details/algorithmdesignm00skie_772/page/n56 45]–46, 401–403 |doi=10.1007/978-1-84800-070-4_4|chapter=Sorting and Searching |isbn=978-1-84800-069-8 }}</ref> and appears in several libraries, such as [[Basic Linear Algebra Subprograms|BLAS]].<ref>{{cite book |last1=Press |first1=William H. |last2=Flannery |first2=Brian P. |last3=Teukolsky |first3=Saul A. |author3-link=Saul Teukolsky |last4=Vetterling |first4=William T. |title=Numerical Recipes: The Art of Scientific Computing |publisher=[[Cambridge University Press]] |edition=3rd |isbn=978-0-521-88068-8 |year=2007 |page=[https://archive.org/details/numericalrecipes00pres_033/page/n131 108]|title-link=Numerical Recipes }}</ref> Fast matrix multiplication algorithms cannot achieve ''component-wise stability'', but some can be shown to exhibit ''norm-wise stability''.<ref name="bdl16">{{cite journal | last1=Ballard | first1=Grey | last2=Benson | first2=Austin R. | last3=Druinsky | first3=Alex | last4=Lipshitz | first4=Benjamin | last5=Schwartz | first5=Oded | title=Improving the numerical stability of fast matrix multiplication | year=2016 | journal=SIAM Journal on Matrix Analysis and Applications | volume=37 | issue=4 | pages=1382–1418 | doi=10.1137/15M1032168 | arxiv=1507.00687| s2cid=2853388 }}</ref> It is very useful for large matrices over exact domains such as [[finite field]]s, where numerical stability is not an issue.
Line 101 ⟶ 103:
| pages=234–235
| date=Jun 1979
| url-access=subscription
}}</ref>
|-
| 1981 || 2.522 || [[Arnold Schönhage|Schönhage]]<ref>
Line 227 ⟶ 230:
| journal=Theoretics
| doi=10.46298/theoretics.24.21
}}</ref><ref>{{Cite web|last=Hartnett|first=Kevin|title=Matrix Multiplication Inches Closer to Mythic Goal|url=https://www.quantamagazine.org/mathematicians-inch-closer-to-matrix-multiplication-goal-20210323/|access-date=2021-04-01|website=Quanta Magazine|date=23 March 2021 |language=en}}</ref>
|-
Line 233 ⟶ 235:
{{cite arXiv |eprint=2210.10173 |class=cs.DS |first1=Ran |last1=Duan |first2=Hongxun |last2=Wu |title=Faster Matrix Multiplication via Asymmetric Hashing |last3=Zhou |first3=Renfei |year=2022}}</ref>
|-
| 2024 || 2.371552 || [[Virginia Vassilevska Williams|Williams]], Xu, Xu, and Zhou<ref name="wxxz23"
</ref>
|-
| 2024 || 2.371339 || Alman, Duan, [[Virginia Vassilevska Williams|Williams]], Xu, Xu, and Zhou<ref name="adwxxz24"/>
▲{{cite arXiv |eprint=2404.16349 |class=cs.DS |first1=Josh |last1=Alman |first2=Ran |last2=Duan |first3=Virginia Vassilevska |last3=Williams |first4=Yinzhan| last4=Xu |first5=Zixuan |last5=Xu |first6=Renfei |last6=Zhou |title=More Asymmetry Yields Faster Matrix Multiplication |year=2024}}</ref>
|}
Line 245 ⟶ 247:
All recent algorithms in this line of research use the ''laser method'', a generalization of the Coppersmith–Winograd algorithm, which was given by [[Don Coppersmith]] and [[Shmuel Winograd]] in 1990 and was the best matrix multiplication algorithm until 2010.<ref name="coppersmith">{{cite journal|doi=10.1016/S0747-7171(08)80013-2 |title=Matrix multiplication via arithmetic progressions |url=http://www.cs.umd.edu/~gasarch/TOPICS/ramsey/matrixmult.pdf |year=1990 |last1=Coppersmith |first1=Don |last2=Winograd |first2=Shmuel |journal=Journal of Symbolic Computation |volume=9|issue=3|pages=251|doi-access=free }}</ref> The conceptual idea of these algorithms is similar to Strassen's algorithm: a method is devised for multiplying two {{math|''k'' × ''k''}}-matrices with fewer than {{math|''k''<sup>3</sup>}} multiplications, and this technique is applied recursively. The laser method has limitations to its power: [[Andris Ambainis|Ambainis]], Filmus and [[Jean-François Le Gall|Le Gall]] prove that it cannot be used to show that {{math|ω < 2.3725}} by analyzing higher and higher tensor powers of a certain identity of Coppersmith and Winograd and neither {{math|ω < 2.3078}} for a wide class of variants of this approach.<ref name="afl142">{{Cite book |last1=Ambainis |first1=Andris |title=Proceedings of the forty-seventh annual ACM symposium on Theory of Computing |last2=Filmus |first2=Yuval |last3=Le Gall |first3=François |date=2015-06-14 |publisher=Association for Computing Machinery |isbn=978-1-4503-3536-2 |series=STOC '15 |___location=Portland, Oregon, USA |pages=585–593 |chapter=Fast Matrix Multiplication |doi=10.1145/2746539.2746554 |chapter-url=https://doi.org/10.1145/2746539.2746554 |arxiv=1411.5414 |s2cid=8332797}}</ref> In 2022 Duan, Wu and Zhou devised a variant breaking the first of the two barriers with {{math|ω < 2.37188}},<ref name="dwz22" /> they do so by identifying a source of potential optimization in the laser method termed ''combination loss'' for which they compensate using an asymmetric version of the hashing method in the Coppersmith–Winograd algorithm.
Nonetheless, the above are classical examples of [[Galactic algorithm#Matrix multiplication|galactic algorithms]]. On the opposite, the above Strassen's algorithm of 1969 and [[Victor Pan|Pan's]] algorithm of 1978, whose respective exponents are slightly above and below 2.78, have constant coefficients that make them feasible.<ref>{{
=== Group theory reformulation of matrix multiplication algorithms ===
[[Henry Cohn]], [[Robert Kleinberg]], [[Balázs Szegedy]] and [[Chris Umans]] put methods such as the Strassen and Coppersmith–Winograd algorithms in an entirely different [[group theory|group-theoretic]] context, by utilising triples of subsets of finite groups which satisfy a disjointness property called the [[Triple product property|triple product property (TPP)]]. They also give conjectures that, if true, would imply that there are matrix multiplication algorithms with essentially quadratic complexity. This implies that the optimal exponent of matrix multiplication is 2, which most researchers believe is indeed the case.<ref name="robinson"/> One such conjecture is that families of [[wreath product]]s of [[Abelian group]]s with symmetric groups realise families of subset triples with a simultaneous version of the TPP.<ref>{{Cite book | last1 = Cohn | first1 = H. | last2 = Kleinberg | first2 = R. | last3 = Szegedy | first3 = B. | last4 = Umans | first4 = C. | chapter = Group-theoretic Algorithms for Matrix Multiplication | doi = 10.1109/SFCS.2005.39 | title = 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05) | pages = 379 | year = 2005 | arxiv = math/0511460 | isbn = 0-7695-2468-0 | s2cid = 41278294 | url = https://authors.library.caltech.edu/23966/ }}</ref><ref>{{cite book |first1=Henry |last1=Cohn |first2=Chris |last2=Umans |chapter=A Group-theoretic Approach to Fast Matrix Multiplication |arxiv=math.GR/0307321 |title=Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, 11–14 October 2003 |year=2003 |publisher=IEEE Computer Society |pages=438–449 |doi=10.1109/SFCS.2003.1238217 |isbn=0-7695-2040-5 |s2cid=5890100 }}</ref> Several of their conjectures have since been disproven by Blasiak, Cohn, Church, Grochow, Naslund, Sawin, and Umans using the Slice Rank method.<ref name=":0">{{Cite book | last1 = Blasiak | first1 = J. | last2 = Cohn | first2 = H. | last3 = Church | first3 = T. | last4 = Grochow | first4 = J. | last5 = Naslund | first5= E. | last6 = Sawin | first6 = W. | last7=Umans | first7= C.| chapter= On cap sets and the group-theoretic approach to matrix multiplication | doi = 10.19086/da.1245 | title = Discrete Analysis | year = 2017 | page = 1245 | s2cid = 9687868 | url = http://discreteanalysisjournal.com/article/1245-on-cap-sets-and-the-group-theoretic-approach-to-matrix-multiplication}}</ref> Further, Alon, Shpilka and [[Chris Umans]] have recently shown that some of these conjectures implying fast matrix multiplication are incompatible with another plausible conjecture, the [[sunflower conjecture]],<ref>{{cite journal |journal=Electronic Colloquium on Computational Complexity |date=April 2011 |author1-link=Noga Alon |last1=Alon |first1=N. |last2=Shpilka |first2=A. |last3=Umans |first3=C. |url=http://eccc.hpi-web.de/report/2011/067/ |title=On Sunflowers and Matrix Multiplication |id=TR11-067 }}</ref> which in turn is related to the [[Cap set#Matrix multiplication algorithms|cap set problem.]]<ref name=":0" />
=== Lower bounds for ω ===
Line 259 ⟶ 261:
=== Rectangular matrix multiplication ===
Similar techniques also apply to rectangular matrix multiplication. The central object of study is <math>\omega(k)</math>, which is the smallest <math>c</math> such that one can multiply a matrix of size <math>n\times \lceil n^k\rceil</math> with a matrix of size <math>\lceil n^k\rceil \times n</math> with <math>O(n^{c + o(1)})</math> arithmetic operations. A result in algebraic complexity states that multiplying matrices of size <math>n\times \lceil n^k\rceil</math> and <math>\lceil n^k\rceil \times n</math> requires the same number of arithmetic operations as multiplying matrices of size <math>n\times \lceil n^k\rceil</math> and <math>n \times n</math> and of size <math>n \times n</math> and <math>n\times \lceil n^k\rceil</math>, so this encompasses the complexity of rectangular matrix multiplication.<ref name="gall18">{{cite
| last1 = Gall | first1 = Francois Le | | editor-last = Czumaj | editor-first = Artur | arxiv = 1708.05622 | contribution = Improved | | | | | year = 2018| isbn = 978-1-61197-503-1
}}</ref> This generalizes the square matrix multiplication exponent, since <math>\omega(1) = \omega</math>.
Since the output of the matrix multiplication problem is size <math>n^2</math>, we have <math>\omega(k) \geq 2</math> for all values of <math>k</math>. If one can prove for some values of <math>k</math> between 0 and 1 that <math>\omega(k) \leq 2</math>, then such a result shows that <math>\omega(k) = 2</math> for those <math>k</math>. The largest ''k'' such that <math>\omega(k) = 2</math> is known as the ''dual matrix multiplication exponent'', usually denoted ''α''. ''α'' is referred to as the "[[Duality (optimization)|dual]]" because showing that <math>\alpha = 1</math> is equivalent to showing that <math>\omega = 2</math>. Like the matrix multiplication exponent, the dual matrix multiplication exponent sometimes appears in the complexity of algorithms in numerical linear algebra and optimization.<ref>{{Cite journal|last1=Cohen|first1=Michael B.|last2=Lee|first2=Yin Tat|last3=Song|first3=Zhao|date=2021-01-05|title=Solving Linear Programs in the Current Matrix Multiplication Time|url=https://doi.org/10.1145/3424305|journal=Journal of the ACM|volume=68|issue=1|pages=3:1–3:39|doi=10.1145/3424305|issn=0004-5411|arxiv=1810.07896|s2cid=231955576 }}</ref>
Line 316 ⟶ 329:
===Minimizing number of multiplications===
Related to the problem of minimizing the number of arithmetic operations is minimizing the number of multiplications, which is typically a more costly operation than addition. A <math>O(n^\omega)</math> algorithm for matrix multiplication must necessarily only use <math>O(n^\omega)</math> multiplication operations, but these algorithms are impractical. Improving from the naive <math>n^3</math> multiplications for schoolbook multiplication, <math>4\times 4</math> matrices in <math>\mathbb{Z}/2\mathbb{Z}</math> can be done with 47 multiplications,<ref>See [https://www.nature.com/articles/s41586-022-05172-4/figures/6 Extended Data Fig. 1: Algorithm for multiplying 4 × 4 matrices in modular arithmetic (<math>\mathbb{Z}_{2}</math>)) with 47 multiplications] in {{Cite journal |title=Discovering faster matrix multiplication algorithms with reinforcement learning | year=2022 |language=en |doi=10.1038/s41586-022-05172-4| pmid=36198780 | last1=Fawzi | first1=A. | last2=Balog | first2=M. | last3=Huang | first3=A. | last4=Hubert | first4=T. | last5=Romera-Paredes | first5=B. | last6=Barekatain | first6=M. | last7=Novikov | first7=A. | last8=r Ruiz | first8=F. J. | last9=Schrittwieser | first9=J. | last10=Swirszcz | first10=G. | last11=Silver | first11=D. | last12=Hassabis | first12=D. | last13=Kohli | first13=P. | journal=Nature | volume=610 | issue=7930 | pages=47–53 | pmc=9534758 | bibcode=2022Natur.610...47F }}</ref> <math>3\times 3</math> matrix multiplication over a commutative ring can be done in 21 multiplications<ref>{{
| last = Rosowski | first = Andreas | arxiv = 1904.07683 | | journal = Journal of Symbolic Computation | mr = 4433063 | pages = 302–321 | title = Fast | volume = 114 | :Also in {{cite journal |doi=10.1016/0041-5553(86)90203-X |title=An algorithm for multiplying 3×3 matrices |year=1986 |last1=Makarov |first1=O. M. |journal=USSR Computational Mathematics and Mathematical Physics |volume=26 |pages=179–180 }}</ref> (23 if non-commutative<ref>{{Cite journal |last=Laderman |first=Julian D. |date=1976 |title=A noncommutative algorithm for multiplying 3×3 matrices using 23 multiplications |url=https://www.ams.org/bull/1976-82-01/S0002-9904-1976-13988-2/ |journal=Bulletin of the American Mathematical Society |language=en |volume=82 |issue=1 |pages=126–128 |doi=10.1090/S0002-9904-1976-13988-2 |issn=0002-9904|doi-access=free }}</ref>). The lower bound of multiplications needed is 2''mn''+2''n''−''m''−2 (multiplication of ''n''×''m''
==See also==
|