Computational complexity of matrix multiplication: Difference between revisions

Content deleted Content added
made lead normal
Citation bot (talk | contribs)
Removed URL that duplicated identifier. | Use this bot. Report bugs. | Suggested by Abductive | Category:Computational complexity theory | #UCB_Category 71/110
 
(118 intermediate revisions by 52 users not shown)
Line 1:
{{Short description|algorithmicAlgorithmic runtime requirements for matrix multiplication}}
{{CS1 config|mode=cs1}}
{{unsolved|computer science|What is the fastest algorithm for matrix multiplication?}}
In [[theoretical computer science]], the '''computational complexity of matrix multiplication''' dictates [[Analysis of algorithms|how quickly]] the operation of [[matrix multiplication]] can be performed. [[Matrix multiplication algorithm]]s are a central subroutine in theoretical and [[numerical algorithm|numerical]] algorithms for [[numerical linear algebra]] and [[optimization]], so finding the rightfastest amountalgorithm offor timematrix it should takemultiplication is of major practical relevance.
 
Directly applying the mathematical definition of matrix multiplication gives an algorithm that requires {{math|''n''<sup>3</sup>}} [[Field (mathematics)|field]] operations to multiply two {{math|''n'' × ''n''}} matrices over that field ({{math|Θ(''n''<sup>3</sup>)}} in [[big O notation]]). Surprisingly, algorithms exist that provide better running times than this straightforward "schoolbook algorithm". The first to be discovered was [[Strassen algorithm|Strassen's algorithm]], devised by [[Volker Strassen]] in 1969 and often referred to as "fast matrix multiplication".<ref name="strassen69">
Line 17 ⟶ 18:
}}</ref> The optimal number of field operations needed to multiply two square {{math|''n'' × ''n''}} matrices [[big O notation|up to constant factors]] is still unknown. This is a major open question in [[theoretical computer science]].
 
{{As of|20202024|1201}}, the matrixbest multiplicationbound algorithmon with bestthe [[Time complexity|asymptotic complexity]] runsof ina matrix multiplication algorithm is {{math|O(''n''<sup>2.3728596371339</sup>)}} time, given by Josh Alman and [[Virginia Vassilevska Williams]].<ref name="aw20adwxxz24">
}}</ref><ref>{{Citecite arXiv web|lasteprint=Hartnett2404.16349 |firstclass=Kevincs.DS |titlefirst1=MatrixJosh Multiplication|last1=Alman Inches|first2=Ran Closer|last2=Duan to|first3=Virginia MythicVassilevska Goal|urllast3=https://www.quantamagazine.org/mathematicians-inch-closer-to-matrix-multiplication-goal-20210323/Williams |access-datefirst4=2021-04-01Yinzhan|website last4=QuantaXu |first5=Zixuan |last5=Xu |first6=Renfei |last6=Zhou |title=More Asymmetry Yields Faster Matrix Multiplication Magazine|languageyear=en2024}}</ref> However, this and similar improvements to Strassen are not used in practice, because they are [[galactic algorithm]]s: the constant coefficient hidden by the [[Bigbig O notation]] is so large that they are only worthwhile for matrices that are too large to handle on present-day computers.<ref>{{citationcite journal
{{Citation
| last1=Alman
| first1=Josh
| last2=Williams
| first2=Virginia Vassilevska
| contribution=A Refined Laser Method and Faster Matrix Multiplication
| year = 2020
| arxiv=2010.05846
| title = 32nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2021)
|url=https://www.siam.org/conferences/cm/program/accepted-papers/soda21-accepted-papers
}}</ref><ref>{{Cite web|last=Hartnett|first=Kevin|title=Matrix Multiplication Inches Closer to Mythic Goal|url=https://www.quantamagazine.org/mathematicians-inch-closer-to-matrix-multiplication-goal-20210323/|access-date=2021-04-01|website=Quanta Magazine|language=en}}</ref> However, this and similar improvements to Strassen are not used in practice, because they are [[galactic algorithm]]s: the constant coefficient hidden by the [[Big O notation]] is so large that they are only worthwhile for matrices that are too large to handle on present-day computers.<ref>{{citation
| last = Iliopoulos
| first = Costas S.
Line 46 ⟶ 37:
| archive-date = 2014-03-05
| url-status = dead
}}</ref><ref name="robinson">{{Citationcite journal | last=Robinson | first=Sara | title=Toward an Optimal Algorithm for Matrix Multiplication | url=https://archive.siam.org/pdf/news/174.pdf | date=November 2005 | journal=SIAM News | volume=38 | issue=9 | quote=Even if someone manages to prove one of the conjectures—thereby demonstrating that {{math|1=''ω'' = 2}}—the wreath product approach is unlikely to be applicable to the large matrix problems that arise in practice. [...] the input matrices must be astronomically large for the difference in time to be apparent.}}</ref>
 
== Simple algorithms ==
 
If ''A'', ''B'' are two {{math|''n'' × ''n''}} matrices over a field, then their product ''AB'' is also an {{math|''n'' × ''n''}} matrix over that field, defined entrywise as
:<math>
(AB)_{ij} = \sum_{k = 1}^n A_{ik} B_{kj}.
Line 68 ⟶ 59:
'''output''' ''C'' (as A*B)
 
This [[algorithm]] requires, in the [[worst-case complexity|worst case]], {{tmath|n^3}} multiplications of scalars and {{tmath|n^3 - n^2}} additions of scalars for computing the product of two square {{math|''n''×''n''}} matrices. Its [[computational complexity]] is therefore {{tmath|O(n^3)}}, in a [[model of computation]] where field operations (addition and multiplication) take constant time (in practice, this is the case for [[floating point]] numbers, but not necessarily for integers).
 
=== Strassen's algorithm ===
{{Main|Strassen algorithm}}
Strassen's algorithm improves on naive matrix multiplication through a [[Divide-and-conquer algorithm|divide-and-conquer]] approach. The key observation is that multiplying two {{math|2 × 2}} matrices can be done with only 7seven multiplications, instead of the usual 8eight (at the expense of several11 additional addition and subtraction operations). This means that, treating the input {{math|''n''×''n''}} matrices as [[Block matrix|block]] {{math|2 × 2}} matrices, the task of multiplying two {{math|''n''×''n''}} matrices can be reduced to 7seven subproblems of multiplying two {{math|''n/2''×''n/2''}} matrices. Applying this recursively gives an algorithm needing <math>O( n^{\log_{2}7}) \approx O(n^{2.807})</math> field operations.
 
Unlike algorithms with faster asymptotic complexity, Strassen's algorithm is used in practice. The [[numerical stability]] is reduced compared to the naïvenaive algorithm,<ref>{{Citationcite journal | last1=Miller | first1=Webb | title=Computational complexity and numerical stability | citeseerx = 10.1.1.148.9947 | year=1975 | journal=SIAM News | volume=4 | issue=2 | pages=97–107 | doi=10.1137/0204009}}</ref> but it is faster in cases where {{math|''n'' > 100}} or so<ref name="skiena">{{cite book |first=Steven |last=Skiena |date=2012 |author-link=Steven Skiena |title=The Algorithm Design Manual |url=https://archive.org/details/algorithmdesignm00skie_772 |url-access=limited |publisher=Springer |year=2008 |pages=[https://archive.org/details/algorithmdesignm00skie_772/page/n56 45]–46, 401–403 |doi=10.1007/978-1-84800-070-4_4|chapter=Sorting and Searching |isbn=978-1-84800-069-8 }}</ref> and appears in several libraries, such as [[Basic Linear Algebra Subprograms|BLAS]].<ref>{{cite book |last1=Press |first1=William H. |last2=Flannery |first2=Brian P. |last3=Teukolsky |first3=Saul A. |author3-link=Saul Teukolsky |last4=Vetterling |first4=William T. |title=Numerical Recipes: The Art of Scientific Computing |publisher=[[Cambridge University Press]] |edition=3rd |isbn=978-0-521-88068-8 |year=2007 |page=[https://archive.org/details/numericalrecipes00pres_033/page/n131 108]|title-link=Numerical Recipes }}</ref> Fast matrix multiplication algorithms cannot achieve ''component-wise stability'', but some can be shown to exhibit ''norm-wise stability''.<ref name="bdl16">{{cite journal | last1=Ballard | first1=Grey | last2=Benson | first2=Austin R. | last3=Druinsky | first3=Alex | last4=Lipshitz | first4=Benjamin | last5=Schwartz | first5=Oded | title=Improving the numerical stability of fast matrix multiplication | year=2016 | journal=SIAM Journal on Matrix Analysis and Applications | volume=37 | issue=4 | pages=1382–1418 | doi=10.1137/15M1032168 | arxiv=1507.00687| s2cid=2853388 }}</ref> It is very useful for large matrices over exact domains such as [[finite field]]s, where numerical stability is not an issue.
 
== Matrix multiplication exponent ==
 
[[File:MatrixMultComplexity svg.svg|thumb|400px|right|Improvement of estimates of exponent {{math|ω}} over time for the computational complexity of matrix multiplication <math>O(n^\omega)</math>.]]
[[File:MatrixMultComplexity1990 svg.svg|thumb|400px|right|Closeup of 1990–2024]]
{| class="wikitable floatright"
|+ Timeline of matrix multiplication exponent
Line 85 ⟶ 77:
| 1969 || 2.8074 || [[Volker Strassen|Strassen]]<ref name="strassen69" />
|-
| 1978 || 2.796 || [[Victor Pan|Pan]]<ref>
{{cite book
| doi=10.1109/SFCS.1978.34
Line 97 ⟶ 89:
}}</ref>
|-
| 1979 || 2.780 || Bini, {{ill|Milvio Capovani|it|lt=Capovani}}, Romani<ref>
{{cite journal
| doi=10.1016/0020-0190(79)90113-3
Line 111 ⟶ 103:
| pages=234&ndash;235
| date=Jun 1979
| url-access=subscription
}}</ref>
|-
| 1981 || 2.522 || [[Arnold Schönhage|Schönhage]]<ref>
{{cite journal
| doi=10.1137/0210032
Line 137 ⟶ 130:
}}</ref><!-- Note: Romani and Coppersmith-Winograd do not cite each other, so these appear to be simultaneous. -->
|-
| 1981 || 2.496 || [[Don Coppersmith|Coppersmith]], [[Shmuel Winograd|Winograd]]<ref name="CW.81">
{{cite book
| doi=10.1109/SFCS.1981.27
Line 157 ⟶ 150:
| pages=49&ndash;54
| date=Oct 1986
| isbn=0-8186-0740-8
| s2cid=15077423
}}</ref>
Line 162 ⟶ 156:
| 1990 || 2.3755 || [[Don Coppersmith|Coppersmith]], [[Shmuel Winograd|Winograd]]<ref>
{{cite journal
| url=https://www.sciencedirect.com/science/article/pii/S0747717108800132
| author1=D. Coppersmith
| author2= S. Winograd
Line 172 ⟶ 165:
| date=Mar 1990
| doi=10.1016/S0747-7171(08)80013-2
| doi-access=free
}}</ref>
|-
| 2010 || 2.3737 || Stothers<ref>
Line 185 ⟶ 179:
}}</ref>
|-
| 20132012 || 2.3729 || [[Virginia Vassilevska Williams|Williams]]<ref>
{{cite book
| doi=10.1145/2213977.2214056
Line 196 ⟶ 190:
| pages=887&ndash;898
| year=2012
| isbn=978-1-4503-1245-5
}}</ref><ref name="Williams.2014">
| s2cid=14350287
}}</ref><ref name="Williams.2014">
{{cite report
| last=Williams
Line 208 ⟶ 204:
|-
| 2014 || 2.3728639 || Le Gall<ref name="LeGall2014">
{{cite book
{{Citation
| last1=Le Gall
| first1=François
| pages=296&ndash;303
| contribution=Powers of tensors and fast matrix multiplication
| year = 2014
| arxiv=1401.7714
| title = Proceedings of the 39th [[International Symposium on Symbolic and Algebraic Computation]] (- ISSAC) '14
| contributionchapter=PowersAlgebraic ofcomplexity tensorstheory and fast matrix multiplication
| doi=10.1145/2608628.2627493
| editor=Katsusuke Nabeshima
| isbn=978-1-4503-2501-1
| bibcode=2014arXiv1401.7714L
| s2cid=2597483
}}</ref>
|-
| 2020 || 2.3728596 || Alman, [[Virginia Vassilevska Williams|Williams]]<ref name="aw20"/>
{{cite journal
| last1=Alman
| first1=Josh
| last2=Williams
| first2=Virginia Vassilevska
| year = 20202024
| arxiv=2010.05846
| contributiontitle = A Refined Laser Method and Faster Matrix Multiplication
| journal=Theoretics
| doi=10.46298/theoretics.24.21
}}</ref><ref>{{Cite web|last=Hartnett|first=Kevin|title=Matrix Multiplication Inches Closer to Mythic Goal|url=https://www.quantamagazine.org/mathematicians-inch-closer-to-matrix-multiplication-goal-20210323/|access-date=2021-04-01|website=Quanta Magazine|date=23 March 2021 |language=en}}</ref>
|-
| 2022 || 2.371866 || Duan, Wu, Zhou<ref name="dwz22">
{{cite arXiv |eprint=2210.10173 |class=cs.DS |first1=Ran |last1=Duan |first2=Hongxun |last2=Wu |title=Faster Matrix Multiplication via Asymmetric Hashing |last3=Zhou |first3=Renfei |year=2022}}</ref>
|-
| 2024 || 2.371552 || [[Virginia Vassilevska Williams|Williams]], Xu, Xu, and Zhou<ref name="wxxz23">{{cite conference |last1=Vassilevska Williams |first1=Virginia |last2=Xu |first2=Yinzhan |last3=Xu |first3=Zixuan |last4=Zhou |first4=Renfei |title=New Bounds for Matrix Multiplication: from Alpha to Omega |conference=Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) |pages=3792–3835 |arxiv=2307.07970 |doi=10.1137/1.9781611977912.134}}
</ref>
|-
| 2024 || 2.371339 || Alman, Duan, [[Virginia Vassilevska Williams|Williams]], Xu, Xu, and Zhou<ref name="adwxxz24"/>
|}
 
The ''matrix multiplication exponent'', usually denoted {{math|ω}}, is the smallest real number for which any two <math>n\times n</math> matrixmatrices over a field can be multiplied together using <math>n^{\omega + o(1)}</math> field operations. This notation is commonly used in [[algorithm]]s research, so that algorithms using matrix multiplication as a subroutine have meaningful bounds on running time regardlessthat ofcan theupdate trueas valuebounds ofon {{math|ω}} improve.
 
Using a naive lower bound and schoolbook matrix multiplication for the upper bound, one can straightforwardly conclude that {{math|2 ≤ ω ≤ 3}}. Whether {{math|1=ω = 2}} is a major open question in [[theoretical computer science]], and there is a line of research developing matrix multiplication algorithms to get improved bounds on {{math|ω}}.
 
The current best bound on {{math|ω}} is {{math|ω < 2.3728596}}, by Josh Alman and [[Virginia Vassilevska Williams]].<ref name="aw20"/> This algorithm, like all otherAll recent algorithms in this line of research, usesuse the ''laser method'', a generalization of the Coppersmith–Winograd algorithm, which was given by [[Don Coppersmith]] and [[Shmuel Winograd]] in 1990, and was the best matrix multiplication algorithm until 2010, and has an asymptotic complexity of {{math|''O''(''n''<sup>2.375477</sup>)}}.<ref name="coppersmith">{{Citationcite journal|doi=10.1016/S0747-7171(08)80013-2 |title=Matrix multiplication via arithmetic progressions |url=http://www.cs.umd.edu/~gasarch/TOPICS/ramsey/matrixmult.pdf |year=1990 |last1=Coppersmith |first1=Don |last2=Winograd |first2=Shmuel |journal=Journal of Symbolic Computation |volume=9|issue=3|pages=251|doi-access=free }}</ref> The conceptual idea of these algorithms areis similar to Strassen's algorithm: a waymethod is devised for multiplying two {{math|''k'' × ''k''}}-matrices with fewer than {{math|''k''<sup>3</sup>}} multiplications, and this technique is applied recursively. The laser method has limitations to its power: [[Andris Ambainis|Ambainis]], Filmus and [[Jean-François Le Gall|Le Gall]] prove that it cannot be used to show that {{math|ω < 2.3725}} by analyzing higher and higher tensor powers of a certain identity of Coppersmith and Winograd and neither {{math|ω < 2.3078}} for a wide class of variants of this approach.<ref name="afl142">{{Cite journalbook |lastlast1=Ambainis |firstfirst1=Andris |title=Proceedings of the forty-seventh annual ACM symposium on Theory of Computing |last2=Filmus |first2=Yuval |last3=Le Gall |first3=François |date=2015-06-14 |titlepublisher=FastAssociation Matrixfor Multiplication:Computing LimitationsMachinery of|isbn=978-1-4503-3536-2 the|series=STOC Coppersmith-Winograd'15 Method|___location=Portland, Oregon, USA |pages=585–593 |chapter=Fast Matrix Multiplication |doi=10.1145/2746539.2746554 |chapter-url=https://doi.org/10.1145/2746539.2746554 |journalarxiv=Proceedings1411.5414 of|s2cid=8332797}}</ref> theIn forty-seventh2022 annualDuan, ACMWu symposiumand onZhou Theorydevised ofa Computing|series=STOCvariant '15|___location=Portland,breaking Oregon,the USA|publisher=Associationfirst forof Computingthe Machinerytwo barriers with {{math|pages=585–593|doi=10.1145/2746539.2746554|isbn=978-1-4503-3536-ω < 2.37188}},</ref name="dwz22" /> they do so by identifying a source of potential optimization in the laser method termed ''combination loss'' for which they compensate using an asymmetric version of the hashing method in the Coppersmith–Winograd algorithm.
 
Nonetheless, the above are classical examples of [[Galactic algorithm#Matrix multiplication|galactic algorithms]]. On the opposite, the above Strassen's algorithm of 1969 and [[Victor Pan|Pan's]] algorithm of 1978, whose respective exponents are slightly above and below 2.78, have constant coefficients that make them feasible.<ref>{{cite journal | last1=Laderman | first1=Julian | last2=Pan | first2=Victor |last3=Sha | first3=Xuan-He | title=On practical algorithms for accelerated matrix multiplication | year=1992 | journal=Linear Algebra and Its Applications | volume=162-164 | pages=557–588 | doi=10.1016/0024-3795(92)90393-O}}</ref>
 
=== Group theory reformulation of matrix multiplication algorithms ===
 
[[Henry Cohn]], [[Robert Kleinberg]], [[Balázs Szegedy]] and [[Chris Umans]] put methods such as the Strassen and Coppersmith–Winograd algorithms in an entirely different [[group theory|group-theoretic]] context, by utilising triples of subsets of finite groups which satisfy a disjointness property called the [[Triple product property|triple product property (TPP)]]. They also give conjectures that, if true, would imply that there are matrix multiplication algorithms with essentially quadratic complexity. This implies that the optimal exponent of matrix multiplication is 2, which most researchers believe is indeed the case.<ref name="robinson"/> One such conjecture is that families of [[wreath product]]s of [[Abelian group]]s with symmetric groups realise families of subset triples with a simultaneous version of the TPP.<ref>{{Cite book | last1 = Cohn | first1 = H. | last2 = Kleinberg | first2 = R. | last3 = Szegedy | first3 = B. | last4 = Umans | first4 = C. | chapter = Group-theoretic Algorithms for Matrix Multiplication | doi = 10.1109/SFCS.2005.39 | title = 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05) | pages = 379 | year = 2005 | arxiv = math/0511460 | isbn = 0-7695-2468-0 | s2cid = 41278294 | url = https://authors.library.caltech.edu/23966/ }}</ref><ref>{{cite book |first1=Henry |last1=Cohn, |first2=Chris |last2=Umans. |chapter=A Group-theoretic Approach to Fast Matrix Multiplication. {{|arxiv|=math.GR/0307321}}. ''|title=Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science'', 11–14 October 2003, Cambridge, MA,|year=2003 |publisher=IEEE Computer Society, pp.&nbsp;|pages=438–449 |doi=10.1109/SFCS.2003.1238217 |isbn=0-7695-2040-5 |s2cid=5890100 }}</ref> Several of their conjectures have since been disproven by Blasiak, Cohn, Church, Grochow, Naslund, Sawin, and Umans using the Slice Rank method.<ref name=":0">{{Cite book | last1 = Blasiak | first1 = J. | last2 = Cohn | first2 = H. | last3 = Church | first3 = T. | last4 = Grochow | first4 = J. | last5 = Naslund | first5= E. | last6 = Sawin | first6 = W. | last7=Umans | first7= C.| chapter= On cap sets and the group-theoretic approach to matrix multiplication | doi = 10.19086/da.1245 | title = Discrete Analysis | year = 2017 | page = 1245 | s2cid = 9687868 | url = http://discreteanalysisjournal.com/article/1245-on-cap-sets-and-the-group-theoretic-approach-to-matrix-multiplication}}</ref> Further, Alon, Shpilka and [[Chris Umans]] have recently shown that some of these conjectures implying fast matrix multiplication are incompatible with another plausible conjecture, the [[sunflower conjecture]].,<ref>[[{{cite journal |journal=Electronic Colloquium on Computational Complexity |date=April 2011 |author1-link=Noga Alon |last1=Alon]], |first1=N. |last2=Shpilka, |first2=A. |last3=Umans, [|first3=C. |url=http://eccc.hpi-web.de/report/2011/067/ |title=On Sunflowers and Matrix Multiplication] |id=TR11-067 }}</ref> which in turn is related to the [[Cap set#Matrix multiplication algorithms|cap set problem.]]<ref name=":0" />
 
=== Lower bounds for ω ===
 
There is a trivial lower bound of {{tmath|\omega \ge 2}}. Since any algorithm for multiplying two {{math|''n'' × ''n''}}-matrices has to process all {{math|2''n''<sup>2</sup>}} entries, there is a trivial asymptotic lower bound of {{math|Ω(''n''<sup>2</sup>)}} operations for any matrix multiplication algorithm. Thus {{tmath|2\le \omega < 2.37337188}}. It is unknown whether {{tmath|\omega > 2}}. The best known lower bound for matrix-multiplication complexity is {{math|Ω(''n''<sup>2</sup> log(''n''))}}, for bounded coefficient [[Arithmetic circuit complexity|arithmetic circuits]] over the real or complex numbers, and is due to [[Ran Raz]].<ref>{{cite journalbook | last1 = Raz | first1 = Ran | author-linktitle = RanProceedings Razof |the yearthiry-fourth =annual ACM symposium on Theory of computing 2002| titlechapter = On the complexity of matrix product | journalauthor-link = ProceedingsRan ofRaz the| Thirty-fourthyear Annual ACM Symposium on Theory of Computing= 2002| pages = 144144–151 | doi = 10.1145/509907.509932 | isbn = 1581134959 | s2cid = 9582328 }}</ref>
 
The exponent ω is defined to be a [[Accumulation point|limit point]], in that it is the infimum of the exponent over all matrix multiplication algorithms. It is known that this limit point is not achieved. In other words, under the model of computation typically studied, there is no matrix multiplication algorithm that uses precisely {{math|O(''n''<sup>ω</sup>)}} operations; there must be an additional factor of {{math|''n''<sup>o(1)</sup>}}.<ref name="CW.81"/>
 
=== Rectangular matrix multiplication ===
 
Similar techniques also apply to rectangular matrix multiplication. The central object of study is <math>\omega(k)</math>, which is the smallest <math>c</math> such that one can multiply a matrix of size <math>n\times \lceil n^k\rceil</math> with a matrix of size <math>\lceil n^k\rceil \times n</math> with <math>O(n^{c + o(1)})</math> arithmetic operations. A result in algebraic complexity states that multiplying matrices of size <math>n\times \lceil n^k\rceil</math> and <math>\lceil n^k\rceil \times n</math> requires the same number of arithmetic operations as multiplying matrices of size <math>n\times \lceil n^k\rceil</math> and <math>n \times n</math> and of size <math>n \times n</math> and <math>n\times \lceil n^k\rceil</math>, so this encompasses the complexity of rectangular matrix multiplication.<ref name="gall18">{{Citationcite conference
|last last1 = Gall |first first1 = Francois Le
|title last2 = Urrutia | first2 = Florent
| editor-last = Czumaj | editor-first = Artur
| arxiv = 1708.05622
| contribution = Improved Rectangularrectangular Matrixmatrix Multiplicationmultiplication using Powerspowers of the Coppersmith-Winograd Tensortensor
|date=2018-01-01|url=https://epubs.siam.org/ doi/ = 10.1137/1.9781611975031.67
|work=Proceedings of the 2018 Annual ACM-SIAM Symposium on Discrete Algorithmspages (SODA)|pages= 1029–1046
|series=Proceedings| publisher = Society for Industrial and Applied Mathematics
|doi title =10.1137/1.9781611975031.67|access Proceedings of the Twenty-date=2021Ninth Annual ACM-05-23|last2=Urrutia|first2=Florent|arxiv=1708.05622}}</ref>SIAM ThisSymposium generalizeson theDiscrete squareAlgorithms, matrixSODA multiplication2018, exponentNew Orleans, sinceLA, <math>\omega(1)USA, =January \omega</math>.7–10, 2018
| year = 2018| isbn = 978-1-61197-503-1
}}</ref> This generalizes the square matrix multiplication exponent, since <math>\omega(1) = \omega</math>.
 
Since the output of the matrix multiplication problem is size <math>n^2</math>, we have <math>\omega(k) \geq 2</math> alwaysfor all values of <math>k</math>. If one can prove for some values of <math>k</math> between 0 and 1 that <math>\omega(k) \leq 2</math>, sothen thesesuch resultsa showresult shows that <math>\omega(k) = 2</math> exactlyfor those <math>k</math>. The largest ''k'' such that <math>\omega(k) = 2</math> is known as the ''dual matrix multiplication exponent'', usually denoted ''α''. ''α'' is referred to as the "[[Duality (optimization)|dual]]" because showing that <math>\alpha = 1</math> is equivalent to showing that <math>\omega = 2</math>. Like the matrix multiplication exponent, the dual matrix multiplication exponent sometimes appears in the complexity of algorithms in numerical linear algebra and optimization.<ref>{{Cite journal|lastlast1=Cohen|firstfirst1=Michael B.|last2=Lee|first2=Yin Tat|last3=Song|first3=Zhao|date=2021-01-05|title=Solving Linear Programs in the Current Matrix Multiplication Time|url=https://doi.org/10.1145/3424305|journal=Journal of the ACM|volume=68|issue=1|pages=3:1–3:39|doi=10.1145/3424305|issn=0004-5411|arxiv=1810.07896|s2cid=231955576 }}</ref>
Of interest is proving that, for values of ''k'' between 0 and 1, that <math>\omega(k) \leq 2</math>.
Since the output of the matrix multiplication problem is size <math>n^2</math>, <math>\omega(k) \geq 2</math> always, so these results show that <math>\omega(k) = 2</math> exactly. The largest ''k'' such that <math>\omega(k) = 2</math> is known as the ''dual matrix multiplication exponent'', usually denoted ''α''. ''α'' is referred to as the "[[Duality (optimization)|dual]]" because showing that <math>\alpha = 1</math> is equivalent to showing that <math>\omega = 2</math>. Like the matrix multiplication exponent, the dual matrix multiplication exponent sometimes appears in the complexity of algorithms in numerical linear algebra and optimization.<ref>{{Cite journal|last=Cohen|first=Michael B.|last2=Lee|first2=Yin Tat|last3=Song|first3=Zhao|date=2021-01-05|title=Solving Linear Programs in the Current Matrix Multiplication Time|url=https://doi.org/10.1145/3424305|journal=Journal of the ACM|volume=68|issue=1|pages=3:1–3:39|doi=10.1145/3424305|issn=0004-5411}}</ref>
 
The first bound on ''α'' is by [[Don Coppersmith|Coppersmith]] in 1982, who showed that <math>\alpha > 0.17227</math>.<ref>{{Cite journal|last=Coppersmith|first=D.|date=1982-08-01|title=Rapid Multiplication of Rectangular Matrices|url=https://epubs.siam.org/doi/10.1137/0211037|journal=SIAM Journal on Computing|volume=11|issue=3|pages=467–471|doi=10.1137/0211037|issn=0097-5397|url-access=subscription}}</ref> The current best peer-reviewed bound on ''α'' is <math>\alpha >\geq 0.31389321334</math>, given by LeWilliams, GallXu, Xu, and UrrutiaZhou.<ref>{{Citation|last=Le Gall|firstname=Francois|title=Improved Rectangular Matrix Multiplication using Powers of the Coppersmith-Winograd Tensor|date=2018-01-01|url=https:"wxxz23"//epubs.siam.org/doi/10.1137/1.9781611975031.67|work=Proceedings of the 2018 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)|pages=1029–1046|series=Proceedings|publisher=Society for Industrial and Applied Mathematics|doi=10.1137/1.9781611975031.67|access-date=2021-05-23|last2=Urrutia|first2=Florent|arxiv=1708.05622}}</ref> This paper also contains bounds on <math>\omega(k)</math>.
 
==Related complexitiesproblems==
{{further|Computational complexity of mathematical operations#Matrix algebra}}
 
Line 253 ⟶ 284:
 
===Matrix inversion, determinant and Gaussian elimination===
In his 1969 paper, where he proved the complexity <math>O(n^{\log_2 7}) \approx O(n^{2.807})</math> for matrix computation, Strassen proved also that [[matrix inversion]], [[determinant]] and [[Gaussian elimination]] have, up to a multiplicative constant, the same [[computational complexity]] as matrix multiplication. The proof does not make any assumptions on matrix multiplication that is used, except that its complexity is <math>O(n^\omega)</math> for some <math>\omega \ge 2</math>
 
The starting point of Strassen's proof is using [[block matrix]] multiplication. Specifically, a matrix of even dimension {{math|2''n''×2''n''}} may be partitioned in four {{math|''n''×''n''}} blocks
Line 273 ⟶ 304:
I(2^k) &\le 2I(2^{k-1}) + 6M(2^{k-1})+ 4 A(2^{k-1})\\
&\le 2^2I(2^{k-2}) + 6(M(2^{k-1})+2M(2^{k-2})) + 4(A(2^{k-1}) + 2A(2^{k-2}))\\
&\,\,\,\vdots
&\ldots
\end{align}</math>
If <math>M(n)\le cn^\omega,</math> and <math>\alpha=2^\omega\ge 4,</math> one gets eventually
Line 279 ⟶ 310:
I(2^k) &\le 2^k I(1) + 6c(\alpha^{k-1}+2\alpha^{k-2} + \cdots +2^{k-1}\alpha^0) + k 2^{k+1}\\
&\le 2^k + 6c\frac{\alpha^k-2^k}{\alpha-2} + k 2^{k+1}\\
&\le d(2^k)^\omega.
\end{align}</math>
for some constant {{mvar|d}}.
Line 295 ⟶ 326:
:<math>\det \begin{bmatrix} {A} & {B} \\{C} & {D} \end{bmatrix} =
\det(A)\det(D-CA^{-1}B).</math>
 
===Minimizing number of multiplications===
 
Related to the problem of minimizing the number of arithmetic operations is minimizing the number of multiplications, which is typically a more costly operation than addition. A <math>O(n^\omega)</math> algorithm for matrix multiplication must necessarily only use <math>O(n^\omega)</math> multiplication operations, but these algorithms are impractical. Improving from the naive <math>n^3</math> multiplications for schoolbook multiplication, <math>4\times 4</math> matrices in <math>\mathbb{Z}/2\mathbb{Z}</math> can be done with 47 multiplications,<ref>See [https://www.nature.com/articles/s41586-022-05172-4/figures/6 Extended Data Fig. 1: Algorithm for multiplying 4 × 4 matrices in modular arithmetic (<math>\mathbb{Z}_{2}</math>)) with 47 multiplications] in {{Cite journal |title=Discovering faster matrix multiplication algorithms with reinforcement learning | year=2022 |language=en |doi=10.1038/s41586-022-05172-4| pmid=36198780 | last1=Fawzi | first1=A. | last2=Balog | first2=M. | last3=Huang | first3=A. | last4=Hubert | first4=T. | last5=Romera-Paredes | first5=B. | last6=Barekatain | first6=M. | last7=Novikov | first7=A. | last8=r Ruiz | first8=F. J. | last9=Schrittwieser | first9=J. | last10=Swirszcz | first10=G. | last11=Silver | first11=D. | last12=Hassabis | first12=D. | last13=Kohli | first13=P. | journal=Nature | volume=610 | issue=7930 | pages=47–53 | pmc=9534758 | bibcode=2022Natur.610...47F }}</ref> <math>3\times 3</math> matrix multiplication over a commutative ring can be done in 21 multiplications<ref>{{cite journal
| last = Rosowski | first = Andreas
| arxiv = 1904.07683
| doi = 10.1016/j.jsc.2022.05.002
| journal = Journal of Symbolic Computation
| mr = 4433063
| pages = 302–321
| title = Fast commutative matrix algorithms
| volume = 114
| year = 2023}}</ref><ref>{{cite journal |last1=Makarov |first1=O. M. |title=An algorithm for multiplying 3×3 matrices |journal=Zhurnal Vychislitel'noi Matematiki I Matematicheskoi Fiziki |volume=26 |issue=2 |year=1986 |pages=293–294 |url=https://www.mathnet.ru/php/archive.phtml?wshow=paper&jrnid=zvmmf&paperid=4056&option_lang=eng |access-date=5 October 2022}}
:Also in {{cite journal |doi=10.1016/0041-5553(86)90203-X |title=An algorithm for multiplying 3×3 matrices |year=1986 |last1=Makarov |first1=O. M. |journal=USSR Computational Mathematics and Mathematical Physics |volume=26 |pages=179–180 }}</ref> (23 if non-commutative<ref>{{Cite journal |last=Laderman |first=Julian D. |date=1976 |title=A noncommutative algorithm for multiplying 3×3 matrices using 23 multiplications |url=https://www.ams.org/bull/1976-82-01/S0002-9904-1976-13988-2/ |journal=Bulletin of the American Mathematical Society |language=en |volume=82 |issue=1 |pages=126–128 |doi=10.1090/S0002-9904-1976-13988-2 |issn=0002-9904|doi-access=free }}</ref>). The lower bound of multiplications needed is 2''mn''+2''n''−''m''−2 (multiplication of ''n''×''m'' matrices with ''m''×''n'' matrices using the substitution method, <math>m \ge n \ge 3</math>), which means n=3 case requires at least 19 multiplications and n=4 at least 34.<ref>{{Cite journal |last=Bläser |first=Markus |date=February 2003 |title=On the complexity of the multiplication of matrices of small formats |journal=Journal of Complexity |language=en |volume=19 |issue=1 |pages=43–60 |doi=10.1016/S0885-064X(02)00007-9|doi-access=free }}</ref> For n=2 optimal seven multiplications and 15 additions are minimal, compared to only four additions for eight multiplications.<ref>{{Cite journal |last=Winograd |first=S. |date=1971-10-01 |title=On multiplication of 2 × 2 matrices |journal=Linear Algebra and Its Applications |language=en |volume=4 |issue=4 |pages=381–388 |doi=10.1016/0024-3795(71)90009-7 |issn=0024-3795|doi-access=free }}</ref><ref>{{Cite book |last=L. |first=Probert, R. |title=On the complexity of matrix multiplication |date=1973 |publisher=University of Waterloo |oclc=1124200063}}</ref>
 
==See also==
Line 303 ⟶ 348:
* [[Matrix multiplication]], for abstract definitions
* [[Matrix multiplication algorithm]], for practical implementation details
* [[Sparse matrix-vectormatrix–vector multiplication]]
 
==References==
{{reflist|30em}}
 
== External links ==
*[https://fmm.univ-lille.fr/ Yet another catalogue of fast matrix multiplication algorithms]
*{{cite journal |last1=Fawzi |first1=A. |last2=Balog |first2=M. |last3=Huang |first3=A. |first4=T. |last4=Hubert |first5=B. |last5=Romera-Paredes |first6=M. |last6=Barekatain |first7=A. |last7=Novikov |first8=F.J.R. |last8=Ruiz |first9=J. |last9=Schrittwieser |first10=G. |last10=Swirszcz |first11=D. |last11=Silver |first12=D. |last12=Hassabis |first13=P. |last13=Kohli |title=Discovering faster matrix multiplication algorithms with reinforcement learning |journal=Nature |volume=610 |issue= 7930|pages=47–53 |date=2022 |doi=10.1038/s41586-022-05172-4 |pmid=36198780 |pmc=9534758 |bibcode=2022Natur.610...47F |url=}}
 
[[Category:Computer arithmetic algorithms]]