Matrix multiplication algorithm: Difference between revisions

Content deleted Content added
AlphaTensor: my first thought was the rationals? but those are commutative
Tags: Mobile edit Mobile app edit Android app edit App select source
OAbot (talk | contribs)
m Open access bot: url-access=subscription updated in citation with #oabot.
 
(4 intermediate revisions by 4 users not shown)
Line 35:
| arxiv=2010.05846
| title = A Refined Laser Method and Faster Matrix Multiplication
| journal=TheoretiCSTheoretics
| volume=3
| doi=10.46298/theoretics.24.21
Line 184:
| 1971 || Winograd<ref>{{cite journal |last=Winograd |first=Shmuel |author-link=Shmuel Winograd|title=On multiplication of 2×2 matrices |journal=Linear Algebra and Its Applications |volume=4 |issue= 4 |pages=381–388 |year=1971 |doi=10.1016/0024-3795(71)90009-7|doi-access=free }}</ref> || 7 || 15 || <math>6n^{\log_2 7}-5n^2</math> || <math>5\left(\frac{\sqrt{3}n}{\sqrt{M}}\right)^{\log_2 7}\cdot M-15n^2 +3M</math>
|-
| 2017 || Karstadt, Schwartz<ref>{{cite conference |url=https://dl.acm.org/doi/10.1145/3087556.3087579 |title=Matrix Multiplication, a Little Faster |last1=Karstadt |first1=Elaye |last2=Schwartz |first2=Oded |date=July 2017 |publisher= |book-title=Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures |pages=101–110 |conference=SPAA '17 |doi=10.1145/3087556.3087579|url-access=subscription }}</ref> || 7 || 12 || <math>5n^{\log_2 7}-4n^2+3n^2\log_2n</math> || <math>4\left(\frac{\sqrt{3}n}{\sqrt{M}}\right)^{\log_2 7}\cdot M-12n^2 +3n^2\cdot\log_2\left(\frac{\sqrt{2}n}{\sqrt{M}}\right) +5M</math>
|-
| 2023 || Schwartz, Vaknin<ref>{{cite conference |url=https://doi.org/10.1137/22M1502719 |title=Pebbling Game and Alternative Basis for High Performance Matrix Multiplication |last1=Schwartz |first1=Oded |last2=Vaknin |first2=Noa |date=2023 |publisher= |book-title=SIAM Journal on Scientific Computing |pages=C277–C303 |doi=10.1137/22M1502719|url-access=subscription }}</ref> || 7 || 12 || <math>5n^{\log_2 7}-4n^2+1.5n^2\log_2n</math> || <math>4\left(\frac{\sqrt{3}n}{\sqrt{M}}\right)^{\log_2 7}\cdot M-12n^2 +1.5n^2\cdot\log_2\left(\frac{\sqrt{2}n}{\sqrt{M}}\right) +5M</math>
|}
 
Line 215:
=== AlphaTensor ===
 
In 2022, [[DeepMind]] introduced AlphaTensor, a [[neural network]] that used a single-player game analogy to invent thousands of matrix multiplication algorithms, including some previously discovered by humans and some that were not.<ref>{{Cite web |title=Discovering novel algorithms with AlphaTensor |url=https://www.deepmind.com/blog/discovering-novel-algorithms-with-alphatensor |access-date=2022-11-01 |website=www.deepmind.com |date=5 October 2022 |language=en}}</ref> Operations were restricted to the non-commutative ground field{{Clarify|reason= which field is that?|text= non-commutative ground field|date= May 2025}}(normal arithmetic) and [[GF(2)|finite field <math>\mathbb Z/2\mathbb Z</math>]] (mod 2 arithmetic). The best "practical" (explicit low-rank decomposition of a matrix multiplication tensor) algorithm found ran in O(n<sup>2.778</sup>).<ref name="alphatensor">{{Cite journal |last1=Fawzi |first1=Alhussein |last2=Balog |first2=Matej |last3=Huang |first3=Aja |last4=Hubert |first4=Thomas |last5=Romera-Paredes |first5=Bernardino |last6=Barekatain |first6=Mohammadamin |last7=Novikov |first7=Alexander |last8=R. Ruiz |first8=Francisco J. |last9=Schrittwieser |first9=Julian |last10=Swirszcz |first10=Grzegorz |last11=Silver |first11=David |last12=Hassabis |first12=Demis |last13=Kohli |first13=Pushmeet |date=October 2022 |title=Discovering faster matrix multiplication algorithms with reinforcement learning |journal=Nature |volume=610 |issue=7930 |pages=47–53 |doi=10.1038/s41586-022-05172-4 |pmid=36198780 |pmc=9534758 |bibcode=2022Natur.610...47F |issn=1476-4687}}</ref> Finding low-rank decompositions of such tensors (and beyond) is NP-hard; optimal multiplication even for 3×3 matrices [[Computational complexity of matrix multiplication#Minimizing number of multiplications|remains unknown]], even in commutative field.<ref name="alphatensor"/> On 4×4 matrices, AlphaTensor unexpectedly discovered a solution with 47 multiplication steps, an improvement over the 49 required with Strassen’s algorithm of 1969, albeit restricted to mod 2 arithmetic. Similarly, AlphaTensor solved 5×5 matrices with 96 rather than Strassen's 98 steps. Based on the surprising discovery that such improvements exist, other researchers were quickly able to find a similar independent 4×4 algorithm, and separately tweaked Deepmind's 96-step 5×5 algorithm down to 95 steps in mod 2 arithmetic and to 97<ref>{{Cite arXiv |last1=Kauers |first1=Manuel |last2=Moosbauer |first2=Jakob |date=2022-12-02 |title=Flip Graphs for Matrix Multiplication |class=cs.SC |eprint=2212.01175 }}</ref> in normal arithmetic.<ref>{{cite news |last1=Brubaker |first1=Ben |title=AI Reveals New Possibilities in Matrix Multiplication |url=https://www.quantamagazine.org/ai-reveals-new-possibilities-in-matrix-multiplication-20221123/ |access-date=26 November 2022 |work=Quanta Magazine |date=23 November 2022 |language=en}}</ref> Some algorithms were completely new: for example, (4, 5, 5) was improved to 76 steps from a baseline of 80 in both normal and mod 2 arithmetic.
 
==Parallel and distributed algorithms==
Line 286:
The result is even faster on a two-layered cross-wired mesh, where only 2''n''-1 steps are needed.<ref>{{cite journal | last1 = Kak | first1 = S | year = 1988 | title = A two-layered mesh array for matrix multiplication | journal = Parallel Computing | volume = 6 | issue = 3| pages = 383–5 | doi = 10.1016/0167-8191(88)90078-6 | citeseerx = 10.1.1.88.8527 }}</ref> The performance improves further for repeated computations leading to 100% efficiency.<ref>{{cite arXiv |last=Kak |first=S. |date=2014 |title=Efficiency of matrix multiplication on the cross-wired mesh array |class=cs.DC |eprint=1411.3273}}</ref> The cross-wired mesh array may be seen as a special case of a non-planar (i.e. multilayered) processing structure.<ref>{{cite journal | last1 = Kak | first1 = S | year = 1988 | title = Multilayered array computing | journal = Information Sciences | volume = 45 | issue = 3| pages = 347–365 | doi = 10.1016/0020-0255(88)90010-2 | citeseerx = 10.1.1.90.4753 }}</ref>
 
In a 3D mesh with ''n''<sup>3</sup> processing elements, two matrices can be multiplied in <math>\mathcal{O}(\log n)</math> using the DNS algorithm.<ref>{{cite journal | last1 = Dekel | first1 = Eliezer | last2 = Nassimi | first2 = David | last3 = Sahni | first3 = Sartaj | year = 1981 | title = Parallel Matrix and Graph Algorithms | journal = SIAM Journal on Computing | volume = 10 | issue = 4 | pages=657-675657–675 | doi = 10.1137/0210049}}</ref>
 
==See also==