Matrix multiplication algorithm: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 03:12, 16 May 2025 edit SchwartzYosale (talk \| contribs) 59 edits →AlphaTensor: my first thought was the rationals? but those are commutative Tags: Mobile edit Mobile app edit Android app edit App select source ← Previous edit		Latest revision as of 13:22, 24 June 2025 edit undo OAbot (talk \| contribs) Bots 643,717 edits m Open access bot: url-access=subscription updated in citation with #oabot.
(4 intermediate revisions by 4 users not shown)
Line 35: \| arxiv=2010.05846 \| title = A Refined Laser Method and Faster Matrix Multiplication \| journal=~~TheoretiCS~~Theoretics \| volume=3 \| doi=10.46298/theoretics.24.21 Line 184: \| 1971 \|\| Winograd<ref>{{cite journal \|last=Winograd \|first=Shmuel \|author-link=Shmuel Winograd\|title=On multiplication of 2×2 matrices \|journal=Linear Algebra and Its Applications \|volume=4 \|issue= 4 \|pages=381–388 \|year=1971 \|doi=10.1016/0024-3795(71)90009-7\|doi-access=free }}</ref> \|\| 7 \|\| 15 \|\| <math>6n^{\log_2 7}-5n^2</math> \|\| <math>5\left(\frac{\sqrt{3}n}{\sqrt{M}}\right)^{\log_2 7}\cdot M-15n^2 +3M</math> \|- \| 2017 \|\| Karstadt, Schwartz<ref>{{cite conference \|url=https://dl.acm.org/doi/10.1145/3087556.3087579 \|title=Matrix Multiplication, a Little Faster \|last1=Karstadt \|first1=Elaye \|last2=Schwartz \|first2=Oded \|date=July 2017 \|publisher= \|book-title=Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures \|pages=101–110 \|conference=SPAA '17 \|doi=10.1145/3087556.3087579\|url-access=subscription }}</ref> \|\| 7 \|\| 12 \|\| <math>5n^{\log_2 7}-4n^2+3n^2\log_2n</math> \|\| <math>4\left(\frac{\sqrt{3}n}{\sqrt{M}}\right)^{\log_2 7}\cdot M-12n^2 +3n^2\cdot\log_2\left(\frac{\sqrt{2}n}{\sqrt{M}}\right) +5M</math> \|- \| 2023 \|\| Schwartz, Vaknin<ref>{{cite conference \|url=https://doi.org/10.1137/22M1502719 \|title=Pebbling Game and Alternative Basis for High Performance Matrix Multiplication \|last1=Schwartz \|first1=Oded \|last2=Vaknin \|first2=Noa \|date=2023 \|publisher= \|book-title=SIAM Journal on Scientific Computing \|pages=C277–C303 \|doi=10.1137/22M1502719\|url-access=subscription }}</ref> \|\| 7 \|\| 12 \|\| <math>5n^{\log_2 7}-4n^2+1.5n^2\log_2n</math> \|\| <math>4\left(\frac{\sqrt{3}n}{\sqrt{M}}\right)^{\log_2 7}\cdot M-12n^2 +1.5n^2\cdot\log_2\left(\frac{\sqrt{2}n}{\sqrt{M}}\right) +5M</math> \|} Line 215: === AlphaTensor === In 2022, [[DeepMind]] introduced AlphaTensor, a [[neural network]] that used a single-player game analogy to invent thousands of matrix multiplication algorithms, including some previously discovered by humans and some that were not.<ref>{{Cite web \|title=Discovering novel algorithms with AlphaTensor \|url=https://www.deepmind.com/blog/discovering-novel-algorithms-with-alphatensor \|access-date=2022-11-01 \|website=www.deepmind.com \|date=5 October 2022 \|language=en}}</ref> Operations were restricted to the ~~non-commutative ground field~~{{Clarify\|reason= which field is that?\|text= non-commutative ground field\|date= May 2025}}(normal arithmetic) and [[GF(2)\|finite field <math>\mathbb Z/2\mathbb Z</math>]] (mod 2 arithmetic). The best "practical" (explicit low-rank decomposition of a matrix multiplication tensor) algorithm found ran in O(n<sup>2.778</sup>).<ref name="alphatensor">{{Cite journal \|last1=Fawzi \|first1=Alhussein \|last2=Balog \|first2=Matej \|last3=Huang \|first3=Aja \|last4=Hubert \|first4=Thomas \|last5=Romera-Paredes \|first5=Bernardino \|last6=Barekatain \|first6=Mohammadamin \|last7=Novikov \|first7=Alexander \|last8=R. Ruiz \|first8=Francisco J. \|last9=Schrittwieser \|first9=Julian \|last10=Swirszcz \|first10=Grzegorz \|last11=Silver \|first11=David \|last12=Hassabis \|first12=Demis \|last13=Kohli \|first13=Pushmeet \|date=October 2022 \|title=Discovering faster matrix multiplication algorithms with reinforcement learning \|journal=Nature \|volume=610 \|issue=7930 \|pages=47–53 \|doi=10.1038/s41586-022-05172-4 \|pmid=36198780 \|pmc=9534758 \|bibcode=2022Natur.610...47F \|issn=1476-4687}}</ref> Finding low-rank decompositions of such tensors (and beyond) is NP-hard; optimal multiplication even for 3×3 matrices [[Computational complexity of matrix multiplication#Minimizing number of multiplications\|remains unknown]], even in commutative field.<ref name="alphatensor"/> On 4×4 matrices, AlphaTensor unexpectedly discovered a solution with 47 multiplication steps, an improvement over the 49 required with Strassen’s algorithm of 1969, albeit restricted to mod 2 arithmetic. Similarly, AlphaTensor solved 5×5 matrices with 96 rather than Strassen's 98 steps. Based on the surprising discovery that such improvements exist, other researchers were quickly able to find a similar independent 4×4 algorithm, and separately tweaked Deepmind's 96-step 5×5 algorithm down to 95 steps in mod 2 arithmetic and to 97<ref>{{Cite arXiv \|last1=Kauers \|first1=Manuel \|last2=Moosbauer \|first2=Jakob \|date=2022-12-02 \|title=Flip Graphs for Matrix Multiplication \|class=cs.SC \|eprint=2212.01175 }}</ref> in normal arithmetic.<ref>{{cite news \|last1=Brubaker \|first1=Ben \|title=AI Reveals New Possibilities in Matrix Multiplication \|url=https://www.quantamagazine.org/ai-reveals-new-possibilities-in-matrix-multiplication-20221123/ \|access-date=26 November 2022 \|work=Quanta Magazine \|date=23 November 2022 \|language=en}}</ref> Some algorithms were completely new: for example, (4, 5, 5) was improved to 76 steps from a baseline of 80 in both normal and mod 2 arithmetic. ==Parallel and distributed algorithms== Line 286: The result is even faster on a two-layered cross-wired mesh, where only 2''n''-1 steps are needed.<ref>{{cite journal \| last1 = Kak \| first1 = S \| year = 1988 \| title = A two-layered mesh array for matrix multiplication \| journal = Parallel Computing \| volume = 6 \| issue = 3\| pages = 383–5 \| doi = 10.1016/0167-8191(88)90078-6 \| citeseerx = 10.1.1.88.8527 }}</ref> The performance improves further for repeated computations leading to 100% efficiency.<ref>{{cite arXiv \|last=Kak \|first=S. \|date=2014 \|title=Efficiency of matrix multiplication on the cross-wired mesh array \|class=cs.DC \|eprint=1411.3273}}</ref> The cross-wired mesh array may be seen as a special case of a non-planar (i.e. multilayered) processing structure.<ref>{{cite journal \| last1 = Kak \| first1 = S \| year = 1988 \| title = Multilayered array computing \| journal = Information Sciences \| volume = 45 \| issue = 3\| pages = 347–365 \| doi = 10.1016/0020-0255(88)90010-2 \| citeseerx = 10.1.1.90.4753 }}</ref> In a 3D mesh with ''n''<sup>3</sup> processing elements, two matrices can be multiplied in <math>\mathcal{O}(\log n)</math> using the DNS algorithm.<ref>{{cite journal \| last1 = Dekel \| first1 = Eliezer \| last2 = Nassimi \| first2 = David \| last3 = Sahni \| first3 = Sartaj \| year = 1981 \| title = Parallel Matrix and Graph Algorithms \| journal = SIAM Journal on Computing \| volume = 10 \| issue = 4 \| pages=~~657-675~~657–675 \| doi = 10.1137/0210049}}</ref> ==See also==