Matrix multiplication algorithm: Difference between revisions

Content deleted Content added
Monkbot (talk | contribs)
m Task 18 (cosmetic): eval 22 templates: hyphenate params (2×);
m As at target
Line 54:
In the idealized cache model, this algorithm incurs only {{math|Θ({{sfrac|''n''<sup>3</sup>|''b'' {{radic|''M''}}}})}} cache misses; the divisor {{math|''b'' {{radic|''M''}}}} amounts to several orders of magnitude on modern machines, so that the actual calculations dominate the running time, rather than the cache misses.<ref name="ocw"/>
 
==Divide -and -conquer algorithm==
An alternative to the iterative algorithm is the [[divide -and -conquer algorithm]] for matrix multiplication. This relies on the [[block matrix|block partitioning]]
 
:<math>C = \begin{pmatrix}
Line 88:
</math>
 
which consists of eight multiplications of pairs of submatrices, followed by an addition step. The divide -and -conquer algorithm computes the smaller multiplications [[recursion|recursively]], using the [[scalar multiplication]] {{math|''c''<sub>11</sub> {{=}} ''a''<sub>11</sub>''b''<sub>11</sub>}} as its base case.
 
The complexity of this algorithm as a function of {{mvar|n}} is given by the recurrence<ref name="clrs"/>
Line 164:
 
===Shared-memory parallelism===
The [[#Divide -and -conquer algorithm|divide -and -conquer algorithm]] sketched earlier can be [[parallel algorithm|parallelized]] in two ways for [[shared-memory multiprocessor]]s. These are based on the fact that the eight recursive matrix multiplications in
 
:<math>\begin{pmatrix}