Analysis of parallel algorithms: Difference between revisions

Content deleted Content added
m Execution on a limited number of processors: avoid the suggestion of a ratio
actually none of this turns out to be specific to PRAMs
Line 1:
This article discusses the [[analysis of algorithms|analysis]] of [[parallel algorithm]]s running on [[parallel random-access machine]]s (PRAMs): models of [[parallel computing|parallel computation]] where multiple [[Central processing unit|processor]]s have [[shared memory architecture|shared access]] to a [[random-access memory]] (RAM). Like in the analysis of "ordinary", sequential, algorithms, one is typically interested in [[Asymptotic analysis|asymptotic]] bounds on the resource consumption (mainly time spent computing), but the analysis is performed in the presence of multiple processor units that cooperate to perform computations. Thus, one can determine not only how many "steps" a computation takes, but also how much faster it becomes as the number of processors goes up.
 
==Overview==
Suppose computations are executed on a PRAM machine usingthat has {{mvar|p}} processors. Let {{mvar|T<sub>p</sub>}} denote the time that expires between the start of the computation and its end. Analysis of the computation's [[Time complexity|running time]] focuses on the following notions:
 
* The ''work'' of a computation executed by {{mvar|p}} processors is the total number of primitive operations that the processors perform.<ref name="casanova">{{cite book |title=Parallel Algorithms |first1=Henri |last1=Casanova |first2=Arnaud |last2=Legrand |first3=Yves |last3=Robert |publisher=CRC Press |year=2008 |pages=10}}</ref> Ignoring communication overhead from synchronizing the processors, this is equal to the time used to run the computation on a single processor, denoted {{math|''T''<sub>1</sub>}}.
Line 15:
Using these definitions and laws, the following measures of performance can be given:
 
* ''[[Speedup]]'' is the gain in speed made by parallel execution compared to sequential execution: {{math|''S<sub>p</sub>'' {{=}} ''T''<sub>1</sub> ∕ ''T<sub>p</sub>''}}. When the speedup is {{math|Ω(''n'')}} for input size {{mvar|n}} (using [[big O notation]]), the speedup is linear, which is optimal in thesimple PRAMmodels modelof computation because the work law implies that {{math|''T''<sub>1</sub> ∕ ''T<sub>p</sub>'' ≤ ''p''}} ([[Speedup#Super-linear speedup|super-linear speedup]] can occur in practice due to [[memory hierarchy]] effects). The situation {{math|''T''<sub>1</sub> ∕ ''T<sub>p</sub>'' {{=}} ''p''}} is called perfect linear speedup.<ref name="clrs"/> An algorithm that exhibits linear speedup is said to be [[scalability|scalable]].<ref name="casanova"/>
* ''Efficiency'' is the speedup per processor, {{math|''S<sub>p</sub>'' ∕ ''p''}}.<ref name="casanova"/>
* ''Parallelism'' is the ratio {{math|''T''<sub>1</sub> ∕ ''T''<sub>∞</sub>}}. It represents the maximum possible speedup on any number of processors. By the span law, the parallelism bounds the speedup: if {{math|''p'' > ''T''<sub>1</sub> ∕ ''T<sub>∞</sub>''}}, then {{math|''T''<sub>1</sub> ∕ ''T''<sub>p</sub> ≤ ''T''<sub>1</sub> ∕ ''T<sub>∞</sub>'' < ''p''}}.<ref name="clrs"/>