Revision as of 12:09, 24 December 2019 edit John of Reading (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers 787,576 edits Typo fixing, replaced: are send → are sent (4), leafs → leaves (2) Tag: AWB ← Previous edit		Revision as of 08:15, 15 February 2020 edit undo Orangesherbet0 (talk \| contribs) 64 edits "The total size of a vector is defined as `n`" -> `m` was already defined as the total size of the vectors in question. Next edit →
Line 4: == Definition == Formally, reduce takes an [[Associative property\|associative]] (but not necessarily [[Commutative property\|commutative]]) operator <math>\oplus</math>, which can be evaluated in constant time and an input set <math>V = \{v_0 = \begin{pmatrix} e_0^0 \\ \vdots \\ e_0^{m-1}\end{pmatrix}, v_1 = \begin{pmatrix} e_1^0 \\ \vdots \\ e_1^{m-1}\end{pmatrix}, \dots, v_{p-1} = \begin{pmatrix} e_{p-1}^0 \\ \vdots \\ e_{p-1}^{m-1}\end{pmatrix}\}</math>of <math>p</math> vectors with <math>m</math> elements each~~. The total size of a vector is defined as <math>n</math>~~. The result <math>r</math> of the operation is the combination of the elements <math>r = \begin{pmatrix} e_0^0 \oplus e_1^0 \oplus \dots \oplus e_{p-1}^0 \\ \vdots \\ e_0^{m-1} \oplus e_1^{m-1} \oplus \dots \oplus e_{p-1}^{m-1}\end{pmatrix} = \begin{pmatrix} \bigoplus_{i=0}^{p-1} e_i^0 \\ \vdots \\ \bigoplus_{i=0}^{p-1} e_i^{m-1} \end{pmatrix}</math> and has to be stored at a specified root processor at the end of the execution. For example, the result of a reduction on the set <math>\{3,5,7,9\}</math>, where all vectors have size one is <math>3 + 5 + 7 + 9 = 24</math>. If the result <math>r</math> has to be available at every processor after the computation has finished, it is often called Allreduce. An optimal sequential linear-time algorithm for reduction can apply the operator successively from front to back, always replacing two vectors with the result of the operation applied to all its elements, thus creating an instance that has one vector less. It needs <math>(p-1)\cdot m </math> steps until only <math>r</math> is left. Sequential algorithms can not perform better than linear time, but parallel algorithms leave some space left to optimize. == Binomial tree algorithms ==

Reduce (parallel pattern): Difference between revisions