Revision as of 21:34, 28 June 2019 edit Elmidae (talk \| contribs) Extended confirmed users, Page movers, New page reviewers, Rollbackers 48,523 edits Nominated page for deletion using Page Curation (afd) ← Previous edit		Revision as of 18:03, 29 June 2019 edit undo McGucket (talk \| contribs) 130 edits Improve language. Next edit →
Line 4: <!-- End of AfD message, feel free to edit beyond this point --> ~~The~~ <math>d</math>-dimensional '''[[hypercube]]''' is a network topology for parallel computers with <math>2^d</math> processing elements. The topology allows for an efficient implementation of some basic communication primitives such as [[Broadcasting (networking)\|Broadcast]], All-[[Reduce (parallel pattern)\|Reduce]], and [[Prefix sum]].<ref>Grama, A.(2003). Introduction to Parallel Computing. Addison Wesley; Auflage: 2 ed. {{ISBN\|978-0201648652}}.</ref> The processing elements are numbered ~~from~~ <math>0</math> tothrough <math>2^d - 1</math>. Each processing element is ~~then~~ adjacent to processing elements whose numbers differ in ~~exactly~~one and only one bit. The algorithms described onin this page utilize this structure efficiently. == Algorithm Outline == Line 21: '''Output''' Each processing element iterates over its neighbors (the expression <math>i \text{ XOR } 2^k</math> negates the <math>k</math>-th bit in <math>i</math>'s binary representation, therefore obtaining the numbers of its neighbors). ~~During~~In aneach iteration, each processing element exchanges a message with the neighbor and processes the received message afterwards. The processing operation depends on the communication primitive. [[File:Hypergraph Communication Pattern.png\|thumb\|Algorithm outline applied to the <math>3</math>-dimensional hypercube. In the first step (before any communication), each processing element possesses one message (blue). Communication is marked red. After each step, the processing elements store the received message, but other operations are also possible.]] Line 28: === Prefix sum === AtIn the beginning of a [[prefix sum]] operation, each processing element <math>i</math> owns a message <math>m_i</math>. The goal is to ~~calculate~~compute <math>\bigoplus_{0 \le j \le i} m_j</math>, where <math>\oplus</math> is an associative operation. The following pseudo code describes the algorithm. '''Input''': message <math>m_i</math> of processor <math>i</math>. Line 83: The ESBT-broadcast (Edge-disjoint Spanning Binomial Tree) algorithm<ref>{{cite journal\|last1=Johnsson\|first1=S.L.\|last2=Ho\|first2=C.-T.\|title=Optimum broadcasting and personalized communication in hypercubes\|journal=IEEE Transactions on Computers\|volume=38\|issue=9\|year=1989\|pages=1249–1268\|issn=0018-9340\|doi=10.1109/12.29465}}</ref> is a pipelined broadcast algorithm with optimal runtime for clusters with hypercube network topology. The algorithm embeds <math>d</math> edge-disjoint binomial trees in the hypercube, such that each neighbor of processing element <math>0</math> is the root of a spanning binomial tree on <math>2^d - 1</math> nodes. To broadcast a message, the source node splits its message into <math>k</math> chunks of equal size and cyclically sends them to the roots of the binomial trees. Upon receiving a chunk, the binomial trees broadcast it. === Runtime === ~~The runtime of this algorithm is as follows.~~ In each step, the source node sends one of its <math>k</math> chunks to a binomial tree. Broadcasting the chunk within the binomial tree takes <math>d</math> steps. Thus, it takes <math>k</math> steps to distribute all chunks and additionally <math>d</math> steps until the last binomial tree broadcast has finished, resulting in <math>k + d</math> steps overall. Therefore, the runtime for a message of length <math>n</math> is <math>T(n, p, k) = \left(\frac{n}{k} T_\text{byte} + T_\text{start} \right) (k + d)</math>. With the optimal chunk size <math>k^* = \sqrt{\frac{nd \cdot T_\text{byte}}{T_\text{start}}}</math>, the optimal runtime of the algorithm is <math>T^*(n, p) = n \cdot T_\text{byte} + \log(p) \cdot T_\text{start} + \sqrt{n \log(p) \cdot T_\text{start} \cdot T_\text{byte}}</math>. === Construction of the Binomial Trees === Line 89 ⟶ 90: [[File:HypergraphESBT.png\|thumb\|A <math>3</math>-dimensional hypercubes with three ESBT embedded.]] This section describes how to construct the binomial trees systematically. First, construct a single binomial spanning tree von <math>2^d</math> nodes as follows. Number the nodes from <math>0</math> to <math>2^d - 1</math> and consider their binary representation. Then the children of each nodes are obtained by negating single leading zeroes. This results in a single binomial spanning tree. To obtain <math>d</math> edge-disjoint copies of the tree, translate and rotate the nodes: for the <math>k</math>-th copy of the tree, apply a XOR operation with <math>2^k</math> to each node. ~~Afterwards~~Subsequently, right -rotate all nodes by <math>k</math> digits. The resulting binomial trees are edge-disjoint and therefore, fulfill the requirements for the ESBT-broadcasting algorithm. ==References==

Hypercube (communication pattern): Difference between revisions