Content deleted Content added
fix description |
See also Hypercube internetwork topology |
||
(12 intermediate revisions by 10 users not shown) | |||
Line 1:
▲The <math>d</math>-dimensional '''[[hypercube]]''' is a network topology for parallel computers with <math>2^d</math> processing elements. The topology allows for an efficient implementation of some basic communication primitives such as [[Broadcasting_(networking)|Broadcast]], All-[[Reduce_(parallel_pattern)|Reduce]] and [[Prefix sum]].<ref>Grama, A.(2003). Introduction to Parallel Computing. Addison Wesley; Auflage: 2 ed. ISBN: 978-0201648652.</ref> The processing elements are numbered from <math>0</math> to <math>2^d - 1</math>. Each processing elements is then adjacent to processing elements whose numbers differ in exactly one bit. The algorithms described on this page utilize this structure efficiently.
Most of the communication primitives presented in this article share a common template.<ref>Foster, I.(1995). Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison Wesley; {{ISBN
▲== Algorithm Outline ==
▲Most of the communication primitives presented in this article share a common template.<ref>Foster, I.(1995). Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison Wesley; ISBN: 0201575949.</ref> Initially, each processing element possesses one message that must reach every other processing element during the course of the algorithm. The following pseudo code sketches the communication steps necessary. Hereby, '''Initialization''', '''Operation''' and '''Output''' are placeholders that depend on the given communication primitive (see next section).
'''Input:''' message <math>m</math>.
Line 13 ⟶ 11:
<math>y := i \text{ XOR } 2^k</math>
'''Send''' <math>s</math> '''to''' <math>y</math>
'''
'''Operation'''<math>(s, m)</math>
'''endfor'''
'''Output'''
Each processing element iterates over its neighbors (the expression <math>i \text{ XOR } 2^k</math> negates the <math>k</math>-th bit in <math>i</math>'s binary representation, therefore obtaining the numbers of its neighbors).
[[File:Hypergraph Communication Pattern.png|thumb|Algorithm outline applied to the <math>3</math>-dimensional hypercube. In the first step (before any communication), each processing element possesses one message (blue). Communication is marked red. After each step, the processing elements store the received message, but other operations are also possible.]]
== Communication
=== Prefix sum ===
'''Input''': message <math>m_i</math> of processor <math>i</math>.
'''Output''': prefix sum <math>\bigoplus_{0 \le j \le i} m_j</math> of processor <math>i</math>.
<math>x := m_i</math>
<math>\sigma := m_i</math>
Line 45 ⟶ 43:
[[File:Hypergraph Communication Steps for Prefix Sum.png|thumb|Example for a prefix sum calculation. Upper number: tentatetive prefix sum (variable <math>x</math>). Lower number: sum over all elements in the sub cube (variable <math>\sigma</math>).]]
=== All-
'''All-
'''Input''': message <math>x := m_i</math> at processing unit <math>i</math>.
'''Output''': all messages <math>m_1 \cdot m_2 \dots m_p</math>.
<math>x := m_i</math>
Line 62 ⟶ 60:
The same principle can be applied to the '''All-Reduce''' operations, but instead of concatenating the messages, it performs a reduction operation on the two messages. So it is a '''Reduce''' operation, where all processing units know the result. Compared to a normal reduce operation followed by a broadcast, All-Reduce in hypercubes reduces the number of communication steps.
=== All-to-
Here every processing element has a unique message for all other processing elements.
'''Input:''' message <math>m_{ij}</math> at processing element <math>i</math> to processing element <math>j</math>.
'''for''' <math>d > k \geq 0</math> '''do'''
'''Receive''' from processing element <math>i \text{ XOR } 2^k</math>:
all messages for my <math>k</math>-dimensional sub cube
'''Send''' to processing element <math>i \text{ XOR } 2^k</math>:
all messages for its <math>k</math>-dimensional sub cube
'''endfor'''
With each iteration a messages comes closer to its destination by one dimension, if it hasn't arrived yet. Hence, all messages have reached their target after at most <math>d = \log{p}</math> steps. In every step, <math>p / 2</math> messages are sent: in the first iteration, half of the messages aren't meant for the own sub cube. In every following step, the sub cube is only half the size as before, but in the previous step exactly the same number of messages arrived from another processing element.
This results in a run-time of <math>T(n,p) \approx \log{p} (T_\text{start} + \frac{p}{2}nT_\text{byte})</math>.
== ESBT-
The ESBT-broadcast (Edge-disjoint Spanning Binomial Tree) algorithm<ref>{{cite journal|last1=Johnsson|first1=S.L.|last2=Ho|first2=C.-T.|title=Optimum broadcasting and personalized communication in hypercubes|journal=IEEE Transactions on Computers|volume=38|issue=9|year=1989|pages=1249–1268|issn=
=== Runtime ===
=== Construction of the
[[File:HypergraphESBT.png|thumb|A <math>3</math>-dimensional hypercubes with three ESBT embedded.]]
This section describes how to construct the binomial trees systematically. First, construct a single binomial spanning tree von <math>2^d</math> nodes as follows. Number the nodes from <math>0</math> to <math>2^d - 1</math> and consider their binary representation. Then the children of each nodes are obtained by negating single leading zeroes. This results in a single binomial spanning tree. To obtain <math>d</math> edge-disjoint copies of the tree, translate and rotate the nodes: for the <math>k</math>-th copy of the tree, apply a XOR operation with <math>2^k</math> to each node.
== See also ==
* [[Hypercube internetwork topology]]
==
{{Reflist}}
[[
|