Loop-level parallelism: Difference between revisions

Content deleted Content added
Ewhorton (talk | contribs)
Ewhorton (talk | contribs)
Line 151:
| Processes are bound to processors.
|}
 
=== DISTRIBUTED loop ===
When a loop has a loop-carried dependence, another way to parallelize it is to distribute the loop into several different loops. Statements that are not dependent with each other are separated so that these distributed loops can be executed in parallel. For example, consider the following code.
<syntaxhighlight>
for (int i = 1; i < n; i ++) {
S1: a[i] = a[i -1] + b[i];
S2: c[i] = c[i] + d[i];
</syntaxhighlight>
The loop has a loop carried dependence <code>S1[i] ->T S1[i + 1]</code> but S2 and S1 do not have a loop-carried dependence so we can rewrite the code as follows.
<syntaxhighlight>
loop1: for (int i = 1; i < n; i ++) {
S1: a[i] = a[i -1] + b[i];
loop2: for (int i = 1; i < n; i ++) {
S2: c[i] = c[i] + d[i];
</syntaxhighlight>
Note that now loop1 and loop2 can be executed in parallel. Instead of single instruction being performed in parallel on different data as in data level parallelism, here different loops perform different tasks on different data. We call this type of parallelism either function or task parallelism.
 
=== DOALL parallelism ===
Line 241 ⟶ 260:
}
</source>
 
=== DISTRIBUTED loop ===
When a loop has a loop-carried dependence, another way to parallelize it is to distribute the loop into several different loops. Statements that are not dependent with each other are separated so that these distributed loops can be executed in parallel. For example, consider the following code.
<syntaxhighlight>
for (int i = 1; i < n; i ++) {
S1: a[i] = a[i -1] + b[i];
S2: c[i] = c[i] + d[i];
</syntaxhighlight>
The loop has a loop carried dependence <code>S1[i] ->T S1[i + 1]</code> but S2 and S1 do not have a loop-carried dependence so we can rewrite the code as follows.
<syntaxhighlight>
loop1: for (int i = 1; i < n; i ++) {
S1: a[i] = a[i -1] + b[i];
loop2: for (int i = 1; i < n; i ++) {
S2: c[i] = c[i] + d[i];
</syntaxhighlight>
Note that now loop1 and loop2 can be executed in parallel. Instead of single instruction being performed in parallel on different data as in data level parallelism, here different loops perform different tasks on different data. We call this type of parallelism either function or task parallelism.
 
== References ==