Loop-level parallelism: Difference between revisions

Content deleted Content added
Aishwinth (talk | contribs)
Added //See also//
Aishwinth (talk | contribs)
Minor edits
Line 53:
|}
 
In order to preserve the sequential behaviorbehaviour of a loop when run in parallel, True Dependence must be preserved. Anti-Dependence and Output Dependence can be dealt with by giving each process its own copy of variables (known as privatization). <ref name="Solihin" />
 
=== Example of True Dependence ===
Line 170:
</syntaxhighlight>
Note that now loop1 and loop2 can be executed in parallel. Instead of single instruction being performed in parallel on different data as in data level parallelism, here different loops perform different tasks on different data. We call this type of parallelism either function or task parallelism.
 
Let's say the time of execution of S1 and S2 be <math>Ts1</math> and <math>Ts2
</math> then the execution time for sequential form of above code is <math>n*(Ts1+Ts2)</math>, Now because we split the two statements and put them in two different loops, gives us an execution time of <math>n*Ts1 + Ts2</math>.
 
=== DOALL parallelism ===
Line 181 ⟶ 184:
</source>
 
Let's say the time of one execution of S1 be <math>Ts1</math> then the execution time for sequential form of above code is <math>n*Ts1</math>, Now because DOALL Parallelism exists when all iterations are independent, speedupspeed-up may be achieved by executing all iterations in parallel which gives us an execution time of <math>Ts1</math>(, which is the time taken for one iteration in sequential execution).
 
The following example, using a simplified pseudocodepseudo code, shows how a loop might be parallelized to execute each iteration independently.
<source>
begin_parallelism();
for (int i = 0; i < n; i++) {
S1: a[i] = b[i] + c[i];
end_parallelism();
}
Line 211 ⟶ 214:
* Assign the value to <code>a[i]</code>
 
Calculating the value <code>a[i - 1] + b[i] + 1</code>, and then performing the assignment can be decomposed into two lines(statements S1 and S2):
 
<source>
S1: int tmp = b[i] + 1;
S2: a[i] = a[i - 1] + tmp;
</source>
 
Line 224 ⟶ 227:
for (int i = 1; i < n; i++) {
 
S1: int tmp = b[i] + 1;
wait(i - 1);
 
S2: a[i] = a[i - 1] + tmp;
post(i);
}
</source>Let's say the time of execution of S1 and S2 be <math>Ts1</math> and <math>Ts2
</source>
</math> then the execution time for sequential form of above code is <math>n*(Ts1+Ts2)</math>, Now because DOACROSS Parallelism exists, speed-up may be achieved by executing iterations in a pipelined fashion which gives us an execution time of <math>Ts1 + n*Ts2</math>.
 
=== DOPIPE parallelism ===
Line 260 ⟶ 264:
}
</source>
Let's say the time of execution of S1 and S2 be <math>Ts1</math> and <math>Ts2
</math> then the execution time for sequential form of above code is <math>n*(Ts1+Ts2)</math>, Now because DOPIPE Parallelism exists, speed-up may be achieved by executing iterations in a pipelined fashion which gives us an execution time of <math>n*Ts1 + (n/p)*Ts2</math>, where p is the number of processor in parallel.
 
== See also ==