Loop-level parallelism: Difference between revisions

Content deleted Content added
Ewhorton (talk | contribs)
No edit summary
Ewhorton (talk | contribs)
No edit summary
Line 1:
'''Loop-level parallelism''' is a form of [[parallelism (computing)|parallelism]] in [[software programming]] that is concerned with extracting parallel tasks from [[Control_flow#Loops|loops]]. The opportunity for loop-level parallelism often arises in computing programs where data is stored in [[random access]] [[data structures]]. Where a sequential program will iterate over the data structure and operate on indices one at a time, a program exploiting loop-level parallelism will use multiple [[thread (computing)|threads ]] or [[process (computing)|processes]] which operate on some or all of the indices at the same time. Such parallelism provides a [[speedup]] to overall execution time of the program, typically in line with [[Amdahl's law]].
 
== Description ==
 
For simple loops, where each iteration is independent of the others, loop-level parallelism can be [[embarrassingly parallel]], as parallelizing only requires assigning a process to handle each iteration. However, many algorithms are designed to run sequentially, and fail when parallel processes [[race condition|race]] due to dependence within the code. Sequential algorithms are sometimes applicable to parallel contexts with slight modification. Usually, though, they require [[synchronization (computer science)|process synchronization]]. Synchronization can be either implicit, via [[message passing]], or explicit, via synchronization primitives like [[Semaphore (programming)|semaphores]].
 
== Example ==
For example, consider the following code operating on a list <code>L</code> of length <code>n</code>.
 
<source>
for (int i = 0; i < n; i++) {
S1: L[i] = L[i] + 10;
}
</source>
 
Each iteration of the loop takes the value from the current index of <code>L</code>, and increments it by 10. If statement <code>S1</code> takes <code>T</code> time to execute, then the loop takes time <code>n * T</code> to execute sequentially, ignoring time taken by loop constructs. Now, consider a system with <code>p</code> processors where <code>p > n</code>. If <code>n</code> threads run in parallel, the time to execute all <code>n</code> steps is reduced to <code>T</code>.
 
== Dependencies in code ==