Parallel programming model: Difference between revisions

Content deleted Content added
Bender the Bot (talk | contribs)
 
(51 intermediate revisions by 35 users not shown)
Line 1:
{{Short description|Abstraction of parallel computer architecture}}
In computer software, a '''parallel programming model''' is a model for writing [[parallel program]]s which can be compiled and executed. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently they execute. The implementation of a programming model can take several forms such as libraries invoked from traditional [[sequential programming|sequential]] languages, language extensions, or complete new execution models.
In [[computing]], a '''parallel programming model''' is an [[Abstraction (software engineering)|abstraction]] of [[parallel computing|parallel computer]] architecture, with which it is convenient to express [[algorithms]] and their composition in [[Computer program|programs]]. The value of a programming model can be judged on its ''generality'': how well a range of different problems can be expressed for a variety of different architectures, and its ''performance'': how efficiently the compiled programs can execute.<ref>Skillicorn, David B., "Models for practical parallel computation", International Journal of Parallel Programming, 20.2 133–158 (1991), https://www.ida.liu.se/~chrke55/papers/modelsurvey.pdf</ref> The implementation of a parallel programming model can take the form of a [[Library (computing)|library]] invoked from a [[programming language]], as an extension to an existing languages.
 
Consensus around eacha particular programming model is important asbecause it enables software expressed within itleads to bedifferent transportableparallel betweencomputers differentbeing architectures.built with Forsupport sequentialfor programmingthe architecturesmodel, thethereby facilitating [[vonSoftware Neumann modelportability|portability]] hasof facilitatedsoftware. In this sense, asprogramming itmodels providesare anreferred efficientto as ''bridge[[bridging model|bridging]]'' between hardware and software, meaning that high-level languages can be efficiently compiled to it and it can be efficiently implemented in hardware.<ref name="Valiant1990">Leslie G. Valiant, "A bridging model for parallel computation", Commun.Communications of the ACM, volumeVolume 33, issueIssue 8, August, 1990, pages 103--111103–111.</ref>
 
==Classification of parallel programming models==
==Main classifications and paradigms==
Classifications of parallel programming models can be divided broadly into two areas: process interaction and problem decomposition.<ref>John E. Savage, Models of Computation: Exploring the Power of Computing, 2008, Chapter 7 (Parallel Computation), https://cs.brown.edu/~jes/book/ {{Webarchive|url=https://web.archive.org/web/20161105053330/http://cs.brown.edu/~jes/book/ |date=2016-11-05 }}</ref><ref>{{Cite web |title=1.3 A Parallel Programming Model |url=https://www.mcs.anl.gov/~itf/dbpp/text/node9.html |access-date=2024-03-21 |website=www.mcs.anl.gov}}</ref><ref name=":0">{{Cite web |title=Introduction to Parallel Computing Tutorial {{!}} HPC @ LLNL |url=https://hpc.llnl.gov/documentation/tutorials/introduction-parallel-computing-tutorial |access-date=2024-03-21 |website=hpc.llnl.gov}}</ref>
 
Classifications of parallel programming models can be divided broadly into two areas: process interaction and problem decomposition.
 
===Process interaction===
Process interaction relates to the mechanisms by which parallel processes are able to communicate with each other. The most common forms of interaction are shared memory and message passing, but interaction can also be implicit (invisible to the programmer).
 
Process interaction relates to the mechanisms by which parallel processes are able to communicate with each other. The most common forms of interaction are shared memory and message passing, but it can also be implicit.
====Shared memory====
{{Mainmain|Shared memory (interprocess communication)}}
Shared memory is an efficient means of passing data between processes. In a shared-memory model, parallel processes share a global address space that they read and write to asynchronously. Asynchronous concurrent access can lead to [[race condition]]s, and mechanisms such as [[Lock (computer science)|locks]], [[Semaphore (programming)|semaphores]] and [[Monitor (synchronization)|monitors]] can be used to avoid these. Conventional [[multi-core processor]]s directly support shared memory, which many parallel programming languages and libraries, such as [[Cilk (programming language)|Cilk]], [[OpenMP]] and [[Threading Building Blocks]], are designed to exploit.
 
Shared memory is an efficient means of passing data between programs. Depending on context, programs may run on a single processor or on multiple separate processors. In this model, parallel tasks share a global address space which they read and write to asynchronously. This requires protection mechanisms such as locks, semaphores and monitors to control concurrent access. Shared memory can be
emulated on distributed-memory systems but non-uniform memory access (NUMA) times can come in to play. Sometimes memory is also shared between different section of code of the same program. E.g. A For loop can create threads for each iteration which updates a variable in parallel.
 
====Message passing====
{{Mainmain|Message passing}}
In a message-passing model, parallel processes exchange data through passing messages to one another. These communications can be asynchronous, where a message can be sent before the receiver is ready, or synchronous, where the receiver must be ready. The [[Communicating sequential processes]] (CSP) formalisation of message passing uses synchronous communication channels to connect processes, and led to important languages such as [[Occam (programming language)|Occam]], [[Limbo (programming language)|Limbo]] and [[Go (programming language)|Go]]. In contrast, the [[actor model]] uses asynchronous message passing and has been employed in the design of languages such as [[D (programming language)|D]], [[Scala (programming language)|Scala]] and SALSA.
 
====Partitioned global address space====
Message passing is a concept from computer science that is used extensively in the design and implementation of modern software applications; it is key to some models of concurrency and object-oriented programming. In a message passing model, parallel tasks exchange data through passing messages to one another. These communications can be asynchronous or synchronous. The Communicating Sequential Processes (CSP) formalisation of message-passing employed communication channels to 'connect' processes, and led to a number of important languages such as Joyce, occam and Erlang.
{{main|Partitioned global address space}}
Partitioned Global Address Space (PGAS) models provide a middle ground between shared memory and message passing. PGAS provides a global memory address space abstraction that is logically partitioned, where a portion is local to each process. Parallel processes communicate by asynchronously performing operations (e.g. reads and writes) on the global address space, in a manner reminiscent of shared memory models. However by semantically partitioning the global address space into portions with affinity to a particular processes, they allow programmers to exploit [[locality of reference]] and enable efficient implementation on [[distributed memory]] parallel computers. PGAS is offered by many parallel programming languages and libraries, such as [[Fortran 2008]], [[Chapel (programming language)|Chapel]], [http://upcxx.lbl.gov UPC++], and [[SHMEM]].
 
====Implicit interaction====
{{Mainmain|Implicit parallelism}}
In an implicit model, no process interaction is visible to the programmer and instead the compiler and/or runtime is responsible for performing it. Two examples of implicit parallelism are with [[___domain-specific language]]s where the concurrency within high-level operations is prescribed, and with [[functional programming|functional programming languages]] because the absence of [[Side effect (computer science)|side-effects]] allows non-dependent functions to be executed in parallel.<ref name="ParFuncProg">Hammond, Kevin. Parallel functional programming: An introduction. In International Symposium on Parallel Symbolic Computation, p. 46. 1994.</ref> However, this kind of parallelism is difficult to manage<ref>McBurney, D. L., and M. Ronan Sleep. "Transputer-based experiments with the ZAPP architecture." PARLE Parallel Architectures and Languages Europe. Springer Berlin Heidelberg, 1987.</ref> and functional languages such as [[Concurrent Haskell]] and [[Concurrent ML]] provide features to manage parallelism explicitly and correctly.
 
In an implicit model, no process interaction is visible to the programmer, instead the compiler and/or runtime is responsible for performing it. This is most common with ___domain-specific languages where the concurrency within a problem can be more prescribed.
 
===Problem decomposition===
A parallel program is composed of simultaneously executing processes. Problem decomposition relates to the way in which the constituent processes are formulated.<ref>{{Cite web |title=2.2 Partitioning |url=https://www.mcs.anl.gov/~itf/dbpp/text/node16.html |access-date=2024-03-21 |website=www.mcs.anl.gov}}</ref><ref name=":0" />
{{clear}}
{{Flynn's Taxonomy}}
 
A parallel program is composed of simultaneously executing processes. Problem decomposition relates to the way in which these processes are formulated. This classification may also be referred to as [[algorithmic skeleton]]s or parallel programming paradigms.
 
====Task parallelism====
{{Mainmain|Task parallelism}}
A task-parallel model focuses on processes, or threads of execution. These processes will often be behaviourally distinct, which emphasises the need for communication. Task parallelism is a natural way to express message-passing communication. In [[Flynn's taxonomy]], task parallelism is usually classified as [[Multiple instruction, multiple data|MIMD]]/[[Flynn's taxonomy#MPMD|MPMD]] or [[Multiple instruction, single data|MISD]].
 
A task-parallel model focuses on processes, or threads of execution. These processes will often be behaviourally distinct, which emphasises the need for communication. Task parallelism is a natural way to express message-passing communication. It is usually classified as [[MIMD]]/[[Flynn's taxonomy#MPMD|MPMD]] or [[MISD]].
 
====Data parallelism====
{{Mainmain|Data parallelism}}
A data-parallel model focuses on performing operations on a data set, typically a regularly structured array. A set of tasks will operate on this data, but independently on disjoint partitions. In [[Flynn's taxonomy]], data parallelism is usually classified as [[Multiple instruction, multiple data|MIMD]]/[[SPMD]] or [[Single instruction, multiple data|SIMD]].
 
====Stream Parallelism====
A data-parallel model focuses on performing operations on a data set which is usually regularly structured in an array. A set of tasks will operate on this data, but independently on separate partitions. In a shared memory system, the data will be accessible to all, but in a distributed-memory system it will divided between memories and worked on locally. Data parallelism is usually classified as [[SIMD]]/[[SPMD]].
Stream parallelism, also known as pipeline parallelism, focuses on dividing a computation
into a sequence of stages, where each stage processes a portion of the input
data. Each stage operates independently and concurrently, and the output of one
stage serves as the input to the next stage. Stream parallelism is particularly suitable
for applications with continuous data streams or pipelined computations.
 
====IdealisedImplicit Parallel Systemsparallelism====
{{main|Implicit parallelism}}
As with implicit process interaction, an implicit model of parallelism reveals nothing to the programmer as the compiler, the runtime or the hardware is responsible. For example, in compilers, [[automatic parallelization]] is the process of converting sequential code into parallel code, and in computer architecture, [[Superscalar processor|superscalar execution]] is a mechanism whereby [[instruction-level parallelism]] is exploited to perform operations in parallel.
 
==Terminology==
The systems are categorized into two categories.{{citationneeded|date=April 2014}} The systems discussed in the first category were characterized by the isolation of the abstract design space seen by the programmer from the parallel, distributed implementation. In this, all processes are presented with equal access to some kind of shared memory space. In its loosest form, any process may attempt to access any item at any time.
Parallel programming models are closely related to [[model of computation|models of computation]]. A model of parallel computation is an [[abstraction]] used to analyze the cost of computational processes, but it does not necessarily need to be practical, in that it can be implemented efficiently in hardware and/or software. A programming model, in contrast, does specifically imply the practical considerations of hardware and software implementation.<ref>Skillicorn, David B., and Domenico Talia, Models and languages for parallel computation, ACM Computing Surveys, 30.2 123–169 (1998), https://www.cs.utexas.edu/users/browne/CS392Cf2000/papers/ModelsOfParallelComputation-Skillicorn.pdf</ref>
The second category considers machines in which the two levels are closer together and in particular, those in which the programmer's world includes explicit parallelism.This category discards shared memory based cooperation in favour of some form of explicit message passing.
 
A parallel programming language may be based on one or a combination of programming models. For example, [[High Performance Fortran]] is based on shared-memory interactions and data-parallel problem decomposition, and [[Go (programming language)|Go]] provides mechanism for shared-memory and message-passing interaction.
== Example parallel programming models==
 
==Example parallel programming models==
* [[Algorithmic skeleton|Algorithmic Skeletons]]
{| class="wikitable"
* Components
! Name || Class of interaction || Class of decomposition || Example implementations
* [[Distributed objects]]
|-
* [[Remote Method Invocation]]
| [[Actor model]]
* Workflows
| Asynchronous message passing
* [[Parallel Random Access Machine]]
| Task
* [[Stream processing]]
| [[D (programming language)|D]], [[Erlang (programming language)|Erlang]], [[Scala (programming language)|Scala]], SALSA
* [[Bulk synchronous parallel]]ism
|-
| [[Bulk synchronous parallel]]
| Shared memory
| Task
| [[Apache Giraph]], [[Apache Hama]], [[BSPlib]]
|-
| [[Communicating sequential processes]]
| Synchronous message passing
| Task
| [[Ada (programming language)|Ada]], [[Occam (programming language)|Occam]], [[VerilogCSP]], [[Go (programming language)|Go]]
|-
| [[Circuit (computer science)|Circuits]]
| Message passing
| Task
| [[Verilog]], [[VHDL]]
|-
| [[Dataflow programming|Dataflow]]
| Message passing
| Task
| [[Lustre (programming language)|Lustre]], [[TensorFlow]], [[Apache Flink]]
|-
| [[Functional programming|Functional]]
| Message passing
| Task
| [[Concurrent Haskell]], [[Concurrent ML]]
|-
| [[LogP machine]]
| Synchronous message passing
| Not specified
| None
|-
| [[Parallel random access machine]]
| Shared memory
| Data
| [[Cilk (programming language)|Cilk]], [[CUDA]], [[OpenMP]], [[Threading Building Blocks]], [[XMTC]]
|-
| [[SPMD]] [[Partitioned global address space|PGAS]]
| Partitioned global address space
| Data
| [[Fortran 2008]], [[Unified Parallel C]], [http://upcxx.lbl.gov UPC++], [[SHMEM]]
|-
| Global-view [[Task parallelism]]
| Partitioned global address space
| Task
| [[Chapel (programming language)|Chapel]], [[X10 (programming language)|X10]]
|}
 
==See also==
* [[Automatic parallelization]]
* [[List of concurrent and parallel programming languages]]
* [[Bridging model]]
* [[Concurrent computing|Concurrency]]
* [[Automatic parallelization]]
* [[Degree of parallelism]]
* [[Explicit parallelism]]
* [[Partitioned global address space]]
* [[List of concurrent and parallel programming languages]]
* [[Optical Multi-Tree with Shuffle Exchange]]
* [[Parallel external memory (Model)]]
 
==References==
{{Reflistreflist}}
 
==Further reading==
* {{Citation | author = Blaise Barney | institution = Lawrence Livermore National Laboratory | title = Introduction to Parallel Computing | url = https://computing.llnl.gov/tutorials/parallel_comp/ | access-date = 2015-11-22 | archive-date = 2013-06-10 | archive-url = https://web.archive.org/web/20130610122229/https://computing.llnl.gov/tutorials/parallel_comp/ | url-status = dead }}
* H. Shan and J. Pal Singh. A comparison of MPI, SHMEM, and Cache-Coherent Shared Address Space Programming Models on a Tightly-Coupled Multiprocessor. International Journal of Parallel Programming, 29(3), 2001.
* {{Citation | author = Murray I. Cole. | title = Algorithmic Skeletons: Structured Management of Parallel Computation | institution = University of Glasgow | url = http://homepages.inf.ed.ac.uk/mic/Pubs/skeletonbook.pdf }}
* H. Shan and J. Pal Singh. Comparison of Three Programming Models for Adaptive Applications on the Origin 2000. Journal of Parallel and Distributed Computing, 62:241–266, 2002.
* {{Cite book|author1=J. Darlinton |author2=M. Ghanem |author3=H. W. To |title=Proceedings of Workshop on Programming Models for Massively Parallel Computers |chapter=Structured parallel programming |date=1993 |pages=160–169 |doi=10.1109/PMMP.1993.315543 |isbn=0-8186-4900-3 |s2cid=15265646 | url = https://www.researchgate.net/publication/3557907}}
* About [[structured parallel programming]]: Davide Pasetto and [[Marco Vanneschi]]. ''[http://portal.acm.org/citation.cfm?id=898142&coll=&dl=GUIDE&CFID=15151515&CFTOKEN=6184618 Machine independent Analytical models for cost evaluation of template--based programs]'', [[University of Pisa]], 1996
* {{Citation | author = J.Ian Darlinton, M. Ghanem, H. W. ToFoster | yearinstitution = 1993Argonne | title = Structured ParallelNational ProgrammingLaboratory | journaltitle =In ProgrammingDesigning Modelsand for MassivelyBuilding Parallel Computers. IEEE Computer Society Press. 1993Programs | url = http://citeseerxwww.istmcs.psuanl.edugov/viewdoc~itf/summary?doi=10.1.1.37.4610 dbpp}}
*Murray I. Cole. Algorithmic Skeletons:Structured Management of Parallel Computation
 
==External links ==
* [http://www.oracle.com/technetwork/server-storage/solarisstudio/documentation/oss-parallel-programs-170709.pdf Developing Parallel Programs — A Discussion of Popular Models] (Oracle White Paper September 2010)
* [http://www.mcs.anl.gov/~itf/dbpp/text/book.html Designing and Building Parallel Programs] (Section 1.3, 'A Parallel Programming Model')
* [http://computing.llnl.gov/tutorials/parallel_comp/ Introduction to Parallel Computing] (Section 'Parallel Programming Models')
 
{{Parallel Computing}}