Embarrassingly parallel: Difference between revisions

Content deleted Content added
Examples: Monte Carlo Analysis is not a commonly used term. The main page on this topic has the Title Monte Carlo Method.
Tags: Mobile edit Mobile web edit
 
(15 intermediate revisions by 13 users not shown)
Line 1:
{{Short description|Parallel computing, a problem which is able to be trivially divided into parallelized tasks}}
In [[parallel computing]], an '''embarrassingly parallel''' workload or problem (also called '''embarrassingly parallelizable''', '''perfectly parallel''', '''delightfully parallel''' or '''pleasingly parallel''') is one where little or no effort is needed to separatesplit the problem into a number of parallel tasks.<ref>{{cite book|last1=Herlihy|first1=Maurice|last2=Shavit|first2=Nir|title=The Art of Multiprocessor Programming, Revised Reprint|date=2012|publisher=Elsevier|isbn=9780123977953|page=14|edition=revised|url=https://books.google.com/books?id=vfvPrSz7R7QC&q=embarrasingly|access-date=28 February 2016|quote=Some computational problems are “embarrassingly parallel”: they can easily be divided into components that can be executed concurrently.}}</ref> This is oftendue theto case where there is littleminimal or no dependency or need forupon communication between thosethe parallel tasks, or for results between them.<ref name=dbpp>Section 1.4.4 of: {{cite book
|url=http://www.mcs.anl.gov/~itf/dbpp/text/node10.html
|title=Designing and Building Parallel Programs
Line 12:
}}</ref>
 
Thus, these areThese differentdiffer from [[distributed computing]] problems, thatwhich need communication between tasks, especially communication of intermediate results. They are easyeasier to perform on [[server farm]]s which lack the special infrastructure used in a true [[supercomputer]] cluster. They are thus well -suited to large, Internet-based distributed[[volunteer computing]] platforms such as [[BOINC]], and dosuffer not sufferless from [[parallel slowdown]]. The opposite of embarrassingly parallel problems are [[inherently serial problem]]s, which cannot be parallelized at all.
 
A common example of an embarrassingly parallel problem is 3D video rendering handled by a [[graphics processing unit]], where each frame (forward method) or pixel ([[Ray tracing (graphics)|ray tracing]] method) can be handled with no interdependency.<ref name="ChalmersReinhard2011">{{cite book|author1=Alan Chalmers|author2=Erik Reinhard|author3=Tim Davis|title=Practical Parallel Rendering|url=https://books.google.com/books?id=loxhtzzG1FYC&q=%22embarrassingly+parallel%22|date=21 March 2011|publisher=CRC Press|isbn=978-1-4398-6380-0}}</ref> Some forms of [[password cracking]] are another embarrassingly parallel task that is easily distributed on [[central processing unit]]s, [[CPU core]]s, or clusters.
 
==Etymology==
"Embarrassingly" is used here in the same sense as in the phrase "an [[embarrassment of riches]]", meaning an overabundance—hereto referringrefer to parallelization problems which are "embarrassingly easy".<ref>Matloff, Norman (2011). ''The Art of R Programming: A Tour of Statistical Software Design'', p.347. No Starch. {{ISBN|9781593274108}}.</ref> The term may also imply embarrassment on the part of developers or compilers: "Because so many important problems remain unsolved mainly due to their intrinsic computational complexity, it would be embarrassing not to develop parallel implementations of polynomial [[homotopy]] continuation methods."<ref>{{cite book|last1=Leykin|first1=Anton|last2=Verschelde|first2=Jan|last3=Zhuang|first3=Yan|title=Mathematical Software - ICMS 2006 |chapter=Parallel Homotopy Algorithms to Solve Polynomial Systems |year=2006|journal=Proceedings of ICMS |volume=4151|pages=225–234|doi=10.1007/11832225_22|series=Lecture Notes in Computer Science|isbn=978-3-540-38084-9}}</ref> The term is first found in the literature in a 1986 book on multiprocessors by [[MATLAB]]'s creator [[Cleve Moler]],<ref name=hcmp>{{cite book
| titlecontribution = Matrix Computation on Distributed Memory Multiprocessors
| author = Moler, Cleve
| publisher = Society for Industrial and Applied Mathematics, Philadelphia
| journaltitle = Hypercube Multiprocessors
| editor-last = Heath
| editor-first = Michael T.
Line 31:
 
==Examples==
A trivial example involves serving static data. It would take very little effort to have many processing units produce the same set of bits. Indeed, the famous [["Hello, World!" program|Hello World]] problem could easily be parallelized with few programming considerations or computational costs.
 
Some examples of embarrassingly parallel problems include:
* [[Monte Carlo analysismethod]]<ref name="Kontoghiorghes2005">{{cite book|author=Erricos John Kontoghiorghes|title=Handbook of Parallel Computing and Statistics|url=https://books.google.com/books?id=BnNnKPkFH2kC&q=%22embarrassingly+parallel%22|date=21 December 2005|publisher=CRC Press|isbn=978-1-4200-2868-3}}</ref>
* Distributed relational database queries using [http://www.mysqlperformanceblog.com/2011/05/14/distributed-set-processing-with-shard-query/ distributed set processing].
* [[Numerical integration]]<ref name="Deng2013">{{cite book|author=Yuefan Deng|title=Applied Parallel Computing|url=https://books.google.com/books?id=YS9wvVeWrXgC&q=%22embarrassingly+parallel%22|year=2013|publisher=World Scientific|isbn=978-981-4307-60-4}}</ref>
* Serving static files on a webserver to multiple users at once.{{citation needed|date=June 2019}}
* Bulk processing of unrelated files of similar nature in general, such as photo gallery resizing and conversion.
* The [[Mandelbrot set]], [[Perlin noise]] and similar images, where each point is calculated independently.
* [[Rendering (computer graphics)|Rendering]] of [[computer graphics]]. In [[computer animation]], each [[video frame|frame]] or pixel may be rendered independently {{xref|(see: [[parallelParallel rendering]])}}.
* Some [[brute-force search]]es in [[cryptography]].<ref>{{Cite webjournal|url=https://tools.ietf.org/html/rfc7914#page-2|title=The scrypt Password-Based Key Derivation Function|first1=Simon|last1=Josefsson|first2=Colin|last2=Percival|author-link2=Colin Percival|date=August 2016|website=tools.ietf.org|doi=10.17487/RFC7914 |access-date=2016-12-12}}</ref> Notable real-world examples include [[distributed.net]] and [[proof-of-work]] systems used in [[cryptocurrency]].
* [[BLAST (biotechnology)|BLAST]] searches in [[bioinformatics]] with split databases.<ref>{{cite journal |last1=Mathog |first1=DR |title=Parallel BLAST on split databases. |journal=Bioinformatics |date=22 September 2003 |volume=19 |issue=14 |pages=1865–6 |doi=10.1093/bioinformatics/btg250 |pmid=14512366|doi-access=free }}</ref>
* [[BLAST]] searches in [[bioinformatics]] for multiple queries (but not for individual large queries).<ref>[http://seqanswers.com/forums/showpost.php?p=21050&postcount=3 SeqAnswers forum]</ref>
* Large scale [[facial recognition system]]s that compare thousands of arbitrary acquired faces (e.g., a security or surveillance video via [[closed-circuit television]]) with similarly large number of previously stored faces (e.g., a ''[[rogues gallery]]'' or similar [[No Fly List|watch list]]).<ref>[http://lbrandy.com/blog/2008/10/how-we-made-our-face-recognizer-25-times-faster/ How we made our face recognizer 25 times faster] (developer blog post)</ref>
* Computer simulations comparing many independent scenarios.
Line 47 ⟶ 48:
* Event simulation and reconstruction in [[particle physics]].
* The [[marching squares]] algorithm.
* Sieving step of the [[quadratic sieve]] and the [[General number field sieve|number field sieve]].
* Tree growth step of the [[random forest]] machine learning technique.
* [[Discrete Fourier transform]] where each harmonic is independently calculated.
* [[Convolutional neural network]]s running on [[GPU]]s.
* [[Hyperparameter optimization#Grid_search|Hyperparameter grid search]] in machine learning.{{citation needed|date=May 2019}}
* Parallel search in [[constraint programming]]<ref name="HamadiSais2018">{{cite book|author1=Youssef Hamadi|author2=Lakhdar Sais|title=Handbook of Parallel Constraint Reasoning|url=https://books.google.com/books?id=w5JUDwAAQBAJ&q=%22embarrassingly+parallel%22|date=5 April 2018|publisher=Springer|isbn=978-3-319-63516-3}}</ref>
 
==Implementations==
* In [[R (programming language)]] – The Simple Network of Workstations (SNOW) package implements a simple mechanism for using a set of workstations or a [[Beowulf cluster]] for embarrassingly parallel computations.<ref>[http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html Simple Network of Workstations (SNOW) package]</ref> Similar R packages include "future", "parallel" and others.
 
==See also==
* [[Amdahl's law]] defines value ''[[Amdahl's law#Parallelization|P]]'', which would be almost or exactly equal to 1 for embarrassingly parallel problems.
* [[Map (parallelCellular pattern)automaton]]
*[[MultiprocessingConnection Machine]]
*[[CUDA|CUDA framework]]
*[[Manycore processor]]
*[[Map (parallel pattern)]]
*[[Massively parallel]]
*[[Multiprocessing]]
*[[Parallel computing]]
*[[Process-oriented programming]]
*[[Shared-nothing architecture]] (SN)
*[[Symmetric multiprocessing]] (SMP)
*[[Connection Machine]]
*[[Cellular automaton]]
*[[CUDA|CUDA framework]]
*[[Manycore processor]]
*[[Vector processor]]
 
==References==
<references />
 
==External links==
{{Wiktionary}}
* [http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book/beowulf_book/node30.html Embarrassingly Parallel Computations], Engineering a Beowulf-style Compute Cluster
 
* "[http://www.cs.ucsb.edu/~gilbert/reports/hpec04.pdf Star-P: High Productivity Parallel Computing]"
* [http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book/beowulf_book/node30.html Embarrassingly Parallel Computations], Engineering a Beowulf-style Compute Cluster
* "[http://www.cs.ucsb.edu/~gilbert/reports/hpec04.pdf Star-P: High Productivity Parallel Computing]"
 
{{Parallel computing}}
 
[[Category:Parallel computing]]
[[Category:Distributed computing problems]]
[[Category:Applications of distributed computing]]
[[Category:Distributed computing problems]]
[[Category:Parallel computing]]