Embarrassingly parallel: Difference between revisions

Content deleted Content added
Monkbot (talk | contribs)
m Task 18 (cosmetic): eval 10 templates: del empty params (1×); hyphenate params (3×);
Examples: Monte Carlo Analysis is not a commonly used term. The main page on this topic has the Title Monte Carlo Method.
Tags: Mobile edit Mobile web edit
 
(28 intermediate revisions by 23 users not shown)
Line 1:
{{Short description|Parallel computing, a problem which is able to be trivially divided into parallelized tasks}}
In [[Parallelparallel computing]], a paradigm in computing which has multiple tasks running simultaneously, might contain what is known as an '''embarrassingly parallel''' workload or problem (also called '''embarrassingly parallelizable''', '''perfectly parallel''', '''delightfully parallel''' or '''pleasingly parallel'''{{citation needed|date=October 2020}}). Anis embarrassinglyone parallel task can be considered a trivial case -where little or no manipulationeffort is needed to separatesplit the problem into a number of parallel tasks.<ref>{{cite book|last1=Herlihy|first1=Maurice|last2=Shavit|first2=Nir|title=The Art of Multiprocessor Programming, Revised Reprint|date=2012|publisher=Elsevier|isbn=9780123977953|page=14|edition=revised|url=https://books.google.com/books?id=vfvPrSz7R7QC&q=embarrasingly|access-date=28 February 2016|quote=Some computational problems are “embarrassingly parallel”: they can easily be divided into components that can be executed concurrently.}}</ref> This is oftendue theto case where there is littleminimal or no dependency or need forupon communication between thosethe parallel tasks, or for results between them.<ref name=dbpp>Section 1.4.4 of: {{cite book
|url=http://www.mcs.anl.gov/~itf/dbpp/text/node10.html
|title=Designing and Building Parallel Programs
Line 11 ⟶ 12:
}}</ref>
 
Thus, these areThese differentdiffer from [[distributed computing]] problems, thatwhich need communication between tasks, especially communication of intermediate results. They are easyeasier to perform on [[server farm]]s which lack the special infrastructure used in a true [[supercomputer]] cluster. They are thus well -suited to large, Internet-based distributed[[volunteer computing]] platforms such as [[BOINC]], and dosuffer not sufferless from [[parallel slowdown]]. The opposite of embarrassingly parallel problems are [[inherently serial problem]]s, which cannot be parallelized at all.
 
A common example of an embarrassingly parallel problem is 3D video rendering handled by a [[graphics processing unit]], where each frame (forward method) or pixel ([[Ray tracing (graphics)|ray tracing]] method) can be handled with no interdependency.<ref name="ChalmersReinhard2011">{{cite book|author1=Alan Chalmers|author2=Erik Reinhard|author3=Tim Davis|title=Practical Parallel Rendering|url=https://books.google.com/books?id=loxhtzzG1FYC&q=%22embarrassingly+parallel%22|date=21 March 2011|publisher=CRC Press|isbn=978-1-4398-6380-0}}</ref> Some forms of [[password cracking]] are another embarrassingly parallel task that is easily distributed on [[central processing unit]]s, [[CPU core]]s, or clusters.
 
==Etymology==
"Embarrassingly" is used here in the same sense as in the phrase "an [[embarrassment of riches]]", meaning an overabundance—hereto referringrefer to parallelization problems which are "embarrassingly easy".<ref>Matloff, Norman (2011). ''The Art of R Programming: A Tour of Statistical Software Design'', p.347. No Starch. {{ISBN|9781593274108}}.</ref> The term may also imply embarrassment on the part of developers or compilers: "Because so many important problems remain unsolved mainly due to their intrinsic computational complexity, it would be embarrassing not to develop parallel implementations of polynomial [[homotopy]] continuation methods."<ref>{{cite book|last1=Leykin|first1=Anton|last2=Verschelde|first2=Jan|last3=Zhuang|first3=Yan|title=Mathematical Software - ICMS 2006 |chapter=Parallel Homotopy Algorithms to Solve Polynomial Systems |year=2006|journal=Proceedings of ICMS |volume=4151|pages=225–234|doi=10.1007/11832225_22|series=Lecture Notes in Computer Science|isbn=978-3-540-38084-9}}</ref> The term is first found in the literature in a 1986 book on multiprocessors by [[MATLAB]]'s creator [[Cleve Moler]],<ref name=hcmp>{{cite book
| titlecontribution = Matrix Computation on Distributed Memory Multiprocessors
| author = Moler, Cleve
| publisher = Society for Industrial and Applied Mathematics, Philadelphia
| journaltitle = Hypercube Multiprocessors
| editor-last = Heath
| editor-first = Michael T.
Line 30 ⟶ 31:
 
==Examples==
A trivial example involves serving static data. It would take very little effort to have many processing units produce the same set of bits. Indeed, the famous [["Hello, World!" program|Hello World]] problem could easily be parallelized with few programming considerations or computational costs.
 
Some examples of embarrassingly parallel problems include:
* [[Monte Carlo analysismethod]]<ref name="Kontoghiorghes2005">{{cite book|author=Erricos John Kontoghiorghes|title=Handbook of Parallel Computing and Statistics|url=https://books.google.com/books?id=BnNnKPkFH2kC&q=%22embarrassingly+parallel%22|date=21 December 2005|publisher=CRC Press|isbn=978-1-4200-2868-3}}</ref>
* Distributed relational database queries using [http://www.mysqlperformanceblog.com/2011/05/14/distributed-set-processing-with-shard-query/ distributed set processing].
* [[Numerical integration]]<ref name="Deng2013">{{cite book|author=Yuefan Deng|title=Applied Parallel Computing|url=https://books.google.com/books?id=YS9wvVeWrXgC&q=%22embarrassingly+parallel%22|year=2013|publisher=World Scientific|isbn=978-981-4307-60-4}}</ref>
* Bulk processing of unrelated files of similar nature in general, such as photo gallery resizing and conversion.
* Serving static files on a webserver to multiple users at once.{{citation needed|date=June 2019}}
* The [[Mandelbrot set]], [[Perlin noise]] and similar images, where each point is calculated independently.
* [[Rendering (computer graphics)|Rendering]] of [[computer graphics]]. In [[computer animation]], each [[video frame|frame]] or pixel may be rendered independently {{xref|(see: [[parallelParallel rendering]])}}.
* Some [[Brutebrute-force search]]es in [[cryptography]].<ref>{{Cite webjournal|url=https://tools.ietf.org/html/rfc7914#page-2|title=The scrypt Password-Based Key Derivation Function|last1first1=Simon|first1last1=Josefsson|last2first2=Colin|first2last2=Percival|author-link2=Colin Percival|date=August 2016|website=tools.ietf.org|doi=10.17487/RFC7914 |access-date=2016-12-12}}</ref> Notable real-world examples include [[distributed.net]] and [[proof-of-work]] systems used in [[cryptocurrency]].
* [[BLAST (biotechnology)|BLAST]] searches in [[bioinformatics]] with split databases.<ref>{{cite journal |last1=Mathog |first1=DR |title=Parallel BLAST on split databases. |journal=Bioinformatics |date=22 September 2003 |volume=19 |issue=14 |pages=1865–6 |doi=10.1093/bioinformatics/btg250 |pmid=14512366|doi-access=free }}</ref>
* [[BLAST]] searches in [[bioinformatics]] for multiple queries (but not for individual large queries).<ref>[http://seqanswers.com/forums/showpost.php?p=21050&postcount=3 SeqAnswers forum]</ref>
* Large scale [[facial recognition system]]s that compare thousands of arbitrary acquired faces (e.g., a security or surveillance video via [[closed-circuit television]]) with similarly large number of previously stored faces (e.g., a ''[[rogues gallery]]'' or similar [[No Fly List|watch list]]).<ref>[http://lbrandy.com/blog/2008/10/how-we-made-our-face-recognizer-25-times-faster/ How we made our face recognizer 25 times faster] (developer blog post)</ref>
* Computer simulations comparing many independent scenarios.
Line 45 ⟶ 48:
* Event simulation and reconstruction in [[particle physics]].
* The [[marching squares]] algorithm.
* Sieving step of the [[quadratic sieve]] and the [[General number field sieve|number field sieve]].
* Tree growth step of the [[random forest]] machine learning technique.
* [[Discrete Fourier transform]] where each harmonic is independently calculated.
* [[Convolutional neural network]]s running on [[GPU]]s.
* [[Hyperparameter optimization#Grid_search|Hyperparameter grid search]] in machine learning.{{citation needed|date=May 2019}}
* Parallel search in [[constraint programming]]<ref name="HamadiSais2018">{{cite book|author1=Youssef Hamadi|author2=Lakhdar Sais|title=Handbook of Parallel Constraint Reasoning|url=https://books.google.com/books?id=w5JUDwAAQBAJ&q=%22embarrassingly+parallel%22|date=5 April 2018|publisher=Springer|isbn=978-3-319-63516-3}}</ref>
 
==Implementations==
* In [[R (programming language)]] – The Simple Network of Workstations (SNOW) package implements a simple mechanism for using a set of workstations or a [[Beowulf cluster]] for embarrassingly parallel computations.<ref>[http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html Simple Network of Workstations (SNOW) package]</ref> Similar R packages include "future", "parallel" and others.
 
==See also==
* [[Amdahl's law]] defines value ''[[Amdahl's law#Parallelization|P]]'', which would be almost or exactly equal to 1 for embarrassingly parallel problems.
* [[Map (parallelCellular pattern)automaton]]
*[[MultiprocessingConnection Machine]]
*[[CUDA|CUDA framework]]
*[[Manycore processor]]
*[[Map (parallel pattern)]]
*[[Massively parallel]]
*[[Multiprocessing]]
*[[Parallel computing]]
*[[Process-oriented programming]]
*[[Shared-nothing architecture]] (SN)
*[[Symmetric multiprocessing]] (SMP)
*[[Connection Machine]]
*[[Cellular automaton]]
*[[CUDA|CUDA framework]]
*[[Manycore processor]]
*[[Vector processor]]
 
==References==
<references />
 
==External links==
{{Wiktionary}}
* [http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book/beowulf_book/node30.html Embarrassingly Parallel Computations], Engineering a Beowulf-style Compute Cluster
 
* "[http://www.cs.ucsb.edu/~gilbert/reports/hpec04.pdf Star-P: High Productivity Parallel Computing]"
* [http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book/beowulf_book/node30.html Embarrassingly Parallel Computations], Engineering a Beowulf-style Compute Cluster
* "[http://www.cs.ucsb.edu/~gilbert/reports/hpec04.pdf Star-P: High Productivity Parallel Computing]"
 
{{Parallel computing}}
 
[[Category:Parallel computing]]
[[Category:Distributed computing problems]]
[[Category:Applications of distributed computing]]
[[Category:Distributed computing problems]]
[[Category:Parallel computing]]