Automatic parallelization: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 23:59, 8 January 2022 edit Matthiaspaul (talk \| contribs) Autopatrolled, Extended confirmed users, Page movers, New page reviewers, Pending changes reviewers, Rollbackers, Template editors 137,876 edits m CE, added and improved refs ← Previous edit		Latest revision as of 14:40, 24 June 2025 edit undo Frap (talk \| contribs) Extended confirmed users, File movers, Pending changes reviewers, Rollbackers 35,585 edits →Code Generation: MOS:HEAD
(15 intermediate revisions by 11 users not shown)
Line 1: {{Short description\|Method of improving computer program speed}} ~~{{Refimprove\|date=February 2008}}~~ {{~~Use~~More ~~dmy~~citations ~~dates~~needed\|date=~~January~~February ~~2022\|cs-dates=y~~2008}} {{Use dmy dates\|date=January 2022}} {{Use list-defined references\|date=January 2022}} '''Automatic parallelization''', also '''auto parallelization''', or '''autoparallelization''' refers to converting sequential [[source code\|code]] into [[multi-threaded]] and/or [[Automatic vectorization\|vectorized]] code in order to use multiple processors simultaneously in a shared-memory [[multiprocessor]] ([[Symmetric multiprocessing\|SMP]]) machine.<ref name="Yehezkael_2000"/> Fully automatic parallelization of sequential programs is a challenge because it requires complex [[program analysis (computer science)\|program analysis]] and the best approach may depend upon parameter values that are not known at compilation time.<ref name="Fox-Williams_Messina_1994"/> Line 19 ⟶ 20: The [[Scheduling (computing)\|scheduler]] will list all the tasks and their dependencies on each other in terms of execution and start times. The scheduler will produce the optimal schedule in terms of number of processors to be used or the total execution time for the application. ===Code ~~Generation~~generation=== The [[Scheduling (computing)\|scheduler]] will generate a list of all the tasks and the details of the cores on which they will execute along with the time that they will execute for. The code Generator will insert special constructs in the code that will be read during execution by the scheduler. These constructs will instruct the scheduler on which core a particular task will execute along with the start and end times. Line 100 ⟶ 101: Most research [[compiler]]s for automatic parallelization consider [[Fortran]] programs,{{Citation needed\|date=July 2007}} because Fortran makes stronger guarantees about [[aliasing (computing)\|aliasing]] than languages such as [[C (programming language)\|C]]. Typical examples are: * [http://www.ece.northwestern.edu/cpdc/Paradigm/Paradigm.html Paradigm compiler] * [https://engineering.purdue.edu/Cetus/Documentation/manual/ch02s02.html Polaris compiler] * Polaris compiler * [https://scholarship.rice.edu/handle/1911/16677 Rice Fortran D compiler] * [[SUIF]] compiler * [https://dl.acm.org/doi/10.1155/1999/304639 Vienna Fortran compiler] Recently, Aubert, Rubiano, Rusch, and [[Thomas Seiller\|Seiller]]<ref>{{cite book \|last1=Aubert \|first1=Clément \|last2=Rubiano \|first2=Thomas \|last3=Rusch \|first3=Neea \|last4=Seiller \|first4=Thomas \|chapter= Distributing and Parallelizing Non-canonical Loops \|title= Verification, Model Checking, and Abstract Interpretation \|series=Lecture Notes in Computer Science \|date=2023 \|volume=13881 \|pages=91–108 \|doi=10.1007/978-3-031-24950-1_1 \|isbn=978-3-031-24949-5 }}</ref> used a dependency analysis technique <ref>{{cite book \|last1=Moyen \|first1=Jean-Yves \|last2=Rubiano \|first2=Thomas \|last3=Seiller \|first3=Thomas \|chapter=Loop Quasi-Invariant Chunk Detection \|title=Automated Technology for Verification and Analysis \|series=Lecture Notes in Computer Science \|date=2017 \|volume=10482 \|pages=91–108 \|doi=10.1007/978-3-319-68167-2_7\|isbn=978-3-319-68166-5 }}</ref> to automatically parallelise loops in [[C (programming language)\|C]] code. ==See also== * [[Loop nest optimization]] * [[~~Polytope~~Parallelization ~~model~~contract]] * [[Polytope model]] also known as Polyhedral model * [[Scalable parallelism]] * [[BMDFM]] Line 116 ⟶ 121: ==References== {{reflist\|refs= <ref name="Yehezkael_2000">{{cite ~~journal~~book \|author-last=Yehezkael \|author-first=Rafael \|title=Applied Parallel Computing. New Paradigms for HPC in Industry and Academia \|chapter=Experiments in Separating Computational Algorithm from Program Distribution and Communication ~~\|journal=~~ \|series=[[Lecture Notes in Computer Science]] \|publisher=[[Springer Verlag]] \|date=2000 \|volume=1947 \|pages=268–278 \|doi=10.1007/3-540-70734-4_32 \|isbn=978-3-540-41729-3 \|chapter-url=http://u.cs.biu.ac.il/~wiseman/para2001.pdf}}</ref> <ref name="Fox-Williams_Messina_1994">{{cite book \|author-last1=Fox \|author-first1=Geoffrey \|author-first2=Roy \|author-last2=Williams \|author-first3=Paul \|author-last3=Messina \|title=Parallel Computing Works! \|date=1994 \|publisher=[[Morgan Kaufmann]] \|pages=575, 593 \|isbn=978-1-55860-253-3}}</ref> <ref name="Campanoni-Jones-Holloway-Wei-Brooks_2012">{{cite book \|title=The HELIX Project: Overview and Directions \|author-first1=Simone \|author-last1=Campanoni \|author-first2=Timothy \|author-last2=Jones \|author-first3=Glenn \|author-last3=Holloway \|author-first4=Gu-Yeon \|author-last4=Wei \|author-first5=David \|author-last5=Brooks \|date=2012 \|url=http://helix.eecs.harvard.edu/index.php/DAC2012}}</ref> <ref name="Anantpur-Govindarajan_2013">{{cite web \|title=Runtime dependence computation and execution of loops on heterogeneous systems \|author-first1=J. \|author-last1=Anantpur \|author-first2=R. \|author-last2=Govindarajan \|url=http://hpc.serc.iisc.ernet.in/~jayvant/papers/CGO-2013.pdf \|access-date=2015-10-05 \|url-status=dead \|archive-url=https://web.archive.org/web/20151006123251/http://hpc.serc.iisc.ernet.in/~jayvant/papers/CGO-2013.pdf \|archive-date=2015-10-06}}</ref> <ref name="Zhuang-Eichenberger-Luo-OBrien">{{~~cite~~citation \|title=Exploiting Parallelism with Dependence-Aware Scheduling \|author-first1=X. \|author-last1=Zhuang \|author-first2=A. E. \|author-last2=Eichenberger \|author-first3=Y. \|author-last3=Luo \|author-last4=O'Brien \|author-first4=Kathryn Kevin \|url=https://www.researchgate.net~~/profile/Alexandre_Eichenberger~~/publication/~~220884771_Exploiting_Parallelism_with_Dependence-Aware_Scheduling/links/09e415028c831aa41f000000/Exploiting-Parallelism-with-Dependence-Aware-Scheduling.pdf~~220884771}}</ref> <ref name="Blitzprog">{{cite web \|title=Automatic parallelism and data dependency \|url=http://blitzprog.org/posts/automatic-parallelism-and-data-dependency \|url-status=dead \|archive-url=https://web.archive.org/web/20140714111836/http://blitzprog.org/posts/automatic-parallelism-and-data-dependency \|archive-date=2014-07-14}}</ref> <ref name="Rünger_2006">{{cite journal \|title=Parallel Programming Models for Irregular Algorithms \|author-last=Rünger \|author-first=Gudula \|journal=Parallel Algorithms and Cluster Computing \|date=2006 \|volume=52 \|pages=3–23 \|doi=10.1007/3-540-33541-2_1 \|series=Lecture Notes in Computational Science and Engineering \|isbn=978-3-540-33539-9}}</ref> Line 126 ⟶ 131: ==Further reading== * {{cite magazine \|title=Configuring parallel programs, Part 1: The Occam Transpiler, now under development, will make writing software for parallel processing easier \|author-first=Dick \|author-last=Pountain \|magazine=[[BYTE (magazine)\|BYTE]] \|publisher=[[McGraw-Hill, Inc.]] \|issn=0360-5280 \|volume=14 \|number=13 \|series= \|date=December 1989 \|id=<!-- \|ia=byte-magazine-1989-12 --> ark:/13960/t34188734 \|pages=349–352 \|url=https://archive.org/details/byte-magazine-1989-12/page/n382/mode/1up \|access-date=2022-01-06}} (NB. Uses the term ''Occam [[transpiler]]'' as a synonym for a [[source-to-source compiler]] working as a [[pre-processor]] that takes a normal [[occam (programming language)\|occam]] program as input and derives a new occam source code as output with link-to-channel assignments etc. added to it thereby ''[[computer configuration\|configuring]]'' it for [[parallel processing (computing)\|parallel processing]] to run as efficient as possible on a network of [[transputer]]s.) {{Compiler optimizations}} {{DEFAULTSORT:Automatic Parallelization}} ~~<!--Categories-->~~ [[Category:Articles with example Fortran code]] [[Category:Compiler optimizations]]