General-purpose computing on graphics processing units: Difference between revisions

Content deleted Content added
GreenC bot (talk | contribs)
Rescued 1 archive link; reformat 1 link. Wayback Medic 2.5
A relevant reference was added to the page.
Line 14:
In principle, any arbitrary [[boolean function]], including addition, multiplication, and other mathematical functions, can be built up from a [[functional completeness|functionally complete]] set of logic operators. In 1987, [[Conway's Game of Life]] became one of the first examples of general-purpose computing using an early [[stream processing|stream processor]] called a [[blitter]] to invoke a special sequence of [[bit blit|logical operations]] on bit vectors.<ref>{{cite journal|last=Hull|first=Gerald|title=LIFE|journal=Amazing Computing|volume=2|issue=12|pages=81–84|date=December 1987|url=https://archive.org/stream/amazing-computing-magazine-1987-12/Amazing_Computing_Vol_02_12_1987_Dec#page/n81/mode/2up}}</ref>
 
General-purpose computing on GPUs became more practical and popular after about 2001, with the advent of both programmable [[shader]]s and [[floating point]] support on graphics processors. Notably, problems involving [[matrix (mathematics)|matrices]] and/or [[vector (mathematics and physics)|vector]]s{{snd}} especially two-, three-, or four-dimensional vectors{{snd}} were easy to translate to a GPU, which acts with native speed and support on those types. A significant milestone for GPGPU was the year 2003 when two research groups independently discovered GPU-based approaches for the solution of general linear algebra problems on GPUs that ran faster than on CPUs.<ref>{{Cite journal |last=Krüger |first=Jens |last2=Westermann |first2=Rüdiger |date=July 2003 |title=Linear algebra operators for GPU implementation of numerical algorithms |url=https://dl.acm.org/doi/10.1145/882262.882363 |journal=ACM Transactions on Graphics |language=en |volume=22 |issue=3 |pages=908–916 |doi=10.1145/882262.882363 |issn=0730-0301}}</ref><ref>{{Cite journal |last=Bolz |first=Jeff |last2=Farmer |first2=Ian |last3=Grinspun |first3=Eitan |last4=Schröder |first4=Peter |date=July 2003 |title=Sparse matrix solvers on the GPU: conjugate gradients and multigrid |url=https://dl.acm.org/doi/10.1145/882262.882364 |journal=ACM Transactions on Graphics |language=en |volume=22 |issue=3 |pages=917–924 |doi=10.1145/882262.882364 |issn=0730-0301}}</ref> These early efforts to use GPUs as general-purpose processors required reformulating computational problems in terms of graphics primitives, as supported by the two major APIs for graphics processors, [[OpenGL]] and [[DirectX]]. This cumbersome translation was obviated by the advent of general-purpose programming languages and APIs such as [[Lib Sh|Sh]]/[[RapidMind]], [[BrookGPU|Brook]] and Accelerator.<ref>{{cite journal |last1=Tarditi |first1=David |first2=Sidd |last2=Puri |first3=Jose |last3=Oglesby |title=Accelerator: using data parallelism to program GPUs for general-purpose uses |journal=ACM SIGARCH Computer Architecture News |volume=34 |issue=5 |date=2006|url=https://www.cs.cmu.edu/afs/cs/academic/class/15740-f07/public/discussion-papers/26-tarditi-asplos06.pdf|doi=10.1145/1168919.1168898 }}</ref><ref>{{cite journal |last1=Che |first1=Shuai |last2=Boyer |first2=Michael |last3=Meng |first3=Jiayuan |last4=Tarjan |first4=D. |last5=Sheaffer |first5=Jeremy W. |last6=Skadron |first6=Kevin |title=A performance study of general-purpose applications on graphics processors using CUDA |journal=J. Parallel and Distributed Computing |volume=68 |issue=10 |date=2008 |pages=1370–1380 |doi=10.1016/j.jpdc.2008.05.014 |df=dmy-all |citeseerx=10.1.1.143.4849 }}</ref><ref>{{cite journal |last1=Glaser |first1=J. |last2=Nguyen |first2=T. D. |last3=Anderson |first3=J. A. |last4=Lui |first4=P. |last5=Spiga |first5=F. |last6=Millan |first6=J. A. |last7=Morse |first7=D. C. |last8=Glotzer |first8=S. C. |date=2015 |title=Strong scaling of general-purpose molecular dynamics simulations on GPUs |url=https://doi.org/10.1016/j.cpc.2015.02.028 |journal=Computer Physics Communications |volume=192 |pages=97-107 | doi=10.1016/j.cpc.2015.02.028| doi-access=free}}</ref>
 
These were followed by Nvidia's [[CUDA]], which allowed programmers to ignore the underlying graphical concepts in favor of more common [[high-performance computing]] concepts.<ref name="du">{{Cite journal |doi= 10.1016/j.parco.2011.10.002 |title= From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming |journal= Parallel Computing |volume= 38 |issue= 8 |pages= 391–407 |year= 2012 |last1= Du |first1= Peng |last2= Weber |first2= Rick |last3= Luszczek |first3= Piotr |last4= Tomov |first4= Stanimire |last5= Peterson |first5= Gregory |last6= Dongarra |first6= Jack |author-link6= Jack Dongarra |df= dmy-all |citeseerx= 10.1.1.193.7712 }}</ref> Newer, hardware-vendor-independent offerings include Microsoft's [[DirectCompute]] and Apple/Khronos Group's [[OpenCL]].<ref name="du"/> This means that modern GPGPU pipelines can leverage the speed of a GPU without requiring full and explicit conversion of the data to a graphical form.