General-purpose computing on graphics processing units: Difference between revisions

Content deleted Content added
WipEout! (talk | contribs)
No edit summary
WipEout! (talk | contribs)
No edit summary
Line 18:
For early fixed function or limited programmability graphics (i.e. up to and including DirectX8.1 compliant GPU's) this was sufficient because this is also the representation used in displays. This representation does have certain limitations, however. Given sufficient graphics processing power even graphics programmers would like to use better formats, such as [[floating point]] data formats, in order to obtain effects such as [[high dynamic range imaging]]. Many GPGPU applications require floating point accuracy, which came with graphics cards conforming to the DirectX9 specification.
 
DirectX9 Shader Model 2.x suggested the support of two precision types: full and partial precision. Full precision support could either be and FP24 (floating point 24-bit per component) or greater, while partial precision was FP16. [[ATI Technologies|ATI’s]] [[Radeon R300| R300 series]] of GPU’s supported FP24 precision only in the programmable fragment pipeline (although FP32 was supported in the vertex processors) while [[NVIDIA Corporation|NVIDIA’s]] [[GeForce FX|NV30]] series supported both FP16 and FP32; other vendors such as [[S3 Graphics]] and [[XGI Technology|XGI]] supported a mixture of formats up to FP24.
As this is written ([[9 December]] [[2005]]) GPUs commonly support two floating point formats:
* 16 bits per component - half precision float
* 32 bits per component - single precision float
While 64 bit floating point values (double precision float) are commonly available on CPUs, these are not currently available on GPUs. Some applications require at least double precision floating point values and thus cannot currently be ported to GPUs. There have been efforts to emulate double precision floating point values on GPUs{{ref|doublePrecisionOnGPU}}.
 
Shader Model 3.0 altered the specification, increasing full precision requirements to a minimum of FP32 support in the fragment pipeline. ATI’s Shader Model 3.0 compliant R5xx generation ([[Radeon R520|Radeon X1000 series]]) supports just FP32 throughout the pipeline while NVIDIA’s [[GeForce 6 Series|NV4x]] and [[GeForce 7 Series|G7x]] series continued to support both FP32 full precision and FP16 partial precisions. Although not stipulated by Shader Model 3.0, both ATI and NVIDIA’s Shader Model 3.0 GPU’s introduced support for blendable FP16 render targets, easier facilitating the support for High Dynamic Range Rendering.
[[NVIDIA Corporation|NVIDIA]] GPUs currently support 32 bit values through the entire pipeline. [[ATI Technologies|ATI]] cards currently support 24 bit values throughout the pipeline, although their new X1000 series supports 32 bits. The implementations of floating point on GPUs are generally not [[IEEE floating-point standard|IEEE]] compliant, and generally do not match across vendors. This has implications for correctness which are considered important to some scientific applications.
 
The implementations of floating point on GPUs are generally not [[IEEE floating-point standard|IEEE]] compliant, and generally do not match across vendors. This has implications for correctness which are considered important to some scientific applications. While 64 bit floating point values (double precision float) are commonly available on CPUs, these are not currently available on GPUs. Some applications require at least double precision floating point values and thus cannot currently be ported to GPUs. There have been efforts to emulate double precision floating point values on GPUs{{ref|doublePrecisionOnGPU}}.
Operations on the GPU operate in a vectorized fashion: a single operation can be performed on up to four values at once. For instance, if one color <R1, G1, B1> is to be modulated by another color <R2, G2, B2>, the GPU can produce the resulting color <R1*R2, G1*G2, B1*B2> in a single operation. This functionality is useful in graphics because almost everything basic data type is a vector (either 2, 3, or 4 dimensional). Examples include vertices, colors, normal vectors, and texture coordinates. Many other applications can put this to good use, and because of this vector instructions ([[SIMD]]) have already been added to CPUs.
 
OperationsMost operations on the GPU operate in a vectorized fashion: a single operation can be performed on up to four values at once. For instance, if one color <R1, G1, B1> is to be modulated by another color <R2, G2, B2>, the GPU can produce the resulting color <R1*R2, G1*G2, B1*B2> in a single operation. This functionality is useful in graphics because almost everything basic data type is a vector (either 2, 3, or 4 dimensional). Examples include vertices, colors, normal vectors, and texture coordinates. Many other applications can put this to good use, and because of this vector instructions ([[SIMD]]) have already been added to CPUs.
 
==GPGPU programming concepts==