Content deleted Content added
Guy Harris (talk | contribs) →SIMD on the web: Left hand was one position too far rightward. |
add ILLIAC IV reference. nyer nyer to NVIDIA who claimed they were first. sorry, ILLIAC beat ya by 50 years :) Tags: Mobile edit Mobile web edit Advanced mobile edit |
||
Line 10:
Such machines exploit [[Data parallelism|data level parallelism]], but not [[Concurrent computing|concurrency]]: there are simultaneous (parallel) computations, but each unit performs exactly the same instruction at any given moment (just with different data). A simple example is to add many pairs of numbers together, all of the SIMD units are performing an addition, but each one has different pairs of values to add. SIMD is especially applicable to common tasks such as adjusting the contrast in a [[digital image]] or adjusting the volume of [[digital audio]]. Most modern [[central processing unit]] (CPU) designs include SIMD instructions to improve the performance of [[multimedia]] use. In recent CPUs, SIMD units are tightly coupled with cache hierarchies and prefetch mechanisms, which minimize latency during large block operations. For instance, AVX-512-enabled processors can prefetch entire cache lines and apply fused multiply-add operations (FMA) in a single SIMD cycle.
SIMD has three different subcategories in [[Flynn's taxonomy#Single instruction stream, multiple data streams (SIMD)|Flynn's 1972 Taxonomy]], one of which is [[single instruction, multiple threads]] (SIMT).
Each hardware element (PU) working on individual data item sometimes also referred as SIMD lane or channel. Modern [[graphics processing unit]]s (GPUs) are often wide SIMD (typically >16 data lanes or channel) implementations.{{cn|date=July 2024}} Some newer GPUs go beyond simple SIMD and integrate mixed-precision SIMD pipelines, which allow concurrent execution of [[8-bit computing|8-bit]], [[16-bit computing|16-bit]], and [[32-bit computing|32-bit]] operations in different lanes. This is critical for applications like AI inference, where mixed precision boosts throughput.
|