Single instruction, multiple threads: Difference between revisions

Content deleted Content added
Nyuzi GPGPU: add nyuzi performance analysis link
Tags: Mobile edit Mobile web edit Advanced mobile edit
Description: ILLIAC IV having masked predication is a big damn deal as it predates NVIDIA and AMD by 30 years.
Tags: Mobile edit Mobile web edit Advanced mobile edit
Line 31:
== Description ==
 
SIMT processors execute multiple "threads" (or "work-items" or "Sequence of SIMD Lane operations"), in lock-step, under the control of a single central unit. The model hasshares muchcommon in commonfeatures with [[SIMD lanes]].<ref>{{cite book |author1=Michael McCool |author2=James Reinders |author3=Arch Robison |title=Structured Parallel Programming: Patterns for Efficient Computation |publisher=Elsevier |year=2013 |page=52}}</ref>
 
The [[ILLIAC IV]] as the world's first known SIMT processor had its [[ILLIAC_IV#Branches|"branching"]] mechanism extensively documented, however fascinatingly it turns out to be [[Predication_(computer_architecture)#SIMD,_SIMT_and_vector_predication|"predicate masking"]] in modern terminology.
 
As access time of all the widespread [[random-access memory|RAM]] types (e.g. [[DDR SDRAM]], [[GDDR SDRAM]], [[XDR DRAM]], etc.) is still relatively high, engineers came up with the idea to hide the latency that inevitably comes with each memory access. Strictly, the latency-hiding is a feature of the zero-overhead scheduling implemented by modern GPUs.