Single instruction, multiple threads: Difference between revisions

Content deleted Content added
No edit summary
No edit summary
Line 1:
'''Single instruction, multiple thread''' ('''SIMT''') is an execution model used in [[parallel computing]] where [[single instruction, multiple data]] (SIMD) is combined with [[Thread (computing)#Multithreading|multithreading]]. It is different from [[SPMD]] in that all instructions in all "threads" are ran in lock-step. The SIMT execution model has been implemented on several [[GPU]]s and is relevant for [[general-purpose computing on graphics processing units]] (GPGPU), e.g. some [[supercomputer]]s combine CPUs with GPUs.
{{missing information||difference from [[SPMD]], if any}}
 
'''Single instruction, multiple thread''' ('''SIMT''') is an execution model used in [[parallel computing]] where [[single instruction, multiple data]] (SIMD) is combined with [[Thread (computing)#Multithreading|multithreading]].
 
==Overview==
The processors, say a number {{mvar|p}} of them, seem to execute many more than {{mvar|p}} tasks. This is achieved by each processor having multiple "threads" (or "work-items" or "Sequence of SIMD Lane operations"), which execute in lock-step, and are analogous to [[SIMD lanes]].<ref>{{cite book |author1=Michael McCool |author2=James Reinders |author3=Arch Robison |title=Structured Parallel Programming: Patterns for Efficient Computation |publisher=Elsevier |year=2013 |page=52}}</ref>
 
==History==
The SIMT execution model has been implemented on several [[GPU]]s and is relevant for [[general-purpose computing on graphics processing units]] (GPGPU), e.g. some [[supercomputer]]s combine CPUs with GPUs.
 
SIMT was introduced by [[Nvidia]]:<ref>{{cite web |url=http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf |title=Nvidia Fermi Compute Architecture Whitepaper |date=2009 |website=http://www.nvidia.com/ |publisher=NVIDIA Corporation |accessdate=2014-07-17}}</ref><ref name=teslaPaper>{{cite journal |title=NVIDIA Tesla: A Unified Graphics and Computing Architecture |date=2008 |page=6 {{subscription required|s}} |doi=10.1109/MM.2008.31 |volume=28 |issue=2 |journal=IEEE Micro|last1=Lindholm |first1=Erik |last2=Nickolls |first2=John |last3=Oberman |first3=Stuart |last4=Montrym |first4=John }}</ref>
Line 12 ⟶ 11:
 
[[ATI Technologies]] (now [[Advanced Micro Devices|AMD]]) released a competing product slightly later on May 14, 2007, the [[TeraScale (microarchitecture)#TeraScale 1|TeraScale 1]]-based ''"R600"'' GPU chip.
 
== Description ==
 
As access time of all the widespread [[random-access memory|RAM]] types (e.g. [[DDR SDRAM]], [[GDDR SDRAM]], [[XDR DRAM]], etc.) is still relatively high, engineers came up with the idea to hide the latency that inevitably comes with each memory access. Strictly, the latency-hiding is a feature of the zero-overhead scheduling implemented by modern GPUs. This might or might not be considered to be a property of 'SIMT' itself.
Line 20 ⟶ 21:
 
{| class="wikitable" style="style="font-size:80%; text-align: center"
|+ SIMT Terminology
! Nvidia [[CUDA]] || [[OpenCL]] || Hennessy & Patterson<ref>{{cite book |author1=John L. Hennessy |author2=David A. Patterson|title=Computer Architecture: A Quantitative Approach |year=1990 |url=https://archive.org/details/computerarchitec00patt_045 |url-access=limited |publisher=Morgan Kaufmann |edition=6 |pages=[https://archive.org/details/computerarchitec00patt_045/page/n263 314] ff}}</ref>
|-