Revision as of 19:21, 4 February 2016 edit ScotXW (talk \| contribs) Extended confirmed users 14,442 edits m fix ← Previous edit		Revision as of 09:09, 10 March 2016 edit undo ScotXW (talk \| contribs) Extended confirmed users 14,442 edits m ref fix Next edit →
Line 1: '''Single instruction, multiple thread''' (SIMT) is a [[parallel computing\|parallel]] execution model, used in some [[GPGPU]] platforms, where [[Thread (computing)#Multithreading\|multithreading]] is simulated by [[SIMD]] processors. The processors, say a number {{mvar\|p}} of them, seem to execute many more than {{mvar\|p}} tasks. This is achieved by each processor having multiple "threads" (or "work-items" or "Sequence of SIMD Lane operations"), which execute in lock-step, and are analogous to SIMD "lanes".<ref ~~name="spp"~~>{{cite book \|author1=Michael McCool \|author2=James Reinders \|author3=Arch Robison \|title=Structured Parallel Programming: Patterns for Efficient Computation \|publisher=Elsevier \|year=2013 \|page=52}}</ref> SIMT was introduced by [[Nvidia]]:<ref>{{cite web \|url=http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf \|title=Nvidia Fermi Compute Arcitecture Whitepaper \|date=2009 \|website=http://www.nvidia.com/ \|publisher=NVIDIA Corporation \|accessdate=2014-07-17}}</ref><ref name=teslaPaper>{{cite web \|url=http://dx.doi.org/10.1109/MM.2008.31 \|title=NVIDIA Tesla: A Unified Graphics and Computing Architecture \|date=2008 \|website=http://www.ieee.org/ \|publisher=IEEE \|accessdate=2014-08-07 \|page=6 {{subscription required\|s}} }}</ref> Line 9: <!-- Strictly, the latency-hiding is a feature of the zero-overhead scheduling implemented by modern GPUs... this might or might not be considered to be a property of 'SIMT' itself --> A downside of SIMT execution is the fact that thread-specific control-flow is performed using "masking", leading to poor utilisation where a processor's threads follow different control-flow paths. For instance, to handle an ''if''-''else'' block where various threads of a processor execute different paths, all threads must actually process both paths (as all threads of a processor always execute in lock-step), but masking is used to disable and enable the various threads as appropriate. Masking is avoided when control flow is coherent for the threads of a processor, i.e. they all follow the same path of execution. The masking strategy is what distinguishes SIMT from ordinary SIMD, and has the benefit of inexpensive synchronization between the threads of a processor.<ref ~~name="spp"~~>{{cite book \|author1=Michael McCool \|author2=James Reinders \|author3=Arch Robison \|title=Structured Parallel Programming: Patterns for Efficient Computation \|publisher=Elsevier \|year=2013 \|pages=209 ff.}}</ref> {\| class="wikitable" style="style="font-size:80%; text-align: center"

Single instruction, multiple threads: Difference between revisions