Revision as of 16:54, 7 November 2017 edit Nandushines (talk \| contribs) 7 edits m Rename Pratt to Patt ← Previous edit		Revision as of 14:06, 13 February 2018 edit undo 95.47.136.74 (talk) No edit summary Next edit →
Line 12: [[ATI Technologies]] (now [[Advanced Micro Devices\|AMD]]) released a competing product slightly later on May 14, 2007, the [[TeraScale (microarchitecture)#TeraScale 1\|TeraScale 1]]-based ''"R600"'' GPU chip. As access time of all the widespread [[random-access memory\|RAM]] types (e.g. [[DDR SDRAM]], [[GDDR SDRAM]], [[XDR DRAM]], etc.) is still relatively ~~low~~high, engineers came up with the idea to hide the latency that inevitably comes with each memory access. Strictly, the latency-hiding is a feature of the zero-overhead scheduling implemented by modern GPUs. This might or might not be considered to be a property of 'SIMT' itself. SIMT is intended to limit [[instruction fetching]] overhead,<ref>{{cite conference \|first1=Sean \|last1=Rul \|first2=Hans \|last2=Vandierendonck \|first3=Joris \|last3=D’Haene \|first4=Koen \|last4=De Bosschere \|title=An experimental study on performance portability of OpenCL kernels \|year=2010 \|conference=Symp. Application Accelerators in High Performance Computing (SAAHPC)}}</ref> i.e. the latency that comes with memory access, and is used in modern GPUs (such as those of [[Nvidia]] and [[AMD]]) in combination with 'latency hiding' to enable high-performance execution despite considerable latency in memory-access operations. This is where the processor is oversubscribed with computation tasks, and is able to quickly switch between tasks when it would otherwise have to wait on memory. This strategy is comparable to [[Multithreading (computer architecture)\|multithreading in CPUs]] (not to be confused with [[Multi-core processor\|multi-core]]).<ref>{{cite web \|url=http://www.cc.gatech.edu/~vetter/keeneland/tutorial-2011-04-14/12-advanced_topics_in_cuda.pdf \|title=Advanced Topics in CUDA \|date=2011 \|website=cc.gatech.edu \|accessdate=2014-08-28}}</ref>

Single instruction, multiple threads: Difference between revisions