{{short description|Microarchitecture of a microprocessor designed to serve a serial computing thread with low latency}}
{{Orphan|date=December 2016}}
'''Latency oriented processor architecture''' is the [[microarchitecture]] of a [[microprocessor]] designed to serve a serial computing [[Thread (computing)|thread]] with a low latency. This is typical of most [[Centralcentral Processingprocessing Unitunit]]s (CPU) being developed since the 1970s. These architectures, in general, aim to execute as many instructions as possible belonging to a single serial thread, in a given window of time; however, the time to execute a single instruction completely from fetch to retire stages may vary from a few cycles to even a few hundred cycles in some cases.<ref>{{cite book| author1=John Paul Shen |author2=Mikko H. Lipasti |year=2013 |title=Modern Processor Design |publisher=McGraw-Hill Professional |isbn=978-1478607830}}</ref>{{page needed|date=November 2016}} Latency oriented processor architectures are the opposite of throughput-oriented processors which concern themselves more with the total [[throughput]] of the system, rather than the service [[Latency (engineering)|latencies]] for all individual threads that they work on.<ref name=YanSohilin2016>{{cite book|author=Yan Solihin |year=2016 |title=Fundamentals of Parallel Multicore Architecture |publisher=Chapman & Hall/CRC Computational Science |isbn=978-1482211184}}</ref>{{page needed|date=November 2016}}<ref name=GarlandKirk>{{cite journal|title=Understanding Throughput-Oriented Architectures |author1=Michael Garland |author2=David B. Kirk |journal=Communications of the ACM |volume=53 |number=11 |pages=58–66 |doi=10.1145/1839676.1839694|year=2010 |doi-access=free }}</ref>▼
▲'''Latency oriented processor architecture''' is the [[microarchitecture]] of a [[microprocessor]] designed to serve a serial computing [[Thread (computing)|thread]] with a low latency. This is typical of most [[Central Processing Unit]]s (CPU) being developed since the 1970s. These architectures, in general, aim to execute as many instructions as possible belonging to a single serial thread, in a given window of time; however, the time to execute a single instruction completely from fetch to retire stages may vary from a few cycles to even a few hundred cycles in some cases.<ref>{{cite book| author1=John Paul Shen |author2=Mikko H. Lipasti |year=2013 |title=Modern Processor Design |publisher=McGraw-Hill Professional |isbn=978-1478607830}}</ref>{{page needed|date=November 2016}} Latency oriented processor architectures are the opposite of throughput-oriented processors which concern themselves more with the total [[throughput]] of the system, rather than the service [[Latency (engineering)|latencies]] for all individual threads that they work on.<ref name=YanSohilin2016>{{cite book|author=Yan Solihin |year=2016 |title=Fundamentals of Parallel Multicore Architecture |publisher=Chapman & Hall/CRC Computational Science |isbn=978-1482211184}}</ref>{{page needed|date=November 2016}}<ref name=GarlandKirk>{{cite journal|title=Understanding Throughput-Oriented Architectures |author1=Michael Garland |author2=David B. Kirk |journal=Communications of the ACM |volume=53 |number=11 |pages=58–66 |doi=10.1145/1839676.1839694|year=2010 |doi-access=free }}</ref>
==Flynn's taxonomy==
{{Main|Flynn's taxonomy}}
LatencyTypically, latency oriented processor architectures wouldexecute normallya fallsingle intotask theoperating categoryon ofa single data stream, and so they are [[Single instruction, single data|SISD]] classification under flynnFlynn's taxonomy. This implies a typical characteristic of latencyLatency oriented processor architectures ismight toalso executeinclude a[[Single singleinstruction, task operating on a singlemultiple data stream. Some [[|SIMD]] styleinstruction multimediaset extensions of popular instruction sets, such as Intel [[MMX (instruction set)|MMX]] and [[Streaming SIMD Extensions|SSE]] instructions, should also fall under the category of latency oriented processor architectures;<refname=YanSohilin2016/>even because,though althoughthese theyextensions operate on a large data setsets, their primary goal is also to reduce overall latency.<ref for the entire task at hand.name=YanSohilin2016/>
==Implementation techniques==
Line 14 ⟶ 13:
===Instruction set architecture (ISA)===
{{Main|Instruction set}}
Most architectures today use shorter and simpler instructions, like the [[load/store architecture]], which help in optimizing the instruction pipeline for faster execution. Instructions are usually all of the same size which also helps in optimizing the instruction fetch logic. Such an ISA is called a [[Reduced instruction set computing|RISC]] architecture.<ref>{{cite conference|last1=Bhandarkar|first1=Dileep|last2=Clark|first2=Douglas W. |title=PerformanceProceedings fromof Architecture:the Comparingfourth ainternational RISCconference andon aArchitectural CISCsupport withfor programming languages and operating systems Similar- HardwareASPLOS-IV Organization|journalchapter=ProceedingsPerformance offrom thearchitecture: FourthComparing Internationala ConferenceRISC onand Architecturala SupportCISC forwith Programmingsimilar Languages andhardware Operatingorganization Systems|date=1 January 1991|pages=310–319|doi=10.1145/106972.107003|url=http://dl.acm.org/citation.cfm?id=107003&CFID=860927590&CFTOKEN=39315780|publisher=ACM|isbn=0897913809 |doi-access=free}}</ref>