Latency oriented processor architecture: Difference between revisions

Content deleted Content added
m Reverted edits by 174.255.193.162 (talk) (HG) (3.4.4)
Citation bot (talk | contribs)
m Alter: isbn. Add: year. | You can use this bot yourself. Report bugs here. | User-activated.
Line 1:
{{Orphan|date=December 2016}}
 
'''Latency oriented processor architecture''' is the [[microarchitecture]] of a [[microprocessor]] designed to serve a serial computing [[Thread (computing)|thread]] with a low latency. This is typical of most [[Central Processing Unit]]s (CPU) being developed since the 1970s. These architectures, in general, aim to execute as many instructions as possible belonging to a single serial thread, in a given window of time; however, the time to execute a single instruction completely from fetch to retire stages may vary from a few cycles to even a few hundred cycles in some cases.<ref>{{cite book| author1=John Paul Shen |author2=Mikko H. Lipasti |year=2013 |title=Modern Processor Design |publisher=McGraw-Hill Professional |isbn=1478607831978-1478607830}}</ref>{{page needed|date=November 2016}} Latency oriented processor architectures are the opposite of throughput-oriented processors which concern themselves more with the total [[throughput]] of the system, rather than the service [[Latency (engineering)|latencies]] for all individual threads that they work on.<ref name=YanSohilin2016>{{cite book|author=Yan Solihin |year=2016 |title=Fundamentals of Parallel Multicore Architecture |publisher=Chapman & Hall/CRC Computational Science |isbn=978-1482211184}}</ref>{{page needed|date=November 2016}}<ref name=GarlandKirk>{{cite journal|title=Understanding Throughput-Oriented Architectures |author1=Michael Garland |author2=David B. Kirk |journal=Communications of the ACM |volume=53 |number=11 |pages=58–66 |doi=10.1145/1839676.1839694|year=2010 }}</ref>
 
==Flynn's taxonomy==
Line 8:
 
==Implementation techniques==
There are many architectural techniques employed to reduce the overall latency for a single computing task. These typically involve adding additional hardware in the [[Pipeline (computing)|pipeline]] to serve instructions as soon as they are fetched from [[Random-access memory|memory]] or [[CPU cache|instruction cache]]. A notable characteristic of these architectures is that a significant area of the chip is used up in parts other than the [[Execution unit|Execution Units]] themselves. This is because the intent is to bring down the time required to complete a 'typical' task in a computing environment. A typical computing task is a serial set of instructions, where there is a high dependency on results produced by the previous instructions of the same task. Hence, it makes sense that the microprocessor will be spending its time doing many other tasks other than the calculations required by the individual instructions themselves. If the [[Hazard (computer architecture)|hazards]] encountered during computation are not resolved quickly, then latency for the thread increases. This is because hazards stall execution of subsequent instructions and, depending upon the pipeline implementation, may either stall progress completely until the dependency is resolved or lead to an avalanche of more hazards in future instructions; further exacerbating execution time for the thread.<ref name="quant">{{cite book|author1=John L. Hennessy |author2=David A. Patterson |title=Computer Architecture: A Quantitative Approach |edition=Fifth |year=2013 |publisher=Morgan Kaufmann Publishers |isbn=012383872X978-0123838728}}</ref><ref name="interface">{{cite book|author1=David A. Patterson |author2=John L. Hennessy |title=Computer Organization and Design: The Hardware/software Interface |edition=Fifth |year=2013 |publisher=Morgan Kaufmann Publishers |isbn=9780124078864}}</ref>
 
The design space of micro-architectural techniques is very large. Below are some of the most commonly employed techniques to reduce the overall latency for a thread.