Content deleted Content added
No edit summary |
|||
(22 intermediate revisions by 18 users not shown) | |||
Line 1:
{{Short description|Computer architecture hardware algorithm}}
'''
The major innovations of Tomasulo’s algorithm include [[register renaming]] in hardware, [[reservation station
[[Robert Tomasulo]] received the [[Eckert–Mauchly Award]] in 1997 for his work on the algorithm.<ref>{{cite web
Line 8 ⟶ 9:
| website =ACM Awards
| publisher =ACM
|
}}</ref>
Line 14 ⟶ 15:
[[File:Tomasulo Architecture.png|thumb|right|700px|Tomasulo's floating point unit]]
The following are the concepts necessary to the implementation of Tomasulo's
===Common data bus===
Line 29 ⟶ 30:
| date = Jan 1967
| issn = 0018-8646
| doi = 10.1147/rd.111.0025 | s2cid = 8445049
}}</ref>{{rp|33}} This has two important effects: #Functional units can access the result of any operation without involving a floating-point-register, allowing multiple units waiting on a result to proceed without waiting to resolve contention for access to register file read ports.
#Hazard Detection and control execution are distributed. The reservation stations control when an instruction can execute, rather than a single dedicated hazard unit.
===Instruction order===
{{Disputed section|Imprecise exceptions|date=December 2023}}
Instructions are issued sequentially so that the effects of a sequence of instructions, such as [[exception (computing)|exception]]s raised by these instructions, occur in the same order as they would on an in-order processor, regardless of the fact that they are being executed out-of-order (i.e. non-sequentially).
===Register renaming===
Tomasulo's
Each functional unit has a single reservation station. Reservation stations hold information needed to execute a single instruction, including the operation and the operands. The functional unit begins processing when it is free and when all source operands needed for an instruction are real.
===Exceptions===
Practically speaking, there may be exceptions for which not enough status information about an exception is available, in which case the processor may raise a special exception, called an
Programs that experience
==Instruction lifecycle==
The three stages listed below are the stages through which each instruction passes from the time it is issued to the time its execution is complete.
===
*RS - Reservation Status
*RegisterStat - Register Status; contains information about the registers.
*regs[x] - Value of register x
*Mem[A] - Value of memory at address A
*rd - destination register number
*rs, rt - source registration numbers
*imm - sign extended immediate field
*r - reservation station or buffer that the instruction is assigned to
====Reservation Station Fields====
*Op - represents the operation being performed on operands
Line 56 ⟶ 71:
*A - used to hold the memory address information for a load or store
*Busy - 1 if occupied, 0 if not occupied
*Qi - (Only register unit) the reservation station whose result should be stored in this register (if blank or 0, no values are destined for this register)▼
====Register Status Fields====
▲*Qi -
===Stage 1: issue===
Line 83 ⟶ 100:
!Action or bookkeeping
|-
|FP operation
|Station {{mono|r}} empty
|<
if (RegisterStat[rs].Qi¦0) {
RS[r].Qj ← RegisterStat[rs].Qi
Line 101 ⟶ 118:
}
RS[r].Busy ← yes;
RegisterStat[rd].
</syntaxhighlight>
|-
|Load or Store
|Buffer {{mono|r}} empty
|<
if (RegisterStat[rs].Qi¦0) {
RS[r].Qj ← RegisterStat[rs].Qi;
Line 116 ⟶ 133:
RS[r].A ← imm;
RS[r].Busy ← yes;
</syntaxhighlight>
|-
|Load only
|
|<
RegisterStat[rt].Qi ← r;
</syntaxhighlight>
|-
|Store only
|
|<
if (RegisterStat[rt].Qi¦0) {
RS[r].Qk ← RegisterStat[rt].Qi;
Line 134 ⟶ 151:
RS[r].Qk ← 0
};
</syntaxhighlight>
|}
[[File:Example of Tomasulo's Algorithm.gif|thumb|Example of Tomasulo's
===Stage 2: execute===
Line 158 ⟶ 175:
|-
|FP operation
| <
(RS[r].Qj = 0) and (RS[r].Qk = 0)
</syntaxhighlight>
|
Compute result: operands are in Vj and Vk
Line 166 ⟶ 183:
|Load/store step 1
|<code>RS[r].Qj = 0</code> & r is head of load-store queue
|<
RS[r].A ← RS[r].Vj + RS[r].A;
</syntaxhighlight>
|-
|Load step 2
Line 191 ⟶ 208:
|FP operation or load
|Execution complete at {{mono|r}} & CDB available
|<
∀x(if (RegisterStat[x].Qi = r) {
regs[x] ← result;
Line 205 ⟶ 222:
});
RS[r].Busy ← no;
</syntaxhighlight>
|-
|Store
|Execution complete at {{mono|1=r & RS[r].Qk = 0}}
|<
Mem[RS[r].A] ← RS[r].Vk;
RS[r].Busy ← no;
</syntaxhighlight>
|}
Line 220 ⟶ 237:
Reservation stations take on the responsibility of waiting for operands in the presence of data dependencies and other inconsistencies such as varying storage access time and circuit speeds, thus freeing up the functional units. This improvement overcomes long floating point delays and memory accesses. In particular the algorithm is more tolerant of cache misses. Additionally, programmers are freed from implementing optimized code. This is a result of the common data bus and reservation station working together to preserve dependencies as well as encouraging concurrency.<ref name="tomasulo"/>{{rp|33}}
By tracking operands for instructions in the reservation stations and register renaming in hardware the algorithm minimizes [[Read after write (Hazard)|read-after-write]] (RAW) and eliminates [[Write after write (hazard)#
An equally important improvement in the algorithm is the design is not limited to a specific pipeline structure. This improvement allows the algorithm to be more widely adopted by [[multiple issue|multiple-issue]] processors. Additionally, the algorithm is easily extended to enable branch speculation.<ref name="hennessy" /> {{rp|182}}
== Applications and legacy ==
Tomasulo's algorithm
# Once caches became commonplace, the
# Dynamic scheduling and
# Proliferation of mass-market software meant that programmers would not want to compile for a specific pipeline structure. The algorithm can function with any pipeline architecture and thus software requires few architecture-specific modifications.<ref name="hennessy" /> {{rp|183}}
Many modern processors implement dynamic scheduling schemes that are
| date = September 2014
| title = Intel 64 and IA-32 Architectures Software Developer's Manual
| url = http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-1-manual.html
| publisher = Intel
|
}}</ref>{{Failed verification|date=February 2017|reason=The word Tomasulo isn't even mentioned?}}<ref name="adusan">{{cite web|last1=Yoga|first1=Adarsh|title=Differences between Tomasulo's algorithm and dynamic scheduling in Intel Core microarchitecture|url=http://adusan.blogspot.com.au/2010/11/differences-between-tomasulos-algorithm.html|website=The boozier|
== See also ==
Line 249 ⟶ 266:
== External links ==
*
* [http://www.dcs.ed.ac.uk/home/hase/webhase/demo/tomasulo.html HASE Java applet simulation of the Tomasulo's
[[Category:Algorithms]]
|