Content deleted Content added
m →Explanation: ditto |
|||
Line 22:
Each stage requires one clock cycle and an instruction passes through the stages sequentially. Without [[pipelining]], a new instruction is fetched in stage 1 only after the previous instruction finishes at stage 5, therefore the number of clock cycles it takes to execute an instruction is five (CPI = 5 > 1). In this case, the processor is said to be ''subscalar''. With pipelining, a new instruction is fetched every clock cycle by exploiting [[instruction-level parallelism]], therefore, since one could theoretically have five instructions in the five pipeline stages at once (one instruction per stage), a different instruction would complete stage 5 in every clock cycle and on average the number of clock cycles it takes to execute an instruction is 1 (CPI = 1). In this case, the processor is said to be ''scalar''.
With a single-[[Execution unit|execution-unit]] processor, the best CPI attainable is 1. However, with a multiple-execution-unit processor, one may achieve even better CPI values (CPI < 1). In this case, the processor is said to be ''[[superscalar]]''. To get better CPI values without pipelining, the number of execution units must be greater than the number of stages. For example, with
==Examples==
|