Revision as of 03:11, 19 August 2011 edit DavidCary (talk \| contribs) Extended confirmed users 7,223 edits clarify (I hope). ← Previous edit		Revision as of 15:12, 19 August 2011 edit undo Levineps (talk \| contribs) 64,853 edits rm whitespace Next edit →
Line 26: The idea of a stored-program computer was already present in the design of [[J. Presper Eckert]] and [[John William Mauchly]]'s ENIAC, but was initially omitted so that it could be finished sooner. On June 30, 1945, before ENIAC was made, mathematician [[John von Neumann]] distributed the paper entitled ''[[First Draft of a Report on the EDVAC]]''. It was the outline of a stored-program computer that would eventually be completed in August 1949.<ref>{{cite paper \| author = [[]] \| title = First Draft of a Report on the EDVAC \| publisher = [[Moore School of Electrical Engineering]], [[University of Pennsylvania]] \| url = http://www.virtualtravelog.net/entries/2003-08-TheFirstDraft.pdf \| date = 1945 }}</ref> EDVAC was designed to perform a certain number of instructions (or operations) of various types. These instructions could be combined to create useful programs for the EDVAC to run. Significantly, the programs written for EDVAC were stored in high-speed [[Memory (computers)\|computer memory]] rather than specified by the physical wiring of the computer. This overcame a severe limitation of ENIAC, which was the considerable time and effort required to reconfigure the computer to perform a new task. With von Neumann's design, the program, or software, that EDVAC ran could be changed simply by changing the contents of the memory. Early CPUs were custom-designed as a part of a larger, sometimes one-of-a-kind, computer. However, this method of designing custom CPUs for a particular application has largely given way to the development of mass-produced processors that are made for many purposes. This standardization began in the era of discrete [[transistor]] [[Mainframe computer\|mainframes]] and [[minicomputer]]s and has rapidly accelerated with the popularization of the [[integrated circuit]] (IC). The IC has allowed increasingly complex CPUs to be designed and manufactured to tolerances on the order of [[nanometer]]s. Both the miniaturization and standardization of CPUs have increased the presence of digital devices in modern life far beyond the limited application of dedicated computing machines. Modern microprocessors appear in everything from [[automobile]]s to [[cell phone]]s and children's toys. While von Neumann is most often credited with the design of the stored-program computer because of his design of EDVAC, others before him, such as [[Konrad Zuse]], had suggested and implemented similar ideas. The so-called [[Harvard architecture]] of the [[Harvard Mark I]], which was completed before EDVAC, also utilized a stored-program design using [[Punched tape\|punched paper tape]] rather than electronic memory. The key difference between the von Neumann and Harvard architectures is that the latter separates the storage and treatment of CPU instructions and data, while the former uses the same memory space for both. Most modern CPUs are primarily von Neumann in design, but elements of the Harvard architecture are commonly seen as well. Line 96: {{Main\|Clock rate}} The clock rate is the speed at which a microprocessor executes instructions. Every computer contains an internal clock that regulates the rate at which instructions are executed and synchronizes all the various computer components. The CPU requires a fixed number of clock ticks (or clock cycles) to execute each instruction. The faster the clock, the more instructions the CPU can execute per second. Most CPUs, and indeed most [[sequential logic]] devices, are [[Synchronous circuit\|synchronous]] in nature.<ref>In fact, all synchronous CPUs use a combination of [[sequential logic]] and [[combinational logic]]. (See [[boolean logic]])</ref> That is, they are designed and operate on assumptions about a synchronization signal. This signal, known as a [[clock signal]], usually takes the form of a periodic [[square wave]]. By calculating the maximum time that electrical signals can move in various branches of a CPU's many circuits, the designers can select an appropriate [[Frequency\|period]] for the clock signal. Line 128: Further improvement upon the idea of instruction pipelining led to the development of a method that decreases the idle time of CPU components even further. Designs that are said to be ''superscalar'' include a long instruction pipeline and multiple identical execution units. <ref>{{cite web \| last = Huynh \| first = Jack \| title = The AMD Athlon XP Processor with 512KB L2 Cache \| publisher = University of Illinois — Urbana-Champaign \| pages = 6–11 \| url = http://courses.ece.uiuc.edu/ece512/Papers/Athlon.pdf \| year = 2003 \| accessdate = 2007-10-06 }}</ref> In a superscalar pipeline, multiple instructions are read and passed to a dispatcher, which decides whether or not the instructions can be executed in parallel (simultaneously). If so they are dispatched to available execution units, resulting in the ability for several instructions to be executed simultaneously. In general, the more instructions a superscalar CPU is able to dispatch simultaneously to waiting execution units, the more instructions will be completed in a given cycle. Most of the difficulty in the design of a superscalar CPU architecture lies in creating an effective dispatcher. The dispatcher needs to be able to quickly and correctly determine whether instructions can be executed in parallel, as well as dispatch them in such a way as to keep as many execution units busy as possible. This requires that the instruction pipeline is filled as often as possible and gives rise to the need in superscalar architectures for significant amounts of [[CPU cache]]. It also makes [[Hazard (computer architecture)\|hazard]]-avoiding techniques like [[branch prediction]], [[speculative execution]], and [[out-of-order execution]] crucial to maintaining high levels of performance. By attempting to predict which branch (or path) a conditional instruction will take, the CPU can minimize the number of times that the entire pipeline must wait until a conditional instruction is completed. Speculative execution often provides modest performance increases by executing portions of code that may not be needed after a conditional operation completes. Out-of-order execution somewhat rearranges the order in which instructions are executed to reduce delays due to data dependencies. Also in case of Single Instructions Multiple Data — a case when a lot of data from the same type has to be processed, modern processors can disable parts of the pipeline so that when a single instruction is executed many times, the CPU skips the fetch and decode phases and thus greatly increases performance on certain occasions, especially in highly monotonous program engines such as video creation software and photo processing. In the case where a portion of the CPU is superscalar and part is not, the part which is not suffers a performance penalty due to scheduling stalls. The Intel [[P5 (microarchitecture)\|P5]] [[Pentium (brand)\|Pentium]] had two superscalar ALUs which could accept one instruction per clock each, but its FPU could not accept one instruction per clock. Thus the P5 was integer superscalar but not floating point superscalar. Intel's successor to the P5 architecture, [[P6 (microarchitecture)\|P6]], added superscalar capabilities to its floating point features, and therefore afforded a significant increase in floating point instruction performance. Line 203: <references /> * <!-- {{note label\|HennessyGoldberg1996\|Hennessy & Goldberg 1996\|a}} --> {{cite book \| last = Hennessy \| first = John A. \| coauthors = Goldberg, David \| title = Computer Architecture: A Quantitative Approach \| publisher = Morgan Kaufmann Publishers \| year = 1996 \| isbn = 1-55860-329-8 }} * {{note label\|Knott1974\|Knott 1974\|a}} Gary D. Knott (1974) ''[http://doi.acm.org/10.1145/775280.775282 A proposal for certain process management and intercommunication primitives]'' ACM SIGOPS Operating Systems Review. Volume 8 , Issue 4 (October 1974). pp. 7 - 44 * {{note label\|MIPSTech2005\|MIPS Technologies 2005\|a}} {{cite paper \| author = MIPS Technologies, Inc. \| title = MIPS32 Architecture For Programmers Volume II: The MIPS32 Instruction Set \| publisher = [[MIPS Technologies]], Inc. \| date = 2005 \| url = http://www.mips.com/content/Documentation/MIPSDocumentation/ProcessorArchitecture/doclibrary }} * {{note label\|Smotherman2005\|Smotherman 2005\|a}} {{cite web \| last = Smotherman \| first = Mark \| year = 2005 \| url = http://www.cs.clemson.edu/~mark/multithreading.html \| title = History of Multithreading \| accessdate = 2005-12-19 }}

Central processing unit: Difference between revisions