Manycore processing unit: Difference between revisions

Content deleted Content added
 
(34 intermediate revisions by 27 users not shown)
Line 1:
#REDIRECT [[Manycore processor]]
A many-core processing unit (or MPU for short) is a type of [[microprocessor]] characterized by
* many standard [[instruction set]] [[microprocessor]] [[Multi-core (computing)|cores]],
* integrated low-[[Latency (engineering)|latency]] [[memory controller]],
* hardware [[accelerator (computing)|acceleration]] futures for [[packet]] handling.
MPUs have emerged since 2000 as a new class of processor used for [[Embedded system|embedded]] applications in [[telecommunications]], [[networking]] and other applications.
 
== History and evolution ==
 
=== Origins: network processing units (NPU) ===
 
The 1990s saw the emergence of a class of device called the [[Network Processing Unit]] (NPU). These were offered as being a flexible technology for the implementation of high-speed processing of packet network data. The NPU was offered as a technology that could replace [[ASIC]] and [[FPGA]] designs and reduce development time by replacing costly hardware design and validation with [[software]] development.
 
NPUs did not have all the market success they hoped for[http://www.instat.com/r/nrep/2004/IN0401340NT.htm]. Nor did they quite meet the promises of broad applicability or ease of use as a result of difficulties mapping NPUs to specific applications[http://books.google.com/books?vid=ISBN0121981576]. Consequently the start-up companies that brought these products to market [[List of defunct network processor companies|are mostly defunct]], a process driven also by the [[1990s]] [[Dot-com bubble|technology market boom and bust]].
 
=== Emergence of the MPU ===
 
[[Intel]]’s offering in the NPU market, the IXP[http://www.intel.com/design/network/products/npfamily/index.htm] product line, was a device with many processor cores they called micro-engines. The micro-engines were specialized for NPU tasks and had a special, IXP-specific instruction set but were general enough to allow the IXP to be programmed for a range of applications wider than some other NPUs. But because of the specialized nature of the micro-engines, users were faced with learning to program highly pipelined designs in a new environment, with new tools on a specialized target.
 
SiByte[http://www.broadcom.com/products/Enterprise-Networking/Communications-Processors], a start-up subsequently acquired by [[Broadcom]], took a similar path but using many standard [[MIPS architecture|MIPS]] cores instead of the proprietary micro-engines of the IXP family. This offered several important benefits: a standard development tool-set could be used including the [[GNU Project|GNU]] [[GNU Compiler Collection|compiler]] (although this was not always seamless and some users found it necessary to program in MIPS [[assembly language]] to meet performance targets). It was also possible to run the [[Linux]] [[operating system]] and applications on SiByte.
SiByte was the first product in the new class of Many-core Processing Unit devices, or MPU for short.
 
=== Modern MPU market ===
 
Following the Broadcom SiByte, several MPUs have come to market:
* Cavium Networks’ OCTEON[http://www.cavium.com/OCTEON_MIPS64.html]
* Raza Microelectronics’ XLR[http://www.razamicroelectronics.com/products/xlr.htm]
* Sun Microsystems’ Niagara UltraSPARC[http://www.sun.com/processors/UltraSPARC-T1/index.xml]
* PA-Semi’s PWRficient[http://www.pasemi.com/processors/index.html]
 
== Defining the MPU processor class ==
 
The common characteristics that define the MPU processor class are examined in turn.
 
Not all MPUs exhibit all the characteristics presented below but all meet enough of them to be identified as MPUs.
 
=== System on a chip ===
 
In contrast to [[IA-64|Intel Architecture]] and [[PowerPC]] general purpose processors (GPP), MPUs are aimed at embedded applications. As such MPUs integrate many [[Peripheral device|peripheral]] functions that GPPs do not and as a result, most can be regarded as [[system-on-a-chip]] (SOC) devices.
 
=== Many cores ===
 
Each MPU product family currently offers up to 8 or 16 processor cores. This stands in contrast to multi-core general purpose IA and PowerPC processors that typically have two processor cores and occasionally have four.
 
=== Standard RISC instruction set ===
 
MPU cores have standard instruction sets:
* [[MIPS architecture|MIPS64]]: Broadcom SiByte, Cavium OCTEON, Raza XLR
* [[PowerPC]]: PA-Semi PWRficient
* [[SPARC]]: Sun UltraSPARC
All of these are industry-standard [[Reduced Instruction Set Computer]] (RISC) processor cores.
 
This characteristic stands in contrast to NPUs that, to the extent that they were programmable, used specialized proprietary instruction sets. It
also stands in contrast to multi-core IA processors that use the IA [[Complex instruction set computer|CISC]] instruction set.
 
=== Integrated memory controllers ===
 
Performance of typical MPU applications, such as [[packet]] processing and [[network control protocols]] (e.g. [[Signalling (telecommunications)|signalling]] and [[call control]]), is often sensitive to first-access [[memory latency]], i.e. the time taken to access memory that is not [[Cache|cached]] on chip, owing to high cache miss rate. This is sometimes more important than peak [[memory bandwidth]]. To achieve low first-access latency MPUs have integrated [[memory controllers]]. This is distinct from [[Intel]] and [[IBM]] general purpose processors that use separate memory controller devices adjacent to the processors and are more optimized for maximum bulk memory throughput.
 
=== Integrated streaming packet IO hardware ===
 
[[Embedded systems|Embedded]] packet processing and network control applications have heavy packet [[Input/output|IO]] loads so many MPUs add streaming packet interface functions in on-chip [[Computer hardware|hardware]] to offload these tasks from [[software]] on the processor cores. [[Data link layer|Layer-two]] [[protocol]] termination (e.g. [[Ethernet]] [[Media Access Control|MAC]] layer) in hardware combined with packet [[Input/output|input and output]] packet [[Queue (data structure)|queues]] are typical. This is compared with general purpose processors that normally use memory address-space oriented [[Interface (computer science)|interfaces]] such as [[PCI]], [[PCI-Express]] or [[Hypertransport]].
 
=== Packet processing acceleration hardware ===
 
Many MPU applications can benefit from specialized hardware processing for [[Hardware acceleration|acceleration]] functions for common tasks in packet processing:
* [[traffic management]], such as [[Class of service]] [[Queue (data structure)|queues]] with [[Congestion control|congestion controls]] like [[Tail drop|tail-drop]] and [[random early detection]]
* [[Scheduling algorithm]]s such as [[Priority queue|strict priority] and [[weighted fair queueing]]
* [[Computer security|Security]] functions including: bulk encryption and decryption, random number generation, packet authentication hash computation
* Packet [[parsing]] and inspection algorithms
* [[Regular expression]] processing
* [[Data compression|Compression]] and decompression, necessary for inspection of compressed data
* Fast interfaces to external search devices such as [[Content-addressable memory|ternary content-addressable memories]], deep packet [[parsers]], [[longest prefix match]] engines etc.
 
== MPU applications ==
 
=== In telecommunications and networking equipment ===
 
MPUs are emerging as a class of technology relevant for [[Embedded system|embedded]] applications in [[telecommunications]] and [[Computer network|networking]] equipment such as:
* [[Signalling (telecommunications)|Signalling]] applications, often considered [[control plane]] applications, including [[softswitch]], control functions in [[IP Multimedia Subsystem]] (IMS), control server (x-CSCF), [[signalling gateway]] (SGW), [[mobile switching centres]] (MSC).
* Bearer applications which pass or manipulate bearer traffic including both [[Telecommunication circuit|circuit]] and [[Internet Protocol|IP]] [[media gateway]]s, [[access network]] aggregators, [[Base Station Subsystem|Base Station Controller]] (BSC) and [[Radio Network Controller]] (RNC).
* [[Transport layer|Transport]] applications that are part of the [[access network]] including [[Digital subscriber line access multiplexer|IP-DSLAM]], [[Passive optical network|optical network termination]], optical line termination.
* [[Base station]] applications a specialized class covering the needs of wireless base stations serving [[WiMAX]], [[4G]] and [[3GPP]]/[[3GPP2]] networks.
 
The high integration and specialized features that characterize MPU class processor devices make them more efficient and thus more cost-effective than general purpose processors in many of these applications.
 
Also of value is that MPUs are sufficiently general as processors (they can run standard [[Linux]] [[Symmetric multiprocessing|SMP]] or lightweight simple [[Execution (computers)|executives]]) so that one MPU based computer design can address a wide selection of the applications listed above using different software loads. This reduces the number of different module types that are needed in a product line addressing both control and packet processing network functions, reducing development costs and inventory volumes.