Systolic array: Difference between revisions

Content deleted Content added
m spelling
m disambiguate
Line 1:
{{Use dmy dates|date=October 2019}}
In [[parallel computing|parallel]] [[computer architectures]], a '''systolic array''' is a homogeneous [[Graph (discrete mathematics)|network]] of tightly coupled [[data processing unit]]s (DPUs) called cells or [[Node (computer science)|node]]s. Each node or DPU independently computes a partial result as a function of the data received from its upstream neighbours, stores the result within itself and passes it downstream. Systolic arrays were first used in [[Colossus computer|Colossus]], which was an early computer used to break German [[Lorenz cipher|Lorenz]] ciphers during [[World War II]].<ref>{{YouTube |id=g2tMcMQqSbA |title=Colossus - The Greatest Secret in the History of Computing |time=41m45s}}</ref> Due to the classified nature of Colossus, they were independently invented or rediscovered by [[H. T. Kung]] and [[Charles Leiserson]] who described arrays for many dense linear algebra computations (matrix product, solving systems of [[linear equation]]s, [[LU decomposition]], etc.) for banded matrices. Early applications include computing [[greatest common divisor]]s of integers and polynomials.<ref>http://www.eecs.harvard.edu/~htk/publication/1984-ieeetoc-brent-kung.pdf</ref> They are sometimes classified as [[MISD|multipleMultiple instruction, single datamultiple-instruction single-data]] (MISD) architectures under [[Flynn's taxonomy]], but this classification is questionable because a strong argument can be made to distinguish systolic arrays from any of Flynn's four categories: [[Single instruction, single data|SISD]], [[Single instruction, multiple data|SIMD]], [[Multiple instruction, single data|MISD]], [[Multiple instruction, multiple data|MIMD]], as discussed later in this article.
 
The parallel input [[data]] flows through a network of hard-wired [[Microprocessor|processor]] nodes, which combine, process, [[merge algorithm|merge]] or [[sorting algorithm|sort]] the input data into a derived result. Because the [[wave]]-like propagation of data through a systolic array resembles the [[pulse]] of the human circulatory system, the name ''systolic'' was coined from medical terminology. The name is derived from [[systole]] as an analogy to the regular pumping of blood by the heart.
Line 20:
 
==Classification controversy==
While systolic arrays are officially classified as [[Multiple instruction, single data|MISD]], their classification is somewhat problematic. Because the input is typically a vector
of independent values, the systolic array is definitely not [[Single instruction, single data|SISD]]. Since these [[input (computer science)|input]] values are merged and combined into the result(s) and do not maintain their [[independence]] as they would in a [[Single instruction, multiple data|SIMD]] vector processing unit, the [[array data structure|array]] cannot be classified as such. Consequently, the array cannot be classified as a [[Multiple instruction, multiple data|MIMD]] either, because MIMD can be viewed as a mere collection of smaller SISD and [[Single instruction, multiple data|SIMD]] machines.
 
Finally, because the data [[swarm]] is transformed as it passes through the array from [[Node (computer science)|node]] to node, the multiple nodes are not operating on the same data, which makes the MISD classification a [[misnomer]]. The other reason why a systolic array should not qualify as a '''MISD''' is the same as the one which disqualifies it from the SISD category: The input data is typically a vector not a '''s'''ingle '''d'''ata value, although one could argue that any given input vector is a single item of data.
Line 35:
An example of a systolic [[algorithm]] might be designed for [[matrix multiplication]]. One [[matrix (math)|matrix]] is fed in a row at a time from the top of the array and is passed down the array, the other matrix is fed in a column at a time from the left hand side of the array and passes from left to right. Dummy values are then passed in until each processor has seen one whole row and one whole column. At this point, the result of the multiplication is stored in the array and can now be output a row or a column at a time, flowing down or across the array.<ref>[http://web.cecs.pdx.edu/~mperkows/temp/May22/0020.Matrix-multiplication-systolic.pdf Systolic Array Matrix Multiplication]</ref>
 
Systolic arrays are arrays of [[data processing unit|DPU]]s which are connected to a small number of nearest neighbour DPUs in a mesh-like topology. DPUs perform a sequence of operations on data that flows between them. Because the traditional systolic array synthesis methods have been practiced by algebraic algorithms, only uniform arrays with only linear pipes can be obtained, so that the architectures are the same in all DPUs. The consequence is, that only applications with regular data dependencies can be implemented on classical systolic arrays. Like [[Single instruction, multiple data|SIMD]] machines, clocked systolic arrays compute in "lock-step" with each processor undertaking alternate compute | communicate phases. But systolic arrays with asynchronous handshake between DPUs are called ''wavefront arrays''.
One well-known systolic array is Carnegie Mellon University's [[iWarp]] processor, which has been manufactured by Intel. An iWarp system has a linear array processor connected by data buses going in both directions.
 
Line 74:
 
==See also==
* [[Multiple instruction, single data|MISD]] – Multiple Instruction Single Data, Example: Systolic Arrays
* [[iWarp]] – Systolic Array Computer, VLSI, Intel/CMU
* [[WARP (systolic array)]] – Systolic Array Computer, GE/CMU