Systolic array: Difference between revisions

Content deleted Content added
Line 102:
If one is willing to use N/2 processing elements (PE) each with a comparator and two registers, elements arranged in a stack-like fashion, an array (or stream) of size N can thus be sorted in 2N time by pushing its elements in while on every level of the systolic stack the maximum of the pair of elements stored in each PE is pushed further down. And after all the elements are pushed in, the process is reversed with the minimum element in each PE being popped out (or "pushed up"), resulting in the stream of elements coming out sorted in ascending order.<ref>M. H. Alsuwaiyel, *Parallel Algorithms*, World Scientific, 2022, Sec. 9.5 "An On-chip Bubble Sorter" (in Ch. 9 "Systolic Computation").</ref>
 
Sorting input arrays of larger size (N > P) than the number of processing elements (P) is somewhat complex to do efficiently with such a system, but can be realized (by adding an external serial processor) in O(N log N/log P) time. The serial processor needs to manage a "bucket B-tree", where each node in the [[B-tree]] has P "buckets" that are eventually each sorted in O(P) time using the PEs.<ref>Mikhail J. Atallah, Greg N. Frederickson, S. Rao Kosaraju, "Sorting with efficient use of special-purpose sorters", Information Processing Letters 1988 https://doi.org/10.1016/0020-0190(88)90075-0; also as Purdue CSD-TR 87-695 https://docs.lib.purdue.edu/cstech/602/</ref>
 
==Implementations==