Merge algorithm: Difference between revisions

Content deleted Content added
Introduced literature on another category of parallel merging ("parallel merge of two lists" section).
fixed minor referencing issue of last edit
Line 126:
There are also algorithms that introduce parallelism within a single instance of merging of two sorted lists. These can be used in field-programmable gate arrays ([[FPGA]]s), specialized sorting circuits, as well as in modern processors with single-instruction multiple-data ([[SIMD]]) instructions.
 
Existing parallel algorithms are based on modifications of the merge part of either the [[bitonic sorter]] or [[odd-even mergesort]] <ref name="flimsj">{{cite journal |last1=Papaphilippou |first1=Philippos |last2=Luk |first2=Wayne |last3=Brooks |first3=Chris |title=FLiMS: a Fast Lightweight 2-way Merger for Sorting |journal=IEEE Transactions on Computers |date=2022 |pages=1–12 |doi=10.1109/TC.2022.3146509}}</ref>. In 2018, Saitoh M. et al. introduced MMS <ref>{{cite journal |last1=Saitoh |first1=Makoto |last2=Elsayed |first2=Elsayed A. |last3=Chu |first3=Thiem Van |last4=Mashimo |first4=Susumu |last5=Kise |first5=Kenji |title=A High-Performance and Cost-Effective Hardware Merge Sorter without Feedback Datapath |journal=2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) |date=April 2018 |pages=197–204 |doi=10.1109/FCCM.2018.00038}}</ref> for FPGAs, which focused on removing a multi-cycle feedback datapath that prevented efficient pipelining in hardware. Also in 2018, Papaphilippou P. et al. introduced FLiMS <ref>{{cite journal |last1name=Papaphilippou"flimsj" |first1=Philippos |last2=Luk |first2=Wayne |last3=Brooks |first3=Chris |title=FLiMS: a Fast Lightweight 2-way Merger for Sorting |journal=IEEE Transactions on Computers |date=2022 |pages=1–12 |doi=10.1109/TC.2022.3146509}}</ref> that improved the hardware utilization and performance by only requiring <math>\log_2(P)+1</math> pipeline stages of {{math|''P/2''}} compare-and-swap units to merge with a parallelism of {{math|''P''}} elements per FPGA cycle.
 
== Language support ==