Revision as of 15:27, 5 October 2022 edit 86.188.218.117 (talk) Introduced literature on another category of parallel merging ("parallel merge of two lists" section). ← Previous edit		Revision as of 08:33, 6 October 2022 edit undo 86.188.218.117 (talk) fixed minor referencing issue of last edit Next edit →
Line 126: There are also algorithms that introduce parallelism within a single instance of merging of two sorted lists. These can be used in field-programmable gate arrays ([[FPGA]]s), specialized sorting circuits, as well as in modern processors with single-instruction multiple-data ([[SIMD]]) instructions. Existing parallel algorithms are based on modifications of the merge part of either the [[bitonic sorter]] or [[odd-even mergesort]] <ref name="flimsj">{{cite journal \|last1=Papaphilippou \|first1=Philippos \|last2=Luk \|first2=Wayne \|last3=Brooks \|first3=Chris \|title=FLiMS: a Fast Lightweight 2-way Merger for Sorting \|journal=IEEE Transactions on Computers \|date=2022 \|pages=1–12 \|doi=10.1109/TC.2022.3146509}}</ref>. In 2018, Saitoh M. et al. introduced MMS <ref>{{cite journal \|last1=Saitoh \|first1=Makoto \|last2=Elsayed \|first2=Elsayed A. \|last3=Chu \|first3=Thiem Van \|last4=Mashimo \|first4=Susumu \|last5=Kise \|first5=Kenji \|title=A High-Performance and Cost-Effective Hardware Merge Sorter without Feedback Datapath \|journal=2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) \|date=April 2018 \|pages=197–204 \|doi=10.1109/FCCM.2018.00038}}</ref> for FPGAs, which focused on removing a multi-cycle feedback datapath that prevented efficient pipelining in hardware. Also in 2018, Papaphilippou P. et al. introduced FLiMS <ref~~>{{cite~~ ~~journal \|last1~~name=~~Papaphilippou~~"flimsj" \|first1=Philippos \|last2=Luk \|first2=Wayne \|last3=Brooks \|first3=Chris \|title=FLiMS: a Fast Lightweight 2-way Merger for Sorting \|journal=IEEE Transactions on Computers \|date=2022 \|pages=1–12 \|doi=10.1109/~~TC.2022.3146509}}</ref~~> that improved the hardware utilization and performance by only requiring <math>\log_2(P)+1</math> pipeline stages of {{math\|''P/2''}} compare-and-swap units to merge with a parallelism of {{math\|''P''}} elements per FPGA cycle. == Language support ==

Merge algorithm: Difference between revisions