Content deleted Content added
Citation bot (talk | contribs) Removed URL that duplicated identifier. Removed access-date with no URL. | Use this bot. Report bugs. | #UCB_CommandLine |
BrainStack (talk | contribs) Link suggestions feature: 3 links added. |
||
Line 391:
In its simplest form, given <math>p</math> sorted sequences <math>S_1, ..., S_p</math> distributed evenly on <math>p</math> processors and a rank <math>k</math>, the task is to find an element <math>x</math> with a global rank <math>k</math> in the union of the sequences. Hence, this can be used to divide each <math>S_i</math> in two parts at a splitter index <math>l_i</math>, where the lower part contains only elements which are smaller than <math>x</math>, while the elements bigger than <math>x</math> are located in the upper part.
The presented [[sequential algorithm]] returns the indices of the splits in each sequence, e.g. the indices <math>l_i</math> in sequences <math>S_i</math> such that <math>S_i[l_i]</math> has a global rank less than <math>k</math> and <math>\mathrm{rank}\left(S_i[l_i+1]\right) \ge k</math>.<ref>{{cite web |author=Peter Sanders |date=2019 |title=Lecture ''Parallel algorithms'' |url=http://algo2.iti.kit.edu/sanders/courses/paralg19/vorlesung.pdf |access-date=2020-05-02}}</ref>
'''algorithm'''<nowiki> msSelect(S : Array of sorted Sequences [S_1,..,S_p], k : int) </nowiki>'''is'''
'''for''' i = 1 '''to''' p '''do'''
Line 397:
'''while''' there exists i: l_i < r_i '''do'''
// pick [[Pivot element|Pivot Element]] in S_j[l_j], .., S_j[r_j], chose random j uniformly
v := pickPivot(S, l, r)
'''for''' i = 1 '''to''' p '''do'''
Line 439:
==== Practical adaption and application ====
The multiway merge sort algorithm is very scalable through its high parallelization capability, which allows the use of many processors. This makes the algorithm a viable candidate for sorting large amounts of data, such as those processed in [[computer cluster]]s. Also, since in such systems memory is usually not a limiting resource, the disadvantage of [[space complexity]] of merge sort is negligible. However, other factors become important in such systems, which are not taken into account when modelling on a [[Parallel random-access machine|PRAM]]. Here, the following aspects need to be considered: [[Memory hierarchy]], when the data does not fit into the processors cache, or the communication overhead of exchanging data between processors, which could become a bottleneck when the data can no longer be accessed via the shared memory.
[[Peter Sanders (computer scientist)|Sanders]] et al. have presented in their paper a [[bulk synchronous parallel]] algorithm for multilevel multiway mergesort, which divides <math>p</math> processors into <math>r</math> groups of size <math>p'</math>. All processors sort locally first. Unlike single level multiway mergesort, these sequences are then partitioned into <math>r</math> parts and assigned to the appropriate processor groups. These steps are repeated recursively in those groups. This reduces communication and especially avoids problems with many small messages. The hierarchical structure of the underlying real network can be used to define the processor groups (e.g. [[19-inch rack|racks]], [[Computer cluster|clusters]],...).<ref name=":0" />
|