Merge algorithm: Difference between revisions

Content deleted Content added
Mia0303 (talk | contribs)
No edit summary
Mia0303 (talk | contribs)
No edit summary
Line 8:
B={b1<b2<…<an }.
 
The goal of the merging algorithm is to determine the linearly ordered list of their union.  The union list can also be represented by an array. For example, to merge A and B is to make a series of pairwise comparisons between an element in list A and an element in list B.  Given multiple algorithms to solve this problem, we are interested in the efficiency of each algorithm. Specifically, we are interested in the maximum number of comparison, K(n,m) required among all possible
== Binary merge ==
orderings of .  We state that an algorithm is optimal when the number of comparisons ( K(n,m) ) is minimum<ref name=":0">Hwang,
The simplest merge algorithm is the one employed in classical merge sort. This ''binary merge'' algorithm takes two sorted lists as input, and repeatedly picks off an element from either the "left" or "right" list, whichever is smaller. When one list is exhausted, it produces all elements from the other list.
F., & Lin, S. (1972). A Simple Algorithm for Merging Two Disjoint Linearly
Ordered Sets. SIAM Journal on Computing, 31-39. doi:10.1137/0201004</ref>.
 
== Usages ==
The merge algorithm plays a critical role in the [[merge sort]] algorithm, a comparison-based sorting algorithm. Conceptually, merge sort algorithm consists of two steps<ref>[[Merge sort]]</ref>:
 
1.Recursively divide the list into sublists until each sublist containing only 1element. We consider a list of only 1 element sorted.
 
2.Repeatedly merge sublists to create a new sorted sublist until the single list
contains all elements. The single list is the sorted list.
 
The merge algorithm is used repeatedly in the merge sort algorithm .
[[File:Merge sort algorithm diagram.svg|centre|thumb|An example for merge sort]]
 
An example for merge sort is given above. We are given a unsorted array of integers. The given array has 7 elements. We then divide the array into 7 partitions, with each partition containing only 1element. Then we repeated merge partitioned units to produce new sublists until only 1 sublist remains.
 
== Some existing algorithms ==
 
=== The "tape merge" algorithm<ref name=":0" /> ===
The tape merge algorithm is the most widely used algorithm to merge two lists of sorted elements into one single sorted list.  Conceptually, the tape merge algorithm works as follows:
 
Given         A={a_1<a_2<…<a_m },
 
B={b_1<b_2<…<a_n }.
 
1. Compare a_m with b_n
 
2. If a_m<a_m, set n=n-1 and repeat step 1
 
3. If a_m>a_m, set m=m-1 and repeat step 1
 
With the tape merge algorithm, the total number of comparisons equals to m+n-1. The number of comparison is K_t (m,n)=m+n-1
 
==== Java implementation for tape merge algorithm<ref>http://introcs.cs.princeton.edu/java/42sort/Merge.java.html</ref>: ====
<syntaxhighlight lang="java">
static void merge(Comparable[] a, Comparable[] aux, int lo, int mid, int hi) {
int i = lo, j = mid;
for (int k = lo; k < hi; k++) {
if (i == mid) aux[k-lo] = a[j++];
else if (j == hi) aux[k-lo] = a[i++];
else if (a[j].compareTo(a[i]) < 0) aux[k-lo] = a[j++];
else aux[k-lo] = a[i++];
}
// copy back the resulting merged list
for (int k = lo; k < hi; k++)
a[k] = aux[k-lo];
}
</syntaxhighlight>
 
The method merge takes five parameters as input and merge two subarrays into one using the auxiliary array aux[] as extra space. Comparable[] a is the array with two sorted subarrays  that require further merging. The first array is from a[lo]..a[mid-1] and the second array is from a[mid] .. a[hi-1] into  a[lo] .. a[hi-1].  The two subarrays are required to be in ascending order before merging. It is clear that the method use array aux as scratch space for merging. The resulting merged array is first stored in aux, and then the array is copied back to Comparable[] a starting from the position a[lo].
 
=== Binary merge algorithm<ref name=":0" /> ===
The binary merge algorithm is the basis of many faster merging algorithms. Conceptually, the algorithm can be described as follows:
 
1. Merge a_m  into B by the binary search algorithm. Suppose am is after b_u
 
2. Eliminate element am and element in B such that B> a_m  (eliminate elements in B after a_m, because these elements are larger than the rest of the elements in A). Set n=n-1, m=u. Use the redefined m and n and repeat step 1.
 
In the worst-case scenario, a_m  is always greater than b_n  and therefore no element in B can be eliminated. The number of comparison is <math>K_s (m,n)=mlogn(n+1)</math> <!-- Ks (m,n)=m ⌈logn(n+1)⌉ -->
 
The binary merge algorithm plays an important role in the [[Timsort]] algorithm<ref>Auger,
N., Nicaud, C., & Pivoteau, C. (2015). Merge Strategies: From Merge Sort to
TimSort. HAL, (Hal-01212839).</ref>, an algorithm that sorts arrays of non-primitive type in Java SE7. Timsort creates runs, subsets in non-descending order, and then merge runs using the binary merge algorithm. When Timsort merges two runs, it uses a temporary memory the size of the shorter run. The algorithm then copies the shorter run to the temporary memory and uses the original memory to store the resulting sorted list. For each element in the shorter list, Timsort uses binary search to find the correct position in the longer list in order to merge the two lists. In the worst case, Timsort takes O(nlogn) comparisons to sort an array of n elements.
 
=== Inplace merge ===
The logic behind in-place merge is slightly tricky. We are given two sublists x[first]..x[mid-1] and x[mid]..x[last-1].  We then compare the left list element x[left] and the right list element x[right]. If x[left] is less than or equal to x[right], then it is already in place in the resulting sorted list, so we can increment left. Otherwise, the x[right] needs to move down into the place currently ocupied by x[left], and the rest of the list from x[left] through x[right-1] needs to move left by one. In the process, since everything moves up by one, we increment left, mid and right pointers<ref>http://penguin.ewu.edu/cscd300/Topic/AdvSorting/MergeSorts/InPlace.html</ref>.<syntaxhighlight lang="c++">
static void inPlaceMerge ( Comparable[] x, int first,int mid, int last )
left = first; right = mid+1;
while (left <= mid && right <= last)
{ // Select from left: no change, increment left
if ( x[left].compareTo(x[right]) <= 0 )
left++;
// Select from right: rotate [left..right] and correct
else
{ tmp = x[right]; // Will move to [left]
System.arraycopy(x, left, x, left+1, right-left);
x[left] = tmp;
left++; mid++; right++;
}
}
// Whatever remains in [right..last] is in place
</syntaxhighlight>
 
==== Complexity ====
In the worst case, within each execution the rest of the unmerged list needs to be moved to the left side. This requires O(n^2) data movements. 
 
It is important to note that the above implementation is a basic version of inplace merge.  With some more complicated implementations, inplace merge can be linearithmic. The number of comparisons can be O(nlogn).
 
==K-way merging==
Line 54 ⟶ 141:
 
== Parallel merge ==
Parallel programming can accelerate Merge algorithms. This section will show you how to transform a serial merge algorithm that can run only on one process into an algorithm that can run on multiple processes or even multiple machines. Take a simple serial merge algorithm that merges two sorted list as an example<ref name=":1">Cormen, T., Leiserson, C., Rivest, R., & Stein, C. (2009). Introduction to
In [[parallel computing]], [[Array data structure|array]]s of sorted values may be merged efficiently using an [[all nearest smaller values]] computation.<ref>{{citation |first1=Omer |last1=Berkman |first2=Baruch |last2=Schieber |first3=Uzi |last3=Vishkin |author3-link=Uzi Vishkin |title=Optimal double logarithmic parallel algorithms based on finding all nearest smaller values |journal=Journal of Algorithms |volume=14 |pages=344–370 |year=1993 |issue=3 |doi=10.1006/jagm.1993.1018}}</ref>
Algorithms (3rd ed.). Cambridge, Massachusetts: The MIT Press.</ref>:
 
<syntaxhighlight lang="c++">
Parallel merge can also be implemented using a divide-and-conquer algorithm, developed and shown in pseudo-code in.<ref>{{Harvnb|Cormen|Leiserson|Rivest|Stein|2009|p=800}}</ref> This algorithm performs well when combined with a fast sequential merge as a base case for merging of small arrays. Implementation using Intel's Threading Building Blocks (TBB) and Microsoft's Parallel Pattern Library (PPL) to run on [[multi-core processor]]s is shown to perform well in practice.<ref>[http://drdobbs.com/high-performance-computing/229204454 V. J. Duvanenko, "Parallel Merge", Dr. Dobb's Journal, February 2011]</ref>
void merge_int(int * a_start, int* a_end, int* b_start, int* b_end, int* dst )
{
while( a_start < a_end && b_start < b_end ) {
if ( *a_start <= *b_start ) *dst++ = *a_start++;
else *dst++ = *b_start++;
}
while( a_start < a_end ) *dst++ = *a_start++;
while( b_start < b_end ) *dst++ = *b_start++;
}
</syntaxhighlight>
 
a_start, a_end, b_start, b_end are the pointers that specify the start and end of the list a and b, respectively. dst is the ___location of the output. 
== Language support ==
 
In the first while loop, the first if statement compares the first element from list a and the first element of list b. The statement writes whichever is smaller to the output list dst. The loop terminates when it reaches the end of either list a or list b. The second and third while loops output any remaining elements from the source lists. The complexity of this merge algorithm is O(n), where n is the number of output elements or the number of source elements combined.
Some [[computer language]]s provide built-in or library support for merging sorted [[collections]].
 
Notice that this procedure contains a number of loops that covers the entire sequence, which means that it cannot be broken up into smaller tasks and distributed to multiple processes in an obvious fashion. On the other hand, divide-and-conquer algorithms usually can be parallelized easily. Let’s modify our serial merge algorithm into a divide-and-conquer merge algorithm<ref name=":1" />. <syntaxhighlight lang="c++">
void merge_int_dac(int * src, int a1, int a2, int b1, int b2, int* dst, int r)
{
int l1 = a2 – a1 + 1;
int l2 = b2 – b1 + 1;
if (l1 < l2)
{
int t = b1; b1 = a1; b1 = t;
t = b2; b2 = a2; a2 = t;
t = l2; l2 = l1; l1 = t;
}
if (l1 == 0) return;
int q1 = (b1 + b2)/2;
for (int i = b1; i < b2; ++i)
{
if (src[i] == src[q1]){q2 = i; break;}
}
int q3 = r + (q1 – a1) + (q2 – b1);
a[q3] = src[q1];
merge_int_dac(src, a1, q1 – 1, b2, q2 – 1, dst, r);
merge_int_dac(src, q1 + 1, a2, q2, b2, dst, q3 + 1);
 
}
</syntaxhighlight>
 
Transforming the divide-and-conquer version of the serial merge algorithm turns out to be very simple using the Intel Threading Building Blocks (TBB) along with the C++ lambda function. Following is the parallel programming modification algorithm that uses TBB’s parallel_invoke() function, which invokes the two functions in the arguments simultaneously. parallel_invoke() can invoke up to ten functions. 
 
<syntaxhighlight lang="c++">
void merge_int_dac_mp(int * src, int a1, int a2, int b1, int b2, int* dst, int r)
{
int l1 = a2 – a1 + 1;
int l2 = b2 – b1 + 1;
if (l1 < l2)
{
int t = b1; b1 = a1; b1 = t;
t = b2; b2 = a2; a2 = t;
t = l2; l2 = l1; l1 = t;
}
if (l1 == 0) return;
int q1 = (b1 + b2)/2;
for (int i = b1; i < b2; ++i)
{
if (src[i] == src[q1]){q2 = i; break;}
}
int q3 = r + (q1 – a1) + (q2 – b1);
a[q3] = src[q1];
tbb::parallel_invoke(
[&] {merge_int_dac_mp(src, a1, q1 – 1, b2, q2 – 1, dst, r);}
[&] {merge_int_dac_mp(src, q1 + 1, a2, q2, b2, dst, q3 + 1);}
);
 
}
</syntaxhighlight>
 
The function run each of the two subtasks (merge_int_dac_mp) on a separate process and the child processes/threads subsequently distribute the further divided tasks into more processes. Notice that this naïve implementation of parallel merge actually runs a lot slower than the single process version because of the overhead of fine-grained spawning of processes and threads. For example, the program will spawn two processes for merging two lists with one element in each list, which clearly takes more time in spawning processes than actually performing the merging. Further improvement can be made by stopping the recursion after the source elements to be merged are smaller than certain number. 
 
== Merge algorithms in programming language’s standard library ==
Some [[Programming language|programming languages]] provide built-in merge function for lists.
 
=== C++ ===
The [[C++]]'s [[Standard Template Library]] has the function std:: merge,  std:: Inplace_merge, and std:: set_union.
The [[C++]]'s [[Standard Template Library]] has the function {{mono|std::merge}}, which merges two sorted ranges of [[iterator]]s, and {{mono|std::inplace_merge}}, which merges two consecutive sorted ranges ''in-place''. In addition, the {{mono|std::list}} (linked list) class has its own {{mono|merge}} method which merges another list into itself. The type of the elements merged must support the less-than ({{mono|<}}) operator, or it must be provided with a custom comparator.
 
==== std:: merge<ref>http://www.cplusplus.com/reference/algorithm/merge/</ref> ====
 
===== Parameters =====
Merge function takes six parameters first1,last1, first2, last2, d_first, comp. first1 and last1 are the first range of elements; first2 and last 2 are the second range of elements to merge; d_first is the beginning of the destination range; comp is the comparison function object ( for example, an object that satisfies the requirement of Compare. The object should return true if the first argument is less than the second argument)
 
It is important to note that the ranges should not overlap and the elements in both input ranges should already be ordered according to the comp criterion.
 
===== Complexity =====
At most std::distance(first1, last1) + std::distance(first2, last2) - 1 comparisons.
 
==== std:: Inplace_merge [7] ====
The difference between Inplace_merge and previous merge is that the resulting merged list in the inplace_merge starts from the initial position of the first list. The two input sorted lists in the inplace_merge must be consecutive, whereas the input lists are not required to be consecutive.  Inplace_merge requires swaps in addition to comparisons, and its complexity can vary depending on whether enough extra memory is available.
 
===== Parameters =====
Inplace_merge takes two consecutive sorted list [first,middle] and [middle, last] and combine into a single list [first, last].
 
Inplace_merge takes 4 parameters: 
 
First is the initial position in the first list. It is also the initial position of the resulting merged list.
 
Middle is the initial position of the second list. Because the two lists must be consecutive, middle is also the position after the end of the first list.
 
Last is the position after the end of the second list. It is also the position after the end of the resulting merged list.
 
Comp is the comparison function object. The returned value equals true when the first argument should go before the second argument. 
 
===== Complexity =====
If enough extra memory is available, linear in the distance between first and last (N=distance(last , first))
 
Performs N-1 comparisons and up to 2N moves.
 
If memory is limited, the complexity is less or equal to linearithmic: Performs up to N*log(N) element comparisons and up to N swaps.
 
=== Python ===