Content deleted Content added
optimality |
Why the name; Knuth; summation formula |
||
Line 1:
In [[computer science]], '''merge-insert sort''' or the '''Ford–Johnson algorithm''' is a [[comparison sort]]ing algorithm published in 1959 by [[L. R. Ford Jr.]] and [[Selmer M. Johnson]].{{r|fj|c4cs|distrib}} It uses fewer comparisons in the [[worst case analysis|worst case]] than the best previously known algorithms, [[insertion sort|binary insertion sort]] and [[merge sort]],{{r|fj}} and for 20 years it was the sorting algorithm with the fewest known comparisons.{{r|nonopt}} Although not of practical significance, it remains of theoretical interest in connection with the problem of sorting with a minimum number of comparisons.{{r|distrib}} The same algorithm may have also been independently discovered by S. Trybula and P. Czen.{{r|knuth}}
==Algorithm==
Merge-insert sort performs the following steps, on an input <math>X</math> of <math>n</math> elements:<ref>The original description by {{harvtxt|Ford|Johnson|1959}} sorted the elements in descending order. The steps listed here reverse the output,
#Group the elements of <math>X</math> into <math>\lfloor n/2\rfloor</math> pairs of elements, arbitrarily, leaving one element unpaired if there is an odd number of elements.
#Perform <math>\lfloor n/2\rfloor</math> comparisons, one per pair, to determine the larger of the two elements in each pair.
Line 25:
*<math>C(\lfloor n/2\rfloor)</math> comparisons for the recursive call,
*Some number of comparisons for the binary insertions used to insert the remaining elements.
In the third term, the worst-case number of comparisons for the elements in the first group is two, because each is inserted into a subsequence of <math>S</math> of length at most three. First, <math>y_4</math> is inserted into the three-element subsequence <math>(x_1,x_2,x_3)</math>. Then, <math>y_3</math> is inserted into some permutation of the three-element subsequence <math>(x_1,x_2,y_4)</math>, or in some cases into the two-element subsequence <math>(x_1,x_2)</math>. Similarly, the elements <math>y_6</math> and <math>y_5</math> of the second group are each inserted into a subsequence of length at most seven, using three comparisons. More generally, the worst-case number of comparisons for the elements in the <math>i</math>th group is <math>i+1</math>, because each is inserted into a subsequence of length at most <math>2^{i+1}-1</math>.{{r|fj|c4cs|distrib}} By summing the number of comparisons used for all the elements and solving the resulting [[recurrence relation]],
this analysis can be used to compute the values of <math>C(n)</math>, giving the formula<ref>{{harvtxt|Knuth|1998}} credits this formula to the 1960 Ph.D. thesis of A. Hadian.</ref>
:<math>C(n)=\sum_{i=1}^n \left\lceil \log_2 \frac{3i}{4} \right\rceil.</math>
:0, 1, 3, 5, 7, 10, 13, 16, 19, 22, 26, 30, 34, ... {{OEIS|A001768}}
[[Asymptotic analysis|Asymptotically]], these numbers are approximately{{r|fj}}
Line 35:
Merge-insert sort also performs fewer comparisons than the [[sorting number]]s, which count the comparisons made by binary insertion sort or merge sort in the worst case. The sorting numbers fluctuate between <math>n\log_2 n - 0.915n</math> and <math>n\log_2 n - n</math>, with the same leading term but a worse constant factor in the lower-order linear term.{{r|fj}}
==Relation to other comparison sorts==
The algorithm is called merge-insert sort because the initial comparisons that it performs before its recursive call (pairing up arbitrary items and comparing each pair) are the same as the initial comparisons of [[merge sort]],
For 20 years, merge-insert sort was the sorting algorithm with the fewest comparisons known (in the worst case, for each input length).▼
while the comparisons that it performs after the recursive call (using binary search to insert elements one by one into a sorted list) follow the same principal as [[insertion sort]]. In this sense, it is a [[hybrid algorithm]] that combines both merge sort and insertion sort.<ref>{{harvtxt|Knuth|1998}}, p. 184: "Since it involves some aspects of merging and some aspects of insertion, we call it ''merge insertion''."</ref>
Merge insert sort is the sorting algorithm with the fewest possible comparisons for <math>n</math> items whenever <math>n\le 14</math>, and for <math>20\le n\le 22</math>.{{r|pec}}
▲For 20 years, merge-insert sort was the sorting algorithm with the fewest comparisons known
However, in 1979 Glenn Manacher published another sorting algorithm that used even fewer comparisons, for large enough inputs.{{r|distrib|nonopt}}
It remains unknown exactly how many comparisons are needed for sorting, for all <math>n</math>, but Manacher's algorithm
Line 77 ⟶ 81:
| volume = 66
| year = 1959}}</ref>
<ref name=knuth>{{citation
| last = Knuth | first = Donald E. | author-link = Donald Knuth
| contribution = Merge insertion
| edition = 2nd
| pages = 184–186
| title = [[The Art of Computer Programming]], Vol. 3: Sorting and Searching
| year = 1998}}</ref>
<ref name=nonopt>{{citation
Line 87 ⟶ 99:
| title = The Ford-Johnson Sorting Algorithm Is Not Optimal
| volume = 26}}</ref>
<ref name=pec>{{citation
| last = Peczarski | first = Marcin
| doi = 10.1007/s00453-004-1100-7
| issue = 2
| journal = Algorithmica
| mr = 2072769
| pages = 133–145
| title = New results in minimum-comparison sorting
| volume = 40
| year = 2004}}</ref>
}}
|