Adaptive heap sort: Difference between revisions

Content deleted Content added
Heapsort: compatible with C and C++
Link suggestions feature: 3 links added.
 
(7 intermediate revisions by 4 users not shown)
Line 1:
{{Short description|Comparison-based sorting algorithm}}
In [[computer science]], '''adaptive heap sort''' is a [[Comparison sort|comparison-based]] [[sorting algorithm]] of the [[Adaptive sort|adaptive sort family]]. It is a variant of [[Heapsort|heap sort]] that performs better when the data contains existing order. Published by [[Christos Levcopoulos]] and [[Ola Petersson]] in 1992, the algorithm utilizes a new measure of presortedness, ''Osc,'' as the number of oscillations.<ref name=":0">{{Cite journal|last1=Levcopoulos|first1=C.|last2=Petersson|first2=O.|date=1993-05-01|title=Adaptive Heapsort|journal=Journal of Algorithms|volume=14|issue=3|pages=395–413|doi=10.1006/jagm.1993.1021|issn=0196-6774}}</ref> Instead of putting all the data into the heap as the traditional heap sort did, adaptive heap sort only take part of the data into the heap so that the run time will reduce significantly when the presortedness of the data is high.<ref name=":0" />
 
== Heapsort ==
Heap sort is a sorting algorithm that utilizes [[binary heap]] [[data structure]]. The method treats an array as a complete [[binary tree]] and builds up a Max-Heap/Min-Heap to achieve sorting.<ref name=":1">{{Cite journal|last1=Schaffer|first1=R.|last2=Sedgewick|first2=R.|date=1993-07-01|title=The Analysis of Heapsort|journal=Journal of Algorithms|volume=15|issue=1|pages=76–100|doi=10.1006/jagm.1993.1031|issn=0196-6774}}</ref> It usually involves the following four steps.
 
# Build a Max-Heap(Min-Heap): put all the data into the heap so that all nodes are either greater than or equal (less than or equal to for ''Min-Heap'') to each of its child nodes.
Line 76 ⟶ 77:
 
=== Oscillations (''Osc'') ===
For sequence <math>X = <\langle x_1, x_2, x_3, ....\dots ,x_n> \rangle</math>, ''Cross''(''x<sub>i</sub>'') is defined as the number edges of the line plot of ''X'' that are intersected by a horizontal line through the point (''i, x<sub>i</sub>''). Mathematically, it is defined as <math>\mathit{Cross}(x_i) = \{ j \mid 1 \leq j < n\ and \ \min\{ x_j, x_{j+1}\} < x_i < \max\{ x_j, x_{j+1}\} \text{ for } 1 \leq j </math> n \}\text{, for <math>}1\leq i \leq n</math>. The oscillation(''Osc'') of X is just the total number of intersections, defined as <math>\mathit{Osc}(x) = \textstyle \sum_{i=1}^n \displaystyle\lVert \mathit{Cross}(x_i) \rVert</math>.<ref name=":0" />
 
=== Other measures ===
Besides the original Osc measurement, other known measures include the number of inversions ''Inv'', the number of runs ''Runs'', the number of blocks ''Block'', and the measures ''Max'', ''Exc'' and ''Rem''. Most of these different measurements are related for adaptive heap sort. Some measures dominate the others: every Osc-optimal algorithm is Inv optimal and Runs optimal; every Inv-optimal algorithm is Max optimal; and every Block-optimal algorithm is Exc optimal and Rem optimal.<ref name=":2">{{Cite journalbook|last1=Edelkamp|first1=Stefan|last2=Elmasry|first2=Amr|last3=Katajainen|first3=Jyrki|s2cid=10325857|date=2011|editor-last=Iliopoulos|editor-first=Costas S.|editor2-last=Smyth|editor2-first=William F.|titlechapter=Two Constant-Factor-Optimal Realizations of Adaptive Heapsort|journaltitle=Combinatorial Algorithms|volume=7056|series=Lecture Notes in Computer Science|publisher=Springer Berlin Heidelberg|pages=195–208|doi=10.1007/978-3-642-25011-8_16|isbn=9783642250118}}</ref>
 
== Algorithm ==
 
Adaptive heap sort is a variant of heap sort that seeks optimality ([[Asymptotically optimal algorithm|asymptotically optimal]]) with respect to the lower bound derived with the measure of presortedness by taking advantage of the existing order in the data. In heap sort, for a data <math>X = <\langle x_1, x_2, x_3, ....\dots ,x_n> \rangle</math> , we put all n elements into the heap and then keep extracting the maximum (or minimum) for n times. Since the time of each max-extraction action is the logarithmic in the size of the heap, the total running time of standard heap sort is [[Big O notation|<math>\color{Blue} O(n \log n)</math>]].<ref name=":1" /> For adaptive heap sort, instead of putting all the elements into the heap, only the possible maximums of the data (max-candidates) will be put into the heap so that fewer runs are required when each time we try to locate the maximum (or minimum).
 
First, a [[Cartesian tree]] is built from the input in <math>O(n)</math> time by putting the data into a binary tree and making each node in the tree is greater(or smaller) than all its children nodes, and the root of the Cartesian tree is inserted into an empty binary heap. Then repeatedly extract the maximum from the binary heap, retrieve the maximum in the Cartesian tree, and add its left and right children (if any) which are themselves Cartesian trees, to the binary heap. If the input is already nearly sorted, the Cartesian trees will be very unbalanced, with few nodes having left and right children, resulting in the binary heap remaining small, and allowing the algorithm to sort more quickly than <math>O(n\log n)</math> for inputs that are already nearly sorted.<ref>{{Cite web|url=http://www.keithschwarz.com/interesting/code/?dir=cartesian-tree-sort|title=Archive of Interesting Code|website=www.keithschwarz.com|access-date=2019-10-31}}</ref>
Line 115 ⟶ 116:
{{reflist}}
 
[[Category:Sorting algorithms]]
[[Category:Comparison sorts]]
[[Category:Heaps (data structures)]]