Partial sorting: Difference between revisions

Content deleted Content added
No edit summary
Line 5:
==Offline problems==
===Heap-based solution===
[[Heap (data structure)|Heaps]] admit a simple single-pass partial sort when {{mvar|k}} is fixed: insert the first {{mvar|k}} elements of the input into a max-heap. Then make one pass over the remaining elements, add each to the heap in turn, and remove the largest element. Each insertion operation takes {{math|''O''(log ''k'')}} time, resulting in {{math|''O''(''n'' log ''k'')}} time overall; this "partial heapsort" algorithm is practical for small values of {{mvar|k}} and in [[online algorithm|online]] settings.<ref name="aofa04slides"/> AnotherAn option"online isheapselect" toalgorithm builddescribed below, based on a min-heap for all the values (the build takes {{math|''O''(''n'')}}) and take out the head of the heap K times, each remove operation takes {{math|''O''(log ''n'')}}. In that case the algorithm takes {{math|''O''(''n'' + ''k'' log ''n'')}}.<ref name="aofa04slides"/>
 
Common in C++ STL implementations is a pass of [[Heap (data structure)#Applications|heapselect]] for an ''unsorted'' list of ''k'' elements, followed by a [[heapsort]] for the final result.<ref>{{cite web |title=std::partial_sort |url=https://en.cppreference.com/w/cpp/algorithm/partial_sort |website=en.cppreference.com}}</ref>
 
===Solution by partitioning selection===
A further relaxation requiring only a list of the {{mvar|k}} smallest elements, but without requiring that these be ordered, makes the problem equivalent to [[Selection algorithm#Partition-based selection|partition-based selection]]; the original partial sorting problem can be solved by such a selection algorithm to obtain an array where the first {{mvar|k}} elements are the {{mvar|k}} smallest, and sorting these, at a total cost of {{math|''O''(''n'' + ''k'' log ''k'')}} operations. A popular choice to implement this algorithm scheme is to combine [[quickselect]] and [[quicksort]]; the result is sometimes called "quickselsort".<ref name="aofa04slides"/>
 
Common in current (as of 2022) C++ STL implementations is a pass of [[Heap (data structure)#Applications|heapselect]] for an ''unsorted'' list of ''k'' elements, followed by a [[heapsort]] for the final result.<ref>{{cite web |title=std::partial_sort |url=https://en.cppreference.com/w/cpp/algorithm/partial_sort |website=en.cppreference.com}}</ref>
 
==={{anchor|Partial quicksort}} Specialised sorting algorithms===
Line 23:
partial_quicksort(A, p+1, j, k)
 
The resulting algorithm is called partial quicksort and requires an ''expected'' time of only {{math|''O''(''n'' + ''k'' log ''k'')}}, and is quite efficient in practice, especially if a [[selection sort]] is used as a base case when {{mvar|k}} becomes small relative to {{mvar|n}}. However, the worst-case time complexity is still very bad, in the case of a bad pivot selection. Pivot selection along the lines of the worst-case linear time selection algorithm (see {{section link|Quicksort|Choice of pivot}}) could be used to get better worst-case performance. Partial quicksort, quickselect (including the multiple variant), and quicksort can all be generalized into what is known as a ''chunksort''.<ref name="aofa04slides"/>
 
==Incremental sorting==
Incremental sorting is a version of the partial sorting problem where the input is given up front but {{mvar|k}} is unknown: given a {{mvar|k}}-sorted array, it should be possible to extend the partially sorted part so that the array becomes {{math|(''k''+1)}}-sorted.{{r|paredes}}
 
[[Heap (data structure)|Heaps]] lead to an {{math|''O''(''n'' + ''k'' log ''n'')}} "online heapselect" solution to incremental partial sorting: first "heapify", in linear time, the complete input array to produce a min-heap. Then extract the minimum of the heap {{mvar|k}} times.<ref name="aofa04slides">{{cite conference |author=Conrado Martínez |year=2004 |title=On partial sorting |url=http://www.lsi.upc.edu/~conrado/research/talks/aofa04.pdf |conference=10th Seminar on the Analysis of Algorithms}}</ref>
 
A different incremental sort can be obtained by modifying quickselect. The version due to Paredes and Navarro maintains a [[stack (data structure)|stack]] of pivots across calls, so that incremental sorting can be accomplished by repeatedly requesting the smallest item of an array {{mvar|A}} from the following algorithm:<ref name="paredes">{{Cite conference| doi = 10.1137/1.9781611972863.16| chapter = Optimal Incremental Sorting| title = Proc. Eighth Workshop on Algorithm Engineering and Experiments (ALENEX)| pages = 171–182| year = 2006| last1 = Paredes | first1 = Rodrigo| last2 = Navarro | first2 = Gonzalo| isbn = 978-1-61197-286-3| citeseerx = 10.1.1.218.4119}}</ref>