Content deleted Content added
-ron-cohhen- (talk | contribs) |
m Bot: http → https |
||
(20 intermediate revisions by 14 users not shown) | |||
Line 1:
In [[computer science]], '''partial sorting''' is a [[Relaxation (approximation)|relaxed]] variant of the [[Sorting algorithm|sorting]] problem. Total sorting is the problem of returning a list of items such that its elements all appear in order, while partial sorting is returning a list of the ''k'' smallest (or ''k'' largest) elements in order. The other elements (above the ''k'' smallest ones) may also be
In terms of indices, in a partially sorted list, for every index ''i'' from 1 to ''k,'' the ''i''-th element is in the same place as it would be in the fully sorted list: element ''i'' of the partially sorted list contains [[order statistic]] ''i'' of the input list.
==Offline
===Heap-based solution===
[[Heap (data structure)|Heaps]] admit a simple single-pass partial sort when {{mvar|k}} is fixed: insert the first {{mvar|k}} elements of the input into a max-heap. Then make one pass over the remaining elements, add each to the heap in turn, and remove the largest element. Each insertion operation takes {{math|''O''(log ''k'')}} time, resulting in {{math|''O''(''n'' log ''k'')}} time overall; this "partial heapsort" algorithm is practical for small values of {{mvar|k}} and in [[online algorithm|online]] settings.<ref name="aofa04slides"/>
===Solution by partitioning selection===
A further relaxation requiring only a list of the {{mvar|k}} smallest elements, but without requiring that these be ordered, makes the problem equivalent to [[Selection algorithm#Partition-based selection|partition-based selection]]; the original partial sorting problem can be solved by such a selection algorithm to obtain an array where the first {{mvar|k}} elements are the {{mvar|k}} smallest, and sorting these, at a total cost of {{math|''O''(''n'' + ''k'' log ''k'')}} operations. A popular choice to implement this algorithm scheme is to combine [[quickselect]] and [[quicksort]]; the result is sometimes called "quickselsort".<ref name="aofa04slides"/>
Common in current (as of 2022) C++ STL implementations is a pass of [[Heap (data structure)#Applications|heapselect]] for a list of ''k'' elements, followed by a [[heapsort]] for the final result.<ref>{{cite web |title=std::partial_sort |url=https://en.cppreference.com/w/cpp/algorithm/partial_sort |website=en.cppreference.com}}</ref>
==={{anchor|Partial quicksort}} Specialised sorting algorithms===
More efficient than the aforementioned are specialized partial sorting algorithms based on [[mergesort]] and [[quicksort]]. In the quicksort variant, there is no need to recursively sort partitions which only contain elements that would fall after the {{mvar|k}}'th place in the final sorted array (starting from the "left" boundary). Thus, if the pivot falls in position {{mvar|k}} or later, we recurse only on the left partition:<ref>{{cite conference |last=Martínez |first=Conrado |title=Partial quicksort |conference=Proc. 6th ACM-SIAM Workshop on Algorithm Engineering and Experiments and 1st ACM-SIAM Workshop on Analytic Algorithmics and Combinatorics |year=2004 |url=
The resulting algorithm is called partial quicksort and requires an ''expected'' time of only {{math|''O''(''n'' + ''k'' log ''k'')}}, and is quite efficient in practice, especially if a [[selection sort]] is used as a base case when {{mvar|k}} becomes small relative to {{mvar|n}}. However, the worst-case time complexity is still very bad, in the case of a bad pivot selection. Pivot selection along the lines of the worst-case linear time selection algorithm (see {{section link|Quicksort|Choice of pivot}}) could be used to get better worst-case performance. Partial quicksort, quickselect (including the multiple variant), and quicksort can all be generalized into what is known as a ''chunksort''.<ref name="aofa04slides"/>
==Incremental sorting==
Incremental sorting is
[[Heap (data structure)|Heaps]] lead to an {{math|''O''(''n'' + ''k'' log ''n'')}} "online heapselect" solution to
<div style="margin-left: 35px; width: 600px">
Line 45:
</div>
The stack {{mvar|S}} is initialized to contain only the length {{mvar|n}} of {{mvar|A}}. {{mvar|k}}-sorting the array is done by calling {{math|IQS(''A'', ''i'', ''S'')}} for {{math|''i'' {{=}} 0, 1, 2, ...}}; this sequence of calls has [[average-case complexity]] {{math|''O''(''n'' + ''k'' log ''k'')}}, which is asymptotically equivalent to {{math|''O''(''n'' + ''k'' log ''n'')}}. The worst-case time is quadratic, but this can be fixed by replacing the random pivot selection by the [[median of medians]] algorithm.{{r|paredes}}
== Language/library support ==
* The [[C++]] standard specifies a library function called <code>[
* The [[Python (programming language)|Python]] standard library includes functions <code>[https://docs.python.org/library/heapq.html#heapq.nlargest nlargest]</code> and <code>[https://docs.python.org/library/heapq.html#heapq.nsmallest nsmallest]</code> in its <code>heapq</code> module.
* The [[Julia_(programming_language)|Julia]] standard library includes a <code>[https://docs.julialang.org/en/v1/base/sort/#Base.Sort.PartialQuickSort PartialQuickSort]</code> algorithm used in <code>[https://docs.julialang.org/en/v1/base/sort/#Base.Sort.partialsort! partialsort!]</code> and variants.
==References==▼
{{reflist}}▼
== See also ==
* [[Selection algorithm]]
▲==References==
▲{{reflist}}
== External links ==
Line 62 ⟶ 63:
[[Category:Sorting algorithms]]
[[Category:Online sorts]]
[[Category:Articles with example pseudocode]]
|