Partial sorting: Difference between revisions

Content deleted Content added
AnomieBOT (talk | contribs)
m Dating maintenance tags: {{Citation needed}} {{Clarify}}
copyedit, rewrite pseudocode, formal reference
Line 13:
It is possible to transform the list into a [[binary heap]] in Θ(''n'') time, and then traverse the heap using a modified [[breadth-first search]] algorithm that places the elements in a [[priority queue]]{{clarify|reason=a heap is a priority queue!|date=January 2014}} (instead of the ordinary queue that is normally used in a BFS), and terminate the scan after traversing exactly ''k'' elements. As the queue size remains O(''k'') throughout the traversal, it would require O(''k'' log ''k'') time to complete, leading to a time bound of O(''n'' + ''k'' log ''k'') on this algorithm.{{citation needed|date=January 2014}}
 
== OptimisedSpecialised sorting algorithms ==
More efficient than any of these are specialized partial sorting algorithms based on [[mergesort]] and [[quicksort]]. The simplest isIn the quicksort variation:variant, there is no need to recursively sort partitions which only contain elements that would fall after the ''{{mvar|k'}}'th place in the endfinal sorted array (starting from the "left" boundary). Thus, if the pivot falls in position ''{{mvar|k''}} or later, we recurse only on the left partition:<ref>{{cite conference |last=Martínez |first=Conrado |title=Partial quicksort |conference=Proc. 6th ACM-SIAM Workshop on Algorithm Engineering and Experiments and 1st ACM-SIAM Workshop on Analytic Algorithmics and Combinatorics |year=2004 |url=http://www.lsi.upc.edu/~conrado/research/reports/ALCOMFT-TR-03-50.pdf}}</ref>
 
'''function''' quicksortFirstKpartial_quicksort(listA, lefti, rightj, k)
'''if''' righti >< leftj
selectp pivotIndex betweenpivot(A, lefti, and rightj)
pivotNewIndexp := partition(listA, lefti, rightj, pivotIndexp)
quicksortFirstKpartial_quicksort(listA, lefti, pivotNewIndexp-1, k)
'''if''' pivotNewIndexp < leftj + k
quicksortFirstKpartial_quicksort(listA, pivotNewIndexp+1, rightj, k)
 
The resulting algorithm requires an ''expected'' time of only {{math|''O''(''n'' + ''k'' log ''k'')}}, and is quite efficient in practice, especially if we substitute selection sort when ''{{mvar|k''}} becomes small relative to ''{{mvar|n''}}. However, the worst-case time complexity is still very bad, in the case of a bad pivot selection. Pivot selection along the lines of the worst-case linear time selection algorithm could be used to get better worst-case performance.
 
Even better is if we don't require those ''{{mvar|k''}} items to be themselves sorted. Losing that requirement means we can ignore all partitions that fall entirely before, '''or''' after the ''{{mvar|k''}}th place. We recurse only into the partition that actually contains the ''{{mvar|k''}}th element itself.
 
'''function''' quickfindFirstKpartial_quicksort(listA, lefti, rightj, k)
'''if''' righti >< leftj
selectp pivotIndex betweenpivot(A, lefti, and rightj)
pivotNewIndexp := partition(listA, lefti, rightj, pivotIndexp)
'''if''' pivotNewIndexp > lefti + k ''// new condition''
quickfindFirstKpartial_quicksort(listA, lefti, pivotNewIndexp-1, k)
'''if''' pivotNewIndexp < lefti + k
quickfindFirstKpartial_quicksort(listA, pivotNewIndexp+1, rightj, k+lefti-pivotNewIndexp-1)
 
The resulting algorithm requires an expected time of only {{math|''O''(''n'')}}, which is the best such an algorithm can hope for.
 
==Tournament Algorithm==
Line 45:
* The [[C++]] standard specifies a library function called <code>[http://en.cppreference.com/w/cpp/algorithm/partial_sort std::partial_sort]</code>.
* The [[Python (programming language)|Python]] standard library includes functions <code>[http://docs.python.org/library/heapq.html#heapq.nlargest nlargest]</code> and <code>[http://docs.python.org/library/heapq.html#heapq.nsmallest nsmallest]</code> in its <code>heapq</code> module.
 
==References==
{{reflist}}
 
== See also ==
Line 51 ⟶ 54:
== External links ==
* J.M. Chambers (1971). [http://dl.acm.org/citation.cfm?id=362602 Partial sorting]. [[Communications of the ACM|CACM]] '''14'''(5):357–358.
* [http://www.siam.org/meetings/analco04/abstracts/CMartinez.pdf Partial quicksort]
 
[[Category:Sorting algorithms]]