Content deleted Content added
rewrite stuff that relates to selection |
→Data structure-based solutions: rewrite |
||
Line 6:
A further relaxation requiring only a list of the {{mvar|k}} smallest elements, but without requiring that these be ordered, makes the problem equivalent to [[Selection algorithm#Partition-based selection|partition-based selection]]; the original partial sorting problem can be solved by such a selection algorithm to obtain an array where the first {{mvar|k}} elements are the {{mvar|k}} smallest, and sorting these, at a total expected cost of {{math|''O''(''n'' + ''k'' log ''k'')}} operations.
==
[[Binary heap]]s lead to an {{math|''O''(''n'' + ''k'' log ''n'')}} solution to partial sorting: partial [[heapsort]]. First "heapify", in linear time, the complete input array. Then extract the minimum of the heap {{mvar|k}} times.<ref name="aofa04slides">{{cite conference |author=Conrado Martínez |year=2004 |title=On partial sorting |url=http://www.lsi.upc.edu/~conrado/research/talks/aofa04.pdf |conference=10th Seminar on the Analysis of Algorithms}}</ref>
A [[streaming algorithm|streaming]], single-pass partial sort is also possible using heaps or other [[priority queue]] data structures. First, insert the first {{mvar|k}} elements of the input into the structure. Then make one pass over the remaining elements, add each to the structure in turn, and remove the largest element. Each insertion operation also takes {{math|''O''(log ''k'')}} time, resulting in {{math|''O''(''n'' log ''k'')}} time overall; this algorithm is practical for small values of {{mvar|k}} and in [[online algorithm|online]] settings.<ref name="aofa04slides"/>
It is possible to transform the list into a [[binary heap]] in Θ(''n'') time, and then traverse the heap using a modified [[breadth-first search]] algorithm that places the elements in a [[priority queue]]{{clarify|reason=a heap is a priority queue!|date=January 2014}} (instead of the ordinary queue that is normally used in a BFS), starting with the root node, iteratively adding the children of the next node in the queue, and terminating the scan after traversing exactly ''k'' elements. As the queue size remains O(''k'') throughout the traversal, it would require O(''k'' log ''k'') time to complete, leading to a time bound of O(''n'' + ''k'' log ''k'') on this algorithm.{{citation needed|date=January 2014}}
|