Selection algorithm: Difference between revisions

Content deleted Content added
Partition-based selection: correction: median-of-medians is O(n), which was confusing in the original text and also caused my previous edit to be misleading
merging two sentences for simplicity.
Line 37:
{{further|Quickselect}}
 
Linear performance can be achieved by a partition-based selection algorithm, most basically [[quickselect]]. Quickselect is a variant of [[quicksort]] – in both one chooses a pivot and then partitions the data by it, but while Quicksort recurses on both sides of the partition, Quickselect only recurses on one side, namely the side on which the desired ''k''th element is. As with Quicksort, this has optimal average performance, in this case linear, but poor worst-case performance, in this case quadratic. This occurs for instance by taking the first element as the pivot and searching for the maximum element, if the data is already sorted. In practice this can be avoided by choosing a random element as pivot, which yields [[almost certain]] linear performance. Alternatively, a more careful deterministic pivot strategy can be used, such as [[median of medians]]. These are combined in the hybrid [[introselect]] algorithm (analogous to [[introsort]]), which starts with Quickselect but falls back to median of medians if progress is slow, resulting in both fast average performance and optimal worst-case performance. The worst-case time complexity performance isof O(''n'').
 
The partition-based algorithms are generally done in place, which thus results in partially sorting the data. They can be done out of place, not changing the original data, at the cost of O(''n'') additional space.