Selection algorithm: Difference between revisions

Content deleted Content added
Ken7ken (talk | contribs)
m Added time complexity for the described algorithm. Also corrected bad use of complexity notation( O(n) is the standard, not "n" ). Quicksort and Quicksearch are names of algorhtms, should start with capital.
Line 37:
{{see|Quickselect}}
 
Linear performance can be achieved by a partition-based selection algorithm, most basically [[quickselect]]. Quickselect is a variant of [[quicksort]] – in both one chooses a pivot and then partitions the data by it, but while quicksortQuicksort recurses on both sides of the partition, quickselectQuickselect only recurses on one side, namely the side on which the desired ''k''th element is. As with quicksortQuicksort, this has optimal average performance, in this case linear, but poor worst-case performance, in this case quadratic. This occurs for instance by taking the first element as the pivot and searching for the maximum element, if the data is already sorted. In practice this can be avoided by choosing a random element as pivot, which yields [[almost certain]] linear performance. Alternatively, a more careful deterministic pivot strategy can be used, such as [[median of medians]]. These are combined in the hybrid [[introselect]] algorithm (analogous to [[introsort]]), which starts with quickselectQuickselect but falls back to median of medians if progress is slow, resulting in both fast average performance and optimal worst-case performance. The average time complexity performance is O(n).
 
The partition-based algorithms are generally done in place, which thus results in partially sorting the data. They can be done out of place, not changing the original data, at the cost of ''O(n'') + O(1) additional space.
 
=== Median selection as pivot strategy ===
{{see|Median of medians}}
A median-selection algorithm can be used to yield a general selection algorithm or sorting algorithm, by applying it as the pivot strategy in quickselectQuickselect or quicksortQuicksort; if the median-selection algorithm is asymptotically optimal (linear-time), the resulting selection or sorting algorithm is as well. In fact, an exact median is not necessary – an approximate median is sufficient. In the [[median of medians]] selection algorithm, the pivot strategy computes an approximate median and uses this as pivot, recursing on a smaller set to compute this pivot. In practice the overhead of pivot computation is significant, so these algorithms are generally not used, but this technique is of theoretical interest in relating selection and sorting algorithms.
 
In detail, given a median-selection algorithm, one can use it as a pivot strategy in quickselectQuickselect, obtaining a selection algorithm. If the median-selection algorithm is optimal, meaning O(''n''), then the resulting general selection algorithm is also optimal, again meaning linear. This is because quickselectQuickselect is a [[decrease and conquer]] algorithm, and using the median at each pivot means that at each step the search set decreases by half in size, so the overall complexity is a [[geometric series]] times the complexity of each step, and thus simply a constant times the complexity of a single step, in fact <math>2 = 1/(1-1/2)</math> times (summing the series).
 
Similarly, given a median-selection algorithm or general selection algorithm applied to find the median, one can use it as a pivot strategy in quicksortQuicksort, obtaining a sorting algorithm. If the selection algorithm is optimal, meaning O(''n''), then the resulting sorting algorithm is optimal, meaning O(''n'' log ''n''). The median is the best pivot for sorting, as it evenly divides the data, and thus guarantees optimal sorting, assuming the selection algorithm is optimal. A sorting analog to median of medians exists, using the pivot strategy (approximate median) in quicksortQuicksort, and similarly yields an optimal quicksortQuicksort.
 
== Incremental sorting by selection ==