Selection algorithm: Difference between revisions

Content deleted Content added
Kleinberg & Tardos
source introselect
Line 30:
*A variation of quickselect chooses a pivot by randomly sampling a subset of <math>r</math> data values, for some sample size <math>r</math>, and then recursively selecting an element near position <math>rk/n</math> of the sample to use as the pivot. This choice causes the pivot to be likely to be close in the sorted sequence to the eventual result of the overall selection algorithm, so that each pivoting step eliminates as many values as possible. With a careful choice of sample size, and with the index of the recursive selection call chosen either somewhat above or somewhat below <math>rk/n</math> (to control which side of <math>k</math> the pivot is likely to land on), this method can achieve an expected number of comparisons that is <math>n+\min(k,n-k)+O(\sqrt n)</math>.{{r|floriv}}
*The [[median of medians]] method partitions the input into sets of five elements, and then uses some other method (rather than a recursive call) to find the median of each of these sets in constant time per set. It then recursively calls the same selection algorithm to find the median of these <math>n/5</math> medians, using the result as its pivot. It can be shown that, for this choice of pivot, <math>\max(|L|,|R|)\le 7n/10</math>. Thus, a problem on <math>n</math> elements is reduced to two recursive problems on <math>n/5</math> and at most <math>7n/10</math> elements. The total size of these two recursive subproblems is at most <math>9n/10</math>, allowing the total time to be analyzed as a geometric series adding to <math>O(n)</math>. Unlike quickselect, this algorithm is deterministic, not randomized.{{r|clrs|bfprt}} It was the first linear-time deterministic selection algorithm known,{{r|bfprt}} and is commonly taught in undergraduate algorithms classes as an example of a [[divide and conquer]] algorithm that does not divide into two equal subproblems. However, the high constant factors in its <math>O(n)</math> time bound make it unsuitable for practical use.
*Hybrid algorithms such as [[introselect]] can be used to achieve the practical performance of quickselect with a fallback to medians of medians guaranteeing worst-case <math>O(n)</math> time.{{r|musser}}
 
===Sublinear data structures===
Line 138:
| volume = 28
| year = 1999}}</ref>
 
<ref name=musser>{{cite journal
| last = Musser | first = David R.
| date = August 1997
| doi = 10.1002/(sici)1097-024x(199708)27:8<983::aid-spe117>3.0.co;2-#
| issue = 8
| journal = Software: Practice and Experience
| pages = 983–993
| publisher = Wiley
| title = Introspective Sorting and Selection Algorithms
| volume = 27}}</ref>
 
<ref name=iso>Section 25.3.2 of ISO/IEC 14882:2003(E) and 14882:1998(E)</ref>