Selection algorithm: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 20:20, 31 August 2023 edit Patar knight (talk \| contribs) Administrators 49,166 edits WP:SDJARGON Tags: Shortdesc helper Reverted ← Previous edit		Latest revision as of 20:59, 28 January 2025 edit undo David Eppstein (talk \| contribs) Autopatrolled, Administrators 235,585 edits Undid revision 1272462687 by 49.207.232.175 (talk) the values must have an ordering. They may not be presented in that ordering. Tag: Undo
(7 intermediate revisions by 6 users not shown)
Line 1: {{Short description\|~~Computer~~Method ~~science~~for ~~algorithim~~finding kth smallest value}} {{For\|simulated natural selection in genetic algorithms\|Selection (genetic algorithm)}} {{Good article}} Line 60: == Exact numbers of comparisons == [[File:Median of 5.svg\|thumb\|upright=0.9\|Finding the median of five values using six comparisons. Each step shows the comparisons to be performed next as yellow line segments, and a [[Hasse diagram]] of the order relations found so far (with smaller=lower and larger=higher) as blue line segments. The red elements have already been found to be greater than three others and so cannot be the median. The larger of the two elements in the final comparison is the median.]] [[Donald Knuth\|Knuth]] supplies the following triangle of numbers summarizing pairs of <math>n</math> and <math>k</math> for which the exact number of comparisons needed by an optimal selection algorithm is known. The {{nowrap\|<math>n</math>th}} row of the triangle (starting with <math>n=1</math> in the top row) gives the numbers of comparisons for inputs of <math>n</math> values, and the {{nowrap\|<math>k</math>th}} number within each row gives the number of comparisons needed to select the {{nowrap\|<math>k</math>th}} smallest value from an input of that size. The rows are symmetric because selecting the {{nowrap\|<math>k</math>th}} smallest requires exactly the same number of comparisons, in the worst case, as selecting the {{nowrap\|<math>k</math>th}} {{nowrap\|largest.{{r\|knuth}}}} {{center\|0}} Line 78: Very few languages have built-in support for general selection, although many provide facilities for finding the smallest or largest element of a list. A notable exception is the [[Standard Template Library]] for [[C++]], which provides a templated <code>nth_element</code> method with a guarantee of expected linear time.{{r\|skiena}} [[Python (programming language)\|Python]]'s standard library ~~(since 2.4)~~ includes <code>heapq.nsmallest</code> and <code>heapq.nlargest</code> ~~subroutines~~functions for returning the smallest or largest elements from a collection, in sorted order. ~~Different versions of Python have used different algorithms for these subroutines. As of Python version 3.13, the~~The implementation maintains a [[binary heap]], limited to holding <math>k</math> elements, and initialized to the first <math>k</math> elements in the collection. Then, each subsequent ~~items~~item of the collection may replace the largest or smallest element in the heap ~~(respectively~~if ~~for~~it ~~<code>heapq~~is smaller or larger than this element.~~nsmallest</code>~~ ~~and~~The algorithm's memory usage is superior to heapselect (the former only holds <~~code~~math>~~heapq.nlargest~~k</~~code~~math>) ifelements itin ismemory ~~smaller~~at ora ~~larger~~time ~~than~~while ~~this~~the ~~element~~latter requires manipulating the entire dataset into memory). ~~The~~ ~~worst-case~~Running time ~~for~~depends ~~this~~on ~~implementation~~data ordering. The best case is <math>O((n - k) + k\log k)</math>, ~~worse~~for ~~than~~already ~~the~~sorted data. The worst-case is <math>O(n+k\log nk)</math> ~~that~~for ~~would~~reverse ~~be achieved~~sorted ~~by heapselect~~data. ~~However,~~In ~~for random~~average ~~input sequences~~cases, there are likely to be few heap updates and most input elements are processed with only a single comparison. For example, extracting the 100 largest or smallest values out of 10,000,000 random inputs makes 10,009,401 comparisons on average.{{r\|python}} Since 2017, [[Matlab]] has included <code>maxk()</code> and <code>mink()</code> functions, which return the maximal (minimal) <math>k</math> values in a vector as well as their indices. The Matlab documentation does not specify which algorithm these functions use or what their running {{nowrap\|time is.{{r\|matlab}}}} Line 128: \| title = Proceedings of the 17th Annual ACM Symposium on Theory of Computing, May 6–8, 1985, Providence, Rhode Island, USA \| year = 1985\| doi-access = free \| isbn = 0-89791-151-2 }}</ref> Line 172 ⟶ 173: \| title = Space-Efficient Data Structures, Streams, and Algorithms – Papers in Honor of J. Ian Munro on the Occasion of His 66th Birthday \| volume = 8066 \| year = 2013\| isbn = 978-3-642-40272-2 }}</ref> <ref name=brown>{{cite journal Line 212 ⟶ 213: \| volume = 711 \| year = 1993\| hdl = 11858/00-001M-0000-0014-B748-C \| isbn = 978-3-540-57182-7 \| hdl-access = free }}</ref> Line 420 ⟶ 422: \| title = 2nd Symposium on Simplicity in Algorithms, SOSA 2019, January 8–9, 2019, San Diego, CA, USA \| volume = 69 \| year = 2019~~}}</ref>~~\| doi-access = free }}</ref> <ref name=kletar>{{cite book Line 474 ⟶ 477: \| publisher = IEEE Computer Society \| title = 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18–21, 2014 \| year = 2014\| isbn = 978-1-4799-6517-5 }}</ref> <ref name=prt>{{cite journal