Revision as of 06:44, 1 April 2023 edit David Eppstein (talk \| contribs) Autopatrolled, Administrators 235,660 edits →Sublinear data structures: ce ← Previous edit		Revision as of 06:45, 1 April 2023 edit undo David Eppstein (talk \| contribs) Autopatrolled, Administrators 235,660 edits →Sublinear data structures: ce Next edit →
Line 36: When data is already organized into a [[data structure]], it may be possible to perform selection in an amount of time that is sublinear in the number of values. As a simple case of this, for data already sorted into an array, selecting the {{nowrap\|<math>k</math>th}} element may be performed by a single array lookup, in constant time. For values organized into a two-dimensional array of {{nowrap\|size <math>m\times n</math>,}} with sorted rows and columns, selection may be performed in time {{nowrap\|<math>O\bigl(m\log(2n/m)\bigr)</math>,}} or faster when <math>k</math> is small relative to the array dimensions.{{r\|frejoh}} ~~For~~Selection from data ~~organized as~~in a [[binary heap]] ~~it is possible to perform selection in~~takes {{nowrap\|time <math>O(k)</math>,.}} This is independent of the size <math>n</math> of the ~~whole tree~~heap, and faster than the <math>O(k\log n)</math> time bound that would be obtained from {{nowrap\|[[best-first search]].{{r\|frederickson}}}} This same method can be applied more generally to data organized as any kind of heap-ordered tree (a tree in which each node stores one value in which the parent of each non-root node has a smaller value than its child). This method of performing selection in a heap has been applied to problems of listing multiple solutions to combinatorial optimization problems, such as finding the [[k shortest path routing\|{{mvar\|k}} shortest paths]] in a weighted graph, by defining a [[State space (computer science)\|state space]] of solutions in the form of an [[implicit graph\|implicitly defined]] heap-ordered tree, and then applying this selection algorithm to this {{nowrap\|tree.{{r\|kpaths}}}} In the other direction, linear time selection algorithms have been used as a subroutine in a [[priority queue]] data structure related to the heap, improving the time for extracting its {{nowrap\|<math>k</math>th}} item from <math>O(\log n)</math> to {{nowrap\|<math>O(\log^* n+\log k)</math>;}} here <math>\log^* n</math> is the {{nowrap\|[[iterated logarithm]].{{r\|bks}}}} For a collection of data values undergoing dynamic insertions and deletions, the [[order statistic tree]] augments a [[self-balancing binary search tree]] structure with a constant amount of additional information per tree node, allowing insertions, deletions, and selection queries that ask for the {{nowrap\|<math>k</math>th}} element in the current set to all be performed in <math>O(\log n)</math> time per {{nowrap\|operation.{{r\|clrs}}}} Going beyond the comparison model of computation, faster times per operation are possible for values that are small integers, on which binary arithmetic operations are {{nowrap\|allowed.{{r\|pattho}}}} It is not possible for a [[streaming algorithms\|streaming algorithm]] with memory sublinear in both <math>n</math> and <math>k</math> to solve selection queries exactly for dynamic data, but the [[count–min sketch]] can be used to solve selection queries approximately, by finding a value whose position in the ordering of the elements (if it were added to them) would be within <math>\varepsilon n</math> steps of <math>k</math>, for a sketch whose size is within logarithmic factors of <math>1/\varepsilon</math>.{{r\|cormut}}

Selection algorithm: Difference between revisions