Content deleted Content added
Cascade heap |
|||
Line 36:
When data is already organized into a [[data structure]], it may be possible to perform selection in an amount of time that is sublinear in the number of values. As a simple case of this, for data already sorted into an array, selecting the <math>k</math> element may be performed by a single array lookup, in constant time.
For data organized as a [[binary heap]] it is possible to perform selection in {{nowrap|time <math>O(k)</math>,}} independent of the size <math>n</math> of the whole tree, and faster than the <math>O(k\log n)</math> time bound that would be obtained from {{nowrap|[[best-first search]].{{r|frederickson}}}} This same method can be applied more generally to data organized as any kind of heap-ordered tree (a tree in which each node stores one value in which the parent of each non-root node has a smaller value than its child). This method of performing selection in a heap has been applied to problems of listing multiple solutions to combinatorial optimization problems, such as finding the [[k shortest path routing|{{mvar|k}} shortest paths]] in a weighted graph, by defining a [[State space (computer science)|state space]] of solutions in the form of an [[implicit graph|implicitly defined]] heap-ordered tree, and then applying this selection algorithm to this {{nowrap|tree.{{r|kpaths}}}} In the other direction, linear time selection algorithms have been used as a subroutine in a [[priority queue]] data structure related to the heap, improving the time for extracting its {{nowrap|<math>k</math>th}} item from <math>O(\log n)</math> to <math>O(\log^* n+\log k)</math>; here <math>\log^* n</math> is the [[iterated logarithm]].{{r|bks}}
For a collection of data values undergoing dynamic insertions and deletions, it is possible to augment a [[self-balancing binary search tree]] structure with a constant amount of additional information per tree node, allowing insertions, deletions, and selection queries that ask for the {{nowrap|<math>k</math>th}} element in the current set to all be performed in <math>O(\log n)</math> time per {{nowrap|operation.{{r|clrs}}}} It is not possible for a [[streaming algorithms|streaming algorithm]] with memory sublinear in both <math>n</math> and <math>k</math> to solve selection queries exactly for dynamic data, but the [[count–min sketch]] can be used to solve selection queries approximately, by finding a value whose position in the ordering of the elements (if it were added to them) would be within <math>\varepsilon n</math> steps of <math>k</math>, for a sketch whose size is within logarithmic factors of <math>1/\varepsilon</math>.{{r|cormut}}
Line 109:
| volume = 7
| year = 1973}}</ref>
<ref name=bks>{{citation
| last1 = Babenko | first1 = Maxim
| last2 = Kolesnichenko | first2 = Ignat
| last3 = Smirnov | first3 = Ivan
| doi = 10.1007/s00224-018-9866-1
| issue = 4
| journal = Theory of Computing Systems
| mr = 3942251
| pages = 637–646
| title = Cascade heap: towards time-optimal extractions
| volume = 63
| year = 2019}}</ref>
<ref name=brown>{{cite journal
|