Search data structure: Difference between revisions

Content deleted Content added
m moved Search data structures to Search data structure: Oops, name must be singular by Wikipedia rules
Started stub with old contents "comparison of data structures"
Line 1:
In [[computer science]], a '''search data structure''' is any [[data structure]] that allows the efficient retrieval of specific items from a [[set (mathematics)|set]] of items, such as a specific [[record (computing)|record]] from a [[database]].
This is a comparison of [[data structure]]s. All times are listed in [[big O notation]].
 
The simplest, most general, and least efficient search strcuture is merely an unordered sequential [[list (computing)|list]] of all the items. Locating the desired item in such a list, by the [[linear search]] method, inevitably requires a number of operations proportional to the number ''n'' of items, in the [[worst case complexity|worst case]] as well as in the [[average case complexity|average case]]. Useful search data structures allow faster retrieval; however, they are limited to queries of some specific kind. Moreover, since the cost of building such structures is at least proportional to ''n'', they only pay off if several queries are to be performed on the same database (or on a database that changes little between queries).
O(1) is faster than O(log n) and it faster than O(n).
 
'''Static''' search structures are designed for answering many [[queries]] on a fixed database; '''dynamic''' structures also allow insertion, deletion, or modification of items between successive queries. In the dynamic case, one must also consider the cost of fixing the search structure to account for the changes in the database.
 
==Classification==
 
The simplest kind of query is to locate a record that has a specific field (the ''key'') equal to a specified value ''v''. Other common kinds of query are "find the item with smallest (or largest) key value", "find the item with largest key value not exceeding ''v''", "find all items with key values between specified bounds ''v''<sub>min</sub> and ''v''<sub>max</sub>".
 
In certain databases the key values may be points in some [[dimension (geometry)|multi-dimensional space]]. For example, the key may be a geographic position ([[latitude]] and [[longitude]]) on the [[Earth]]. In that case, common kinds of queries are ''find the record with a key closest to a given point ''v''", or "find all items whose key lies at a given distance from ''v''", or "find all ietms within a specified region ''R'' of the space".
 
A common special case of the latter are simultaneous range queries on two or more simple keys, such as "find all employee records with salary between 50,000 and 100,000 and hired between 1995 and 2007".
 
===Single ordered keys===
*[[Array data structure|Array]] if the key values span a moderately compact interval.
*Priority-sorted list; see [[linear search]]
*Key-sorted array; see [[binary search]]
*[[Self-balancing binary search tree]]
 
===Finding the smallest element===
*[[Heap (data structure)|Heap]]
 
====Asymptotic worst-case analysis====
 
In this table, the [[big-O notation|notation O(''f''(''n'')]] means "proportional to ''f''(''n'')" in the worst case".
 
{| class="wikitable"
Line 40 ⟶ 63:
| O(n)
|-
| [[Self-balancing binary search tree|Balanced binary tree]]
| [[Logarithm|O(log n)]]
| O(log n)