Substring index: Difference between revisions

Content deleted Content added
ce
Line 29:
Specific data structures that can be used as substring indexes include:
* The [[suffix tree]], a [[radix tree]] of the suffixes of the string, allowing substring search to be performed symbol-by-symbol<ref name=bst/><ref name=gv/>
* The [[suffix array]], a sorted array of the starting positions of suffixes of the string, allowing substring search to be performed by [[binary search]]<ref name=bst/><ref name=gv/> Augmenting a suffix array with an [[LCP array]] of the lengths of common prefixes of consecutive suffixes allows the search to be performed symbol-by-symbol, matching the search time of the suffix tree.<ref>{{citation
| last1 = Manber | first1 = Udi | author1-link = Udi Manber
| last2 = Myers | first2 = Gene | author2-link = Eugene Myers
| doi = 10.1137/0222058
| issue = 5
| journal = [[SIAM Journal on Computing]]
| mr = 1237156
| pages = 935–948
| title = Suffix arrays: a new method for on-line string searches
| volume = 22
| year = 1993}}</ref>
* The [[compressed suffix array]], a data structure that combines [[data compression]] with the suffix array, allowing the structure to be stored in space sublinear in the text length<ref name=bst/><ref name=gv>{{citation
| last1 = Grossi | first1 = Roberto
Line 47 ⟶ 57:
| doi = 10.1145/1082036.1082039
| issue = 4
| journal = [[Journal of the ACM]]
| mr = 2164632
| pages = 552–581