Revision as of 23:02, 25 October 2005 edit MeltBanana (talk \| contribs) 12,587 edits m →References: fix cat ← Previous edit		Revision as of 20:47, 4 December 2005 edit undo Gaius Cornelius (talk \| contribs) Administrators 169,502 edits m One word in this context: "straightforward". Next edit →
Line 29: 11 3 racadabra To find where a given pattern ''P'' is a substring of ''S'', two binary searches are used to find the range of suffixes [[prefix]]ed by ''P''. The output of the search is the list of starting positions in this range. A search for "br" in the given example would give a left border 6 and a right border 7, giving suffixes 9 and 2. This means that "br" is a substring of "abracadabra" both at position 2 and 9. If implemented ~~straight forward~~straightforward, this binary search takes O(''m'' log ''n'') time, as most of the pattern P is compared at every step in the binary search in the wost case. To avoid redoing comparisons, extra data structures giving information about the longest common prefixes (LCP's) of suffixes are constructed, giving O(''m'' + log ''n'') search time. The key insight of the suffix array is to denote each suffix by its starting position only (the second column above). The resulting array of numbers, combined with the original string, is a compact representation of the sorted suffix list, consuming one character and one integer for each character in the string.

Suffix array: Difference between revisions