Revision as of 23:56, 20 May 2020 edit Dhurandhar.bhatvadekar (talk \| contribs) 12 edits m removed unnecessary word Tag: Visual edit ← Previous edit		Revision as of 18:34, 11 August 2020 edit undo Enervation (talk \| contribs) Extended confirmed users 1,457 edits →Applications: Clarify search code Tag: Visual edit Next edit →
Line 186: <syntaxhighlight lang="python"> n = len(S) def search(P: str) -> Tuple[int, int]: """ Return indices (s, r) such that the interval A[s:r] (including the end index) represents all suffixes of S that start with the pattern P. """ # Find starting position of interval l = 0 r = n + 1 while l < r: mid = (l + r) // 2 # division rounding down to nearest integer # suffixAt(A[i]) is the ith smallest suffix if P > suffixAt(A[mid]): l = mid + 1 Line 196 ⟶ 202: r = mid s = l # Find ending position of interval r = n + 1 while l < r: mid = (l + r) // 2 if ~~P not a prefix of~~ suffixAt(A[mid]).startswith(P): r = mid▼ else:▼ l = mid + 1 ▲ else: ▲ r = mid return (s, r) </syntaxhighlight> Line 209 ⟶ 217: Suffix sorting algorithms can be used to compute the [[Burrows–Wheeler transform\|Burrows–Wheeler transform (BWT)]]. The [[Burrows–Wheeler transform\|BWT]] requires sorting of all cyclic permutations of a string. If this string ends in a special end-of-string character that is lexicographically smaller than all other character (i.e., $), then the order of the sorted rotated [[Burrows–Wheeler transform\|BWT]] matrix corresponds to the order of suffixes in a suffix array. The [[Burrows–Wheeler transform\|BWT]] can therefore be computed in linear time by first constructing a suffix array of the text and then deducing the [[Burrows–Wheeler transform\|BWT]] string: <math>BWT[i] = S[A[i]-1]</math>. Suffix arrays can also be used to look up substrings in [[~~Example~~example-~~Based~~based ~~Machine~~machine ~~Translation~~translation]], demanding much less storage than a full [[phrase table]] as used in [[Statistical machine translation]]. Many additional applications of the suffix array require the [[LCP array]]. Some of these are detailed in the [[LCP array#Applications\|application section]] of the latter.

Suffix array: Difference between revisions