LCP array: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: hdl, doi added to citation with #oabot.
Definition: Change Example to use 1-based indexing to match definition section
Line 35:
 
== Definition ==
Let <math>A</math> be the [[suffix array]] of the string <math>S=s_1,s_2,\ldots s_ns_{n-1}\$</math> andof letlength <math>\operatorname{lcp}(vn</math>,w) where <math>\$</math> denoteis thea lengthsentinel ofletter thethat longestis commonunique prefixand between[[Lexicographic twoorder|lexicographically]] stringssmaller than any other character. Let <math>vS[i,j]</math> anddenote the substring of <math>wS</math>. Letranging further denotefrom <math>S[i,j]</math> theto substring<math>j</math>. ofThus, <math>S[A[i], n]</math> rangingis fromthe <math>i</math>th tosmallest suffix of <math>jS</math>.
 
Let <math>\operatorname{lcp}(v,w)</math> denote the length of the longest common prefix between two strings <math>v</math> and <math>w</math>. Then the LCP array <math>H[1,n]</math> is an integer array of size <math>n</math> such that <math>H[1]</math> is undefined and <math>H[i]=\operatorname{lcp}(S[A[i-1],n],S[A[i],n])</math> for every <math>1<i\leq n</math>. Thus <math>H[i]</math> stores the length of longest common prefix of the [[Lexicographical order|lexicographically]] <math>i</math>'th smallest suffix and its predecessor in the suffix array.
 
== Difference between suffixLCP array and LCPsuffix array ==:
 
* Suffix array: Represents the lexicographic rank of each suffix of an array.
* LCP array: Contains the maximum length prefix match between two consecutive suffixes, after they are sorted lexicographically.
 
== Example ==
Line 45 ⟶ 50:
|-
! {{left header}} | i
| 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7
|-
! {{left header}} | S[i]
Line 53 ⟶ 58:
{| class="wikitable"
! {{left header}} | i
| 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7
|-
! {{left header}} | A[i]
| 67 || 56 || 34 || 12 || 01 || 45 || 23
|}
 
Complete suffixSuffix array with sorted suffixes itselfwritten out underneath vertically:
 
{| class="wikitable"
! {{left header}} | i
| 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7
|-
! {{left header}} | A[i]
| 67 || 56 || 34 || 12 || 01 || 45 || 23
|-
! {{left header}} | S[A[i], n][1]
| $ || a || a || a || b || n || n
|-
! {{left header}} | S[A[i], n][2]
| || $ || n || n || a || a || a
|-
! {{left header}} | S[A[i], n][3]
| || || a || a || n || $ || n
|-
! {{left header}} | S[A[i], n][4]
| || || $ || n || a || || a
|-
! {{left header}} | S[A[i], n][5]
| || || || a || n || || $
|-
! {{left header}} | S[A[i], n][6]
| || || || $ || a || ||
|-
! {{left header}} | S[A[i], n][7]
| || || || || $ || ||
 
Line 96 ⟶ 101:
|-
! {{left header}} | i
| 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7
|-
! {{left header}} | H[i]
| undefined || 0 || 1 || 3 || 0 || 0 || 2
|}
 
So, for example, <math>H[3]=3</math> is the length of the longest common prefix <math>ana</math> shared by the suffixes <math> A[23]=S[4,7]=ana\$</math> and <math>A[34]=S[2,7]=anana\$</math>. Note that <math>H[01]=\bot</math> is undefined, since there is no lexicographically smaller suffix.
 
== Difference between suffix array and LCP array ==
Suffix array: Represents the lexicographic rank of each suffix of an array.
 
LCP array: Contains the maximum length prefix match between two consecutive suffixes, after they are sorted lexicographically.
 
== LCP array usage in finding the number of occurrences of a pattern ==