Knuth–Morris–Pratt algorithm: Difference between revisions

Content deleted Content added
Ryan Reich (talk | contribs)
Add C code example to the search algorithm.
Ryan Reich (talk | contribs)
Add C code example to the table algorithm.
Line 100:
!<math>i</math>
| -1
| 0
| 1
| 2
| 3
| 4
| 5
| 6
|-
!<math>P[i]</math>
|
| A
| B
| C
| D
| A
| B
| D
|-
!<math>T[i]</math>
| -1
| 0
| 0
| 0
| 0
| 1
| 2
| 0
|}
 
Line 137:
# If <math>i = n</math> then we are done; terminate the algorithm. Otherwise, compare <math>P[i]</math> with <math>P[j]</math>.
#* If they are equal, set <math>T[i] = j + 1</math>, <math>j = j + 1</math>, and <math>i = i + 1</math>.
#* If not, and if <math>j > 0</math>, set <math>j = T[j - 1]</math>.
#* Otherwise, set <math>T[i] = 0</math>, <math>i = i + 1</math>.
# Return to step 3.
 
===EfficiencyThe efficiency of the table-building algorithm===
 
The complexity of the table algorithm is <math>O(n)</math>, where <math>n</math> is the length of <math>P</math>. As except for some initialization all the work is done in step 3, it is sufficient to show that step 3 executes in <math>O(n)</math> time, which will be done by simultaneously examining the quantities <math>i</math> and <math>i - j</math>. In the first branch, <math>i - j</math> is preserved, as both <math>i</math> and <math>j</math> are incremented simultaneously, but naturally, <math>i</math> is increased. In the second branch, <math>j</math> is replaced by <math>T[j - 1]</math>, which we saw above is always strictly less than <math>j</math>, thus increasing <math>i - j</math>. In the third branch, <math>i</math> is incremented and <math>j</math> is not, so both <math>i</math> and <math>i - j</math> increase. Since <math>i \geq i - j</math>, this means that at each stage either <math>i</math> or a lower bound for <math>i</math> increases; therefore since the algorithm terminates once <math>i = n</math>, it must terminate after at most <math>n</math> iterations of step 3, since <math>i - j</math> begins at <math>1</math>. Therefore the complexity of the table algorithm is <math>O(n)</math>.
 
===Code example for the table-building algorithm===
 
A [[C programming language|C]] code example for the table-building algorithm is given here. As for the search algorithm, the bounds of <math>T</math> have been incremented by 1 to make the C code more natural. The additional variable <math>c</math> is employed to simulate the existence of <math>P[-1]</math>. It is also assumed that both this routine and the search routine are [[call]]ed as [[subroutine]]s of a [[wrapper]] which correctly [[Memory allocation|allocates]] [[Computer storage|memory]] for <math>T</math>.
 
void kmp_table(char *P)
{
extern int T[];
int i = 0;
int j = -1;
char c;
c = '\0';
T[0] = j;
while (P[i] != '\0') {
if (P[i] == c) {
T[i + 1] = j + 1;
++j;
++i;
} else if (j > 0) {
j = T[j];
} else {
T[i + 1] = 0;
++i;
}
c = T[j];
}
return;
}
 
==Efficiency of the KNP algorithm==