I think it would be nice if the article discussed extending the algorithm for 2 dimensional pattern matching, as well as giving some optimizations in the case of having varying string lengths for multi-pattern matching case.
=
In the pseudocode of the RK string search algorithm, if hs is given as hash(s[1..n]), the algorithm can not detect a target pattern which occurs on the first m bytes of a given text. This is why I reverted it to be hash(s[1...m]).
One arduous but impetuous, so-called, vandalism detector claims that it should be hash(s[1...n]), because, when m is larger than n, the code crashes. That is true, but I reckon such a range check matter is not a major issue in describing the fundamentals of the algorithm.
Considering the following sentence which is excerpted from the article and is supposed to be written by one of the original contributors,
"Lines 2,3, and 6 each require omega(m) time."
if one still thinks hs := hash(s[1...n]) is correct, why doesn't he/she also modify the above sentence too?
footnote) The line 3 of the pseudocode has been in the form of hash(s[1..m]) for the last six months or so since its original edition by the original writer.
--129.254.65.18 08:35, 23 May 2005 (UTC)
- I agree with you. We are assuming m ≤ n, and if [1..n] is used, the algorithm won't work, because it'd be trying to compare hashes of strings of length n to hashes of strings of length m. Deco 21:41, 23 May 2005 (UTC)