Talk:Rabin–Karp algorithm

This is an old revision of this page, as edited by 213.221.87.179 (talk) at 09:20, 28 July 2005 (Line 8 of the muli-pattern-search pseudo-code seems wrong to me). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Latest comment: 20 years ago by Dcoetzee in topic =

I think it would be nice if the article discussed extending the algorithm for 2 dimensional pattern matching, as well as giving some optimizations in the case of having varying string lengths for multi-pattern matching case.

=

In the pseudocode of the RK string search algorithm, if hs is given as hash(s[1..n]), the algorithm can not detect a target pattern which occurs on the first m bytes of a given text. This is why I reverted it to be hash(s[1...m]).

One arduous but impetuous, so-called, vandalism detector claims that it should be hash(s[1...n]), because, when m is larger than n, the code crashes. That is true, but I reckon such a range check matter is not a major issue in describing the fundamentals of the algorithm.

Considering the following sentence which is excerpted from the article and is supposed to be written by one of the original contributors,

"Lines 2,3, and 6 each require omega(m) time."

if one still thinks hs := hash(s[1...n]) is correct, why doesn't he/she also modify the above sentence too?

footnote) The line 3 of the pseudocode has been in the form of hash(s[1..m]) for the last six months or so since its original edition by the original writer.

--129.254.65.18 08:35, 23 May 2005 (UTC)Reply

I agree with you. We are assuming m ≤ n, and if [1..n] is used, the algorithm won't work, because it'd be trying to compare hashes of strings of length n to hashes of strings of length m. Deco 21:41, 23 May 2005 (UTC)Reply

Line 8 of the muli-pattern-search pseudo-code seems wrong to me

Quote: if s[i..i+m-1] = a substring with hash hs This is exactly the same condition as in the line above (element of hash). As pointed out corretly in the single patters search - to rule out a hash collision we need to compare with the actual search string(s). (Plural only if by fat chance or denial of service attack, two search strings have the same hash.)

Or am I missing something obvious here?