Content deleted Content added
m Spelling; is -> it |
No edit summary |
||
Line 8:
== Hash function used ==
The key to Rabin-Karp performance is the efficient computation of [[hash value]]s of the successive substrings of the text. Rabin-Karp achieves this by treating every substring as a number in some base, the base being usually a big [[prime]]. For example, if the substring is "hi" and the base is 101, the hash value would be 104 * 101^1 + 105 * 101^0 = 10609 ([[ASCII]] of 'h' is 104 and of 'i' is 105). Technically, this algorithm is only similar to the true number in a non-decimal system representation, since for example we could have the "base" less than one of the "digits". See [[hash function]] for much more detailed discussion. The essential benefit achieved by such representation is that it is possible to compute the hash value of the next substring from the previous one by doing only a finite number of operations, independent of the substrings' length. For example, if we have text "abracadabra" and we are searching for a pattern of length 3, we can compute the hash of "bra" from the hash for "abr" (the previous substring) by subtracting the number added for the first 'a' of "abr", i.e. 97 * 101^2 (97 is ASCII for 'a' and 101 is the base we are using), multiplying by the base and adding
== Rabin-Karp and multiple pattern search ==
|