Rabin–Karp algorithm: Difference between revisions

Content deleted Content added
No edit summary
Dcoetzee (talk | contribs)
A simple application after the intro
Line 1:
'''Rabin-Karp algorithm''' is a [[string searching algorithm]] that seeks a pattern, i.e. a substring, within a text by using [[hashing]]. It is not widely used for single pattern matching, but is of considerable theoretical importance and is very effective for multiple pattern matching. Historically, it predates the very popular [[Knuth-Morris-Pratt algorithm]]. For text of length n and pattern of length m, its average and best case running time is O(n) (meaning proportional to n), but the worst case scenario is O(n*m), which is one of the reasons why it is not widely used.
 
One of the simplest practical applications of Rabin-Karp is in detection of plagiarism. Say, for example, that a student is writing an English paper on ''[[Moby Dick]]''. A cunning professor might locate a variety of source material on Moby Dick and automatically extract a list of all sentences in those materials. Then, Rabin-Karp can rapidly search through a particular paper for any instance of any of the sentences from the source materials. To avoid easily thwarting the system with minor changes, it can be made to ignore details such as case and punctuation by removing these first.
 
== Shifting substrings search and competing algorithms ==