Revision as of 07:13, 14 November 2004 edit Dcoetzee (talk \| contribs) 37,529 edits m →Shifting substrings search and competing algorithms: Hmm, a bit more neutral ← Previous edit		Revision as of 07:14, 14 November 2004 edit undo Dcoetzee (talk \| contribs) 37,529 edits Why use this in example Next edit →
Line 1: '''Rabin-Karp algorithm''' is a [[string searching algorithm]] that seeks a pattern, i.e. a substring, within a text by using [[hashing]]. It is not widely used for single pattern matching, but is of considerable theoretical importance and is very effective for multiple pattern matching. Historically, it predates the very popular [[Knuth-Morris-Pratt algorithm]]. For text of length ''n'' and pattern of length ''m'', its average and best case running time is [[Big-O notation\|O]](''n''), but the (highly unlikely) worst case performance is O(''nm''), which is one of the reasons why it is not widely used. However, it has the unique advantage of being able to find any one of ''k'' strings or less in O(''n'') time on average, regardless of the size of ''k''. One of the simplest practical applications of Rabin-Karp is in detection of plagiarism. Say, for example, that a student is writing an English paper on ''[[Moby Dick]]''. A cunning professor might locate a variety of source material on Moby Dick and automatically extract a list of all sentences in those materials. Then, Rabin-Karp can rapidly search through a particular paper for any instance of any of the sentences from the source materials. To avoid easily thwarting the system with minor changes, it can be made to ignore details such as case and punctuation by removing these first. Because the number of strings we're searching for, ''k'', is very large, other string searching algorithms are impractical. == Shifting substrings search and competing algorithms ==

Rabin–Karp algorithm: Difference between revisions