Revision as of 09:21, 21 August 2024 edit LucasBrown (talk \| contribs) Extended confirmed users 61,742 edits m ce Tag: Visual edit ← Previous edit		Revision as of 16:10, 29 August 2024 edit undo Earlsofsandwich (talk \| contribs) 262 edits →Rolling hash Tag: Visual edit Next edit →
Line 212: {{Main\|Rolling hash}} {{See also\|Linear congruential generator}} In some applications, such as [[string searching algorithm\|substring search]], one can compute a hash function {{math\|''h''}} for every {{math\|''k''}}-character [[substring]] of a given {{math\|''n''}}-character string by advancing a window of width {{math\|''k''}} characters along the string, where {{math\|''k''}} is a fixed integer, and {{Math\|''n'' > ''k''}}. The straightforward solution, which is to extract such a substring at every character position in the text and compute {{math\|''h''}} separately, requires a number of operations proportional to {{math\|''k''·''n''}}. However, with the proper choice of {{math\|''h''}}, one can use the technique of rolling hash to compute all those hashes with an effort proportional to {{math\|''mk'' + ''n''}} where {{math\|''m''}} is the number of occurrences of the substring.<ref>{{~~citation~~Cite book ~~needed~~\|~~date~~last=~~November~~Singh ~~2022~~\|first=N. B. \|url=https://books.google.com/books?id=ALIMEQAAQBAJ&pg=PT102&dq=rolling+hash&hl=en&newbks=1&newbks_redir=0&sa=X&ved=2ahUKEwijuvSVy5qIAxUUGDQIHSU5Ma4Q6AF6BAgIEAI#v=onepage&q=rolling%20hash&f=false \|title=A Handbook of Algorithms \|publisher=N.B. Singh \|language=en}}</ref>{{fix\|text=what is the choice of h?}} The most familiar algorithm of this type is [[Rabin-Karp]] with best and average case performance {{math\|''O''(''n''+''mk'')}} and worst case {{math\|''O''(''n''·''k'')}} (in all fairness, the worst case here is gravely pathological: both the text string and substring are composed of a repeated single character, such as {{math\|''t''}}="AAAAAAAAAAA", and {{math\|''s''}}="AAA"). The hash function used for the algorithm is usually the [[Rabin fingerprint]], designed to avoid collisions in 8-bit character strings, but other suitable hash functions are also used.

Hash function: Difference between revisions