Content deleted Content added
Line 251:
::Actually, I think you're right about the general target level for hash functions. However, Rabin-Karp with Rabin fingerprinting necessarily implies knowledge of GF, and that belongs to a college upperclassman in mathematics, or a graduate student in computer science. An undergraduate will not likely come upon Rabin-Karp; it's useful only in exotic situations. I'm dithering over whether it's implicitly graduate level subject matter that I'm going to have to dumb down, and dumbing it down will make it readable, but useful to whom? I don't think I've ever used a rolling hash, and I've written editors and plagiarism detectors and other natural language processing software.[[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 23:15, 27 September 2019 (UTC)
:::Modular arithmetic is high school mathematics. You don't need to know Galois theory to understand it. —[[User:David Eppstein|David Eppstein]] ([[User talk:David Eppstein|talk]]) 01:16, 28 September 2019 (UTC)
::::We shouldn't care too much about arithmetic, and more about the properties of the function. They used a modulus of 101 - if we're searching for a string of 3 characters out of a character set of 96 printable characters (but their base was actually 8-bit bytes), there are 96<sup>3</sup>/101 = ~10,000 strings that can map to each hash value. If we use their base, it's 2<sup>24</sup>/101 binary values = ~160,000 strings. What they used as a modulus close to the cardinality of the character set, is supposed to be the base, and the modulus is an irreducible polynomial over the field relatively prime to the pattern's polynomial. That polynomial is big and choosing it relies on some reduction properties of the field. They used numbers without understanding how operating in the field yields a unique representation of a string, i.e. it's fingerprint. The article is junk. [[User:Sbalfour|Sbalfour]] ([[User talk:Sbalfour|talk]]) 17:05, 28 September 2019 (UTC)
|