Content deleted Content added
JMarianczuk (talk | contribs) corrected summary statement about re-construction of the PHF (in non-dynamic case) |
m convert special characters (via WP:JWB) |
||
Line 23:
The important performance parameters for perfect hashing are the representation size, the evaluation time, the construction time, and additionally the range requirement <math>\frac{m}{n}</math>.<ref name="CHD"/> The evaluation time can be as fast as {{math|''O''(''1'')}}, which is optimal.<ref name="inventor"/><ref name="CHD"/> The construction time needs to be at least {{math|''O''(''n'')}}, because each element in {{mvar|S}} needs to be considered, and {{mvar|S}} contains {{mvar|n}} elements. This lower bound can be achieved in practice.<ref name="CHD"/>
The lower bound for the representation size depends on {{mvar|m}} and {{mvar|n}}. Let {{math|''m'' {{=}} (1+ε) ''n''}} and {{mvar|h}} a perfect hash function. A good approximation for the lower bound is <math>\log e - \varepsilon \log \frac{1+\varepsilon}{\varepsilon}</math> Bits per element. For minimal perfect hashing, {{math|ε {{=}} 0}}, the lower bound is {{math|log e
==Construction==
Line 49:
A more recent method for constructing a perfect hash function is described by {{harvtxt|Belazzougui|Botelho|Dietzfelbinger|2009}} as "hash, displace, and compress". Here a first-level hash function {{mvar|g}} is also used to map elements onto a range of {{mvar|r}} integers. An element {{math|''x''
Then, in descending order of size, each bucket's elements are hashed by a hash function of a sequence of independent fully random hash functions {{math|(Φ<sub>1</sub>, Φ<sub>2</sub>, Φ<sub>3</sub>, ...)}}, starting with {{math|Φ<sub>1</sub>}}. If the hash function does not produce any collisions for the bucket, and the resulting values are not yet occupied by other elements from other buckets, the function is chosen for that bucket. If not, the next hash function in the sequence is tested.<ref name="CHD" />
Line 64:
(2) Sort buckets B<sub>i</sub> in falling order according to size |B<sub>i</sub>|
(3) Initialize array T[0...m-1] with 0's
(4) '''for all''' i{{thin space}}
(5) '''for''' l{{thin space}}←{{thin space}}1,2,...
(6) '''repeat''' forming K<sub>i</sub>{{thin space}}←{{thin space}}{{{math|Φ}}<sub>l</sub>(x)|x{{thin space}}
(6) '''until''' |K<sub>i</sub>|=|B<sub>i</sub>| '''and''' K<sub>i</sub>∩{j|T[j]=1}={{thin space}}∅
(7) '''let''' σ(i):= the successful l
(8) '''for all''' j{{thin space}}
(9) Transform (σ<sub>i</sub>)<sub>0≤i<r</sub> into compressed form, retaining {{math|''O''(''1'')}} access.
Line 148:
===k-perfect hashing===
A hash function is {{mvar|k}}-perfect if at most {{mvar|k}} elements from {{mvar|S}} are mapped onto the same value in the range. The "hash, displace, and compress" algorithm can be used to construct {{mvar|k}}-perfect hash functions by allowing up to {{mvar|k}} collisions. The changes necessary to accomplish this are minimal, and are underlined in the adapted pseudocode below:
(4) '''for all''' i{{thin space}}
(5) '''for''' l{{thin space}}←{{thin space}}1,2,...
(6) '''repeat''' forming K<sub>i</sub>{{thin space}}←{{thin space}}{{{math|Φ}}<sub>l</sub>(x)|x{{thin space}}
(6) '''until''' |K<sub>i</sub>|=|B<sub>i</sub>| '''and''' K<sub>i</sub>∩{j|<u>T[j]=k</u>}={{thin space}}∅
(7) '''let''' σ(i):= the successful l
(8) '''for all''' j{{thin space}}
===Order preservation===
|