Cuckoo hashing: Difference between revisions

Content deleted Content added
AnomieBOT (talk | contribs)
m Dating maintenance tags: {{Clarify}}
Bender the Bot (talk | contribs)
m External links: HTTP to HTTPS for Blogspot
 
(11 intermediate revisions by 7 users not shown)
Line 12:
 
===Lookup===
Cuckoo hashing uses two hash tables, <math>T_1</math> and <math>T_2</math>. Assuming <math>r</math> is the length of each table, the hash functions for the two tables is defined as, <math>h_1,\ h_2\ :\ \cupS \rightarrow \{0, ..., r-1\}</math> and <math>\forall x \in S</math> where <math>x</math> is the key and <math>S</math> is the set whose keys are stored in <math>h_1(x)</math> of <math>T_1</math> or <math>h_2(x)</math> of <math>T_2</math>. The lookup operation is as follows:{{r|Cuckoo|p=124}}
{|
|- style="vertical-align:top"
Line 63:
In practice, cuckoo hashing is about 20–30% slower than [[linear probing]], which is the fastest of the common approaches.<ref name="Cuckoo"/>
The reason is that cuckoo hashing often causes two cache misses per search, to check the two locations where a key might be stored, while linear probing usually causes only one cache miss per search.
However, because of its worst case guarantees on search time, cuckoo hashing can still be valuable when [[Real-time computing|real-time response rates]] are required. One advantage of cuckoo hashing is its linked-list free property, which fits GPU processing well.{{Clarify|date=July 2024}}
 
==Example==
Line 84:
|-
!colspan=2| Key inserted
| 2053 || 50 || 5320 || 75 || 100 || 67 || 105 || 3 || 36 || 3945
|-
!colspan=2| h(k)
| 9 || 6 || 9 || 9 || 1 || 1 || 6 || 3 || 3 || 61
|-
! rowspan=11 {{vert header|va=middle|Hash table entries}}
Line 94:
|-
! 1
| || || || ||bgcolor=#ffffcc| 100 ||bgcolor=#ffffcc| 67 || 67 || 67 || 67 ||bgcolor=#ffffcc| 10045
|-
! 2
Line 109:
|-
! 6
| ||bgcolor=#ffffcc| 50 || 50 || 50 || 50 || 50 ||bgcolor=#ffffcc| 105 || 105 || 105 ||bgcolor=#ffffcc| 39105
|-
! 7
Line 118:
|-
! 9
|bgcolor=#ffffcc| 2053 || 2053 || bgcolor="#ffffcc" | 20 ||bgcolor=#ffffcc| 75 || 75 || 75 || 53 || 53 || 53 || 7553
|-
! 10
Line 136:
|-
!colspan=2| Key inserted
| 2053 || 50 || 5320 || 75 || 100 || 67 || 105 || 3 || 36 || 3945
|-
!colspan=2| h′(k)
| 14 || 4 || 41 || 6 || 9 || 6 || 9 || 0 || 3 || 34
|-
! rowspan=11 {{vert header|va=middle|Hash table entries}}
Line 146:
|-
! 1
|bgcolor=#ffffcc| || ||bgcolor=#ffffcc| || 20 || 20 || 20 || 20 || 20 || 20 || 20
|-
! 2
Line 152:
|-
! 3
| || || || || || || || ||bgcolor=#ffffcc| ||bgcolor=#ffffcc|
|-
! 4
| ||bgcolor=#ffffcc| ||bgcolor=#ffffcc| || 53 || 53 || 53 || 53 || 50 || 50 || 50 ||bgcolor=#ffffcc| 5350
|-
! 5
Line 161:
|-
! 6
| || || ||bgcolor=#ffffcc| || ||bgcolor=#ffffcc| || 75 || 75 || 75 || 6775
|-
! 7
Line 170:
|-
! 9
| || || || ||bgcolor=#ffffcc| || 100 ||bgcolor=#ffffcc| 100 || 100 || 100 || 105100
|-
! 10
Line 179:
 
===Cycle===
If you now attempt to insert the element 645, then you get into a cycle, and fail. In the last row of the table we find the same initial situation as at the beginning again.
 
<math>h\left(645\right)=645\bmod 11=61</math><br/>
<math>h'\left(645\right)=\left\lfloor\frac{645}{11}\right\rfloor\bmod 11=04</math><br/>
{| class="wikitable"
|-
Line 188:
! Table 2
|-
| 645 replaces 5067 in cell 61 || 5067 replaces 5375 in cell 46
|-
| 5375 replaces 7553 in cell 9 || 7553 replaces 6750 in cell 64
|-
| 6750 replaces 100105 in cell 16 || 100105 replaces 105100 in cell 9
|-
| 105100 replaces 645 in cell 61 || 645 replaces 353 in cell 04
|-
| 353 replaces 3675 in cell 39 || 3675 replaces 3967 in cell 36
|-
| 3967 replaces 105100 in cell 61 || 105100 replaces 100105 in cell 9
|-
| 100 replaces 67 in cell 1 || 67 replaces 75 in cell 6
|-
| 75 replaces 53 in cell 9 || 53 replaces 50 in cell 4
|-
| 50 replaces 39 in cell 6 || 39 replaces 36 in cell 3
|-
| 36105 replaces 350 in cell 36 || 350 replaces 645 in cell 04
|-
| 645 replaces 5067 in cell 61 || 5067 replaces 5375 in cell 46
|}
 
Line 278 ⟶ 272:
==Known users==
Cuckoo hashing is used in [[TikTok]]'s recommendation system to solve the problem of "embedding table collisions", which can result in reduced model quality.
The TikTok recommendation system "Monolith" takes advantage cuckoo hashing's collision resolution to prevent different concepts from being mapped to the same vectors.<ref>{{Citecite webarXiv |vauthors=Liu Z, Zou L, Zou X, Wang C, Zhang B, Tang D, Zhu B, Zhu Y, Wu P, Wang K, Cheng Y|date=27 Sep 2022|title=Monolith: TheReal Time Recommendation System BehindWith TikTokCollisionless |url=https://gantry.io/blog/papers-to-know-20230110/Embedding Table|access-datearxiv=2023-05-30 |website=gantry2209.io 07663|languageclass=encs.IR}}</ref>
 
==See also==
Line 292 ⟶ 286:
* [http://www.ru.is/faculty/ulfar/CuckooHash.pdf A cool and practical alternative to traditional hash tables] {{Webarchive|url=https://web.archive.org/web/20190407085017/http://www.ru.is/faculty/ulfar/CuckooHash.pdf |date=2019-04-07 }}, U. Erlingsson, M. Manasse, F. Mcsherry, 2006.
* [http://www.itu.dk/people/pagh/papers/cuckoo-undergrad.pdf Cuckoo Hashing for Undergraduates, 2006], R. Pagh, 2006.
* [httphttps://mybiasedcoin.blogspot.com/2007/06/cuckoo-hashing-theory-and-practice-part.html Cuckoo Hashing, Theory and Practice] (Part 1, [httphttps://mybiasedcoin.blogspot.com/2007/06/cuckoo-hashing-theory-and-practice-part_15.html Part 2] and [httphttps://mybiasedcoin.blogspot.com/2007/06/cuckoo-hashing-theory-and-practice-part_19.html Part 3]), Michael Mitzenmacher, 2007.
* {{cite conference |last=Naor |first=Moni |author2=Segev, Gil |author3=Wieder, Udi |title=History-Independent Cuckoo Hashing |book-title=International Colloquium on Automata, Languages and Programming (ICALP) |place=Reykjavik, Iceland |year=2008 |url=http://www.wisdom.weizmann.ac.il/~naor/PAPERS/cuckoo_hi_abs.html |access-date=2008-07-21}}
* [http://www.cs.princeton.edu/~mfreed/docs/cuckoo-eurosys14.pdf Algorithmic Improvements for Fast Concurrent Cuckoo Hashing], X. Li, D. Andersen, M. Kaminsky, M. Freedman. EuroSys 2014.
Line 298 ⟶ 292:
===Examples===
*[https://github.com/efficient/libcuckoo Concurrent high-performance Cuckoo hashtable written in C++]
* [httphttps://sourceforge.net/projects/cuckoo-cpp/ Cuckoo hash map written in C++]
* [http://www.theiling.de/projects/lookuptable.html Static cuckoo hashtable generator for C/C++]
* [http://hackage.haskell.org/packages/archive/hashtables/latest/doc/html/Data-HashTable-ST-Cuckoo.html Cuckoo hash table written in Haskell]