运交华盖是什么意思| 尿素氮是什么| 电磁炉用什么锅最好| 栀子花叶子发黄是什么原因| 词又被称为什么| 脑出血有什么后遗症| 辐照食品是什么意思| 牙龈肿大是什么原因| 脑瘫是什么| 右肺下叶钙化灶是什么意思| 如果是什么意思| 瓜子脸适合什么刘海| m2是什么单位| 礼仪是什么意思| o血型的人有什么特点| 怕空调冷风什么原因| 溺爱的意思是什么| 吃火龙果有什么好处和坏处| 女人每天喝豆浆有什么好处| 仓鼠吃什么| 乳香是什么东西| 肝掌是什么原因引起的| 7月初7是什么日子| 低密度脂蛋白偏低是什么意思| 蛋清加蜂蜜敷脸有什么好处| 脚为什么会发麻| 宫腔内钙化灶是什么意思| xo是什么意思| 锰酸钾是什么颜色| 我俩太不公平这是什么歌| 肚子胀是什么原因| 皂角米是什么东西| 吃什么降尿酸最有效食物| 排便困难用什么药| 为什么不| 四个木字念什么| 大学生当兵有什么好处| 洋葱不能跟什么一起吃| 化疗中的病人应该吃什么| 怀孕前三个月为什么不能告诉别人| 手脚心出汗是什么原因| 海鲜都有什么| clarks是什么牌子| 什么病误诊为帕金森| 西瓜像什么| 专科和本科有什么区别| 葬礼穿什么衣服| 218是什么星座| 三个金念什么| 有眼不识泰山是什么意思| 头晕脑胀是什么原因| 关节退变什么意思| 广州为什么叫花城| 情人节送什么花| 豚的右边念什么| 补白蛋白吃什么食物最快最好| 用酒擦身体有什么好处| 6.16是什么星座| 黄茶是什么茶| 梦见钓了好多鱼是什么意思| 尿道口有灼热感是什么原因| 黄精有什么功效| 眼睛很多眼屎是什么原因| 夏天吃西瓜有什么好处| 微创手术是什么| 编者按是什么| 下眼睑浮肿是什么原因| 做胃镜前喝的那个液体是什么| 牙龈肿痛吃什么药| 23号来月经什么时候是排卵期| 苏打水什么牌子的好| 果五行属什么| 非典型鳞状细胞意义不明确是什么意思| 云吞是什么| 心律失常是什么症状| 紫色加红色是什么颜色| 腮腺炎是什么原因引起的| 兄弟是什么生肖| 6月份是什么季节| 结婚登记需要什么证件| 两女一杯是什么| 保底工资是什么意思| 手镯断了有什么预兆| 乙肝需要检查什么项目| 妇科炎症用什么药好| 戊戌是什么意思| 夏天喝什么水最解渴| 嘴上长痘痘是什么原因| 一劳永逸什么意思| 心跳过快是什么原因引起的| 不变应万变是什么意思| 拉泡泡屎是什么原因| 水肿吃什么药| 回族女人为什么戴头巾| 小孩贫血有什么症状| 早晨8点是什么时辰| 小腿浮肿是什么病| 袖珍人是什么| 脓包疮用什么药| 脂肪肝吃什么药治疗| 24岁属什么生肖| 女生右手食指戴戒指什么意思| 异卵双胞胎什么意思| 小混混是什么意思| 狗狗冠状是什么症状| 世界上最硬的东西是什么| 左眼皮一直跳是什么原因| 网线长什么样| 为什么我的眼里常含泪水| 晕车的人是什么体质| 辛辣都包括什么| 霸王龙吃什么| pop是什么意思| 郑州有什么大学| 看见壁虎是什么兆头| 人体七大营养素是什么| 鱼是什么意思| 吃什么可以化掉息肉| 灰飞烟灭是什么意思| 云裳是什么意思| 颈椎曲度变直有什么症状| 来大姨妈能喝什么饮料| 可心是什么意思| 肚子疼呕吐是什么原因引起的| 孕中期宫缩是什么感觉| 缺钾吃什么好| 线性是什么意思| 感冒头疼吃什么药| 旗袍搭配什么鞋子好看| 下巴长痘是什么原因| 近字五行属什么| 菌丝是什么| 吃了紧急避孕药会有什么反应| 锌是什么颜色| 什么的杏花| 剪发虫是什么| 什么是姜黄| 孕妇子痫是什么病| 怨天尤人是什么意思| 肝穿刺检查是什么意思| 百合有什么功效| 做什么动作可以长高| 不讲武德什么意思| 补气血吃什么药效果好| crispy是什么意思| 乱点鸳鸯谱什么意思| 慢性肠炎吃什么药调理| 前列腺吃什么食物调理| 三维彩超主要检查什么| 欲言又止的欲什么意思| 尿蛋白低是什么原因| 卡路里是什么| 怀孕前一周有什么症状| 记吃不记打的下一句是什么| 什么叫意象| 羊水破了是什么症状| 痰栓是什么意思| 抑郁症是什么意思| 荷花开是什么季节| 室早三联律是什么意思| 检查甲状腺挂什么科| 月经不调是什么意思| 外痔是什么样子的| 三公经费指什么| 肠胃不好吃什么| 感冒了吃什么水果比较好| 晚上9点到11点是什么时辰| 回眸一笑百媚生什么意思| 八字桃花是什么意思| 地主之谊是什么意思| npn是什么意思| 沙参长什么样子图片| 量贩式ktv什么意思| 梦见小男孩是什么预兆| 吃榴莲不能吃什么| 女生流白带意味着什么| 心意已决是什么意思| 印堂发亮预兆着什么| 玲珑是什么意思| 鸡蛋和什么不能一起吃吗| 甲亢有些什么症状| 胎儿fl是什么意思| 真维斯属于什么档次| 女性尿检能查出什么病| 护士规培是什么意思| 熊猫为什么被称为国宝| 忘带洗面奶用什么代替| 减肥可以吃什么水果| 足金什么意思| 月亮的肚子指的是什么| 肌酐高吃什么好| 什么而不什么成语| 脂膜炎是什么原因引起的| 早上起床有眼屎是什么原因| 肛瘘是什么情况| 腺体鳞化是什么意思| 小年是什么时候| poc是什么| 柬埔寨用什么货币| 壳心念什么| 九月十五日是什么星座| 4月份有什么节日| 晚上做梦梦到蛇是什么意思| vodka是什么酒| 壬子五行属什么| 经常嗓子疼是什么原因| 阴囊潮湿瘙痒是什么原因| 故是什么意思| 什么感冒药效果最好| gold是什么牌子| ck属于什么档次的品牌| 奥美拉唑和雷贝拉唑有什么区别| 婴儿打嗝是什么原因引起的| 梦见葡萄是什么意思| 什么食物胆固醇含量高| 出殡是什么意思| 鱼的尾巴有什么作用| 肺火旺吃什么药最有效| 经血发黑是什么原因| 榴莲不可以和什么一起吃| 大姨妈量少什么原因| 子宫腺肌症是什么原因引起的| 骐字五行属什么| 益生菌有什么功效| 一九七八年属什么生肖| 附睾炎吃什么药最有效| 下午五点到七点是什么时辰| 胸闷喘不上气什么原因| 暗是什么生肖| 真菌感染是什么| 例假提前来是什么原因| 秦王属什么生肖| 肠粘连吃什么药| 凤凰代表什么生肖| 祛火喝什么茶| 一什么天安门| 正视是什么意思| 男命正官代表什么| rapper什么意思| 祈是什么意思| 尿路感染吃什么药最见效| 麂皮是什么材质| 小儿安现在叫什么名| 门槛什么意思| 腿上有青筋是什么原因| 心律平又叫什么名字| 金钱草什么样| vgr100是什么药| 睡几个小时就醒了是什么原因| 受害者是什么意思| 健康证办理需要什么材料| 史莱姆是什么意思| 七月十日是什么星座| 为什么海螺里有大海的声音| 04年是什么生肖| 深水炸弹什么意思| 大骨节病是一种什么病| 嘉兴有什么大学| 梦见摘瓜是什么意思啊| 女人梦到被蛇咬是什么意思| 丹参片和复方丹参片有什么区别| 奶昔是什么东西| 胃食管反流有什么症状| 什么东西只进不出| 百度Jump to content

[信中国]蒋欣朗读金茂芳的信

From Wikipedia, the free encyclopedia
Cuckoo hashing example. The arrows show the alternative location of each key. A new item would be inserted in the location of A by moving A to its alternative location, currently occupied by B, and moving B to its alternative location which is currently vacant. Insertion of a new item in the location of H would not succeed: Since H is part of a cycle (together with W), the new item would get kicked out again.
百度 这种历史和文化的力量是无形的,却又非常强大。

Cuckoo hashing is a scheme in computer programming for resolving hash collisions of values of hash functions in a table, with worst-case constant lookup time. The name derives from the behavior of some species of cuckoo, where the cuckoo chick pushes the other eggs or young out of the nest when it hatches in a variation of the behavior referred to as brood parasitism; analogously, inserting a new key into a cuckoo hashing table may push an older key to a different location in the table.

History

[edit]

Cuckoo hashing was first described by Rasmus Pagh and Flemming Friche Rodler in a 2001 conference paper.[1] The paper was awarded the European Symposium on Algorithms Test-of-Time award in 2020.[2]:?122?

Operations

[edit]

Cuckoo hashing is a form of open addressing in which each non-empty cell of a hash table contains a key or key–value pair. A hash function is used to determine the location for each key, and its presence in the table (or the value associated with it) can be found by examining that cell of the table. However, open addressing suffers from collisions, which happens when more than one key is mapped to the same cell. The basic idea of cuckoo hashing is to resolve collisions by using two hash functions instead of only one. This provides two possible locations in the hash table for each key. In one of the commonly used variants of the algorithm, the hash table is split into two smaller tables of equal size, and each hash function provides an index into one of these two tables. It is also possible for both hash functions to provide indexes into a single table.[1]:?121-122?

Lookup

[edit]

Cuckoo hashing uses two hash tables, and . Assuming is the length of each table, the hash functions for the two tables is defined as, and where is the key and is the set whose keys are stored in of or of . The lookup operation is as follows:[1]:?124?

 function lookup(x) is
   return 
 end function

The logical or () denotes that, the value of the key is found in either or , which is in worst case.[1]:?123?

Deletion

[edit]

Deletion is performed in time since probing is not involved. This ignores the cost of the shrinking operation if the table is too sparse.[1]:?124-125?

Insertion

[edit]

When inserting a new item with key , the first step involves examining if slot of table is occupied. If it is not, the item is inserted in that slot. However, if the slot is occupied, the existing item is removed and is inserted at . Then, is inserted into table by following the same procedure. The process continues until an empty position is found to insert the key.[1]:?124-125? To avoid an infinite loop, a threshold is specified. If the number of iterations exceeds this fixed threshold, both and are rehashed with new hash functions and the insertion procedure repeats. The following is pseudocode for insertion:[1]:?125?

1    function insert(x) is
2      if lookup(x) then
3        return
4      end if
5      loop Max-Loop times
6        if  =  then
7           := x
8          return
9        end if
10       x 
11       if  =  then
12          := x
13         return
14       end if
15       x 
16     end loop
17     rehash()
18     insert(x)
19   end function

On lines 10 and 15, the "cuckoo approach" of kicking other keys which occupy repeats until every key has its own "nest", i.e. item is inserted into an empty slot in either of the two tables. The notation expresses swapping and .[1]:?124-125?

Theory

[edit]

Insertions succeed in expected constant time,[1] even considering the possibility of having to rebuild the table, as long as the number of keys is kept below half of the capacity of the hash table, i.e., the load factor is below 50%.

One method of proving this uses the theory of random graphs: one may form an undirected graph called the "cuckoo graph" that has a vertex for each hash table location, and an edge for each hashed value, with the endpoints of the edge being the two possible locations of the value. Then, the greedy insertion algorithm for adding a set of values to a cuckoo hash table succeeds if and only if the cuckoo graph for this set of values is a pseudoforest, a graph with at most one cycle in each of its connected components. Any vertex-induced subgraph with more edges than vertices corresponds to a set of keys for which there are an insufficient number of slots in the hash table. When the hash function is chosen randomly, the cuckoo graph is a random graph in the Erd?s–Rényi model. With high probability, for load factor less than 1/2 (corresponding to a random graph in which the ratio of the number of edges to the number of vertices is bounded below 1/2), the graph is a pseudoforest and the cuckoo hashing algorithm succeeds in placing all keys. The same theory also proves that the expected size of a connected component of the cuckoo graph is small, ensuring that each insertion takes constant expected time. However, also with high probability, a load factor greater than 1/2 will lead to a giant component with two or more cycles, causing the data structure to fail and need to be resized.[3]

Since a theoretical random hash function requires too much space for practical usage, an important theoretical question is which practical hash functions suffice for Cuckoo hashing. One approach is to use k-independent hashing. In 2009 it was shown[4] that -independence suffices, and at least 6-independence is needed. Another approach is to use tabulation hashing, which is not 6-independent, but was shown in 2012[5] to have other properties sufficient for Cuckoo hashing. A third approach from 2014[6] is to slightly modify the cuckoo hashtable with a so-called stash, which makes it possible to use nothing more than 2-independent hash functions.

Practice

[edit]

In practice, cuckoo hashing is about 20–30% slower than linear probing, which is the fastest of the common approaches.[1] The reason is that cuckoo hashing often causes two cache misses per search, to check the two locations where a key might be stored, while linear probing usually causes only one cache miss per search. However, because of its worst case guarantees on search time, cuckoo hashing can still be valuable when real-time response rates are required.

Example

[edit]

The following hash functions are given (the two least significant digits of k in base 11):


The following two tables show the insertion of some example elements. Each column corresponds to the state of the two hash tables over time. The possible insertion locations for each new value are highlighted. The last column illustrates a failed insertion due to a cycle, details below.

Table 1: uses h(k)
Steps
Step number 1 2 3 4 5 6 7 8 9 10
Key inserted 53 50 20 75 100 67 105 3 36 45
h(k) 9 6 9 9 1 1 6 3 3 1
Hash table entries
0
1 100 67 67 67 67 45
2
3 3 36 36
4
5
6 50 50 50 50 50 105 105 105 105
7
8
9 53 53 20 75 75 75 53 53 53 53
10
Table 2: uses h′(k)
Steps
Step number 1 2 3 4 5 6 7 8 9 10
Key inserted 53 50 20 75 100 67 105 3 36 45
h′(k) 4 4 1 6 9 6 9 0 3 4
Hash table entries
0 3 3
1 20 20 20 20 20 20 20
2
3
4 53 53 53 53 50 50 50 50
5
6 75 75 75 75
7
8
9 100 100 100 100 100
10

Cycle

[edit]

If you attempt to insert the element 45, then you get into a cycle, and fail. In the last row of the table we find the same initial situation as at the beginning again.



Table 1 Table 2
45 replaces 67 in cell 1 67 replaces 75 in cell 6
75 replaces 53 in cell 9 53 replaces 50 in cell 4
50 replaces 105 in cell 6 105 replaces 100 in cell 9
100 replaces 45 in cell 1 45 replaces 53 in cell 4
53 replaces 75 in cell 9 75 replaces 67 in cell 6
67 replaces 100 in cell 1 100 replaces 105 in cell 9
105 replaces 50 in cell 6 50 replaces 45 in cell 4
45 replaces 67 in cell 1 67 replaces 75 in cell 6

Variations

[edit]

Several variations of cuckoo hashing have been studied, primarily with the aim of improving its space usage by increasing the load factor that it can tolerate to a number greater than the 50% threshold of the basic algorithm. Some of these methods can also be used to reduce the failure rate of cuckoo hashing, causing rebuilds of the data structure to be much less frequent.

Generalizations of cuckoo hashing that use more than two alternative hash functions can be expected to utilize a larger part of the capacity of the hash table efficiently while sacrificing some lookup and insertion speed. Using just three hash functions increases the load to 91%.[7]

Another generalization of cuckoo hashing called blocked cuckoo hashing uses more than one key per bucket and a balanced allocation scheme. Using just 2 keys per bucket permits a load factor above 80%.[8]

Another variation of cuckoo hashing that has been studied is cuckoo hashing with a stash. The stash, in this data structure, is an array of a constant number of keys, used to store keys that cannot successfully be inserted into the main hash table of the structure. This modification reduces the failure rate of cuckoo hashing to an inverse-polynomial function with an exponent that can be made arbitrarily large by increasing the stash size. However, larger stashes also mean slower searches for keys that are not present or are in the stash. A stash can be used in combination with more than two hash functions or with blocked cuckoo hashing to achieve both high load factors and small failure rates.[9] The analysis of cuckoo hashing with a stash extends to practical hash functions, not just to the random hash function model commonly used in theoretical analysis of hashing.[10]

Some people recommend a simplified generalization of cuckoo hashing called skewed-associative cache in some CPU caches.[11]

Another variation of a cuckoo hash table, called a cuckoo filter, replaces the stored keys of a cuckoo hash table with much shorter fingerprints, computed by applying another hash function to the keys. In order to allow these fingerprints to be moved around within the cuckoo filter, without knowing the keys that they came from, the two locations of each fingerprint may be computed from each other by a bitwise exclusive or operation with the fingerprint, or with a hash of the fingerprint. This data structure forms an approximate set membership data structure with much the same properties as a Bloom filter: it can store the members of a set of keys, and test whether a query key is a member, with some chance of false positives (queries that are incorrectly reported as being part of the set) but no false negatives. However, it improves on a Bloom filter in multiple respects: its memory usage is smaller by a constant factor, it has better locality of reference, and (unlike Bloom filters) it allows for fast deletion of set elements with no additional storage penalty.[12]

[edit]

A study by Zukowski et al.[13] has shown that cuckoo hashing is much faster than chained hashing for small, cache-resident hash tables on modern processors. Kenneth Ross[14] has shown bucketized versions of cuckoo hashing (variants that use buckets that contain more than one key) to be faster than conventional methods also for large hash tables, when space utilization is high. The performance of the bucketized cuckoo hash table was investigated further by Askitis,[15] with its performance compared against alternative hashing schemes.

A survey by Mitzenmacher[7] presents open problems related to cuckoo hashing as of 2009.

Known users

[edit]

Cuckoo hashing is used in TikTok's recommendation system to solve the problem of "embedding table collisions", which can result in reduced model quality. The TikTok recommendation system "Monolith" takes advantage cuckoo hashing's collision resolution to prevent different concepts from being mapped to the same vectors.[16]

See also

[edit]

References

[edit]
  1. ^ a b c d e f g h i j Pagh, Rasmus; Rodler, Flemming Friche (2001). "Cuckoo Hashing". Algorithms — ESA 2001. Lecture Notes in Computer Science. Vol. 2161. CiteSeerX 10.1.1.25.4189. doi:10.1007/3-540-44676-1_10. ISBN 978-3-540-42493-2.
  2. ^ "ESA - European Symposium on Algorithms: ESA Test-of-Time Award 2020". esa-symposium.org. Award committee: Uri Zwick, Samir Khuller, Edith Cohen. Archived from the original on 2025-08-06. Retrieved 2025-08-06.{{cite web}}: CS1 maint: others (link)
  3. ^ Kutzelnigg, Reinhard (2006). Bipartite random graphs and cuckoo hashing (PDF). Fourth Colloquium on Mathematics and Computer Science. Discrete Mathematics and Theoretical Computer Science. Vol. AG. pp. 403–406.
  4. ^ Cohen, Jeffrey S., and Daniel M. Kane. "Bounds on the independence required for cuckoo hashing." ACM Transactions on Algorithms (2009).
  5. ^ Pǎtra?cu, Mihai, and Mikkel Thorup. "The power of simple tabulation hashing." Journal of the ACM (JACM) 59.3 (2012): 1-50.
  6. ^ Aumüller, Martin, Martin Dietzfelbinger, and Philipp Woelfel. "Explicit and efficient hash families suffice for cuckoo hashing with a stash." Algorithmica 70.3 (2014): 428-456.
  7. ^ a b Mitzenmacher, Michael (2025-08-06). "Some Open Questions Related to Cuckoo Hashing" (PDF). Proceedings of ESA 2009. Retrieved 2025-08-06.
  8. ^ Dietzfelbinger, Martin; Weidling, Christoph (2007). "Balanced allocation and dictionaries with tightly packed constant size bins". Theoret. Comput. Sci. 380 (1–2): 47–68. doi:10.1016/j.tcs.2007.02.054. MR 2330641.
  9. ^ Kirsch, Adam; Mitzenmacher, Michael D.; Wieder, Udi (2010). "More robust hashing: cuckoo hashing with a stash". SIAM J. Comput. 39 (4): 1543–1561. doi:10.1137/080728743. MR 2580539.
  10. ^ Aumüller, Martin; Dietzfelbinger, Martin; Woelfel, Philipp (2014). "Explicit and efficient hash families suffice for cuckoo hashing with a stash". Algorithmica. 70 (3): 428–456. arXiv:1204.4431. doi:10.1007/s00453-013-9840-x. MR 3247374. S2CID 1888828.
  11. ^ "Micro-Architecture".
  12. ^ Fan, Bin; Andersen, Dave G.; Kaminsky, Michael; Mitzenmacher, Michael D. (2014), "Cuckoo filter: Practically better than Bloom", Proc. 10th ACM Int. Conf. Emerging Networking Experiments and Technologies (CoNEXT '14), pp. 75–88, doi:10.1145/2674005.2674994
  13. ^ Zukowski, Marcin; Heman, Sandor; Boncz, Peter (June 2006). "Architecture-Conscious Hashing" (PDF). Proceedings of the International Workshop on Data Management on New Hardware (DaMoN). Retrieved 2025-08-06.
  14. ^ Ross, Kenneth (2025-08-06). Efficient Hash Probes on Modern Processors (PDF) (Research Report). IBM. RC24100. Retrieved 2025-08-06.
  15. ^ Askitis, Nikolas (2009). "Fast and Compact Hash Tables for Integer Keys". Proceedings of the 32nd Australasian Computer Science Conference (ACSC 2009) (PDF). Vol. 91. pp. 113–122. ISBN 978-1-920682-72-9. Archived from the original (PDF) on 2025-08-06. Retrieved 2025-08-06.
  16. ^ Liu Z, Zou L, Zou X, Wang C, Zhang B, Tang D, Zhu B, Zhu Y, Wu P, Wang K, Cheng Y (27 Sep 2022). "Monolith: Real Time Recommendation System With Collisionless Embedding Table". arXiv:2209.07663 [cs.IR].
[edit]

Examples

[edit]
游戏是什么 检点是什么意思 2017年五行属什么 田此读什么 说你什么好
奉天为什么改名沈阳 梦见大老鼠是什么意思 流口水什么原因 白蛇是什么蛇 痛风是什么地方痛
无名指长代表什么 乙肝五项25阳性是什么意思 理疗和按摩有什么区别 血糖低会出现什么症状 不可多得是什么意思
脱发补充什么维生素 三月初八是什么星座 苯甲酸钠是什么 女性潮热是什么症状 小孩上吐下泻吃什么药
幽门螺杆菌感染有什么症状和表现hcv7jop5ns6r.cn 精炼植物油是什么油hcv9jop5ns8r.cn cacao是什么意思hcv9jop1ns1r.cn 支原体吃什么药最有效xjhesheng.com 无法无天是什么生肖liaochangning.com
补肝血吃什么药hcv8jop1ns0r.cn 今期难过美人关是什么生肖hcv8jop8ns8r.cn 今年78岁属什么生肖zsyouku.com 孕妇喝柠檬水对胎儿有什么好处hcv8jop6ns5r.cn 神经性皮炎用什么药最好wuhaiwuya.com
钟点房是什么意思hcv9jop2ns4r.cn 女性喝什么茶比较好hcv8jop3ns7r.cn 河蚌吃什么hcv9jop1ns7r.cn 传染病检查项目有什么aiwuzhiyu.com 过敏性鼻炎喝什么茶好hcv8jop8ns5r.cn
脓毒症是什么病tiangongnft.com 茉莉花长什么样hcv9jop5ns2r.cn 赤潮是什么意思hcv7jop5ns6r.cn 震动棒是什么hcv8jop4ns0r.cn 眼睛充血用什么眼药水好hcv8jop2ns2r.cn
百度