{"id":1857,"date":"2019-01-04T12:00:00","date_gmt":"2019-01-04T11:00:00","guid":{"rendered":"https:\/\/kindsonthegenius.com\/blog\/perfect-hashing\/"},"modified":"2026-07-05T03:20:46","modified_gmt":"2026-07-05T01:20:46","slug":"perfect-hashing","status":"publish","type":"post","link":"https:\/\/kindsonthegenius.com\/blog\/perfect-hashing\/","title":{"rendered":"Perfect Hashing \u2013 How it Works"},"content":{"rendered":"<p>This is used when the keys stored in the hash table are expected to be static. In this case perfect hashing guarantees excellent average as well as worst-case performance. This is how real dictionaries work. You most of the time just need to read from a dictionary.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-386 size-medium\" src=\"https:\/\/www.kindsonthegenius.com\/wp-content\/uploads\/2020\/09\/Perfect-Hashing-300x129.jpg\" alt=\"How Perfect Hashing Works\" width=\"300\" height=\"129\" \/><\/p>\n<p>Some application of perfect hashing includes:<\/p>\n<ul>\n<li>data storage on a CD ROM<\/li>\n<li>set of reserved words in a programming language<\/li>\n<\/ul>\n<p>This hashing technique is called perfect hashing because it takes constant time, O(1) to search for an item in the table.<\/p>\n<ol>\n<li><a href=\"#t1\">Overview of Perfect Hashing<\/a><\/li>\n<li><a href=\"#t2\">Watch the Video<\/a><\/li>\n<li><a href=\"#t3\">Choosing a hash function<\/a><\/li>\n<li><a href=\"#t4\">Estimating expected collision<\/a><\/li>\n<li><a href=\"#t5\">Analysis of Perfect Hashing<\/a><\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<h5><strong id=\"t1\">1. Overview of Perfect Hashing<\/strong><\/h5>\n<p><strong>Perfect hashing<\/strong> is implemented using two hash tables, one at each level. Each of the table uses universal hashing. The first level is the same a hashing with chaining such that n elements is hashed into m slots in the hash table. This is done using a has function selected from a universal family of hash functions.<\/p>\n<p>The second level of uses a second hash table (instead of a linked list used in chaining). Elements that hash to the same slot j in the first hash table are stored in a second hash table. This second hash table is known as secondary hash table . The hash function\u00a0h<sub>j<\/sub> is carefully\u00a0 chosen such that there are no collision in the secondary table.<\/p>\n<p>To ensure there are no collisions in the secondary hash table\u00a0 S<sub>j<\/sub>,\u00a0 we need to make the size m<sub>j<\/sub> of the secondary table equal to the square of the number of keys hashing into slot j in the first table. That is:<\/p>\n<p>m<sub>j<\/sub> = n<sub>j<\/sub><sup>2<\/sup><\/p>\n<p>The hash functions for the primary hash table is carefully chosen so that we limit the expected total amount of space used to be O(n). Take some time to watch the video explanation of Perfect Hashing.<\/p>\n<p>&nbsp;<\/p>\n<h5><strong id=\"t2\">2. Watch the Video<\/strong><br \/>\n<iframe loading=\"lazy\" src=\"https:\/\/www.youtube.com\/embed\/zM77FKdS54Y\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/h5>\n<p>&nbsp;<\/p>\n<h5><strong id=\"t3\">3. Choosing Hash Functions<\/strong><\/h5>\n<p>The first level hash function is chosen from the universal hash family\u00a0\u210b pm where p is a prime number larger than any key value. So the keys that hash into slot j of the first hash table are placed into a secondary hash table S<sub>j<\/sub> using the hash function h<sub>j<\/sub> which is chosen from another family of hash functions\u00a0\u210b <sub>p,m<sub>j<\/sub><\/sub>. The size of the secondary hash table is m<sub>j<\/sub>.<\/p>\n<p>The objective is to achieve a situation where there are no collisions in the secondary hash table. At least no expected collisions. Let&#8217;s examine the expected colliding elements.<\/p>\n<p>&nbsp;<\/p>\n<h5 id=\"t4\"><strong>4. Estimating Expected Collisions<\/strong><\/h5>\n<p><strong>Theorem<\/strong>: Assuming n keys are stored in a hash table of size m = using a hash function h which is chosen randomly from a universal family of hash functions. Then the probability that there would be any collision is less than 1\/2<\/p>\n<p>Proof: There are \u2329n choose 2\u232a pairs of keys\u00a0 that may collide. Each pair collides with a probability of 1\/m if h is chosen randomly from a universal class\u00a0\u210b\u00a0 of hash functions. We use a random variable X to count the number of collisions.<\/p>\n<p>Let E(X) = expected number of collisions<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-385 aligncenter\" src=\"https:\/\/www.kindsonthegenius.com\/wp-content\/uploads\/2020\/09\/Expected-Number-of-Collisions-in-Perfect-Hashing.jpg\" alt=\"\" width=\"270\" height=\"327\" \/><\/p>\n<p>It then follows that a hash function h that is chosen randomly from the set\u00a0\u210b\u00a0 is have a high likelihood of having no collisions.<\/p>\n<p>&nbsp;<\/p>\n<h5 id=\"t5\"><strong>5. Analysis of Perfect Hashing<\/strong><\/h5>\n<p>If\u00a0 the number of keys n is small we can get away with using a hash table of size m = n<sup>2<\/sup> and choosing a random has function from\u00a0\u210b . But if n is large, the m = n<sup>2<\/sup> would be expensive. Therefore, we resort to using two level hash hashing:<\/p>\n<p>in the first level, hash function h is used to hash n keys into m slots where m = n. Then the number of keys hashing to slot j is n<sub>j<\/sub>.<\/p>\n<p>in the second level, we now use a secondary hash table of size m<sub>j<\/sub> = n<sub>j<\/sub><sup>2<\/sup> which ensures a collision does not occur (given the table is static)<\/p>\n<p>&nbsp;<\/p>\n<h5><strong>Check these Data Structure and Algorithms Tutorials<\/strong><\/h5>\n<ul>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/skip-lists-and-how-they-work\/\">Skip Lists and How They Work<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/perfect-hashing\/\">Perfect Hashing \u2013 How it Works<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/hashing\/\">Linear Probing, Quadratic Probing and Double Hashing<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/hashing-with-open-addressing\/\">Hashing With Open Addressing<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/universal-hashing\/\">Universal Hashing<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/search-time-under-simple-uniform-hashing\/\">Search Time Under Simple Uniform Hashing<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/hash-tables-hashing-with-chaining\/\">Hash Tables \u2013 Hashing With Chaining<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/introduction-to-hash-tables-and-direct-addressing\/\">Introduction to Hash Tables and Direct Addressing<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/recurrences-in-divide-and-conquer-algorithm\/\">Recurrences in Divide and Conquer Algorithm<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/how-bloom-filters-work\/\">How Bloom Filters Work<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/cuckoo-hashing\/\">How Cuckoo Hashing Works<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/introduction-to-avl-trees-with-examples\/\">Introduction to AVL Trees (With Examples)<\/a><\/strong><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>This is used when the keys stored in the hash table are expected to be static. In this case perfect hashing guarantees excellent average as &hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"pagelayer_contact_templates":[],"_pagelayer_content":"","footnotes":""},"categories":[35],"tags":[],"class_list":["post-1857","post","type-post","status-publish","format-standard","hentry","category-algorithms"],"_links":{"self":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/1857","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/comments?post=1857"}],"version-history":[{"count":1,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/1857\/revisions"}],"predecessor-version":[{"id":2026,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/1857\/revisions\/2026"}],"wp:attachment":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/media?parent=1857"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/categories?post=1857"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/tags?post=1857"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}