{"id":1861,"date":"2019-01-01T12:00:00","date_gmt":"2019-01-01T11:00:00","guid":{"rendered":"https:\/\/kindsonthegenius.com\/blog\/how-bloom-filters-work\/"},"modified":"2026-07-05T03:20:58","modified_gmt":"2026-07-05T01:20:58","slug":"how-bloom-filters-work","status":"publish","type":"post","link":"https:\/\/kindsonthegenius.com\/blog\/how-bloom-filters-work\/","title":{"rendered":"Bloom Filters \u2013 A Clear Explanation of How They Work"},"content":{"rendered":"<p>We would cover the following:<\/p>\n<ol>\n<li><a href=\"#t1\">Introduction to Bloom Filters<\/a><\/li>\n<li><a href=\"#t2\">Applications of Bloom Filters<\/a><\/li>\n<li><a href=\"#t3\">How Bloom Filters Work<\/a><\/li>\n<li><a href=\"#t4\">Bloom Filter Setup<\/a><\/li>\n<li><a href=\"#t5\">Analysis of Bloom Filters<\/a><\/li>\n<li><a href=\"#t6\">Watch the Video<\/a><\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-350 aligncenter\" src=\"https:\/\/www.kindsonthegenius.com\/wp-content\/uploads\/2020\/09\/How-Bloom-Filters-Work-300x179.jpg\" alt=\"\" width=\"256\" height=\"153\" \/><\/p>\n<p><strong id=\"t1\">1. Introduction to Bloom Filters<\/strong><\/p>\n<p>A Bloom Filter is a a data structure (based on hashing) that lets us determine whether an element is a member of a set. In bloom filters, it is possible for false positive to occur but with low probability. But false negatives is not possible. It is a space efficient and probabilistic data structure.<\/p>\n<p>Bloom filters allows us to used several hash functions for each key.<\/p>\n<p>&nbsp;<\/p>\n<p><strong id=\"t2\">2. Application of Bloom Filters<\/strong><\/p>\n<p>A common application of bloom filters ins in caching. The browser maintains certain levels of caches carefully placed to speed up loading of web pages that are accessed frequently. If a user requests a particular URL, then the browser needs to determine if the requested page is already in one of the caches and then serves the page from the cache if it is there.<\/p>\n<p>However, it may happen that the page may the reported to be in the cache while it is actually not there. This is called false positive. In this case, then fetches he page from the original url.\u00a0 Probability of this happening is small as we would see later.<\/p>\n<p>&nbsp;<\/p>\n<p><strong id=\"t3\">3. How Bloom Filters Work<\/strong><\/p>\n<p>Remember that we mentioned that bloom filters are used to determine if an element is a member of a set.<\/p>\n<p>So let&#8217;s assume we have a set of n elements S = {s1, s2, . . . , sn} which is taken from a very large universe U. (U may be a universe of all URLs and S is a set of urls existing in the cache). We want to allow for insertions into set S. Before insertion, the system checks if:<\/p>\n<p style=\"text-align: center;\">Given x\u00a0 \u2208\u00a0U, is x\u00a0\u2208 \u00a0 S?<\/p>\n<ul>\n<li>If the answer is No, then\u00a0 x is not in S (x \u2209\u00a0S)<\/li>\n<li>If the answer is Yes, then x may\u00a0 or may not be in S<\/li>\n<\/ul>\n<p>The objective is to perform this query as well as insertion into the cache in constant time O(1)<\/p>\n<p>&nbsp;<\/p>\n<p><strong id=\"t4\">4. Bloom Filter Setup<\/strong><\/p>\n<p>A bloom filter is a bit array of B of m bits and k independent hash functions, h<sub>1<\/sub>, . . . , h<sub>k<\/sub>. The has functions map each key in U to a set R<sub>m<\/sub> = {0, 1, . . . , m-1}<\/p>\n<p>We assume that each of the hash function h<sub>i<\/sub> maps a uniformly random key\u00a0 x\u00a0\u2208 \u00a0U to each element of R<sub>m<\/sub> with equal probability.<\/p>\n<p>Since the hash functions\u00a0h<sub>1<\/sub>, . . . , h<sub>k<\/sub> \u00a0are independent, then the result of the hashing, h<sub>1<\/sub>(x), . . . , h<sub>k<\/sub>(x) is equally likely to be any of the positions in Rm. All the m bits of B are initially set to 0. Therefore the process of insertion is as follows:<\/p>\n<p>To Insert:\u00a0 Evaluate\u00a0 h<sub>1<\/sub>(x), . . . ,\u00a0h<sub>k<\/sub>(x) and set each of the bits to 1<\/p>\n<p>To check is x\u00a0\u2208 S:\u00a0Evaluate\u00a0 h<sub>1<\/sub>(x), . . . ,\u00a0h<sub>k<\/sub>(x) . If all the bits are set to 1. then x\u00a0\u2208 \u00a0S else x\u00a0\u2209 S<\/p>\n<p>This is illustrated in Figure 1.0<\/p>\n<figure id=\"attachment_348\" aria-describedby=\"caption-attachment-348\" style=\"width: 667px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-348 \" src=\"https:\/\/www.kindsonthegenius.com\/wp-content\/uploads\/2020\/09\/Bloom-Filters.jpg\" alt=\"\" width=\"667\" height=\"226\" \/><figcaption id=\"caption-attachment-348\" class=\"wp-caption-text\">Figure 1.0: Bloom Filter. Initially all the bits are set to zero. Then we computer h<sub>k<\/sub>(x) and set the bits to 1 for each of the keys h<sub>k<\/sub>(x) maps to<\/figcaption><\/figure>\n<p>&nbsp;<\/p>\n<p><strong id=\"t5\">5. Analysis of Bloom Filters &#8211; Version 1<\/strong><\/p>\n<p>In this version, we use k hash functions and k bit arrays.<\/p>\n<p>To insert:<\/p>\n<ul>\n<li>h<sub>1<\/sub>(x) is stored in the first array<\/li>\n<li>h<sub>2<\/sub>(x is stored in the second array<\/li>\n<li>&#8230;<\/li>\n<li>h<sub>k<\/sub>(x) is stored in the k<sup>th<\/sup> array<\/li>\n<\/ul>\n<p>To search:<\/p>\n<p>Search h<sub>i<\/sub>(x) for each i and if h<sub>i<\/sub>(x) is not set to 1 for any i, then x is not in the set<\/p>\n<p>Let&#8217;s now calculate the probability of false positive<\/p>\n<p>Pr(false positive) occurs if h(y) = h(x<sub>i<\/sub>)\u00a0 where y\u00a0\u2209\u00a0 S. Probability that this happens is given by:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-349 aligncenter\" src=\"https:\/\/www.kindsonthegenius.com\/wp-content\/uploads\/2020\/09\/Bloom-Filter-Probability.jpg\" alt=\"\" width=\"342\" height=\"260\" \/><\/p>\n<p>We can also go ahead to calculate the number of bits required.<\/p>\n<p>&nbsp;<\/p>\n<p><strong id=\"t6\">6. Watch the Video<\/strong><\/p>\n<p><iframe loading=\"lazy\" src=\"https:\/\/www.youtube.com\/embed\/Opf_X1ceMsw\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<p>&nbsp;<\/p>\n<h5><strong>Check these Data Structure and Algorithms Tutorials<\/strong><\/h5>\n<ul>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/skip-lists-and-how-they-work\/\">Skip Lists and How They Work<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/perfect-hashing\/\">Perfect Hashing \u2013 How it Works<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/hashing\/\">Linear Probing, Quadratic Probing and Double Hashing<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/hashing-with-open-addressing\/\">Hashing With Open Addressing<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/universal-hashing\/\">Universal Hashing<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/search-time-under-simple-uniform-hashing\/\">Search Time Under Simple Uniform Hashing<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/hash-tables-hashing-with-chaining\/\">Hash Tables \u2013 Hashing With Chaining<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/introduction-to-hash-tables-and-direct-addressing\/\">Introduction to Hash Tables and Direct Addressing<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/recurrences-in-divide-and-conquer-algorithm\/\">Recurrences in Divide and Conquer Algorithm<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/how-bloom-filters-work\/\">How Bloom Filters Work<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/cuckoo-hashing\/\">How Cuckoo Hashing Works<\/a><\/strong><\/li>\n<li><strong><a href=\"https:\/\/kindsonthegenius.com\/tempsite\/introduction-to-avl-trees-with-examples\/\">Introduction to AVL Trees (With Examples)<\/a><\/strong><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>We would cover the following: Introduction to Bloom Filters Applications of Bloom Filters How Bloom Filters Work Bloom Filter Setup Analysis of Bloom Filters Watch &hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"pagelayer_contact_templates":[],"_pagelayer_content":"","footnotes":""},"categories":[35],"tags":[],"class_list":["post-1861","post","type-post","status-publish","format-standard","hentry","category-algorithms"],"_links":{"self":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/1861","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/comments?post=1861"}],"version-history":[{"count":1,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/1861\/revisions"}],"predecessor-version":[{"id":2030,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/posts\/1861\/revisions\/2030"}],"wp:attachment":[{"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/media?parent=1861"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/categories?post=1861"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kindsonthegenius.com\/blog\/wp-json\/wp\/v2\/tags?post=1861"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}