Efficient Update Control of Bloom Filter Replicas in Large Scale Distributed Systems

Efficient Update Control of Bloom Filter Replicas in Large Scale Distributed Systems

Yifeng Zhu (University of Maine, USA) and Hong Jiang (University of Nebraska - Lincoln, USA)
Copyright: © 2010 |Pages: 23
DOI: 10.4018/978-1-60566-661-7.ch034


This chapter discusses the false rates of Bloom filters in a distributed environment. A Bloom filter (BF) is a space-efficient data structure to support probabilistic membership query. In distributed systems, a Bloom filter is often used to summarize local services or objects and this Bloom filter is replicated to remote hosts. This allows remote hosts to perform fast membership query without contacting the original host. However, when the services or objects are changed, the remote Bloom replica may become stale. This chapter analyzes the impact of staleness on the false positive and false negative for membership queries on a Bloom filter replica. An efficient update control mechanism is then proposed based on the analytical results to minimize the updating overhead. This chapter validates the analytical models and the update control mechanism through simulation experiments.
Chapter Preview

Introduction To Bloom Filters

A standard Bloom filter (BF) (Bloom, 1970) is a lossy but space-efficient data structure to support membership queries within a constant delay. As shown in Figure 1, a BF includes k independent random hash functions and a vector B of a length of m bits. It is assumed that the BF represents a finite set S = {x1, x2,…,xn} of n elements from a universe . The hash functions hi(x), 1 ≤ ik, map the universe U to the bit address space [1,m], shown as follows,

H(x) = {hi(x) ​​| 1 ≤ hi(x) ≤ m for 1 ≤ ik} (1)Definition 1. For all xU, B[H(x)] ≡ {B[hi(x)] | 1 ≤ ik}.
Figure 1.

A Bloom filter with a bit vector of m bits, and k independent hash functions. When an element x is added into the set represented, all bits indexed by those hash functions are set to 1.

This notation facilitates the description of operations on the subset of addressed by the hash functions. For example, B[H(x)] = 1 represents the condition in which all the bits in B at the positions of h1(x),…, and hk(x) are 1. “Setting B[H(x)]” means that the bits at these positions in B are set to 1.

Representing the set S using a BF B is fast and simple. Initially, all the bits in B are set to 0. Then for each xS, an operation of setting B[H(x)] is performed. Given an element x, to check whether is in , one only needs to test whether B[H(x)] = 1. If no, then x is not a member of S; If yes, x is conjectured to be in S. Figure 1 shows the results after the element x is inserted into the Bloom filter.

A standard BF has two well-known properties that are described by the following two theorems.

<b>Theorem 1.</b><i>Zero false negative</i></div> For ∀<i>x</i> ∈ <i>U</i>, if ∃<i>i</i>, <i>B</i>[<i>h<sub>i</sub></i>(<i>x</i>)] ≠ 1, then <span class="xmlReaderInlineFormula"><span><a href="/sourcecontent/9781605666617_498/978-1-60566-661-7.ch034.m05.png" target="_blank"><img src="/sourcecontent/9781605666617_498/978-1-60566-661-7.ch034.m05.png" style="max-width: 100%;" /></a></span></span>. <p>For a static set <i>S</i> whose elements are not dynamically deleted, the bit vector indexed by those hash functions always never returns a false negative. The proof is easy and is not given in this chapter.</p></div><div class="preview-footer"><a href="javascript:__doPostBack('ctl00$cphFeatured$lnkAddToCart','')">Purchase this chapter to continue reading all 23 pages ></a></div></div><h2 id="key-terms">Key Terms in this Chapter</h2><p><a href="/dictionary/distributed-membership-query/8076">Distributed Membership Query</a>: Membership query is one fundamental function that reports where the target data, resource, or service is located. The membership query can be performed by a centralized server or by a group of distributed server. The latter approach has a stronger scalability and is referred as distributed memory query.</p><p><a href="/dictionary/false-positive/10897">False Positive</a>: A false positive happens when an element is not a member of the set that a Bloom filter represents but the Bloom filter mistakenly reports it is. The probability of false positives can be very slow when the Bloom filter is appropriately designed.</p><p><a href="/dictionary/bloom-filter-array/2676">Bloom Filter Array</a>: A Bloom filter array, consisted of multiple Bloom filters, represents multiple sets. It is a space-efficient data structure to evaluate whether an element is within these sets and which set this element belongs to if yes.</p><p><a href="/dictionary/false-negative/10895">False Negative</a>: A false negative happens when an element is a member of the set that a Bloom filter represents but the Bloom filter mistakenly reports it is not. A standard Bloom filter has no false negatives. However, in a distributed system, a Bloom filter replica can generate false negatives when the replica is not timely updated.</p><p><a href="/dictionary/bloom-filter-update-protocol/2678">Bloom Filter Update Protocol</a>: When the set that a Bloom filter represents is changed over time, the corresponding Bloom filter replica becomes out-dated. In order to reduce the probability that the Bloom filter replica reports the membership incorrectly, the replica needs to be updated frequently. The Bloom filter update protocol determines when a Bloom filter replica needs to be updated.</p><p><a href="/dictionary/bloom-filter/2675">Bloom Filter</a>: A Bloom filter is a space-efficient data structure that supports membership queries. It consists of a bit array and all bits are initially set to 0. It uses a fix number of predefined independent hash functions. For each element, all hashed bits are set to 1. To check whether an element belongs to the set represented by a Bloom filter, one simply checks all bits pointed by the hash functions are 1. If not, the element is not in the set. If yes, the element is consider as a member.</p><p><a href="/dictionary/bloom-filter-replica/2677">Bloom Filter Replica</a>: A Bloom filter replica is a replication of a Bloom filter. In a distributed environment, the original and replicated Bloom filters are typically stored on different servers for improved performance and fault tolerance. A Bloom filter replica will generate both false positives and false negatives.</p><div id="table-of-contents"><h2>Complete Chapter List</h2><div class="search-contents"><span class="text"> Search this Book: </span><span class="text-box-container"><input id="txtKeywords" type="text" maxlength="50" onkeypress="return SearchBookFulltextHandleEnter(event, 498);" placeholder="Full text search terms" title="Full text search terms" class="full-text-search-box" /></span><div class="inline-block search-contents-xs-full-width"><span class="search"><span class="search-button" onclick="RemoveSpecialCharacters();SearchBookFulltext(498);"></span></span><span class="reset"><span onclick="RemoveSpecialCharacters();SearchBookFulltextReset();" class="link-gray-s">Reset</span></span></div></div><div id="searchResults"></div><div id="full-toc"></div><div id="loading-toc" class="text-align-center"><div class="loading-icon-lg"></div></div><script type="text/javascript"> $(document).ready(function () { var bookId = 498; var titleId = 36434; var subjectId = 5; var compactView = 'True'; var onDemandDiscountDisplayPrice = ''; var onDemandDisplayPrice = '$37.50'; var chapterCount = 38; if (chapterCount !== 0) { GetBookToc(bookId, titleId, subjectId, compactView, onDemandDiscountDisplayPrice, onDemandDisplayPrice); } else { GetBookTocFromSubmissionSystem(bookId, titleId, subjectId, compactView, onDemandDiscountDisplayPrice, onDemandDisplayPrice); } }); </script></div></div></div></div><script type="text/javascript"> MenuAdjust(); $(window).on('resize orientationChange', function (event) { MenuAdjust(); }); </script><footer class="footer"><div class="container"><div class="row"><div class="top-margin"><div class="col-md-6"><div class="footer-header"> Learn More </div><div class="text"><a href="/about/" class="footer-link">About IGI Global</a> | <a href="/about/partnerships/" class="footer-link">Partnerships</a> | <a href="/contact/" class="footer-link">Contact</a> | <a href="/about/staff/job-opportunities/" class="footer-link">Job Opportunities</a> | <a href="/faq/" class="footer-link">FAQ</a> | <a href="/about/staff/" class="footer-link">Management Team</a></div><div class="footer-header header-margin-top"> Resources For </div><div class="text"><a href="/librarians/" class="footerlink">Librarians</a> | <a href="/publish/" class="footerlink">Authors/Editors</a> | <a href="/distributors/" class="footerlink">Distributors</a> | <a href="/course-adoption/" class="footerlink">Instructors</a> | <a href="/about/rights-permissions/translation-rights/" class="footerlink">Translators</a> | <a href="https://www.econtentpro.com/partners/referrer/2eeff007-a17a-e611-80c4-0cc47a0d221d?url=/copyediting" class="footerlink" target="_blank">Copy Editing Services</a></div><div class="footer-header header-margin-top"> Media Center </div><div class="text"><a href="/symposium/" class="footer-link">Webinars</a> | <a href="/newsroom/" class="footer-link">Blogs</a> | <a href="/catalogs/" class="footer-link">Catalogs</a> | <a href="/newsletters/" class="footer-link">Newsletters</a></div><div class="footer-header header-margin-top"> Policies </div><div class="text"><a href="/about/rights-permissions/privacy-policy/" class="footer-link">Privacy Policy</a> | <a href="/cookies-agreement/" class="footer-link">Cookie & Tracking Notice</a> | <a href="/about/rights-permissions/content-reuse/" class="footer-link">Fair Use Policy</a> | <a href="/about/rights-permissions/ethics-malpractice/" class="footer-link">Ethics and Malpractice</a></div></div><div class="col-md-6 td-r"><div class="td-r-t"><div class="footer-rightside-container"><div class="left"><div style="margin-bottom: 1em;"><a href="http://www.facebook.com/pages/IGI-Global/138206739534176?ref=sgm" target="_blank"><span class="fb"></span></a> <a href="http://twitter.com/igiglobal" target="_blank"><span class="tw"></span></a></div><div><a href="http://www.world-forgotten-children.org" target="_blank"><img src="/images/proud-supporter-of-wfcf-07282015.png" alt="World Forgotten Children's Foundation" title="Proud Supporter of the World Forgotten Children's Foundation" width="157" height="52" /></a></div></div><div class="right"><div class="conference-badges-container"><a class="conference-badge" href="https://www.charlestonlibraryconference.com/" target="_blank"><img src="/Images/Conferences/charleston-2018.png" width="250" height="124"></a></div></div></div></div><div class="text copyright-text">Copyright © 1988-2018, IGI Global - All Rights Reserved</div></div></div></div></div></footer><div class="aspNetHidden"><input type="hidden" name="__VIEWSTATEGENERATOR" id="__VIEWSTATEGENERATOR" value="679D6B48" /><input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="gl2LsyyXZfvNDttf+IFGCWBslimomvUQTEY3asADd0XKp7iiAuUg/VpeHrJjnYpgwhPW/0a8LMD6GgDHCgHu7dVo2y9/dZ913yy+xGfl3QjiJZSclPEOxOIw7UF/REvSiqn5PgqJjY1p7ersmtZC/O8yzUJtxXn2jJEuKYa8tRiP0VoGHcU55V7nFvrc4upbnDSTKO4tIuwlrjTfhxc5wK+cRN/B96dxsvKCpat9OVJu0hOQsFRR2S2rKO4L4R53pT1DAjlR8U9CCYyZBA3rmTOdk/UHkFgCYLNXEe5Yi0+3ToczKasQkBCuUE/otnboS5wFfVwHkeu1EoO26NIVnjdebGeJT/RwRvh7wdfbD7sBOtINmFiBD5kaJkHMjrublH0KnZP/FUVWOGQ9AP3hpABQ3kmkw/VgUkriZR/40Y5ZYLVT" /></div></form></body></html>