Algorithms For Dummies
Book image
Explore Book Buy On Amazon
Generally, you create Bloom filters for algorithms of a fixed size (recently developed versions allow you to resize the filter). You operate them by adding new elements to the filter and looking them up when already present. It's not possible to remove an element from the filter after adding it (the filter has an indelible memory).

When adding an element to a bit vector, the bit vector has some bits set to 1, as shown. In this case, the Bloom filter adds X to the bit vector.

Adding a single element to a bit vector.

You can add as many elements as is necessary to the bit vector. For example, the next figure shows what happens when adding another element, Y, to the bit vector. Note that bit 7 is the same for both X and Y. Consequently, bit 7 represents a collision between X and Y. These collisions are the source of the potential false positives; because of them, the algorithm could say that an element is already added to the bit vector when it isn't. Using a larger bit vector makes collisions less likely and improves the performance of the Bloom filter, but does so at the cost of both space and time.

Adding a second element can cause collisions.

About This Article

This article is from the book:

About the book authors:

John Paul Mueller has produced 102 books and more than 600 articles to date on topics ranging from networking to machine learning. Luca Massaron is a data scientist specializing in organizing and interpreting big data and transforming it into smart data by means of the simplest and most effective data mining and machine learning techniques.

This article can be found in the category: