Elements Added to Bloom Filters

By John Paul Mueller, Luca Massaron

Generally, you create Bloom filters for algorithms of a fixed size (recently developed versions allow you to resize the filter). You operate them by adding new elements to the filter and looking them up when already present. It’s not possible to remove an element from the filter after adding it (the filter has an indelible memory).

When adding an element to a bit vector, the bit vector has some bits set to 1, as shown. In this case, the Bloom filter adds X to the bit vector.

Adding a single element to a bit vector.

You can add as many elements as is necessary to the bit vector. For example, the next figure shows what happens when adding another element, Y, to the bit vector. Note that bit 7 is the same for both X and Y. Consequently, bit 7 represents a collision between X and Y. These collisions are the source of the potential false positives; because of them, the algorithm could say that an element is already added to the bit vector when it isn’t. Using a larger bit vector makes collisions less likely and improves the performance of the Bloom filter, but does so at the cost of both space and time.

Adding a second element can cause collisions.