Cachemere
Modular Caching Library for C++
Public Member Functions | List of all members
cachemere::policy::detail::BloomFilter< ItemHash > Class Template Reference

Probabilistic data structure for representing sets in a space-efficient format. More...

#include <bloom_filter.h>

Public Member Functions

 BloomFilter (uint32_t cardinality)
 Constructor. More...
 
template<typename ItemKey >
void add (const ItemKey &item)
 Add an item to the filter. More...
 
void clear ()
 Clear the filter while keeping the allocated memory.
 
template<typename ItemKey >
bool maybe_contains (const ItemKey &item) const
 Test membership of the specified item. More...
 
size_t memory_used () const noexcept
 Get an estimate of the memory consumption of the filter. More...
 
double saturation () const noexcept
 Get the saturation of the filter. More...
 

Detailed Description

template<typename ItemHash>
class cachemere::policy::detail::BloomFilter< ItemHash >

Probabilistic data structure for representing sets in a space-efficient format.

A bloom filter is a constant-sized data structure, which means that insertions will never make the filter allocate more memory. However, too many inserts will severly impact the accuracy of filter membership tests.

Template Parameters
ItemHashFunctor used for hashing the items inserted in the set.

Constructor & Destructor Documentation

◆ BloomFilter()

template<typename ItemHash >
cachemere::policy::detail::BloomFilter< ItemHash >::BloomFilter ( uint32_t  cardinality)

Constructor.

To use this data structure at its full potential, it's very important to have a good estimate for the cardinality of the set to be inserted.

Warning
Having an estimate much higher than the real cardinality will result in excessive memory usage, while having an estimate that is too low will drastically reduce the accuracy of the filter.
Parameters
cardinalityThe expected cardinality of the set to be inserted in the filter.

Member Function Documentation

◆ add()

template<typename ItemHash >
template<typename ItemKey >
void cachemere::policy::detail::BloomFilter< ItemHash >::add ( const ItemKey &  item)

Add an item to the filter.

Parameters
itemThe item to insert.

◆ maybe_contains()

template<typename ItemHash >
template<typename ItemKey >
bool cachemere::policy::detail::BloomFilter< ItemHash >::maybe_contains ( const ItemKey &  item) const

Test membership of the specified item.

A bloom filter can return false positives, but not false negatives. This method returning true only means that the set might contain the specified item, while a return value of false means that the set certainly does not contain the specified item.

Parameters
itemThe item to test.

◆ memory_used()

template<typename ItemHash >
size_t cachemere::policy::detail::BloomFilter< ItemHash >::memory_used
noexcept

Get an estimate of the memory consumption of the filter.

Returns
The memory used by the filter, in bytes.

◆ saturation()

template<typename ItemHash >
double cachemere::policy::detail::BloomFilter< ItemHash >::saturation
noexcept

Get the saturation of the filter.

As filter saturations increases, so will the probability of false positives. A filter saturation of 1.0 means that all underlying bits are set to 1, so every call to maybe_contains will return true. Increasing the filter cardinality will slow down the saturation of the filter, at the cost of using more memory.

Returns
The saturation of the filter, as a fraction

The documentation for this class was generated from the following files: