Cachemere
Modular Caching Library for C++
Public Member Functions | List of all members
cachemere::policy::detail::CountingBloomFilter< ItemHash > Class Template Reference

Space-efficient probabilistic data structure to estimate the number of times an item was inserted in a set. More...

#include <counting_bloom_filter.h>

Public Member Functions

 CountingBloomFilter (uint32_t cardinality)
 Constructor. More...
 
template<typename ItemKey >
void add (const ItemKey &item)
 Increment the count for a given item by one. More...
 
void clear ()
 Clear the filter while keeping the allocated memory.
 
void decay ()
 Divide counter values by two. More...
 
template<typename ItemKey >
uint32_t estimate (const ItemKey &item) const
 Get the counter estimate for a given item. More...
 
uint32_t cardinality () const noexcept
 Get the cardinality of the filter. More...
 
size_t memory_used () const noexcept
 Get an estimate of the memory consumption of the filter. More...
 
double saturation () const noexcept
 Get the saturation of the filter. More...
 

Detailed Description

template<typename ItemHash>
class cachemere::policy::detail::CountingBloomFilter< ItemHash >

Space-efficient probabilistic data structure to estimate the number of times an item was inserted in a set.

A counting bloom filter is a constant-sized data structure, which means that insertions will never make the filter allocate more memory. However, too many inserts will severely impact the accuracy of counter estimates.

Constructor & Destructor Documentation

◆ CountingBloomFilter()

template<typename ItemHash >
cachemere::policy::detail::CountingBloomFilter< ItemHash >::CountingBloomFilter ( uint32_t  cardinality)

Constructor.

To use this data structure at its full potential, it's very important to have a good estimate for the cardinality of the set to be inserted.

Warning
Having an estimate much higher than the real cardinality will result in excessive memory usage, while having an estimate that is too low will drastically reduce the accuracy of the filter.
Parameters
cardinalityThe expected cardinality of the set to be inserted in the filter.

Member Function Documentation

◆ add()

template<typename ItemHash >
template<typename ItemKey >
void cachemere::policy::detail::CountingBloomFilter< ItemHash >::add ( const ItemKey &  item)

Increment the count for a given item by one.

Parameters
itemThe item of the counter to increment.

◆ cardinality()

template<typename ItemHash >
uint32_t cachemere::policy::detail::CountingBloomFilter< ItemHash >::cardinality
noexcept

Get the cardinality of the filter.

Returns
The cardinality of the filter, as configured.

◆ decay()

template<typename ItemHash >
void cachemere::policy::detail::CountingBloomFilter< ItemHash >::decay

Divide counter values by two.

If the user of this filter doesn't rely on the absolute value of the counters, calling this regularly will decrease the saturation of the filter by removing items that are very rarely seen.

◆ estimate()

template<typename ItemHash >
template<typename ItemKey >
uint32_t cachemere::policy::detail::CountingBloomFilter< ItemHash >::estimate ( const ItemKey &  item) const

Get the counter estimate for a given item.

Parameters
itemThe item for which to estimate the count.

Similarly to the way a bloom filter can return false positives, but not false negatives, the counter estimation produced by a counting bloom filter is actually an upper bound of the real counter value - the real count is guaranteed to be less or equal to the estimate returned.

Returns
The counter estimate for the item.

◆ memory_used()

template<typename ItemHash >
size_t cachemere::policy::detail::CountingBloomFilter< ItemHash >::memory_used
noexcept

Get an estimate of the memory consumption of the filter.

Returns
The memory used by the filter, in bytes.

◆ saturation()

template<typename ItemHash >
double cachemere::policy::detail::CountingBloomFilter< ItemHash >::saturation
noexcept

Get the saturation of the filter.

As filter saturations increases, so will the probability of false positives. A filter saturation of 1.0 means that all underlying counters are non-zero, so every call to estimate will return a counter value greater than 1. Increasing the filter cardinality will slow down the saturation of the filter, at the cost of using more memory.

Returns
The saturation of the filter, as a fraction.

The documentation for this class was generated from the following files: