Slightly adapted after discussing with Bernhard. I also added one
further check.
* origin/fastpath:
fix segfault that could be caused by merging an empty bloom-filter with a bloom-filter already containing values.
CompositeHash.
We do this by hashing values added to a BloomFilter another time more
with a stable hash seeded only by either the filter's name or the
global_hash_seed (or Bro's random() seed if neither is defined).
I'm also adding a new bif bloomfilter_internal_state() that returns a
string representation of a Bloom filter's current internal state. This
is solely for writing tests that check that the filters end up
consistent when seeded with the same value.
This commit adds support for script-level specification of a seed to be used by
hashers. For example, if the given name of a Bloom filter is not empty, then
the seed used by the underlying hasher only depends on the Bloom filter name.
If the name is empty, we check whether the user defined a non-empty
global_hash_seed string variable at script and use it instead. If that script
variable does not exist, then we fall back to the initial seed computed a
Bro startup (which is affected ultimately by $BRO_SEED).
See Hasher::MakeSeed for details.
with a bloom-filter already containing values.
I assume that it is ok to merge an empty bloom-filter with any bloom-filter -
if not we have to change the patch to return an error in this case.
* topic/robin/bloom-filter-merge: (50 commits)
Support emptiness check on Bloom filters.
Refactor Bloom filter merging.
Add bloomfilter_clear() BiF.
Updating NEWS.
Broifying the code.
Implement and test Bloom filter merging.
Make hash functions equality comparable.
Make counter vectors mergeable.
Use half adder for bitwise addition and subtraction.
Fix and test counting Bloom filter.
Implement missing CounterVector functions.
Tweak hasher interface.
Add missing include for GCC.
Fixing for unserializion error.
Small fixes and style tweaks.
Only serialize Bloom filter type if available.
Create hash policies through factory.
Remove lingering debug code.
Factor implementation and change interface.
Expose Bro's linear congruence PRNG as utility function.
...
I'm moving the new files into a subdirectory probabilistic, and into a
corresponding namespace. We can later put code for the other
probabilistic data structures there as well.
* origin/topic/matthias/bloom-filter: (45 commits)
Implement and test Bloom filter merging.
Make hash functions equality comparable.
Make counter vectors mergeable.
Use half adder for bitwise addition and subtraction.
Fix and test counting Bloom filter.
Implement missing CounterVector functions.
Tweak hasher interface.
Add missing include for GCC.
Fixing for unserializion error.
Small fixes and style tweaks.
Only serialize Bloom filter type if available.
Create hash policies through factory.
Remove lingering debug code.
Factor implementation and change interface.
Expose Bro's linear congruence PRNG as utility function.
H3 does not check for zero length input.
Support seeding for hashers.
Add utility function to access first random seed.
Update H3 documentation (and minor style nits.)
Make H3 seed configurable.
...