When constructing a Bloom filter, one now has to pass a HashPolicy instance to
it. This separates more clearly the concerns of hashing and Bloom filter
management.
This commit also changes the interface to initialize Bloom filters: there exist
now two initialization functions, one for each type:
(1) bloomfilter_basic_init(fp: double,
capacity: count,
name: string &default=""): opaque of bloomfilter
(2) bloomfilter_counting_init(k: count,
cells: count,
max: count,
name: string &default=""): opaque of bloomfilter
The BiFs for adding elements and performing lookups remain the same. This
essentially gives us "BiF polymorphism" at script land, where the
initialization BiF constructs the most derived type while subsequent BiFs
adhere to the same interface.
The reason why we split up the constructor in this case is that we have not yet
derived the math that computes the optimal number of hash functions for
counting Bloom filters---users have to explicitly parameterize them for now.
Thanks to git this merge was less troublesome that I was afraid it
would be. Not all tests pass yet though (and file hashes have changed
unfortunately).
Conflicts:
cmake
doc/scripts/DocSourcesList.cmake
scripts/base/init-bare.bro
scripts/base/protocols/ftp/main.bro
scripts/base/protocols/irc/dcc-send.bro
scripts/test-all-policy.bro
src/AnalyzerTags.h
src/CMakeLists.txt
src/analyzer/Analyzer.cc
src/analyzer/protocol/file/File.cc
src/analyzer/protocol/file/File.h
src/analyzer/protocol/http/HTTP.cc
src/analyzer/protocol/http/HTTP.h
src/analyzer/protocol/mime/MIME.cc
src/event.bif
src/main.cc
src/util-config.h.in
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/istate.events-ssl/receiver.http.log
testing/btest/Baseline/istate.events-ssl/sender.http.log
testing/btest/Baseline/istate.events/receiver.http.log
testing/btest/Baseline/istate.events/sender.http.log
And changed the endianness parameter of bytestring_to_count() BIF to
default to false (big endian), mostly just to prove that the BIF parser
doesn't choke on default parameters.
observed elements.
Add methods to merge with and without pruning (before only merge
method was with pruning, which invalidates the number of total
observed elements)
I am not (entirely) sure that this is mathematically correct, but
I am (more and more) getting the feeling that it... might be.
In any case - this was the last step and now it should work
in cluster settings.
Note: merging top-k data structures is not yet possible (and is
actually quite awkward/expensive). I will have to think about
how to do that for a bit...
All tests pass with one exception: some Broxygen tests are broken
because dpd_config doesn't exist anymore. Need to update the mechanism
for auto-documenting well-known ports.
* origin/topic/matthias/opaque:
Add new unit test for opaque serialization.
Migrate entropy testing to opaque.
C++ify RandTest.*
Fix a hard-to-spot bug.
Use more descriptive error message.
Fix the fix :-/.
Fix initialization of hash values.
Be clearer about delegation.
Implement serialization of opaque types.
Update hash BiF documentation.
Migrate free SHA* functions to SHA*Val::digest().
Add missing type name that caused failing tests.
Update base scripts and unit tests.
Simplify hash function BiFs.
Add support for opaque hash values.
Adapt BiF & Bro parser to handle opaque types.
More lexer/parser work.
Implement equivalence relation for opaque types.
Support basic serialization of opaque.
Add opaque type to lexer, parser, and BroType.
Closes#925
Conflicts:
aux/broccoli
The srand()/rand() interface was being intermixed with the
srandom()/random() one. The later is now used throughout.
Changed the srand() and rand() BIFs to work deterministically if Bro
was given a seed file (addresses #825). They also now wrap the
system's srandom() and random() instead of srand() and rand() as per
the above.
* origin/topic/dnthayer/bif-tests:
Improve "fmt" BIF documentation comment
Improve tests of the type_name BIF
Improve test cases for "order" BIF
Fix documentation of sort BIF and add more tests
Fix documentation for system_env BIF
Deprecate the parse_dotted_addr BIF (use to_addr instead)
Improve tests for to_port and type_name BIFs
Improve tests for sort, order, and system_env BIFs
Fix the join_string_vec BIF and add more tests
Add more tests for previously-untested BIFs
Add more tests for previously-untested BIFs
Add more tests for previously-untested BIFs
Add more tests for previously-untested BIFs
Add tests for previously-untested strings BIFs