In deterministic mode, RAND_MAX is not related to the result of
zeek::random_number() (formerly bro_random()), but some logic was
using RAND_MAX as indication of the possible range of values. The
new zeek::max_random() will give the correct upper-bound regardless
of whether deterministic-mode is used.
The type used for storing the state of the RNG is changed from
`unsigned int` to `long int` since the former has a minimal range
of [0, 65,535] while the RNG function itself has a range of
[1, 2147483646]. A `long int` must be capable of
[−2147483647, +2147483647] and is also the return type of `random()`,
which is what zeek::prng() aims to roughly parity.
Merge adjustments:
- Preserved original `base_type_no_ref` argument type as ::TypeTag
- Removed superfluous #pragma guard around deprecated TableVal ctor
- Clarify NEWS regarding MetaHook{Pre,Post} deprecations
- Simplify some `::zeek::` qualifications to just `zeek::`
- Prefixed FORWARD_DECLARE_NAMESPACED macro with ZEEK_
* origin/topic/timw/266-namespaces:
Disable some deprecation diagnostics for GCC
Rename BroType to Type
Update NEWS
Review cleanup
Move Type types to zeek namespace
Move Flare/Pipe from the bro namespace to zeek::detail
Move Attr to the zeek::detail namespace
Move Trigger into the zeek::detail namespace
Move ID to the zeek::detail namespace
Move Anon.h into zeek::detail namespace
Mark all of the aliased classes in plugin/Plugin.h deprecated, and fix all of the plugins that were using them
Move all of the base plugin classes into the zeek::plugin namespace
Expr: move all classes into zeek::detail
Stmt: move Stmt classes into zeek::detail namespace
Add utility macro for creating namespaced aliases for classes
* origin/master:
Use zeek::detail namespace for fuzzer utils
Set terminating flag during fuzzer cleanup
Add missing include to standalone fuzzer driver
Improve standalone fuzzer driver error messages
Test fuzzers against seed corpus under CI ASan build
Update fuzzing README with OSS-Fuzz integration notes
Link fuzzers against shared library to reduce executable sizes
Improve FuzzBuffer chunking
Fix compiler warning in standalone fuzzer driver
Adjust minor fuzzing documentation
Exit immediately after running unit tests
Add OSS-Fuzz Zeek script search path to fuzzers
Assume libFuzzer when LIB_FUZZING_ENGINE file doesn't exist
Change handling of LIB_FUZZING_ENGINE
Change --enable-fuzzing to --enable-fuzzers
Add standalone driver for fuzz targets
Add basic structure for fuzzing targets
This commit moves some of the hash datastructures and code from
util.cc into Hash.cc - where it seems more appropriate.
It also starts to make more Keyed hash functions available - still
using siphash as the default 64 bit keyed hash, but also making
128 and 256 bit highway hashes available.
There already are a few other functions that are defined but not
yet implemented - these will be "static" keyed hashes - which use
an installation specific key. These will be used to, e.g., get
rid of md5 hashing for the generation of file UIDs.
Currently, siphash is used for strings up to 36 bytes. hmac-md5 is used
for longer strings.
This switch-over is a remnant of the previous hash-function that was
used, which apparently was slower with longer input strings.
This change serves no purpose anymore. I performed a few performance tests
on strings of varying sizes:
For a 40 byte string with 10 million iterations:
siphash: 0.31 seconds
hmac-md5: 3.8 seconds
For a 1080 byte string with 10 million iterations:
siphash: 4.2 seconds
hmac-md5: 17 seconds
For a 18360 byte string with 10 million iterations:
siphash: 69 seconds
hmac-md5: 240 seconds
Hence, this commit removes the use of hmac-md5.
This change causes reordering of lines in a few logs.
This commit also changes the datastructure for the seed in probabilistic/Hasher
to get rid of a type-punning warning.
This adds the entirety of the highwayhash implementation of Google.
This includes siphash as well as severl highwayhash variants - which
are faster.
This first commit only switches out the siphash implementation. All
hashes that are generated are exactly the same as before. However, this
does make all other hashes available to be used by us.
I did some performance tests vs the previous siphash implementation by
running the 2009-M57-day11-18 trace 100x through both cases. The average
runtime was virtually the same (within 0.014 seconds of each other).
Note that the way that I included the highwayhash implementation in our
cmake setup is... well, let's say hacky. This definitely needs to be
changed a bit before including this in a real build.
General changes:
* Add -D/--deterministic command line option as
convenience/alternative to -G/--load-seeds (i.e. no file needed, it just
uses zero-initialized random seeds). It also changes Broker data
stores over to using deterministic timing rather than real time.
* Add option to make Reporter abort on runtime scripting errors
Using an overload that takes a va_list argument potentially causes
accidental misuse on platforms (e.g. 32-bit) where va_list is
implemented as a type that may collide with commonly-used argument
types.
For example:
char* c = copy_string("hi");
fmt("%s", (const char*)c);
fmt("%s", c);
The first fmt() call correctly goes through fmt(const char*, ...) first,
but the second mistakenly goes through fmt(const char*, va_list) first
because variadic function overloads have lower priority during overload
resolution and va_list on a 32-bit system happens to be defined as a
pointer type that can match with "char*" but not "const char*".
During merge I split the test for bro_init/bro_done/bro_script_loaded
event errors into individual tests since the other testing of the zeek
versions of those events seemed fine to otherwise keep.
* origin/topic/robin/631-deprecation-v2:
Update NEWS for naming changes.
Small cleanup and updating submodules.
Remove test for legacy plugin.
Remove legancy symlinks in aux/.
Add warnings when loading scripts ending in ".bro", or using legacy environment variables.
Fix missing rename.
No longer symlink local.zeek to local.bro.
Update notice user agent.
Remove old_comm_usage_is_ok.
Remove bro-config.h.in and bro-path-dev.in.
Change Bro wrapper script to now abort when old executable names are still used.
Remove APIs that were explicitly deprecated to be removed in 3.1.
safe_snprintf and safe_vsnprintf just exist to ensure that the resulting strings are always null-terminated. The documentation for snprintf/vsnprintf states that the output of those methods are always null-terminated, thus making the safe versions obsolete.
More aspects of the cluster configuration to get fleshed out later,
but a basic cluster like one would use for a live deployment
can now be instantiated and run under supervision. The new
clusterized-pcap-processing supervisor mode is also not done yet.
The full process hierarchy isn't set up yet, but these changes
help prepare by doing two things:
- Add a -j option to enable supervisor-mode. Currently, just a single
"stem" process gets forked early on to be used as the basis for
further forking into real cluster nodes.
- Separates the parsing of command-line options from their consumption.
i.e. need to parse whether we're in -j supervisor-mode before
modifying any global state since that would taint the "stem" process.
The new intermediate structure containing the parsed options may
also serve as a way to pass configuration info from "stem" to its
descendent cluster node processes.