safe_snprintf and safe_vsnprintf just exist to ensure that the resulting strings are always null-terminated. The documentation for snprintf/vsnprintf states that the output of those methods are always null-terminated, thus making the safe versions obsolete.
More aspects of the cluster configuration to get fleshed out later,
but a basic cluster like one would use for a live deployment
can now be instantiated and run under supervision. The new
clusterized-pcap-processing supervisor mode is also not done yet.
The full process hierarchy isn't set up yet, but these changes
help prepare by doing two things:
- Add a -j option to enable supervisor-mode. Currently, just a single
"stem" process gets forked early on to be used as the basis for
further forking into real cluster nodes.
- Separates the parsing of command-line options from their consumption.
i.e. need to parse whether we're in -j supervisor-mode before
modifying any global state since that would taint the "stem" process.
The new intermediate structure containing the parsed options may
also serve as a way to pass configuration info from "stem" to its
descendent cluster node processes.
Replaced logic in strstrip() with a lambda to avoid deprecations:
- std::ptr_fun is deprecated in C++11, removed C++17
- std::not1 is deprecated in C++17. removed C++20
* origin/topic/timw/cleaner-utf8:
GHI-486: Switch over to using LLVM utf8-checking code to better validate characters
I addressed a buffer over-read during the merge and added test-cases for
it.
* topic/jsiwek/template-containers-merge:
Fix a potential usage of List::remove_nth(-1)
Change List::remote(const T&) to return a bool
Fix debug build due to old int_list usage within assert
Convert uses of loop_over_list to ranged-for loops
Remove loop_over_queue (as an example for later removing loop_over_list)
Change int_list in CCL.h to be a vector, fix uses of int_list to match
Remove List<> usage from strings.bif
Replace uses of the old Queue/PQueue generation code with new template versions
Convert BaseQueue/Queue/PQueue into templates, including iterator support
Replace uses of the old Dict generation code with new template versions
Convert PDict into template
Replace uses of the old List generation code with new template versions
Convert BaseList/List/PList into templates, including iterator support
* Generally squashed fixups from topic/timw/template-containers
* Add missing include file in List.h: <cassert>
Observed segfault accessing the local static std::map of zeekenv() from
a logging thread, but only in non-debug builds using Apple/Clang
compiler, not in a debug build or GCC. Don't quite get this behavior
since static local variable initialization is supposed to be thread-safe
since C++11, but moving to a global static works and is "more efficient"
anyway since there's no longer any run-time overhead.
* origin/topic/timw/150-to-json:
Update submodules for JSON work
Update unit tests for JSON logger to match new output
Modify JSON log writer to use the external JSON library
Update unit test output to match json.zeek being deprecated and slight format changes to JSON output
Add proper JSON serialization via C++, deprecate json.zeek
Add new method for escaping UTF8 strings for JSON output
Move do_sub method from zeek.bif to StringVal class method
Move record_fields method from zeek.bif to Val class method
Add ToStdString method for StringVal
For backward compatibility when reading values, we first check
the ZEEK-prefixed value, and if not set, then check the corresponding
BRO-prefixed value.
Most of these changes are either cmake-related or plugin-related.
Added a new test "plugins/legacy.zeek" to test that legacy Bro plugins
still work.
Also added a symlink bro-path-dev.in because some legacy Bro packages
won't install without it.
* origin/topic/robin/gh-239:
Undo a change to btest.cfg from a recent commit
Updating submodule.
Fix zeek-wrapper
Update for renaming BroControl to ZeekControl.
Updating submodule.
GH-239: Rename bro to zeek, bro-config to zeek-config, and bro-path-dev to zeek-path-dev.
This also installs symlinks from "zeek" and "bro-config" to a wrapper
script that prints a deprecation warning.
The btests pass, but this is still WIP. broctl renaming is still
missing.
#239
* is_valid_ip() is now implemented as a BIF instead of in
base/utils/addrs
* The IPv4 and IPv6 regular expressions provided by base/utils/addrs
have been improved/corrected (previously they could possibly match
some invalid IPv4 decimals, or various "zero compressed" IPv6 strings
with too many hextets)
* extract_ip_addresses() should give better results as a result of
the above two points
* 'master' of https://github.com/dnthayer/zeek:
Update tests and baselines due to renaming all scripts
Rename all scripts to have ".zeek" file extension
Update a few tests due to scripts with new file extension
Add test cases to verify new file extension is recognized
Fix the core/load-duplicates.bro test
Update script search logic for new file extension
Remove unnecessary ".bro" from @load directives
When searching for script files, look for both the new and old file
extensions. If a file with ".zeek" can't be found, then search for
a file with ".bro" as a fallback.
set_processing_status can be called before reporter is initialized or
after it is deleted. Work around by sending data to stderr instead.
Patch by Thomas Petersen.
Addig a new random seed for external tests.
I added a wrapper around the siphash() function to make calling it a
little bit safer at least.
BIT-1612 #merged
* origin/topic/johanna/bit-1612:
HLL: Fix missing typecast in test case.
Remove the -K/-J options for setting keys.
Add test checking the quality of HLL by adding a lot of elements.
Fix serializing probabilistic hashers.
Baseline updates after hash function change.
Also switch BloomFilters from H3 to siphash.
Change Hashing from H3 to Siphash.
HLL: Remove unnecessary comparison.
Hyperloglog: change calculation of Rho
The options were never really used and do not seem especially useful;
initialization with a seed file still works.
This also fixes a bug with the initialization of the siphash key.
This commit mostly changes the hash function that is used for Internal
hashing of data < 36 bytes from H3 to Siphash. This change is motivated
by the fact that it turns out that H3 apparently does not deliver a very
good source of data uniqueness; running HLL with H3 as a hashing
function results in quite poor results (up to of 75% off in my tests).
In difference, running HLL with Siphash (or HMAC-MD5) changes this
factor to ~2%.
This also fixes a long-standing bug in Hash.h which truncated our hash
values to 32 bit on most machines.
Furthermore, it once again fixes a problem with the Rank function in
HLL.
(Cleaned up some code a little bit.)
* origin/topic/seth/stats-improvement:
Fixing tests for stats improvements
Rename the reporting interval variable for stats.
Removing more broken functionality due to changed stats apis.
Removing some references to resource_usage()
Removing Broker stats, it was broken and incomplete.
Fixing default stats collection interval to every 5 minutes.
Add DNS stats to the stats.log
Small stats script tweaks and beginning broker stats.
Continued stats cleanup and extension.
More stats collection extensions.
More stats improvements
Slight change to Mach API for collecting memory usage.
Fixing some small mistakes.
Updating the cmake submodule for the stats updates.
Fix memory usage collection on Mac OS X.
Cleaned up stats collection.
BIT-1581 #merged
This was causing occasional problems with the time on processes
running lots of threads. The use of gmtime in the json
formatter is the likely culprit due to the fact that the
json formatter runs in threads. More evidence for this is that
the problem only appears to exhibit when logs are being written
as JSON.
More specifically, this removes the functions:
strcasecmp_n
strchr_n
strrchr_n
and replaces the calls with the respective C-library calls that should
be part of just about all operating systems by now.
Broke out the stats collection into a bunch of new Bifs
in stats.bif. Scripts that use stats collection functions
have also been updated. More work to do.
With this patch the model is:
- "print" cleans the data so that non-printable characters get
escaped. This is not necessarily reversible.
- to print in a reversible way, one can go through
escape_string(); this escapes backslashes as well to make the
decoding non-ambigious.
- Logging always escapes similar to escape_string(), making it
reversible.
Compared to master, we also change the escaping as follows:
- We now only escape with "\xXX", no more "^X" or "\0". Exception:
backslashes.
- We escape backlashes as "\\".
- There's no "alternative" output style anymore, i.e., fmt() '%A'
qualifier is gone.
Baselines in testing/btest are updated, external tests not yet.
Addresses BIT-1333.