Using an overload that takes a va_list argument potentially causes
accidental misuse on platforms (e.g. 32-bit) where va_list is
implemented as a type that may collide with commonly-used argument
types.
For example:
char* c = copy_string("hi");
fmt("%s", (const char*)c);
fmt("%s", c);
The first fmt() call correctly goes through fmt(const char*, ...) first,
but the second mistakenly goes through fmt(const char*, va_list) first
because variadic function overloads have lower priority during overload
resolution and va_list on a 32-bit system happens to be defined as a
pointer type that can match with "char*" but not "const char*".
The Zeek code base has very inconsistent #includes. Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed. Another side effect was a lot of header
bloat which slows down the build.
First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.
After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations. In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.
This patch speeds up the build by 19%, because each compilation unit
gets smaller. Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):
Before this patch:
3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps
After this patch:
2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
- Minor changes in merge: extended unit test, prefer emplace_back(),
remove unused "found" count in new function
* 'optimize_normalize_path' of https://github.com/MaxKellermann/zeek:
util: add a tokenize_string() overload which returns string_views
util: store std::string_view in "final_components" vector
util: use "auto" in normalize_path()
util: reserve space in normalize_path()
util: skip "." completely in normalize_path()
util: pass std::string_view to normalize_path()
util: pass std::string_view to tokenize_string()
util: don't modify the input string in tokenize_string()
Don't copy those path segments - instead, use std::string_view to
store references into the existing std::strings. This saves a good
amount of allocation overhead.
During merge I split the test for bro_init/bro_done/bro_script_loaded
event errors into individual tests since the other testing of the zeek
versions of those events seemed fine to otherwise keep.
* origin/topic/robin/631-deprecation-v2:
Update NEWS for naming changes.
Small cleanup and updating submodules.
Remove test for legacy plugin.
Remove legancy symlinks in aux/.
Add warnings when loading scripts ending in ".bro", or using legacy environment variables.
Fix missing rename.
No longer symlink local.zeek to local.bro.
Update notice user agent.
Remove old_comm_usage_is_ok.
Remove bro-config.h.in and bro-path-dev.in.
Change Bro wrapper script to now abort when old executable names are still used.
Remove APIs that were explicitly deprecated to be removed in 3.1.
* origin/topic/jsiwek/supervisor: (44 commits)
Add note that Supervisor script APIs are unstable until 4.0
Move command-line arg parsing functions to Options.{h,cc}
Add btests for supervisor stem/leaf process revival
Move supervisor control events into SupervisorControl namespace
Fix supervisor "destroy" call on nodes not currently alive
Move supervisor source files into supervisor/
Address supervisor code re-factoring feedback from Robin
Convert supervisor internals to rapidjson
Add Supervisor documentation
Add supervisor btests
Improve logging of supervised node errors
Fix supervised node inheritence of command-line script paths
Improve normalize_path() util function
Use a timer to check for death of supervised node's parent
Improve supervisor checks for parent process termination
Improve handling of premature supervisor stem exit
Improve supervisor signal handler safety
Remove unused supervisor config options
Cleanup minor Supervisor TODOs
Improve supervisor debug logging
...
- Removed the superfluous check for C++17 in the merge since that's
a requirement enforced at the CMake-level.
* 'getrandom' of https://github.com/MaxKellermann/zeek:
util: use getrandom() on Linux if available
Unlike /dev/urandom, getrandom() doesn't need a file descriptor and
works when there is no /dev. It requires Linux 3.17 and glibc 2.25,
but there is a fallback to the old code.
For simplicity, this patch uses __has_include() to detect the
availability of this API, but maybe we should move that to cmake.
(It might be useful to refactor the whole random gathering code to a
separate function.)
Tweaks:
- Small change to the logic for removing quotes around strings.
- Updated NEWS & COPYING.3rdparty
- Use of intrusive_ptr for stack-allocated StringVals
- Little bit of refactoring (I would love to merge the two BuildJSON() functions, too, but that's a larger task)
* origin/topic/timw/595-rapidjson:
Use the list of files from clang-tidy when searching for unit tests
Optimize json_escape_utf8 a bit by removing repeated calls to string methods
Expand unit test for json_escape_utf8 to include all of the strings from the ascii-json-utf8 btest
GHI-595: Convert from nlohmann/json to rapidjson for performance reasons
Convert type-checking macros to actual functions
* origin/topic/timw/util-unit-tests:
fixup! Add unit tests to util.cc and module_util.cc
Mark safe_snprintf and safe_vsnprintf as deprecated, remove uses of them
Add unit tests to util.cc and module_util.cc
safe_snprintf and safe_vsnprintf just exist to ensure that the resulting strings are always null-terminated. The documentation for snprintf/vsnprintf states that the output of those methods are always null-terminated, thus making the safe versions obsolete.
More aspects of the cluster configuration to get fleshed out later,
but a basic cluster like one would use for a live deployment
can now be instantiated and run under supervision. The new
clusterized-pcap-processing supervisor mode is also not done yet.
The full process hierarchy isn't set up yet, but these changes
help prepare by doing two things:
- Add a -j option to enable supervisor-mode. Currently, just a single
"stem" process gets forked early on to be used as the basis for
further forking into real cluster nodes.
- Separates the parsing of command-line options from their consumption.
i.e. need to parse whether we're in -j supervisor-mode before
modifying any global state since that would taint the "stem" process.
The new intermediate structure containing the parsed options may
also serve as a way to pass configuration info from "stem" to its
descendent cluster node processes.
Replaced logic in strstrip() with a lambda to avoid deprecations:
- std::ptr_fun is deprecated in C++11, removed C++17
- std::not1 is deprecated in C++17. removed C++20
* origin/topic/timw/cleaner-utf8:
GHI-486: Switch over to using LLVM utf8-checking code to better validate characters
I addressed a buffer over-read during the merge and added test-cases for
it.
* topic/jsiwek/template-containers-merge:
Fix a potential usage of List::remove_nth(-1)
Change List::remote(const T&) to return a bool
Fix debug build due to old int_list usage within assert
Convert uses of loop_over_list to ranged-for loops
Remove loop_over_queue (as an example for later removing loop_over_list)
Change int_list in CCL.h to be a vector, fix uses of int_list to match
Remove List<> usage from strings.bif
Replace uses of the old Queue/PQueue generation code with new template versions
Convert BaseQueue/Queue/PQueue into templates, including iterator support
Replace uses of the old Dict generation code with new template versions
Convert PDict into template
Replace uses of the old List generation code with new template versions
Convert BaseList/List/PList into templates, including iterator support
* Generally squashed fixups from topic/timw/template-containers
* Add missing include file in List.h: <cassert>
Observed segfault accessing the local static std::map of zeekenv() from
a logging thread, but only in non-debug builds using Apple/Clang
compiler, not in a debug build or GCC. Don't quite get this behavior
since static local variable initialization is supposed to be thread-safe
since C++11, but moving to a global static works and is "more efficient"
anyway since there's no longer any run-time overhead.
* origin/topic/timw/150-to-json:
Update submodules for JSON work
Update unit tests for JSON logger to match new output
Modify JSON log writer to use the external JSON library
Update unit test output to match json.zeek being deprecated and slight format changes to JSON output
Add proper JSON serialization via C++, deprecate json.zeek
Add new method for escaping UTF8 strings for JSON output
Move do_sub method from zeek.bif to StringVal class method
Move record_fields method from zeek.bif to Val class method
Add ToStdString method for StringVal