Commit graph

237 commits

Author SHA1 Message Date
Tim Wojtulewicz
fa9a568e8f Remove #include of some iosource files from Net.h 2020-01-31 09:34:54 -07:00
Tim Wojtulewicz
f16f0360ff Only allow a single trace file (-r) or interface (-i) option on the command-line 2020-01-31 09:34:54 -07:00
Max Kellermann
26da10ca05 util: add a tokenize_string() overload which returns string_views
Additionally, it uses a single "char" as delimiter, which is also
faster.

This patch speeds up Zeek startup by 10%.
2020-01-31 13:46:45 +01:00
Max Kellermann
763afe6f5f util: store std::string_view in "final_components" vector
Don't copy those path segments - instead, use std::string_view to
store references into the existing std::strings.  This saves a good
amount of allocation overhead.
2020-01-31 13:46:45 +01:00
Max Kellermann
37cbd98e34 util: use "auto" in normalize_path() 2020-01-31 13:46:45 +01:00
Max Kellermann
53c4e30024 util: reserve space in normalize_path()
Pessimistic reservations to ensure that it does not need to be
reallocated.
2020-01-31 13:46:45 +01:00
Max Kellermann
5c0c336c6b util: skip "." completely in normalize_path()
Don't copy "." segments to the final_components list only to remove it
afterwards.
2020-01-31 13:46:45 +01:00
Max Kellermann
0589f295fa util: pass std::string_view to normalize_path()
Reduce overhead in some callers.
2020-01-31 13:46:44 +01:00
Max Kellermann
f1566bda14 util: pass std::string_view to tokenize_string()
This saves some overhead because some callers pass a plain C string
here which needed to be copied to a temporary std::string.
2020-01-31 13:46:42 +01:00
Max Kellermann
e068ad8a53 util: don't modify the input string in tokenize_string()
This saves one full copy of the input string and avoids moving memory
around at O(n^2) in the erase() call in each loop iteration.
2020-01-31 13:42:30 +01:00
Max Kellermann
0b3317b1c2 util: optimize expand_escape() by avoiding sscanf()
sscanf() is notoriously slow, and the default scripts have lots of hex
escapes.  This patch reduces Zeek's startup time by 9%.

Before:

            245.04 msec task-clock:u              #    1.002 CPUs utilized
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            16,411      page-faults:u             #    0.067 M/sec
       629,238,575      cycles:u                  #    2.568 GHz
     1,237,236,556      instructions:u            #    1.97  insn per cycle
       262,223,957      branches:u                # 1070.142 M/sec
         3,351,083      branch-misses:u           #    1.28% of all branches

After:

            220.99 msec task-clock:u              #    1.002 CPUs utilized
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            16,419      page-faults:u             #    0.074 M/sec
       544,603,653      cycles:u                  #    2.464 GHz
     1,065,862,324      instructions:u            #    1.96  insn per cycle
       229,207,957      branches:u                # 1037.181 M/sec
         3,045,270      branch-misses:u           #    1.33% of all branches
2020-01-31 10:32:37 +01:00
Jon Siwek
70b45d1aba Merge remote-tracking branch 'origin/topic/robin/631-deprecation-v2'
During merge I split the test for bro_init/bro_done/bro_script_loaded
event errors into individual tests since the other testing of the zeek
versions of those events seemed fine to otherwise keep.

* origin/topic/robin/631-deprecation-v2:
  Update NEWS for naming changes.
  Small cleanup and updating submodules.
  Remove test for legacy plugin.
  Remove legancy symlinks in aux/.
  Add warnings when loading scripts ending in ".bro", or using legacy environment variables.
  Fix missing rename.
  No longer symlink local.zeek to local.bro.
  Update notice user agent.
  Remove old_comm_usage_is_ok.
  Remove bro-config.h.in and bro-path-dev.in.
  Change Bro wrapper script to now abort when old executable names are still used.
  Remove APIs that were explicitly deprecated to be removed in 3.1.
2020-01-30 19:19:56 -08:00
Max Kellermann
32bb019e3a util, nb_dns: fix off-by-one bugs in strncpy() calls
Fortunately, these bugs had no effect because the following lines
overwrote the last character with a null byte.
2020-01-29 20:22:16 +01:00
Max Kellermann
aacf84e552 Type, util: add constexpr to static variables
This allows the compiler to move them to section `.rodata`.
2020-01-29 20:22:16 +01:00
Robin Sommer
6bcd583836 Merge remote-tracking branch 'origin/topic/jsiwek/supervisor'
* origin/topic/jsiwek/supervisor: (44 commits)
  Add note that Supervisor script APIs are unstable until 4.0
  Move command-line arg parsing functions to Options.{h,cc}
  Add btests for supervisor stem/leaf process revival
  Move supervisor control events into SupervisorControl namespace
  Fix supervisor "destroy" call on nodes not currently alive
  Move supervisor source files into supervisor/
  Address supervisor code re-factoring feedback from Robin
  Convert supervisor internals to rapidjson
  Add Supervisor documentation
  Add supervisor btests
  Improve logging of supervised node errors
  Fix supervised node inheritence of command-line script paths
  Improve normalize_path() util function
  Use a timer to check for death of supervised node's parent
  Improve supervisor checks for parent process termination
  Improve handling of premature supervisor stem exit
  Improve supervisor signal handler safety
  Remove unused supervisor config options
  Cleanup minor Supervisor TODOs
  Improve supervisor debug logging
  ...
2020-01-29 13:11:04 +00:00
Robin Sommer
649301b667 Add warnings when loading scripts ending in ".bro", or using legacy environment variables. 2020-01-29 12:08:10 +00:00
Jon Siwek
83874fa5fa Merge branch 'getrandom' of https://github.com/MaxKellermann/zeek
- Removed the superfluous check for C++17 in the merge since that's
  a requirement enforced at the CMake-level.

* 'getrandom' of https://github.com/MaxKellermann/zeek:
  util: use getrandom() on Linux if available
2020-01-28 12:45:15 -08:00
Max Kellermann
cb4258434c util: use getrandom() on Linux if available
Unlike /dev/urandom, getrandom() doesn't need a file descriptor and
works when there is no /dev.  It requires Linux 3.17 and glibc 2.25,
but there is a fallback to the old code.

For simplicity, this patch uses __has_include() to detect the
availability of this API, but maybe we should move that to cmake.

(It might be useful to refactor the whole random gathering code to a
separate function.)
2020-01-28 11:45:25 +01:00
Jon Siwek
9c0d252c2b Merge branch 'master' into topic/jsiwek/supervisor 2020-01-21 12:17:56 -08:00
Robin Sommer
8170baabef Merge remote-tracking branch 'origin/topic/timw/595-rapidjson'
Tweaks:
    - Small change to the logic for removing quotes around strings.
    - Updated NEWS & COPYING.3rdparty
    - Use of intrusive_ptr for stack-allocated StringVals
    - Little bit of refactoring (I would love to merge the two BuildJSON() functions, too, but that's a larger task)

* origin/topic/timw/595-rapidjson:
  Use the list of files from clang-tidy when searching for unit tests
  Optimize json_escape_utf8 a bit by removing repeated calls to string methods
  Expand unit test for json_escape_utf8 to include all of the strings from the ascii-json-utf8 btest
  GHI-595: Convert from nlohmann/json to rapidjson for performance reasons
  Convert type-checking macros to actual functions
2020-01-18 10:49:15 +00:00
Jon Siwek
38cd56a3db Improve normalize_path() util function
It didn't always properly handle ".." when the preceding path component
was also the first component.
2020-01-16 13:08:01 -08:00
Tim Wojtulewicz
23f551876c Optimize json_escape_utf8 a bit by removing repeated calls to string methods 2020-01-14 15:43:25 -07:00
Tim Wojtulewicz
ee0619f999 Expand unit test for json_escape_utf8 to include all of the strings from the ascii-json-utf8 btest 2020-01-14 15:43:25 -07:00
Jon Siwek
520c6e3ebf Merge branch 'master' into topic/jsiwek/supervisor 2020-01-13 10:27:34 -08:00
Jon Siwek
a4089bc659 Enable LeakSanitizer for unit tests run via doctest 2020-01-08 21:14:40 -08:00
Jon Siwek
6046da9993 Merge branch 'master' into topic/jsiwek/supervisor 2020-01-07 16:57:58 -08:00
Jon Siwek
a4fab5327a Merge remote-tracking branch 'origin/topic/timw/util-unit-tests'
* origin/topic/timw/util-unit-tests:
  fixup! Add unit tests to util.cc and module_util.cc
  Mark safe_snprintf and safe_vsnprintf as deprecated, remove uses of them
  Add unit tests to util.cc and module_util.cc
2020-01-06 09:44:43 -08:00
Tim Wojtulewicz
67fcc9b5af Mark safe_snprintf and safe_vsnprintf as deprecated, remove uses of them
safe_snprintf and safe_vsnprintf just exist to ensure that the resulting strings are always null-terminated. The documentation for snprintf/vsnprintf states that the output of those methods are always null-terminated, thus making the safe versions obsolete.
2020-01-02 15:36:39 -07:00
Tim Wojtulewicz
6a52857f8f Add unit tests to util.cc and module_util.cc 2020-01-02 15:36:39 -07:00
Jon Siwek
29f386e388 Implement minimal supervised cluster configuration
More aspects of the cluster configuration to get fleshed out later,
but a basic cluster like one would use for a live deployment
can now be instantiated and run under supervision.  The new
clusterized-pcap-processing supervisor mode is also not done yet.
2019-10-23 17:37:53 -07:00
Jon Siwek
4959d438fa Initial structure for supervisor-mode
The full process hierarchy isn't set up yet, but these changes
help prepare by doing two things:

- Add a -j option to enable supervisor-mode.  Currently, just a single
  "stem" process gets forked early on to be used as the basis for
  further forking into real cluster nodes.

- Separates the parsing of command-line options from their consumption.
  i.e. need to parse whether we're in -j supervisor-mode before
  modifying any global state since that would taint the "stem" process.
  The new intermediate structure containing the parsed options may
  also serve as a way to pass configuration info from "stem" to its
  descendent cluster node processes.
2019-09-27 19:17:58 -07:00
Jon Siwek
bc18ca44e6 Fix Xcode deprecation warning for std::ptr_fun
Replaced logic in strstrip() with a lambda to avoid deprecations:

- std::ptr_fun is deprecated in C++11, removed C++17
- std::not1 is deprecated in C++17. removed C++20
2019-09-26 09:45:44 -07:00
Tim Wojtulewicz
54752ef9a1 Deprecate the internal int/uint types in favor of the cstdint types they were based on 2019-08-12 13:50:07 -07:00
Johanna Amann
486bf1e713 Merge remote-tracking branch 'origin/topic/timw/cleaner-utf8'
* origin/topic/timw/cleaner-utf8:
  GHI-486: Switch over to using LLVM utf8-checking code to better validate characters

I addressed a buffer over-read during the merge and added test-cases for
it.
2019-07-29 09:25:25 -07:00
Tim Wojtulewicz
ad19f1e1bb GHI-486: Switch over to using LLVM utf8-checking code to better validate characters 2019-07-24 10:58:00 -07:00
Jon Siwek
8c45937798 Merge branch 'topic/jsiwek/template-containers-merge'
* topic/jsiwek/template-containers-merge:
  Fix a potential usage of List::remove_nth(-1)
  Change List::remote(const T&) to return a bool
  Fix debug build due to old int_list usage within assert
  Convert uses of loop_over_list to ranged-for loops
  Remove loop_over_queue (as an example for later removing loop_over_list)
  Change int_list in CCL.h to be a vector, fix uses of int_list to match
  Remove List<> usage from strings.bif
  Replace uses of the old Queue/PQueue generation code with new template versions
  Convert BaseQueue/Queue/PQueue into templates, including iterator support
  Replace uses of the old Dict generation code with new template versions
  Convert PDict into template
  Replace uses of the old List generation code with new template versions
  Convert BaseList/List/PList into templates, including iterator support

* Generally squashed fixups from topic/timw/template-containers

* Add missing include file in List.h: <cassert>
2019-07-15 19:51:27 -07:00
Tim Wojtulewicz
e51f02737b Convert uses of loop_over_list to ranged-for loops 2019-07-15 19:00:24 -07:00
Johanna Amann
418ab0e33a Merge remote-tracking branch 'origin/topic/jsiwek/zeekenv-static-local-fix'
* origin/topic/jsiwek/zeekenv-static-local-fix:
  Fix potential thread safety issue with zeekenv util function
2019-07-11 13:30:50 -07:00
Jon Siwek
cb292af84d Fix a sign-compare compiler warning 2019-07-11 12:14:27 -07:00
Jon Siwek
9a72a7117d Fix potential thread safety issue with zeekenv util function
Observed segfault accessing the local static std::map of zeekenv() from
a logging thread, but only in non-debug builds using Apple/Clang
compiler, not in a debug build or GCC.  Don't quite get this behavior
since static local variable initialization is supposed to be thread-safe
since C++11, but moving to a global static works and is "more efficient"
anyway since there's no longer any run-time overhead.
2019-07-11 11:41:50 -07:00
Johanna Amann
1f329ad541 Merge remote-tracking branch 'origin/topic/timw/150-to-json'
* origin/topic/timw/150-to-json:
  Update submodules for JSON work
  Update unit tests for JSON logger to match new output
  Modify JSON log writer to use the external JSON library
  Update unit test output to match json.zeek being deprecated and slight format changes to JSON output
  Add proper JSON serialization via C++, deprecate json.zeek
  Add new method for escaping UTF8 strings for JSON output
  Move do_sub method from zeek.bif to StringVal class method
  Move record_fields method from zeek.bif to Val class method
  Add ToStdString method for StringVal
2019-07-11 11:17:32 -07:00
Tim Wojtulewicz
385de9b0e7 Add new method for escaping UTF8 strings for JSON output 2019-07-02 12:52:26 -07:00
Tim Wojtulewicz
965a99a781 Fix potential null-dereference in current_time() 2019-06-12 14:46:29 -07:00
Jon Siwek
7f0fb49612 Add an internal getenv wrapper function: zeekenv
It maps newer environment variable names starting with ZEEK to the
legacy names starting with BRO.
2019-05-23 20:42:42 -07:00
Daniel Thayer
1a74516db1 Rename all BRO-prefixed environment variables
For backward compatibility when reading values, we first check
the ZEEK-prefixed value, and if not set, then check the corresponding
BRO-prefixed value.
2019-05-22 00:12:31 -05:00
Daniel Thayer
fe3d508796 Additional Bro to Zeek renaming
Most of these changes are either cmake-related or plugin-related.
Added a new test "plugins/legacy.zeek" to test that legacy Bro plugins
still work.

Also added a symlink bro-path-dev.in because some legacy Bro packages
won't install without it.
2019-05-19 16:51:36 -05:00
Jon Siwek
6ad7099f7e Merge remote-tracking branch 'origin/topic/robin/gh-239'
* origin/topic/robin/gh-239:
  Undo a change to btest.cfg from a recent commit
  Updating submodule.
  Fix zeek-wrapper
  Update for renaming BroControl to ZeekControl.
  Updating submodule.
  GH-239: Rename bro to zeek, bro-config to zeek-config, and bro-path-dev to zeek-path-dev.
2019-05-14 13:27:40 -07:00
Robin Sommer
789cb376fd GH-239: Rename bro to zeek, bro-config to zeek-config, and bro-path-dev to zeek-path-dev.
This also installs symlinks from "zeek" and "bro-config" to a wrapper
script that prints a deprecation warning.

The btests pass, but this is still WIP. broctl renaming is still
missing.

#239
2019-05-01 21:43:45 +00:00
Jon Siwek
7144661930 GH-340: Improve IPv4/IPv6 regexes, extraction, and validity functions
* is_valid_ip() is now implemented as a BIF instead of in
  base/utils/addrs

* The IPv4 and IPv6 regular expressions provided by base/utils/addrs
  have been improved/corrected (previously they could possibly match
  some invalid IPv4 decimals, or various "zero compressed" IPv6 strings
  with too many hextets)

* extract_ip_addresses() should give better results as a result of
  the above two points
2019-04-18 19:04:39 -07:00
Jon Siwek
f21e11d811 GH-237: add @load foo.bro -> foo.zeek fallback
When failing to locate a script with explicit .bro suffix, check for
whether one with a .zeek suffix exists and use it instead.
2019-04-16 17:49:37 -07:00