Commit graph

6667 commits

Author SHA1 Message Date
Max Kellermann
53c4e30024 util: reserve space in normalize_path()
Pessimistic reservations to ensure that it does not need to be
reallocated.
2020-01-31 13:46:45 +01:00
Max Kellermann
5c0c336c6b util: skip "." completely in normalize_path()
Don't copy "." segments to the final_components list only to remove it
afterwards.
2020-01-31 13:46:45 +01:00
Max Kellermann
0589f295fa util: pass std::string_view to normalize_path()
Reduce overhead in some callers.
2020-01-31 13:46:44 +01:00
Max Kellermann
f1566bda14 util: pass std::string_view to tokenize_string()
This saves some overhead because some callers pass a plain C string
here which needed to be copied to a temporary std::string.
2020-01-31 13:46:42 +01:00
Max Kellermann
e068ad8a53 util: don't modify the input string in tokenize_string()
This saves one full copy of the input string and avoids moving memory
around at O(n^2) in the erase() call in each loop iteration.
2020-01-31 13:42:30 +01:00
Arne Welzel
800c0c7132 parse.y: Properly set location info for functions
When defining a function, remember the location where the function header
was and restore it before calling `end_func()`. Inside `end_func()`, a
`BroFunc` object is created using the current global location information.

This came up while experimenting with zeek script profiling and wondering
why the locations set for `BroFunc` were "somewhere" in the middle of
functions instead of spanning them.
2020-01-31 10:47:20 +01:00
Max Kellermann
0b3317b1c2 util: optimize expand_escape() by avoiding sscanf()
sscanf() is notoriously slow, and the default scripts have lots of hex
escapes.  This patch reduces Zeek's startup time by 9%.

Before:

            245.04 msec task-clock:u              #    1.002 CPUs utilized
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            16,411      page-faults:u             #    0.067 M/sec
       629,238,575      cycles:u                  #    2.568 GHz
     1,237,236,556      instructions:u            #    1.97  insn per cycle
       262,223,957      branches:u                # 1070.142 M/sec
         3,351,083      branch-misses:u           #    1.28% of all branches

After:

            220.99 msec task-clock:u              #    1.002 CPUs utilized
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            16,419      page-faults:u             #    0.074 M/sec
       544,603,653      cycles:u                  #    2.464 GHz
     1,065,862,324      instructions:u            #    1.96  insn per cycle
       229,207,957      branches:u                # 1037.181 M/sec
         3,045,270      branch-misses:u           #    1.33% of all branches
2020-01-31 10:32:37 +01:00
Jon Siwek
70b45d1aba Merge remote-tracking branch 'origin/topic/robin/631-deprecation-v2'
During merge I split the test for bro_init/bro_done/bro_script_loaded
event errors into individual tests since the other testing of the zeek
versions of those events seemed fine to otherwise keep.

* origin/topic/robin/631-deprecation-v2:
  Update NEWS for naming changes.
  Small cleanup and updating submodules.
  Remove test for legacy plugin.
  Remove legancy symlinks in aux/.
  Add warnings when loading scripts ending in ".bro", or using legacy environment variables.
  Fix missing rename.
  No longer symlink local.zeek to local.bro.
  Update notice user agent.
  Remove old_comm_usage_is_ok.
  Remove bro-config.h.in and bro-path-dev.in.
  Change Bro wrapper script to now abort when old executable names are still used.
  Remove APIs that were explicitly deprecated to be removed in 3.1.
2020-01-30 19:19:56 -08:00
Johanna Amann
0e00113f0e Merge remote-tracking branch 'origin/master' into topic/johanna/table-changes 2020-01-30 15:16:25 -08:00
Max Kellermann
662c3aab58 Desc: move realloc() call out of the loop 2020-01-30 19:53:23 +01:00
Max Kellermann
61c3be8e16 SerializationFormat: move realloc() call out of the loop
Reallocate and copy the data only once.
2020-01-30 19:53:23 +01:00
Max Kellermann
ef15467757 PacketDumper: remove unused types 2020-01-30 19:53:23 +01:00
Jon Siwek
948cc32844 Fix leaked FDs in redirecting supervised node stdout/stderr 2020-01-29 16:05:39 -08:00
Jon Siwek
fd2c6c56a5 Add checks for failed fcntl calls 2020-01-29 16:04:46 -08:00
Jon Siwek
aac7f6e8f2 Set Pipe file descriptor flags correctly 2020-01-29 16:03:12 -08:00
Jon Siwek
f3e5728bcb Merge branch 'leaks' of https://github.com/MaxKellermann/zeek
* 'leaks' of https://github.com/MaxKellermann/zeek:
  Scope: fix memory leak by removing duplicate copy_string() call
  util, nb_dns: fix off-by-one bugs in strncpy() calls
  Type, util: add `constexpr` to static variables
  Net: remove unused variable
2020-01-29 11:50:09 -08:00
Max Kellermann
a458827292 Scope: fix memory leak by removing duplicate copy_string() call
The `ID` constructor also calls copy_string().
2020-01-29 20:22:16 +01:00
Max Kellermann
32bb019e3a util, nb_dns: fix off-by-one bugs in strncpy() calls
Fortunately, these bugs had no effect because the following lines
overwrote the last character with a null byte.
2020-01-29 20:22:16 +01:00
Max Kellermann
aacf84e552 Type, util: add constexpr to static variables
This allows the compiler to move them to section `.rodata`.
2020-01-29 20:22:16 +01:00
Max Kellermann
2694b4e2c8 Net: remove unused variable 2020-01-29 20:22:16 +01:00
Johanna Amann
ad18014bed Merge remote-tracking branch 'origin/topic/jsiwek/ssl-empty-files'
* origin/topic/jsiwek/ssl-empty-files:
  Skip file analysis for zero-length SSL/TLS data
2020-01-29 11:16:35 -08:00
Robin Sommer
6bcd583836 Merge remote-tracking branch 'origin/topic/jsiwek/supervisor'
* origin/topic/jsiwek/supervisor: (44 commits)
  Add note that Supervisor script APIs are unstable until 4.0
  Move command-line arg parsing functions to Options.{h,cc}
  Add btests for supervisor stem/leaf process revival
  Move supervisor control events into SupervisorControl namespace
  Fix supervisor "destroy" call on nodes not currently alive
  Move supervisor source files into supervisor/
  Address supervisor code re-factoring feedback from Robin
  Convert supervisor internals to rapidjson
  Add Supervisor documentation
  Add supervisor btests
  Improve logging of supervised node errors
  Fix supervised node inheritence of command-line script paths
  Improve normalize_path() util function
  Use a timer to check for death of supervised node's parent
  Improve supervisor checks for parent process termination
  Improve handling of premature supervisor stem exit
  Improve supervisor signal handler safety
  Remove unused supervisor config options
  Cleanup minor Supervisor TODOs
  Improve supervisor debug logging
  ...
2020-01-29 13:11:04 +00:00
Robin Sommer
649301b667 Add warnings when loading scripts ending in ".bro", or using legacy environment variables. 2020-01-29 12:08:10 +00:00
Robin Sommer
bbc308cb02 Fix missing rename. 2020-01-29 12:08:10 +00:00
Robin Sommer
d0b206fa36 Remove APIs that were explicitly deprecated to be removed in 3.1.
Special handling for bro_{init,done,script_loaded} events: if still
used, they cause Zeek to abort at startup.
2020-01-29 12:08:09 +00:00
Jon Siwek
83874fa5fa Merge branch 'getrandom' of https://github.com/MaxKellermann/zeek
- Removed the superfluous check for C++17 in the merge since that's
  a requirement enforced at the CMake-level.

* 'getrandom' of https://github.com/MaxKellermann/zeek:
  util: use getrandom() on Linux if available
2020-01-28 12:45:15 -08:00
Max Kellermann
cb4258434c util: use getrandom() on Linux if available
Unlike /dev/urandom, getrandom() doesn't need a file descriptor and
works when there is no /dev.  It requires Linux 3.17 and glibc 2.25,
but there is a fallback to the old code.

For simplicity, this patch uses __has_include() to detect the
availability of this API, but maybe we should move that to cmake.

(It might be useful to refactor the whole random gathering code to a
separate function.)
2020-01-28 11:45:25 +01:00
Jon Siwek
069eedb736 Improve kerberos analyzer address and event handling
Adds a weird, "invalid_kerberos_addr_len", for invalid kerberos host
address lengths and also fixes a memory leak when processing KRB_KDC_REQ
and KRB_KDC_REP messages for message types that do not match a
known/expected type.
2020-01-27 17:24:49 -08:00
Jon Siwek
53363a9bd3 Move command-line arg parsing functions to Options.{h,cc} 2020-01-27 13:50:44 -08:00
Jon Siwek
5fb01caee6 Add btests for supervisor stem/leaf process revival 2020-01-27 10:58:40 -08:00
Johanna Amann
68f0fe9e8c Automatic bro table->brokerstore insert operations
We now have an &broker_store attribute which automatically sends
inserts/deletes into a set/table to broker.

This might work - I actually did not test if the data ends up in the
broker store in the end. A limitation is that the table/set currently
only can have a one-element type since Broker doesn't support the list
type.
2020-01-23 13:13:10 -08:00
Johanna Amann
c306fcf3d7 Make bro_broker::val_to_data take a const Val* instead of a Val 2020-01-23 12:15:38 -08:00
Robin Sommer
01b7db5b46 Merge remote-tracking branch 'origin/topic/jsiwek/smb-transaction-strings'
* origin/topic/jsiwek/smb-transaction-strings:
  Improve creation of SMB transaction data strings
2020-01-23 13:19:11 +00:00
Jon Siwek
fce4bb3f50 Improve FTP word/whitespace handling 2020-01-22 19:50:14 -08:00
Jon Siwek
f939bcad7e Skip file analysis for zero-length SSL/TLS data 2020-01-22 16:49:32 -08:00
Johanna Amann
98ad95d00b Merge remote-tracking branch 'origin/master' into topic/johanna/table-changes 2020-01-22 16:02:12 -08:00
Jon Siwek
c75519ca88 Improve creation of SMB transaction data strings 2020-01-22 15:41:50 -08:00
Jon Siwek
68b513a364 Fix supervisor "destroy" call on nodes not currently alive
This would mistakenly have the Stem process kill itself due to giving
PID 0 as argument to kill() where it really was being used to mean "that
node does not currently have any live process associated with it" and so
can just be removed without trying to kill/reap.
2020-01-22 13:17:38 -08:00
Jon Siwek
59e075acab Move supervisor source files into supervisor/ 2020-01-22 11:23:10 -08:00
Jon Siwek
718879735e Address supervisor code re-factoring feedback from Robin 2020-01-21 22:26:17 -08:00
Jon Siwek
172456fac0 Convert supervisor internals to rapidjson 2020-01-21 13:19:05 -08:00
Jon Siwek
9c0d252c2b Merge branch 'master' into topic/jsiwek/supervisor 2020-01-21 12:17:56 -08:00
Robin Sommer
8170baabef Merge remote-tracking branch 'origin/topic/timw/595-rapidjson'
Tweaks:
    - Small change to the logic for removing quotes around strings.
    - Updated NEWS & COPYING.3rdparty
    - Use of intrusive_ptr for stack-allocated StringVals
    - Little bit of refactoring (I would love to merge the two BuildJSON() functions, too, but that's a larger task)

* origin/topic/timw/595-rapidjson:
  Use the list of files from clang-tidy when searching for unit tests
  Optimize json_escape_utf8 a bit by removing repeated calls to string methods
  Expand unit test for json_escape_utf8 to include all of the strings from the ascii-json-utf8 btest
  GHI-595: Convert from nlohmann/json to rapidjson for performance reasons
  Convert type-checking macros to actual functions
2020-01-18 10:49:15 +00:00
Jon Siwek
8247c42368 Add Supervisor documentation
Minor additions/changes to improve API I noticed along the way
2020-01-17 18:36:32 -08:00
Robin Sommer
c8c6621a0e Merge remote-tracking branch 'origin/topic/timw/bit-fields'
* origin/topic/timw/bit-fields:
  Use bools instead of single-bit bitfields in Ident and TCP protocol analyzers
  Bit of code-modernization cleanup in BroString
  Use fixed types in NetbiosSSN.h and Timer.h instead of bit fields
2020-01-17 11:55:00 +00:00
Jon Siwek
1972190b89 Add supervisor btests 2020-01-16 19:21:53 -08:00
Jon Siwek
21c75b46eb Improve logging of supervised node errors
Now getting sent through standard Reporter framework in the Supervisor
process.
2020-01-16 14:23:08 -08:00
Jon Siwek
8a145ee1a2 Fix supervised node inheritence of command-line script paths
They're now converting to absolute paths in the argument parsing phase
such that if a supervised node switches working directory, it can still
load the referenced script.
2020-01-16 13:11:04 -08:00
Jon Siwek
38cd56a3db Improve normalize_path() util function
It didn't always properly handle ".." when the preceding path component
was also the first component.
2020-01-16 13:08:01 -08:00
Jon Siwek
dbca14e1fc Use a timer to check for death of supervised node's parent 2020-01-15 15:27:53 -08:00