Previously, a single `icmp_conn` record was built per ICMP "connection"
and re-used for all events generated from it. This may have been a
historical attempt at performance optimization, but:
* By default, Zeek does not load any scripts that handle ICMP events.
* The one script Zeek ships with that does handle ICMP events,
"detect-traceroute", is already noted as being disabled due to
potential performance problems of doing that kind of analysis.
* Re-use of the original `icmp_conn` record tends to misreport
TTL and length values since they come from original packet instead
of the current one.
* Even if we chose to still re-use `icmp_conn` records and just fill
in a new TTL and length value each packet, a user script could have
stored a reference to the record and not be expecting those values
to be changed out from underneath them.
Now, a new `icmp_info` record is created/populated in all ICMP events
and should be used instead of `icmp_conn`. It also removes the
orig_h/resp_h fields as those are redundant with what's already
available in the connection record.
- Added test case and adjusted whitespace in merge
* 'stats-logging-fix' of https://github.com/brittanydonowho/zeek:
Fixed stats.zeek to log all data before zeek terminates rather than return too soon
* "bro_is_terminating" is now "zeek_is_terminating"
* "bro_version" is now "zeek_version"
The old function names still exist for now, but are deprecated.
* All "Broxygen" usages have been replaced in
code, documentation, filenames, etc.
* Sphinx roles/directives like ":bro:see" are now ":zeek:see"
* The "--broxygen" command-line option is now "--zeexygen"
* origin/topic/jsiwek/empty-lines:
Add 'smtp_excessive_pending_cmds' weird
Fix SMTP command string comparisons
Improve handling of empty lines in several text protocol analyzers
Add rate-limiting sampling mechanism for weird events
Teach timestamp canonifier about timestamps before ~2001
The generation of weird events, by default, are now rate-limited
according to these tunable options:
- Weird::sampling_whitelist
- Weird::sampling_threshold
- Weird::sampling_rate
- Weird::sampling_duration
The new get_reporter_stats() BIF also allows one to query the
total number of weirds generated (pre-sampling) which the new
policy/misc/weird-stats.bro script uses periodically to populate
a weird_stats.log.
There's also new reporter BIFs to allow generating weirds from the
script-layer such that they go through the same, internal
rate-limiting/sampling mechanisms:
- Reporter::conn_weird
- Reporter::flow_weird
- Reporter::net_weird
Some of the code was adapted from previous work by Johanna Amann.
* origin/topic/seth/dhcp-update:
Rework to the DHCP analyzer.
First step of DHCP analyzer rearchitecture.
Add .btest scripts for dhck_ack and dhcp_discover messages verifying that new options are correctly reported in dhcp.log records.
Extend DHCP protocol analyzer with new options.
BIT-1924 #merged
Additional changes:
* Removed known-hosts.bro as the only thing populating its table was
the already-removed known-hosts-and-devices.bro. So a
known_devices.log will no longer be generated.
* In dhcp-options.pac, the process_relay_agent_inf_option had a memleak
and also process_auto_proxy_config_option looked like it accessed one
byte past the end of the available bytestring, so fixed those.
Highlights:
- Reduced all DHCP events into a single dhcp_message event. (removed legacy events since they weren't widely used anyway)
- Support many more DHCP options.
- DHCP log is completely reworked and now represents DHCP sessions
based on the transaction ID (and works on clusters).
- Removed the known-devices-and-hostnames script since it's generally
less relevant now with the updated log.
(Cleaned up some code a little bit.)
* origin/topic/seth/stats-improvement:
Fixing tests for stats improvements
Rename the reporting interval variable for stats.
Removing more broken functionality due to changed stats apis.
Removing some references to resource_usage()
Removing Broker stats, it was broken and incomplete.
Fixing default stats collection interval to every 5 minutes.
Add DNS stats to the stats.log
Small stats script tweaks and beginning broker stats.
Continued stats cleanup and extension.
More stats collection extensions.
More stats improvements
Slight change to Mach API for collecting memory usage.
Fixing some small mistakes.
Updating the cmake submodule for the stats updates.
Fix memory usage collection on Mac OS X.
Cleaned up stats collection.
BIT-1581 #merged
Broke out the stats collection into a bunch of new Bifs
in stats.bif. Scripts that use stats collection functions
have also been updated. More work to do.
- Removed the gap_report event. It wasn't used anymore
and functionally no more capable that scheduling events
and using the get_gap_summary bif.
- Added functionality to Dictionaries to count cumulative
numbers of inserts performed. This is further used to
measure the total number of connections of various types.
Previously only the number of active connections was
available.
- The Reassembler base class now tracks active reassembly
size for all subclasses (File/TCP/Frag & unknown).
- Improvements to the stats.log. Mostly, more information.
This allows the path for the default filter to be specified explicitly
when creating a stream and reduces the need to rely on the default path
function to magically supply the path.
The default path function is now only used if, when a filter is added to
a stream, it has neither a path nor a path function already.
Adapted the existing Log::create_stream calls to explicitly specify a
path value.
Addresses BIT-1324
These functions are now deprecated in favor of alternative versions that
return a vector of strings rather than a table of strings.
Deprecated functions:
- split: use split_string instead.
- split1: use split_string1 instead.
- split_all: use split_string_all instead.
- split_n: use split_string_n instead.
- cat_string_array: see join_string_vec instead.
- cat_string_array_n: see join_string_vec instead.
- join_string_array: see join_string_vec instead.
- sort_string_array: use sort instead instead.
- find_ip_addresses: use extract_ip_addresses instead.
Changed functions:
- has_valid_octets: uses a string_vec parameter instead of string_array.
Addresses BIT-924, BIT-757.
The scripts for the others still remain and can be loaded explicitly,
but they reportedly may produce figures that are far from correct.
Addresses BIT-1171.
Changes:
- Changing semantics of the new_event() meta event: it's raised
only for events that have a handler defined. There are too many
checks in Bro that prevent events wo/ handler from being even
prepared to raise to do that differently.
- Adding test case.
* topic/robin/event-dumper:
New script misc/dump-events.bro, along with core support, that dumps events Bro is raising in an easily readable form.
Prettyfing Describe() for record types.
in an easily readable form.
This is for debugging purposes, obviously.
Example, including only SMTP events:
> bro -r smtp.trace misc/dump-events.bro DumpEvents::include=/smtp/
[...]
1254722768.219663 smtp_reply
[0] c: connection = [id=[orig_h=10.10.1.4, orig_p=1470/tcp, resp_h=74.53.140.153, [...]
[1] is_orig: bool = F
[2] code: count = 220
[3] cmd: string = >
[4] msg: string = xc90.websitewelcome.com ESMTP Exim 4.69 #1 Mon, 05 Oct 2009 01:05:54 -0500
[5] cont_resp: bool = T
1254722768.219663 smtp_reply
[0] c: connection = [id=[orig_h=10.10.1.4, orig_p=1470/tcp, resp_h=74.53.140.153, [...]
[1] is_orig: bool = F
[2] code: count = 220
[3] cmd: string = >
[4] msg: string = We do not authorize the use of this system to transport unsolicited,
[5] cont_resp: bool = T
[...]
Add a "broxygen" domain Sphinx extension w/ directives to allow
on-the-fly documentation to be generated w/ Bro and included in files.
This means all autogenerated reST docs are now done by Bro. The odd
CMake/Python glue scipts which used to generate some portions are now
gone. Bro and the Sphinx extension handle checking for outdated docs
themselves.
Parallel builds of `make doc` target should now work (mostly because
I don't think there's any tasks that can be done in parallel anymore).
Overall, this seems to simplify things and make the Broxygen-generated
portions of the documentation visible/traceable from the main Sphinx
source tree. The one odd thing still is that per-script documentation
is rsync'd in to a shadow copy of the Sphinx source tree within the
build dir. This is less elegant than using the new broxygen extension
to make per-script docs, but rsync is faster and simpler. Simpler as in
less code because it seems like, in the best case, I'd need to write a
custom Sphinx Builder to be able to get that to even work.
They just duplicated the text from where the events are originally
declared and also it's not generally useful to Broxygen-style comment
event *handlers* (they're more of an implementation detail of a script,
not a user-facing element).