This adds two new functions: `Conn::register_removal_hook()` and
`Conn::unregister_removal_hook()` for registering a hook function to be
called back during `connection_state_remove`. The benefit of using hook
callback approach is better scalability: the overhead of unrelated
protocols having to dispatch no-op `connection_state_remove` handlers is
avoided.
Previously, more data than could effectively be utilized by any remote
Zeek was published (e.g. full list of pending commands or other
transient state that may add up to non-trivial amount of bytes).
And switch Zeek's base scripts over to using it in place of
"connection_state_remove". The difference between the two is
that "connection_state_remove" is raised for all events while
"successful_connection_remove" excludes TCP connections that were never
established (just SYN packets). There can be performance benefits
to this change for some use-cases.
There's also a new event called ``connection_successful`` and a new
``connection`` record field named "successful" to help indicate this new
property of connections.
Minor cleanup in merge: remove print statements and unnecessary @if
directive.
* 'topic/jsbarber/ftp-cluster-fix-patch' of https://github.com/jsbarber/zeek:
Publish ftp_data_expected updates to other workers for synchronization
* All "Broxygen" usages have been replaced in
code, documentation, filenames, etc.
* Sphinx roles/directives like ":bro:see" are now ":zeek:see"
* The "--broxygen" command-line option is now "--zeexygen"
The bytes_threshold_crossed event in the gridftp analyzer is not first
checking to see if the connection passed the initial criteria. This
causes the script to add the gridftp-data service to any connection that
crosses a threshold that is the same as or greater than the gridftp
size_threshold.
The server-reported file size was being collected poorly and if
a file name had a number in it, that was reported as the file
size instead of the actual size.
A new test is included to avoid reintroducing the problem.
* origin/topic/seth/more-file-type-ident-fixes:
File API updates complete.
Fixes for file type identification.
API changes to file analysis mime type detection.
Make HTTP 206 reassembly require ETags by default.
More file type identification improvements
Fix an issue with files having gaps before the bof_buffer is filled.
Fix an issue with packet loss in http file reporting.
Adding WOFF fonts to file type identification.
Extended JSON matching and added OCSP responses.
Another large signature update.
More signature updates.
Even more file type ident clean up.
Lots of fixes for file type identification.
BIT-1368 #merged
* origin/topic/johanna/conn-threshold:
Wrap threshold stuff up - fix two small bugs and update baselines.
update GridFTP analyzer to use connection thresholding instead of polling
Add high level api for thresholding that holds lists of thresholds and raises an event for each threshold exactly once.
Allow setting packet and byte thresholds for connections.
BIT-1377 #merged
Removed "file_mime_type" and "file_mime_types" event, replacing them
with a new event called "file_metadata_inferred". It has a record
argument of type "inferred_file_metadata", which contains the mime type
information that the earlier events used to supply. The idea here is
that future extensions to the record with new metadata will be less
likely to break user code than the alternatives (adding new events or
new event parameters).
Addresses BIT-1368.
This allows the path for the default filter to be specified explicitly
when creating a stream and reduces the need to rely on the default path
function to magically supply the path.
The default path function is now only used if, when a filter is added to
a stream, it has neither a path nor a path function already.
Adapted the existing Log::create_stream calls to explicitly specify a
path value.
Addresses BIT-1324
These functions are now deprecated in favor of alternative versions that
return a vector of strings rather than a table of strings.
Deprecated functions:
- split: use split_string instead.
- split1: use split_string1 instead.
- split_all: use split_string_all instead.
- split_n: use split_string_n instead.
- cat_string_array: see join_string_vec instead.
- cat_string_array_n: see join_string_vec instead.
- join_string_array: see join_string_vec instead.
- sort_string_array: use sort instead instead.
- find_ip_addresses: use extract_ip_addresses instead.
Changed functions:
- has_valid_octets: uses a string_vec parameter instead of string_array.
Addresses BIT-924, BIT-757.
* origin/topic/seth/faf-updates: (27 commits)
Undoing the FTP tests I updated earlier.
Update the last two btest FAF tests.
File analysis fixes and test updates.
Fix a bug with getting analyzer tags.
A few test updates.
Some tests work now (at least they all don't fail anymore!)
Forgot a file.
Added protocol description functions that provide a super compressed log representation.
Fix a bug where orig file information in http wasn't working right.
Added mime types to http.log
Clean up queued but unused file_over_new_connections event args.
Add jar files to the default MHR lookups.
Adding CAB files for MHR checking.
Improve malware hash registry script.
Fix a small issue with finding smtp entities.
Added support for files to the notice framework.
Make the custom libmagic database a git submodule.
Add an is_orig parameter to file_over_new_connection event.
Make magic for emitting application/msword mime type less strict.
Disable more libmagic builtin checks that override the magic database.
...
Conflicts:
doc/scripts/DocSourcesList.cmake
scripts/base/init-bare.bro
scripts/test-all-policy.bro
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
- Fix examples/references in the file analysis how-to/usage doc.
- Add Broxygen-generated docs for file analyzer plugins.
- Break FTP::Info type declaration out in to its own file to get
rid of some circular dependencies (between s/b/p/ftp/main and
s/b/p/ftp/utils).
- Several places were just using old variable names or not loading
scripts correctly after they'd been renamed/moved.
- Revert/adjust a change in how HTTP file handles are generated that
broke partial content responses.
- Turn some libmagic builtin checks back on; seems some are actually
useful (e.g. text detection seems to be a builtin). The rule going
forward probably will be only to turn off a builtin if we confirm it
causes issues.
- Removed some tests that are redundant or not necessary anymore because
the generic file analysis tests cover them.
- A couple FTP tests still fail that I think need an actual solution via
script changes.
- This caused us to lose signatures for POP3 and Bittorrent. These will
need discovered in the repository again when we add scripts
for those analyzers.
- Recorrected the module name to Files.
- Added Files::analyzer_name to get a more readable name for a
file analyzer.
- Improved and just overall better handled multipart mime
transfers in HTTP and SMTP. HTTP now has orig_fuids and resp_fuids
log fields since multiple "files" can be transferred with
multipart mime in a single request/response pair. SMTP has
an fuids field which has file unique IDs for all parts
transferred. FTP and IRC have a log field named fuid added
because only a single file can be transferred per irc and ftp
log line.
Closes#1030.
* origin/topic/seth/packet-filter-updates:
Missed a test fix.
Updating test baselines.
Updates for the PacketFilter framework to simplify it.
Last test update for PacketFilter framework.
Several final fixes for PacketFilter framework.
Packet filter framework checkpoint.
Checkpoint on the packet filter framework.
Initial rework of packet filter framework.