This change adds two new hooks to the Intel framework that can be used
to intercept added and removed indicators and their type.
These hooks are fairly low-level. One immediate use-case is to count the
number of indicators loaded per Intel::Type and enable and disable the
corresponding event groups of the intel/seen scripts.
I attempted to gauge the overhead and while it's definitely there, loading
a file with ~500k DOMAIN entries takes somewhere around ~0.5 seconds hooks
when populated via the min_data_store store mechanism. While that
doesn't sound great, it actually takes the manager on my system 2.5
seconds to serialize and Cluster::publish() the min_data_store alone
and its doing that serially for every active worker. Mostly to say that
the bigger overhead in that area on the manager doing redundant work
per worker.
Co-authored-by: Mohan Dhawan <mohan@corelight.com>
This introduces a new hook into the Intel::seen() function that allows
users to directly interact with the result of a find() call via external
scripts.
This should solve the use-case brought up by @chrisanag1985 in
discussion #3256: Recording and acting on "no intel match found".
@Canon88 was recently asking on Slack about enabling HTTP logging for a
given connection only when an Intel match occurred and found that the
Intel::match() event would only occur on the manager. The
Intel::match_remote() event might be a workaround, but possibly running a
bit too late and also it's just an internal "detail" event that might not
be stable.
Another internal use case revolved around enabling packet recording
based on Intel matches which necessarily needs to happen on the worker
where the match happened. The proposed workaround is similar to the above
using Intel::match_remote().
This hook also provides an opportunity to rate-limit heavy hitter intel
items locally on the worker nodes, or even replacing the event approach
currently used with a customized approach.
This adds a redefinable const to the internals of the Intel framework, to allow
suppression of the manager sending its current min_data_store when a worker
connects. This feature is desirable for nodes that check in "late" to bring them
up to speed, but during testing it introduces nondeterminism.
This adds a "policy" hook into the logging framework's streams and
filters to replace the existing log filter predicates. The hook
signature is as follows:
hook(rec: any, id: Log::ID, filter: Log::Filter);
The logging manager invokes hooks on each log record. Hooks can veto
log records via a break, and modify them if necessary. Log filters
inherit the stream-level hook, but can override or remove the hook as
needed.
The distribution's existing log streams now come with pre-defined
hooks that users can add handlers to. Their name is standardized as
"log_policy" by convention, with additional suffixes when a module
provides multiple streams. The following adds a handler to the Conn
module's default log policy hook:
hook Conn::log_policy(rec: Conn::Info, id: Log::ID, filter: Log::Filter)
{
if ( some_veto_reason(rec) )
break;
}
By default, this handler will get invoked for any log filter
associated with the Conn::LOG stream.
The existing predicates are deprecated for removal in 4.1 but continue
to work.
* 'export_intel_events' of https://github.com/mauropalumbo75/zeek:
minor restyle and add comments
add an empty read_error event to the intel framework (in the export block, so that users can implement further checks with it)
move event Intel::read_entry to export block
Adjusted whitespace in merge.
* All "Broxygen" usages have been replaced in
code, documentation, filenames, etc.
* Sphinx roles/directives like ":bro:see" are now ":zeek:see"
* The "--broxygen" command-line option is now "--zeexygen"
This introduces the following redefinable string constants, empty by
default:
- InputAscii::path_prefix
- InputBinary::path_prefix
- Intel::path_prefix
When using ASCII or binary reades in the Input/Intel Framework with an
input stream source that does not have an absolute path, these
constants cause Zeek to prefix the resulting paths accordingly. For
example, in the following the location on disk from which Zeek loads
the input becomes "/path/to/input/whitelist.data":
redef InputAscii::path_prefix = "/path/to/input";
event bro_init()
{
Input::add_table([$source="whitelist.data", ...]);
}
These path prefixes can be absolute or relative. When an input stream
source already uses an absolute path, this path is preserved and the
new variables have no effect (i.e., we do not affect configurations
already using absolute paths).
Since the Intel framework builds upon the Input framework, the first
two paths also affect Intel file locations. If this is undesirable,
the Intel::path_prefix variable allows specifying a separate path:
when its value is absolute, the resulting source seen by the Input
framework is absolute, therefore no further changes to the paths
happen.
Namely these are now removed:
- Broker::relay
- Broker::publish_and_relay
- Cluster::relay_rr
- Cluster::relay_hrw
The idea being that Broker may eventually implement the necessary
routing (plus load balancing) functionality. For now, code that used
these should "manually" handle and re-publish events as needed.
a broctl print triggers this error
Reporter::ERROR no such index (Cluster::nodes[Intel::p$descr])
/usr/local/bro/share/bro/base/frameworks/intel/./cluster.bro, line 39
when broctl connects p$descr is empty. It should probably be set to
'control' somewhere inside broctl, but that would only fix broctl, not
other clients.
diff --git a/aux/bro-aux b/aux/bro-aux
index 02f710a43..43f4b90bb 160000
--- a/aux/bro-aux
+++ b/aux/bro-aux
@@ -1 +1 @@
-Subproject commit 02f710a436dfe285bae0d48d7f7bc498783e11a8
+Subproject commit 43f4b90bbaf87dae1a1073e7bf13301e58866011
diff --git a/aux/broctl b/aux/broctl
index e960be2c1..d3e6cdfba 160000
--- a/aux/broctl
+++ b/aux/broctl
@@ -1 +1 @@
-Subproject commit e960be2c192a02f1244ebca3ec31ca57d64e23dc
+Subproject commit d3e6cdfba496879bd55542c668ea959f524bd723
diff --git a/aux/btest b/aux/btest
index 2810ccee2..e638fc65a 160000
--- a/aux/btest
+++ b/aux/btest
@@ -1 +1 @@
-Subproject commit 2810ccee25f6f20be5cd241155f12d02a79d592a
+Subproject commit e638fc65aa12bd136594451b8c185a7a01ef3e9a
diff --git a/scripts/base/frameworks/intel/cluster.bro b/scripts/base/frameworks/intel/cluster.bro
index 820a5497a..e75bdd057 100644
--- a/scripts/base/frameworks/intel/cluster.bro
+++ b/scripts/base/frameworks/intel/cluster.bro
@@ -32,7 +32,7 @@ event remote_connection_handshake_done(p: event_peer)
{
# When a worker connects, send it the complete minimal data store.
# It will be kept up to date after this by the cluster_new_item event.
- if ( Cluster::nodes[p$descr]$node_type == Cluster::WORKER )
+ if ( p$descr in Cluster::nodes && Cluster::nodes[p$descr]$node_type == Cluster::WORKER )
{
send_id(p, "Intel::min_data_store");
}
When inserting, existance of the given subnet is checked using exact
matching instead of longest prefix matching. Before, inserting a subnet
would have updated the subnet item, which is the longest prefix of the
inserted subnet, if present.
File Analysis Framework related code has been moved into a separate
script. Using redefinitions of the corresponding records causes the
file-related columns to appear last.
The extension mechanism is basically the one that Seth introduced with
his intel extensions. The main difference lies in using a hook instead
of an event. An example policy implements whitelisting.
This patch allows users to provide the fuid or the connection id
directly, in case they do not have access to either in the event that
they handle.
An example for this is the handling of certificates in SSL, where the
fa_file record cannot be retained because this would create a cyclic
data structure.
This patch also provides file IDs for hostname matches in certificates,
which was not possible with the previous API.