Introduce two new events for analyzer confirmation and analyzer violation
reporting. The current analyzer_confirmation and analyzer_violation
events assume connection objects and analyzer ids are available which
is not always the case. We're already passing aid=0 for packet analyzers
and there's not currently a way to report violations from file analyzers
using analyzer_violation, for example.
These new events use an extensible Info record approach so that additional
(optional) information can be added later without changing the signature.
It would allow for per analyzer extensions to the info records to pass
analyzer specific info to script land. It's not clear that this would be
a good idea, however.
The previous analyzer_confirmation and analyzer_violation events
continue to exist, but are deprecated and will be removed with Zeek 6.1.
This adds a "policy" hook into the logging framework's streams and
filters to replace the existing log filter predicates. The hook
signature is as follows:
hook(rec: any, id: Log::ID, filter: Log::Filter);
The logging manager invokes hooks on each log record. Hooks can veto
log records via a break, and modify them if necessary. Log filters
inherit the stream-level hook, but can override or remove the hook as
needed.
The distribution's existing log streams now come with pre-defined
hooks that users can add handlers to. Their name is standardized as
"log_policy" by convention, with additional suffixes when a module
provides multiple streams. The following adds a handler to the Conn
module's default log policy hook:
hook Conn::log_policy(rec: Conn::Info, id: Log::ID, filter: Log::Filter)
{
if ( some_veto_reason(rec) )
break;
}
By default, this handler will get invoked for any log filter
associated with the Conn::LOG stream.
The existing predicates are deprecated for removal in 4.1 but continue
to work.
This adds two new functions: `Conn::register_removal_hook()` and
`Conn::unregister_removal_hook()` for registering a hook function to be
called back during `connection_state_remove`. The benefit of using hook
callback approach is better scalability: the overhead of unrelated
protocols having to dispatch no-op `connection_state_remove` handlers is
avoided.
- Updated the logic significantly: still filters out ICMP from being
considered an active service (like before) and adds a new
"Known::service_udp_requires_response" option (defaults to true) for
whether to require UDP server response before being considered an
active service.
* 'topic/dopheide/known-services' of https://github.com/dopheide-esnet/zeek:
Log services with unknown protocols
- All timers are now handled by a single global timer manager, which simplifies how they handled by the IOSource manager.
- This change flows down a number of changes to other parts of the code. The timer manager tag field is removed, which means that matching connections to a timer manager is also removed. This removes the ability to tag a connection as internal or external, since that's how the connections where differentiated. This in turn removes the `current_conns_extern` field from the `ConnStats` record type in the script layer.
Typically in base scripts, Log::create_stream() is called in zeek_init()
handler with &priority=5 such that it will have already been created
in the default zeek_init() &priority=0.
And switch Zeek's base scripts over to using it in place of
"connection_state_remove". The difference between the two is
that "connection_state_remove" is raised for all events while
"successful_connection_remove" excludes TCP connections that were never
established (just SYN packets). There can be performance benefits
to this change for some use-cases.
There's also a new event called ``connection_successful`` and a new
``connection`` record field named "successful" to help indicate this new
property of connections.
* 'known_services_multiprotocols' of https://github.com/mauropalumbo75/zeek:
improve logging with broker store
drop services starting with -
remove service from key for Cluster::publish_hrw
remove check for empty services
update tests
order list of services in store key
remove repeated services in logs if already seen
add multiprotocol known_services when Known::use_service_store = T
remove hyphen in front of some services (for example -HTTP, -SSL) In some cases, there is an hyphen before the protocol name in the field connection$service. This can cause problems in known_services and is removed here. It originates probably in some analyzer where it would be better removed in the future.
add multiprotocol known_services when Known::use_service_store = F
Changes during merge:
* whitespace
* add unit test
* 'empty_services' of https://github.com/mauropalumbo75/zeek:
remove empty services and include udp active connections when logging in connection_state_remove
In some cases, there is an hyphen before the protocol name in the field
connection$service. This can cause problems in known_services and
is removed here. It originates probably in some analyzer where it
would be better removed in the future.
* All "Broxygen" usages have been replaced in
code, documentation, filenames, etc.
* Sphinx roles/directives like ":bro:see" are now ":zeek:see"
* The "--broxygen" command-line option is now "--zeexygen"
The link-layer addresses are now part of the connection endpoints
following the originator-responder-pattern. The addresses are printed
with leading zeros. Additionally link-layer addresses are also extracted
for 802.11 plus RadioTap.
Lowered priority of a connection_state_remove event handler to ensure
that the "conn" field is initialized in the connection record before
attempting to add the VLAN tags.
* 'master' of https://github.com/aaronmbr/bro:
Copy-paste issue
Allow for logging of the VLAN data about a connection in conn.log
Save the inner vlan in the Packet object for Q-in-Q setups
This allows the path for the default filter to be specified explicitly
when creating a stream and reduces the need to rely on the default path
function to magically supply the path.
The default path function is now only used if, when a filter is added to
a stream, it has neither a path nor a path function already.
Adapted the existing Log::create_stream calls to explicitly specify a
path value.
Addresses BIT-1324
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.
There are three major parts going into this:
- A new plugin infrastructure in src/plugin. This is independent
of analyzers and will eventually support plugins for other parts
of Bro as well (think: readers and writers). The goal is that
plugins can be alternatively compiled in statically or loadead
dynamically at runtime from a shared library. While the latter
isn't there yet, there'll be almost no code change for a plugin
to make it dynamic later (hopefully :)
- New analyzer infrastructure in src/analyzer. I've moved a number
of analyzer-related classes here, including Analyzer and DPM;
the latter now renamed to Analyzer::Manager. More will move here
later. Currently, there's only one plugin here, which provides
*all* existing analyzers. We can modularize this further in the
future (or not).
- A new script interface in base/framework/analyzer. I think that
this will eventually replace the dpm framework, but for now
that's still there as well, though some parts have moved over.
I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:
const ports = { 22/tcp } &redef;
event bro_init() &priority=5
{
...
Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
}
As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.
This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.
The debug stream "dpm" shows more about the loaded/enabled analyzers.
A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).
This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.