Includes submodule bumps for Broker (to pull in better handling of data
structures that are difficult to unserialize in Python), zeek-client (for the
get-config command), and a commit hash update for the external testsuite.
This adds an optional set of cluster node names to narrow the querying to. It
similarly expands the dispatch mechanism, since it likely most sense for any
such request to apply only to a subset of nodes.
Requests for invalid nodes trigger Response records in error state.
When agents receive a configuration, we don't currently honor requested run
states (there's no such thing as registering a node but not running it, for
example). To reflect this, we now start off nodes in state PENDING as we
launch them via the Supervisor, and move them to RUNNING when they check
in with us via Management::Node::API::notify_node_hello.
This adds support for retrieving the value of a global identifier from any
subset of cluster nodes. It relies on the lookup_ID() BiF to retrieve the val,
and to_json() to render the value to an easily parsed string. Ideally we'd send
the val directly, but this hits several roadblocks, including the fact that
Broker won't serialize arbitrary values.
This adds request/response event pairs to enable the controller to dispatch
"actions" (pre-implemented Zeek script actions) on subsets of Zeek cluster nodes
and collect the results. Using generic events to carry multiple such "run X on
the nodes" scenarios simplifies adding these in the future.
This provides Broker-level plumbing that allows agents to reach out to their
managed Zeek nodes and collect responses.
As a first event, it establishes Management::Node::API::notify_agent_hello,
to notify the agent when the cluster node is ready to communicate.
Also a bit of comment rewording to replace use of "data cluster" with simply
"cluster", to avoid ambiguity with data nodes in SumStats, and expansion of
test-all-policy.zeek and related/dependent tests, since we're introducing new
scripts.
* origin/topic/vern/table-attr-fixes:
updates for btests - new cases to check, new baselines
updates for btests - new cases to check, new baselines
fix for ill-formed (complex) &default function
type-checking for use of empty table constructors in expressions
catch empty constructors used for type inference suppress repeated error messages
factoring to make checking of &default attributes externally accessible
bug fix for empty table constructors with &default attributes (plus a typo)
* origin/topic/vern/rec-constr-check:
associated btest
fix base scripts to include mandatory fields in record constructors
restored record constructor checking for missing-but-mandatory fields
Documentation is missing and will be added in the next couple of hours.
* origin/topic/johanna/tls12-decryption: (24 commits)
TLS decryption: add test, fix small issues
Address PR feedback
TLS decryption: refactoring, more comments, less bare pointers
Small code fix and test baseline update.
SSL decryption: refactor TLS12_PRF
SSL decryption: small style changes, a bit of documentation
Deprecation and warning fixes
Clang-format updates
add missing call to EVP_KDF_CTX_set_params
TLS decryption: remove payload from ssl_encrypted_data again.
TLS 1.2 decryption: adapt OpenSSL 3.0 changes for 1.1
ssl: adapt TLS-PRF to openSSL 3.0
ssl/analyzer: potentially fix memory leaks caused by bytestrings
analyzer/ssl: several improvements
analyzer/ssl: defensive key length check + more debug logging
testing: feature gate ssl/decryption test
testing: add ssl/decryption test
analyzer/ssl: handle missing <openssl/kdf.h>
analyzer/ssl: silence warning in DTLS analyzer
analyzer/ssl: move proc-{client,server}-hello into the respective analyzers
...
This addresses feedback to GH-1814. The most significant change is the
fact that the ChipertextRecord now can remain &transient - which might
lead to improved speed.
- This gives the cluster controller and agent the common name "Management
framework" and changes the start directory of the sources from
"policy/frameworks/cluster" to "policy/frameworks/management". This avoids
ambiguity with the existing cluster framework.
- It renames the "ClusterController" and "ClusterAgent" script modules to
"Management::Controller" and "Management::Agent", respectively. This allows us
to anchor tooling common to both controller and agent at the "Management"
module.
- It moves common configuration settings, logging, requests, types, and
utilities to the common "Management" module.
- It removes the explicit "::Types" submodule (so a request/response result is
now a Management::Result, not a Management::Types::Result), which makes
typenames more readable.
- It updates tests that depend on module naming and full set of scripts.
Mostly trivial changes, except for one aspect: if a module exports a record type
and that record bears Zeekygen comments, then redefs that add to the record in
another module cannot be private to that module. Zeekygen will complain with
"unknown target" errors, even when such redefs have Zeekygen comments. So this
commits also adds two export-blocks that aren't technically required at this point.
* topic/christian/cluster-controller-get-nodes:
Bump external cluster testsuite
Bump zeek-client for the get-nodes command
Add ClusterController::API::get_nodes_request/response event pair
Support optional listening ports for cluster nodes
Don't auto-publish Supervisor response events in the cluster agent
Make members of the ClusterController::Types::State enum all-caps
Be more conservative with triggering request timeout events
Move redefs of ClusterController::Request::Request to their places of use
Simplify ClusterController::API::set_configuration_request/response
This allows querying the status of Zeek nodes currently running in a cluster.
The controller relays the request to all instances and accumulates their
responses.
The response back to the client contains one Result record per instance
response, each of which carrying a ClusterController::Types::NodeState vector in
its $data member to convey the state of each node at that instance.
The NodeState record tracks the name of the node, its role in the controller (if
any), its role in the data cluster (if any), as well as PID and listening port,
if any.
This makes cluster node listening ports &optional, and maps absent values to
0/unknown, the value the cluster framework currently uses to indicate that
listening isn't desired.
This commit changes DPD matching for TLS connections. A one-sided match
is enough to enable DPD now.
This commit also removes DPD for SSLv2 connections. SSLv2 connections do
basically no longer happen in the wild. SSLv2 is also really finnicky to
identify correctly - there is very little data required to match it, and
basically all matches today will be false positives. If DPD for SSLv2 is
still desired, the optional signature in policy/protocols/ssl/dpd-v2.sig
can be loaded.
Fixes GH-1952
It's easier to track outstanding controller/agent requests via a simple set of
pending agent names, and we can remove all of the result aggregation logic since
we can simply re-use the results reported by the agents.
This can serve as a template for request-response patterns where a client's
request triggers a request to all agents, followed by a response to the client
once all agents have responded. Once we have a few more of those, it'll become
clearer how to abstract this further.
This commit refactors TLS decryption, adds more comments in scripts and
in C++ source-code, and removes use of bare pointers, instead relying
more on stl data types.
* origin/topic/vern/when-lambda:
explicitly provide the frame for evaluating a "when" timeout expression
attempt to make "when" btest deterministic
tests for new "when" semantics/errors
update existing test suite usage of "when" statements to include captures
update uses of "when" in base scripts to include captures
captures for "when" statements update Triggers to IntrusivePtr's and simpler AST traversal introduce IDSet type, migrate associated "ID*" types to "const ID*"
logic (other than in profiling) for assignments that yield separate values
option for internal use to mark a function type as allowing non-expression returns
removed some now-obsolete profiling functionality
minor commenting clarifications
This changes the agent-controller communication to remove the need for ongoing
pinging of the controller by agents not actively "in service". Instead, agents
now use the notify_agent_hello event to the controller to report only their
identity. The controller puts them into service via an agent_welcome_request/
response pair, and takes them out of service via agent_standby_request/response.
This removes the on_change handler from the set of agents that is ready for
service, because not every change to this set is now a suitable time to
potentially send out the configuration. We now invoke this check explicitly in
the two situations where it's warranted: when a agent reports ready for service,
and when we've received a new configuration.
This has no practical relevance other than allowing the two to be loaded a the
same time, which some of our (cluster-unrelated) tests require. Absence of
namespacing would trigger symbol clashes at this point.
This now features support for the test_timeout_request/response events, as
supported by the client, and also adds a timeout event for set_configuration, in
case agents do not respond in time.
Includes corresponding zeek-client submodule bump.
This establishes a timeout controlled via ClusterController::request_timeout,
triggering a ClusterController::Request::request_expired event whenever a
timeout rolls around before request state has been finalized by a request's
normal processing.
This changes the basic agent-management model to one in which the configurations
received from the client define not just the data cluster, but also set the set
of acceptable instances. Unless connectivity already exists, the controller will
establish peerings with new agents that listen, or wait for ones that connect to
the controller to check in.
Once all required agents are available, the controller triggers the new
notify_agents_ready event, an agent/controller-level "cluster-is-ready"
event. The controller also uses this event to submit a pending config update to
the now-ready instances.