The Supervisor generates this event every time it receives a status update from
the stem, meaning a node got created or re-created. A corresponding
SupervisorControl::node_status event relays the same information for users
interacting with the Supervisor over Broker.
* topic/christian/management-cluster-dirs:
Management framework: bump zeek-client to pull in instance serialization fixes
Management framework: bump external cluster testsuite
Management framework: update agent-checkin test to reflect recent changes
Management framework: place each Zeek process in its own working dir
Management framework: set defaults for log rotation and persistent state
Management framework: add spool and state directory config settings
Management framework: establish stdout/stderr files also for cluster nodes
Management framework: default to having agents check in with the (local) controller
Management framework: move role variable from logging into framework-wide config
Management framework: distinguish supervisor/supervisee when loading agent/controller
Management framework: simplify agent and controller stdout/stderr files
Management framework: prefix the management logs with "management-"
Management framework: comment and layouting tweaks, no functional change
Management framework: rename env var that labels agents/controllers
Management framework: increase robustness of agent/controller naming
This establishes a directory "nodes" in Management::state_dir and places each
Zeek process into a subdirectory in it, named after the Zeek process. For
example, node "worker-01" runs with cwd <state_dir>/nodes/worker-01/.
Explicitly configured directories can override the naming logic, and also ignore
the state directory if they're absolute paths. One exception remains: the
Supervisor itself -- we'd have to use LogAscii::logdir to automatically place it
too in its own directory, but that feature currently does not interoperate with
log rotation.
This adds management/persistence.zeek to establish common configuration for log
rotation and persistent variable state. Log-writing Zeek processes initially
write locally in their working directory, and rotate into subdirectory
"log-queue" of the spool. Since agent and controller have no logger,
persistence.zeek puts in place compatible configurations for them.
Storage folders for Broker-backed tables and clusterized stores default to
subdirectories of the new Zeek-level state folder.
When setting the ZEEK_MANAGEMENT_TESTING environment variable, persistent state
is kept in the local directory, and log rotation remains disabled.
This also tweaks @loads a bit in favor of simply loading frameworks/management,
which is easier to keep track of.
This allows specifying spool and variable-state directories specifically for the
management framework. They default to the corresponding installation-level
folders.
Load the agent/controller bootstrapping code only from the Supervisor, and the
basic config only from a supervisee. When we're neither (which is likely a
mistake), we do nothing.
The fallback mechanism when no explicit agent/controller names are configured
didn't work properly, because many places in the code relied on accessing the
name via the variables meant for explicit configuration, such as
Management::Agent::name. Agent and controller now offer functions for computing
the correct effective name, and we use that throughout.
When passing an empty string as a directory, the function would produce
filenames starting with a slash even when the given file_name is not an absolute
path. Defaulting to the root directory is likely never intended and might
conveivably be dangerous. The middle "/" is now skipped also if dir is an empty
string.
* origin/topic/vern/script-profiling:
tidy up after generating profile
test suite updates for refined script coverage, use of new BiF to speed startup
fix for coverage reporting for functions that use "when" statements
new global_options() BiF to speed up startup, plus a micro-preen
hooks for new --profile-scripts option
classes for managing script profiles
address some holes in script coverage
fix for script coverage missing on-exit activity
memory management fixes for loggers
make curr_CPU_time() broadly available rather than just isolated to ZAM
I needed to figure out which exact algorithm we use for our
probabilistic top-k measurements. It turns out that we do not mention
this in our source tree at all so far.
Includes submodule bumps for Broker (to pull in better handling of data
structures that are difficult to unserialize in Python), zeek-client (for the
get-config command), and a commit hash update for the external testsuite.
This adds an optional set of cluster node names to narrow the querying to. It
similarly expands the dispatch mechanism, since it likely most sense for any
such request to apply only to a subset of nodes.
Requests for invalid nodes trigger Response records in error state.
When agents receive a configuration, we don't currently honor requested run
states (there's no such thing as registering a node but not running it, for
example). To reflect this, we now start off nodes in state PENDING as we
launch them via the Supervisor, and move them to RUNNING when they check
in with us via Management::Node::API::notify_node_hello.
This adds support for retrieving the value of a global identifier from any
subset of cluster nodes. It relies on the lookup_ID() BiF to retrieve the val,
and to_json() to render the value to an easily parsed string. Ideally we'd send
the val directly, but this hits several roadblocks, including the fact that
Broker won't serialize arbitrary values.
This adds request/response event pairs to enable the controller to dispatch
"actions" (pre-implemented Zeek script actions) on subsets of Zeek cluster nodes
and collect the results. Using generic events to carry multiple such "run X on
the nodes" scenarios simplifies adding these in the future.
This provides Broker-level plumbing that allows agents to reach out to their
managed Zeek nodes and collect responses.
As a first event, it establishes Management::Node::API::notify_agent_hello,
to notify the agent when the cluster node is ready to communicate.
Also a bit of comment rewording to replace use of "data cluster" with simply
"cluster", to avoid ambiguity with data nodes in SumStats, and expansion of
test-all-policy.zeek and related/dependent tests, since we're introducing new
scripts.
* origin/topic/vern/table-attr-fixes:
updates for btests - new cases to check, new baselines
updates for btests - new cases to check, new baselines
fix for ill-formed (complex) &default function
type-checking for use of empty table constructors in expressions
catch empty constructors used for type inference suppress repeated error messages
factoring to make checking of &default attributes externally accessible
bug fix for empty table constructors with &default attributes (plus a typo)
* origin/topic/vern/rec-constr-check:
associated btest
fix base scripts to include mandatory fields in record constructors
restored record constructor checking for missing-but-mandatory fields
Documentation is missing and will be added in the next couple of hours.
* origin/topic/johanna/tls12-decryption: (24 commits)
TLS decryption: add test, fix small issues
Address PR feedback
TLS decryption: refactoring, more comments, less bare pointers
Small code fix and test baseline update.
SSL decryption: refactor TLS12_PRF
SSL decryption: small style changes, a bit of documentation
Deprecation and warning fixes
Clang-format updates
add missing call to EVP_KDF_CTX_set_params
TLS decryption: remove payload from ssl_encrypted_data again.
TLS 1.2 decryption: adapt OpenSSL 3.0 changes for 1.1
ssl: adapt TLS-PRF to openSSL 3.0
ssl/analyzer: potentially fix memory leaks caused by bytestrings
analyzer/ssl: several improvements
analyzer/ssl: defensive key length check + more debug logging
testing: feature gate ssl/decryption test
testing: add ssl/decryption test
analyzer/ssl: handle missing <openssl/kdf.h>
analyzer/ssl: silence warning in DTLS analyzer
analyzer/ssl: move proc-{client,server}-hello into the respective analyzers
...
This addresses feedback to GH-1814. The most significant change is the
fact that the ChipertextRecord now can remain &transient - which might
lead to improved speed.