Mirror/zeek - git.uphillsecurity.com: We code.

mirror of https://github.com/zeek/zeek.git synced 2025-10-02 06:38:20 +00:00

Author	SHA1	Message	Date
Christian Kreibich	742f7fe340	Management framework: add auto-enumeration of metrics ports This is quite redundant with the enumeration for Broker ports, unfortunately. But the logic is subtly different: all nodes obtain a telemetry port, while not all nodes require a Broker port, for example, and in the metrics port assignment we also cross-check selected Broker ports. I found more unified code actually harder to read in the end. The logic for the two sets remains the same: from a start point, ports get enumerated sequentially that aren't otherwise taken. These ports are assumed available; there's nothing that checks their availability -- for now. The default start port is 9000. I considered 9090, to align with the Prometheus default, but counting upward from there is likely to hit trouble with the Broker default ports (9999/9997), used by the Supervisor. Counting downward is a bit unnatural, and shifting the Broker default ports brings subtle ordering issues. This also changes the node ordering logic slightly since it seems more intuitive to keep sequential ports on a given instance, instead of striping across them.	2024-07-08 23:05:24 -07:00
Josh Soref	21e0d777b3	Spelling fixes: scripts * accessing * across * adding * additional * addresses * afterwards * analyzer * ancillary * answer * associated * attempts * because * belonging * buffer * cleanup * committed * connects * database * destination * destroy * distinguished * encoded * entries * entry * hopefully * image * include * incorrect * information * initial * initiate * interval * into * java * negotiation * nodes * nonexistent * ntlm * occasional * omitted * otherwise * ourselves * paragraphs * particular * perform * received * receiver * referring * release * repetitions * request * responded * retrieval * running * search * separate * separator * should * synchronization * target * that * the * threshold * timeout * transaction * transferred * transmission * triggered * vetoes * virtual Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>	2022-11-02 17:36:39 -04:00
Christian Kreibich	147283c8f5	Management framework: add websocket support to controller The controller now listens on an additional port, defaulting to 2149, for Broker connections via websockets. Configuration works as for the existing traditional Broker port (2150), via ZEEK_CONTROLLER_WEBSOCKET_ADDR and ZEEK_CONTROLLER_WEBSOCKET_PORT environment variables, as well as corresponding redef'able constants. To disable the websockets feature, leave ZEEK_CONTROLLER_WEBSOCKET_PORT unset and redefine Management::Controller::default_port_websocket to 0/unknown.	2022-10-24 15:59:26 -07:00
Christian Kreibich	e73b561dca	Update Management framework to new Supervisor::NodeConfig script fields	2022-09-02 12:12:19 -07:00
Christian Kreibich	6c3e545306	Management framework: fix early return condition for get-id-value This erroneously used connectedness of instances, not presence of a deployed cluster. Without a deployment, there's no point in trying to retrieve global ID values.	2022-08-09 14:07:16 -07:00
Christian Kreibich	e947e1d1c2	Management framework: additional context in a few log messages This adds request IDs in a few places that didn't mention them, and makes requests to the Supervisor that act on all current nodes explicit.	2022-07-11 13:00:35 -07:00
Christian Kreibich	3aa0409792	Management framework: edit pass over docstrings This expands cross-referencing in the doc strings and adds a bit more explanation.	2022-06-22 23:26:11 -07:00
Christian Kreibich	b9879a50a0	Management framework: node restart support This adds restart request/response event pairs that restart nodes in the running Zeek cluster. The implementation is very similar to get_id_value, which also involves distributing a list of nodes to agents and aggregating the responses.	2022-06-22 23:26:11 -07:00
Christian Kreibich	d994f33636	Management framework: log the controller's startup deployment attempt The controller now logs its deployment attempt of a persisted configuration at startup. This is generally helpful to see recorded, and also explains timeout of the underlying request in case of failure (which triggers a timeout message).	2022-06-22 23:26:11 -07:00
Christian Kreibich	05447c413f	Management framework: bugfix for a get_id_value corner case For the case of a running cluster with no connected agents, use the g_instances_known table instead of g_instances. The latter reflects the contents of the last deployed config, not the live scenario of actually attached agents.	2022-06-22 23:26:06 -07:00
Christian Kreibich	b2f9e29bae	Management framework: make "result" argument plural in multi-result response events No functional change, just a consistency tweak. Since agent and controller send response events via Broker::publish(), the arguments aren't named and so this only affects the API definition.	2022-06-22 23:25:15 -07:00
Christian Kreibich	2c1cd1d401	Management framework: rename set_configuration events to stage_configuration This reflects corresponding renaming of the client's set-config command to stage-config, to make it more clear what's happening.	2022-06-22 11:54:58 -07:00
Christian Kreibich	68558e2874	Management framework: trigger deployment upon when instances are ready More resilience: when an agent restarts, it checks in with the controller. If the controller has deployed a config, this check-in may lead to an internal notify_agents_ready event. At that point, we now trigger a deployment when there currently isn't already one running. This ensures that any agents not yet running the current cluster will start to do so, and does nothing when those agents already run it, since they ignore the request in that case.	2022-06-21 17:22:45 -07:00
Christian Kreibich	1faf1ab8b7	Management framework: re-trigger deployment upon controller launch A resilience feature: when a booting controller has a previously deployed configuration (just reloaded from persistent state), it now triggers a deployment. When agents at this point run something else, this restores the controller's understanding of what's deployed, and if the agents do still run this configuration, does nothing since agents ignore deployment of a configuration they already run.	2022-06-21 17:22:45 -07:00
Christian Kreibich	c4862e7c5e	Management framework: move most deployment handling to internal function The controller now runs most of a config deployment via an internal function, allowing it to be called from multiple places instead of just the deploy_request event handler.	2022-06-21 17:22:45 -07:00
Christian Kreibich	3120fbc75e	Management framework: distinguish internally and externally requested deployments The controller's deployment request state now features a bit that indicates whether the deployment was requested by a client, or triggered internally. This affects logging and the transmission of deployment response events via Broker, which are skipped when the deployment is internal. This is in preparation of resilience features when the controller (re-)boots.	2022-06-21 17:22:45 -07:00
Christian Kreibich	7787d84739	Management framework: track instances by their Broker IDs This allows us to handle loss of Broker peerings, updating instance state as we see instances go away. This also tweaks logging slightly to differentiate between an instance checking in for the first time, and checking in when the controller already knows it.	2022-06-21 17:22:45 -07:00
Christian Kreibich	d7e88fc079	Management framework: make helper function a local	2022-06-21 17:22:45 -07:00
Christian Kreibich	a2525e44ba	Management framework: add a helper for rendering result vectors to a string	2022-06-21 17:22:45 -07:00
Christian Kreibich	d367f1bad9	Management framework: agents now skip re-deployment of current config When an agent is already running the configuration it's asked to deploy, it will now recognize this and by default do nothing. The requester can force it if needed, via a new argument to the deploy_request event.	2022-06-21 17:22:45 -07:00
Christian Kreibich	46db4a0e71	Management framework: introduce state machine for configs and persist them The controller now knows three states that a cluster configuration can be in: - STAGED: as uploaded by the client - READY: with needed tweaks applied, e.g. to fill in ports - DEPLOYED: as sent off to agents for deployment These states aren't exclusive, they represent checkpoints that a config goes through from upload through deployment. A deployed configuration will also exist in its STAGED and READY versions, unless a client has uploaded a new configuration, which will overwrite the STAGED and READY ones. The controller saves all of these in a table, which lets us use Broker to persist all states to disk. We use &broker_allow_complex_type, since we only ever store entire configurations.	2022-06-21 17:22:45 -07:00
Christian Kreibich	77556e9f11	Management framework: introduce deployment API in controller This separates uploading a configuration from deploying it to the instances into separate event transactions. set_configuration_request/response remains, but now only conducts validation and storage of the new configuration (upon validation success, and not yet persisted to disk). The response event indicates success or the list of validation errors. Successful upload now returns the configuration's ID in the result record's data struct. The new deploy_request/response event takes a previously uploaded configuration and deploys it to the agents. The controller now tracks uploaded and deployed configurations separately. Uploading assigns g_config_staged; deployment assigns g_config_deployed. Deployment does not affect g_config_staged. The get_config_request/response event pair now allows selecting the configuration the caller would like to retrieve.	2022-06-21 17:22:45 -07:00
Christian Kreibich	0480b5f39c	Management framework: rename agent "set_configuration" to "deploy" This renames the agent's functionality for setting a configuration to reflect the controller's upcoming separation of set_configuration and deployment.	2022-06-21 17:22:45 -07:00
Christian Kreibich	3ac5fdfc59	Management framework: trivial changes and comment-only rewording	2022-06-21 17:22:45 -07:00
Christian Kreibich	d6042cf516	Management framework: add config validation During `set_configuration_request` handling the controller now validates received configurations, checking for a few common gotchas around naming and port use. Validation continues once it finds a problem, resulting in a list summarizing all identified problems.	2022-06-19 01:20:16 -07:00
Christian Kreibich	620db4d4eb	Management framework: improvements to port auto-enumeration The numbering process now accounts for the possibility of colliding with the agent port, as well as with ports explicitly assigned in the configuration. It also avoids nondeterminism that could result from traversal of sets.	2022-06-19 01:19:54 -07:00
Christian Kreibich	5592beaf31	Management framework: handle no-instances corner case in set-config correctly When the controller receives a configuration with no instances (and thus no nodes), it needs to roundtrip to agents and can send the response right away.	2022-06-19 01:19:47 -07:00
Christian Kreibich	64741b571e	Management framework: switch default network visibilities Up to now, agents and controllers listened locally only, and the Supervisor (which listens when we run an agent) listened globally. It's now the other way around: controllers and agents listen globally and the Supervisor, when listening, does so locally.	2022-06-08 15:00:19 -07:00
Christian Kreibich	9b4841912c	Management framework: also use send_set_configuration_response_error elsewhere	2022-06-07 13:42:07 -07:00
Christian Kreibich	ccf3c24e23	Management framework: minor log formatting tweak, for consistency	2022-06-07 13:41:47 -07:00
Christian Kreibich	7a471df1a1	Management framework: support auto-assignment of ports in cluster nodes This enables the controller to assign listening ports to managers, loggers, and proxies. (We don't currently make the workers listen.) The feature is controlled by the Management::Controller::auto_assign_ports flag. When enabled (the default), enumeration starts from Management::Controller::auto_assign_start_port, beginning with the manager, then the logger(s), then proxy(s). When the feature is disabled and nodes that require a port lack it, the controller rejects the configuration.	2022-06-07 13:38:04 -07:00
Christian Kreibich	c53044981a	Management framework: improve address and port handling The get-nodes command also benefits from showing the state on connected agents more broadly (as opposed to just the one for the current configuration). Also a bugfix: ensure we use an agent's IP address as seen by the controller. This avoids reporting "0.0.0.0" in some cases.	2022-06-03 02:14:07 -07:00
Christian Kreibich	0c47d45bb9	Management framework: broaden get_instances response data to connected instances This response so far contained only the connected instances that are relevant to the current configuration, but this isn't very helpful when troubleshooting instance connectivity. It now reports all currently connected instances, with network addresses & ports as known to Broker.	2022-06-03 02:13:30 -07:00
Christian Kreibich	72acf24f52	Management framework: expand notify_agent_hello event arguments This swaps the host event argument for the Broker ID. The latter is more useful, since the sending agent doesn't necessarily know its IP address as visible to the controller, and the controller can pull up the full Broker context via the ID. It also adds an explicit argument to the event to indicate whether the agent connected to the controller or vice versa. This simplifies the controller's internal logic. Also minor tweaks to logging to show Broker IDs.	2022-06-03 02:12:19 -07:00
Christian Kreibich	aa689807fa	Management framework: comment-only tweaks and typo fixes	2022-06-03 02:12:12 -07:00
Christian Kreibich	f10b94de39	Management framework: enable stdout/stderr reporting This uses the new frameworks/management/supervisor functionality to maintain stdout/stderr files, and hooks output context into set_configuration error results.	2022-05-31 12:55:21 -07:00
Christian Kreibich	49b9f1669c	Management framework: move to ResultVec in agent's set_configuration response We so far reported one result record per agent, which made it hard to report per-node outcomes for the new configuration. Agents now report one result record per node they're responsible for.	2022-05-31 12:55:21 -07:00
Christian Kreibich	83c60fd8ac	Management framework: tune request timeout granularity and interval When the controller relays requests to agents, we want agents to time out more quickly than the corresponding controller requests. This allows agents to respond with more meaningful errors, while the controller's timeout acts mostly as a last resort to ensure a response to the client actually happens. This dials down the table_expire_interval to 2 seconds in both agent and controller, for more predictable timeout behavior. It also dials the agent-side request expiration interval down to 5 seconds, compared to the agent's 10 seconds. We may have to revisit this to allow custom expiration intervals per request/response message type.	2022-05-31 12:55:21 -07:00
Christian Kreibich	c922f749c5	Management framework: a bit of debug-level logging for troubleshooting	2022-05-31 12:55:21 -07:00
Christian Kreibich	93ea03a081	Management framework: place each Zeek process in its own working dir This establishes a directory "nodes" in Management::state_dir and places each Zeek process into a subdirectory in it, named after the Zeek process. For example, node "worker-01" runs with cwd <state_dir>/nodes/worker-01/. Explicitly configured directories can override the naming logic, and also ignore the state directory if they're absolute paths. One exception remains: the Supervisor itself -- we'd have to use LogAscii::logdir to automatically place it too in its own directory, but that feature currently does not interoperate with log rotation.	2022-05-26 12:56:02 -07:00
Christian Kreibich	d1cd409e59	Management framework: set defaults for log rotation and persistent state This adds management/persistence.zeek to establish common configuration for log rotation and persistent variable state. Log-writing Zeek processes initially write locally in their working directory, and rotate into subdirectory "log-queue" of the spool. Since agent and controller have no logger, persistence.zeek puts in place compatible configurations for them. Storage folders for Broker-backed tables and clusterized stores default to subdirectories of the new Zeek-level state folder. When setting the ZEEK_MANAGEMENT_TESTING environment variable, persistent state is kept in the local directory, and log rotation remains disabled. This also tweaks @loads a bit in favor of simply loading frameworks/management, which is easier to keep track of.	2022-05-26 12:55:10 -07:00
Christian Kreibich	e305d9c613	Management framework: establish stdout/stderr files also for cluster nodes	2022-05-25 13:56:23 -07:00
Christian Kreibich	b96a4276eb	Management framework: move role variable from logging into framework-wide config The role isn't just about logging, it can also act as a general indicator to key in on in role-specific code elsewhere, such as @if.	2022-05-25 13:56:23 -07:00
Christian Kreibich	e78fdc39e4	Management framework: distinguish supervisor/supervisee when loading agent/controller Load the agent/controller bootstrapping code only from the Supervisor, and the basic config only from a supervisee. When we're neither (which is likely a mistake), we do nothing.	2022-05-25 13:56:23 -07:00
Christian Kreibich	d40bb6e85f	Management framework: simplify agent and controller stdout/stderr files Moving to a model in which every Zeek process runs out of its own working directory simplifies the handling of those files.	2022-05-25 13:56:23 -07:00
Christian Kreibich	bd6c1683a2	Management framework: comment and layouting tweaks, no functional change Also remove additional instances of the term "data cluster".	2022-05-25 13:56:23 -07:00
Christian Kreibich	d4d6f10299	Management framework: rename env var that labels agents/controllers Just a consistency tweak to avoid confusion with "cluster".	2022-05-25 13:56:23 -07:00
Christian Kreibich	d2903bb645	Management framework: increase robustness of agent/controller naming The fallback mechanism when no explicit agent/controller names are configured didn't work properly, because many places in the code relied on accessing the name via the variables meant for explicit configuration, such as Management::Agent::name. Agent and controller now offer functions for computing the correct effective name, and we use that throughout.	2022-05-25 13:56:23 -07:00
Christian Kreibich	001de561fc	Management framework: add get_configuration_request/response transaction Includes submodule bumps for Broker (to pull in better handling of data structures that are difficult to unserialize in Python), zeek-client (for the get-config command), and a commit hash update for the external testsuite.	2022-05-05 16:09:21 -07:00
Christian Kreibich	b23d292410	Management framework: consistency fixes around event() vs Broker::publish() Switch to using Broker::publish() for any event we only send to a peered entity, and not to drive local processing. Also minor indentation cleanup.	2022-04-26 23:23:58 -07:00

1 2

55 commits