Mirror/zeek - git.uphillsecurity.com: We code.

mirror of https://github.com/zeek/zeek.git synced 2025-10-02 06:38:20 +00:00

Author	SHA1	Message	Date
Tim Wojtulewicz	535df5e263	Remove deprecated Controller::auto_assign_ports and Controller::auto_assign_start_port	2024-08-07 11:58:21 -07:00
Christian Kreibich	8a4fb0ee19	Management framework: augment deployed configs with instance IP addresses The controller learns IP addresses from agents that peer with it, but that information has so far gotten lost when resulting configs get pushed out to the agents. This makes these updates include that information.	2024-07-08 23:05:24 -07:00
Christian Kreibich	742f7fe340	Management framework: add auto-enumeration of metrics ports This is quite redundant with the enumeration for Broker ports, unfortunately. But the logic is subtly different: all nodes obtain a telemetry port, while not all nodes require a Broker port, for example, and in the metrics port assignment we also cross-check selected Broker ports. I found more unified code actually harder to read in the end. The logic for the two sets remains the same: from a start point, ports get enumerated sequentially that aren't otherwise taken. These ports are assumed available; there's nothing that checks their availability -- for now. The default start port is 9000. I considered 9090, to align with the Prometheus default, but counting upward from there is likely to hit trouble with the Broker default ports (9999/9997), used by the Supervisor. Counting downward is a bit unnatural, and shifting the Broker default ports brings subtle ordering issues. This also changes the node ordering logic slightly since it seems more intuitive to keep sequential ports on a given instance, instead of striping across them.	2024-07-08 23:05:24 -07:00
Christian Kreibich	fa6361af56	Management framework: propagate metrics port from agent This propagates the metrics port from the node config passed through the supervisor all the way into the script layer.	2024-07-08 23:05:24 -07:00
Christian Kreibich	563704a26e	Management framework: add metrics port in management & Supervisor node records This allows setting a metrics port for creation in new nodes.	2024-07-08 23:05:24 -07:00
Christian Kreibich	737b1a2013	Remove the Supervisor's internal ClusterEndpoint struct. This eliminates one place in which we currently need to mirror changes to the script-land Cluster::Node record. Instead of keeping an exact in-core equivalent, the Supervisor now treats the data structure as opaque, and stores the whole cluster table as a JSON string. We may replace the script-layer Supervisor::ClusterEndpoint in the future, using Cluster::Node directly. But that's a more invasive change that will affect how people invoke Supervisor::create() and similars. Relying on JSON for serialization has the side-effect of removing the Supervisor's earlier quirk of using 0/tcp, not 0/unknown, to indicate unused ports in the Supervisor::ClusterEndpoint record.	2024-07-02 14:52:17 -07:00
Tim Wojtulewicz	5a3abbe364	Revert "Merge remote-tracking branch 'origin/topic/vern/at-if-analyze'" This reverts commit `4e797ddbbc`, reversing changes made to `3ac28ba5a2`.	2023-05-31 09:20:33 +02:00
Vern Paxson	890010915a	change base scripts to use run-time if's or @if ... &analyze	2023-05-19 13:26:27 -07:00
Josh Soref	21e0d777b3	Spelling fixes: scripts * accessing * across * adding * additional * addresses * afterwards * analyzer * ancillary * answer * associated * attempts * because * belonging * buffer * cleanup * committed * connects * database * destination * destroy * distinguished * encoded * entries * entry * hopefully * image * include * incorrect * information * initial * initiate * interval * into * java * negotiation * nodes * nonexistent * ntlm * occasional * omitted * otherwise * ourselves * paragraphs * particular * perform * received * receiver * referring * release * repetitions * request * responded * retrieval * running * search * separate * separator * should * synchronization * target * that * the * threshold * timeout * transaction * transferred * transmission * triggered * vetoes * virtual Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>	2022-11-02 17:36:39 -04:00
Christian Kreibich	147283c8f5	Management framework: add websocket support to controller The controller now listens on an additional port, defaulting to 2149, for Broker connections via websockets. Configuration works as for the existing traditional Broker port (2150), via ZEEK_CONTROLLER_WEBSOCKET_ADDR and ZEEK_CONTROLLER_WEBSOCKET_PORT environment variables, as well as corresponding redef'able constants. To disable the websockets feature, leave ZEEK_CONTROLLER_WEBSOCKET_PORT unset and redefine Management::Controller::default_port_websocket to 0/unknown.	2022-10-24 15:59:26 -07:00
Christian Kreibich	e73b561dca	Update Management framework to new Supervisor::NodeConfig script fields	2022-09-02 12:12:19 -07:00
Christian Kreibich	fb733eb664	Management framework: log node set in dispatch requests cleanly Converting to a (sorted) vector both renders the empty set cleanly (without whitespace) and ensures consistent ordering.	2022-08-09 15:12:39 -07:00
Christian Kreibich	7d4dd22aba	Management framework: log additional node events	2022-08-09 15:12:10 -07:00
Christian Kreibich	63291ba2df	Management framework: upon deployment, make agent log multiple node results This erroneously only logged the result of the last node iterated over.	2022-08-09 15:11:31 -07:00
Christian Kreibich	6c3e545306	Management framework: fix early return condition for get-id-value This erroneously used connectedness of instances, not presence of a deployed cluster. Without a deployment, there's no point in trying to retrieve global ID values.	2022-08-09 14:07:16 -07:00
Christian Kreibich	ffebf99bad	Management framework: additional logging tweaks Ensure the framework's log stream exists prior to using it in zeek_init(), and use a node-is-live message similar to those in agent and controller also in launched nodes.	2022-07-12 17:53:35 -07:00
Christian Kreibich	e947e1d1c2	Management framework: additional context in a few log messages This adds request IDs in a few places that didn't mention them, and makes requests to the Supervisor that act on all current nodes explicit.	2022-07-11 13:00:35 -07:00
Christian Kreibich	f6597ffabf	Management framework: await Supervisor peering before sending agent's hello Failing to do so could open a race condition in which a quickly connecting controller could send instructions whose resulting Supervisor interactions got lost.	2022-07-11 13:00:35 -07:00
Christian Kreibich	a505a7814f	Management framework: remove outdated comment The agent has a request_expired timeout handler at this point.	2022-07-11 13:00:35 -07:00
Christian Kreibich	3aa0409792	Management framework: edit pass over docstrings This expands cross-referencing in the doc strings and adds a bit more explanation.	2022-06-22 23:26:11 -07:00
Christian Kreibich	b9879a50a0	Management framework: node restart support This adds restart request/response event pairs that restart nodes in the running Zeek cluster. The implementation is very similar to get_id_value, which also involves distributing a list of nodes to agents and aggregating the responses.	2022-06-22 23:26:11 -07:00
Christian Kreibich	bd39207772	Management framework: more consistent Supervisor interaction in the agent This declares our helper functions for sending events to the Supervisor, and makes them return the created request objects to enable the caller to modify them. It also adds a helper for restart and status requests, uses the helpers throughout the module, and makes all handlers more resilient in case Supervisor events other than the agent's arrive.	2022-06-22 23:26:11 -07:00
Christian Kreibich	d994f33636	Management framework: log the controller's startup deployment attempt The controller now logs its deployment attempt of a persisted configuration at startup. This is generally helpful to see recorded, and also explains timeout of the underlying request in case of failure (which triggers a timeout message).	2022-06-22 23:26:11 -07:00
Christian Kreibich	05447c413f	Management framework: bugfix for a get_id_value corner case For the case of a running cluster with no connected agents, use the g_instances_known table instead of g_instances. The latter reflects the contents of the last deployed config, not the live scenario of actually attached agents.	2022-06-22 23:26:06 -07:00
Christian Kreibich	1af9bba76e	Management framework: minor timeout bugfix The timeout result wasn't actually stored in requests timing out in the agent. (So far that's for deployment requests.) Also log the timing out of any request state, similar to the controller.	2022-06-22 23:25:15 -07:00
Christian Kreibich	b2f9e29bae	Management framework: make "result" argument plural in multi-result response events No functional change, just a consistency tweak. Since agent and controller send response events via Broker::publish(), the arguments aren't named and so this only affects the API definition.	2022-06-22 23:25:15 -07:00
Christian Kreibich	2c1cd1d401	Management framework: rename set_configuration events to stage_configuration This reflects corresponding renaming of the client's set-config command to stage-config, to make it more clear what's happening.	2022-06-22 11:54:58 -07:00
Christian Kreibich	68558e2874	Management framework: trigger deployment upon when instances are ready More resilience: when an agent restarts, it checks in with the controller. If the controller has deployed a config, this check-in may lead to an internal notify_agents_ready event. At that point, we now trigger a deployment when there currently isn't already one running. This ensures that any agents not yet running the current cluster will start to do so, and does nothing when those agents already run it, since they ignore the request in that case.	2022-06-21 17:22:45 -07:00
Christian Kreibich	a622e28eab	Management framework: more resilient node shutdown upon deployment When agents had to terminate existing Zeek cluster nodes at the beginning of a new deployment, they so far used their internal state to look up the nodes and fired off requests to the Supervisor to shut these down. This has a problem: when an agent restarts unexpectedly, it has no internal state, and when it then tries to create nodes that already exist, the Supervisor complains with error messages. To avoid this, the agent now tears down all Supervised nodes other than agents and controllers. In order to do so, it first needs to query the Supervisor for the current node status, which means there are now two such status requests: one upon deployment, and one during get_nodes requests. In order to disambiguate these contexts in the SupervisorControl::status_request/response transactions, we use the finish() callback in the corresponding request state to continue execution as needed.	2022-06-21 17:22:45 -07:00
Christian Kreibich	1faf1ab8b7	Management framework: re-trigger deployment upon controller launch A resilience feature: when a booting controller has a previously deployed configuration (just reloaded from persistent state), it now triggers a deployment. When agents at this point run something else, this restores the controller's understanding of what's deployed, and if the agents do still run this configuration, does nothing since agents ignore deployment of a configuration they already run.	2022-06-21 17:22:45 -07:00
Christian Kreibich	c4862e7c5e	Management framework: move most deployment handling to internal function The controller now runs most of a config deployment via an internal function, allowing it to be called from multiple places instead of just the deploy_request event handler.	2022-06-21 17:22:45 -07:00
Christian Kreibich	3120fbc75e	Management framework: distinguish internally and externally requested deployments The controller's deployment request state now features a bit that indicates whether the deployment was requested by a client, or triggered internally. This affects logging and the transmission of deployment response events via Broker, which are skipped when the deployment is internal. This is in preparation of resilience features when the controller (re-)boots.	2022-06-21 17:22:45 -07:00
Christian Kreibich	7787d84739	Management framework: track instances by their Broker IDs This allows us to handle loss of Broker peerings, updating instance state as we see instances go away. This also tweaks logging slightly to differentiate between an instance checking in for the first time, and checking in when the controller already knows it.	2022-06-21 17:22:45 -07:00
Christian Kreibich	633535d8da	Management framework: tweak Supervisor event logging We now log Supervisor event interaction just like we do transmission/receipt of other Management framework events.	2022-06-21 17:22:45 -07:00
Christian Kreibich	d7e88fc079	Management framework: make helper function a local	2022-06-21 17:22:45 -07:00
Christian Kreibich	35ea566223	Management framework: rename "log_level" to "level" "Management::Log::log_level" looks redundant.	2022-06-21 17:22:45 -07:00
Christian Kreibich	8bc142f73c	Management framework: add "finish" callback to requests These callbacks are handy for stringing together codepaths separated by event request/response transactions: when such a transaction completes, the callback allows locating a parent request for the finished one, to continue its processing.	2022-06-21 17:22:45 -07:00
Christian Kreibich	a2525e44ba	Management framework: add a helper for rendering result vectors to a string	2022-06-21 17:22:45 -07:00
Christian Kreibich	d367f1bad9	Management framework: agents now skip re-deployment of current config When an agent is already running the configuration it's asked to deploy, it will now recognize this and by default do nothing. The requester can force it if needed, via a new argument to the deploy_request event.	2022-06-21 17:22:45 -07:00
Christian Kreibich	a68ee13939	Management framework: suppress notify_agent_hello upon Supervisor peering The agent's Broker::peer_added handler now recognizes the Supervisor and does not trigger a notify_agent_hello event upon it. It might still send such events repeatedly as other things peer with the agent.	2022-06-21 17:22:45 -07:00
Christian Kreibich	46db4a0e71	Management framework: introduce state machine for configs and persist them The controller now knows three states that a cluster configuration can be in: - STAGED: as uploaded by the client - READY: with needed tweaks applied, e.g. to fill in ports - DEPLOYED: as sent off to agents for deployment These states aren't exclusive, they represent checkpoints that a config goes through from upload through deployment. A deployed configuration will also exist in its STAGED and READY versions, unless a client has uploaded a new configuration, which will overwrite the STAGED and READY ones. The controller saves all of these in a table, which lets us use Broker to persist all states to disk. We use &broker_allow_complex_type, since we only ever store entire configurations.	2022-06-21 17:22:45 -07:00
Christian Kreibich	77556e9f11	Management framework: introduce deployment API in controller This separates uploading a configuration from deploying it to the instances into separate event transactions. set_configuration_request/response remains, but now only conducts validation and storage of the new configuration (upon validation success, and not yet persisted to disk). The response event indicates success or the list of validation errors. Successful upload now returns the configuration's ID in the result record's data struct. The new deploy_request/response event takes a previously uploaded configuration and deploys it to the agents. The controller now tracks uploaded and deployed configurations separately. Uploading assigns g_config_staged; deployment assigns g_config_deployed. Deployment does not affect g_config_staged. The get_config_request/response event pair now allows selecting the configuration the caller would like to retrieve.	2022-06-21 17:22:45 -07:00
Christian Kreibich	0480b5f39c	Management framework: rename agent "set_configuration" to "deploy" This renames the agent's functionality for setting a configuration to reflect the controller's upcoming separation of set_configuration and deployment.	2022-06-21 17:22:45 -07:00
Christian Kreibich	f353ac22a5	Management framework: consistency fixes to the Result record The instance and error fields are now optional instead of defaulting to empty strings, which caused minor output deviations in the client. Agents now ensure that any Result record they create has the instance field filled in.	2022-06-21 17:22:45 -07:00
Christian Kreibich	3ac5fdfc59	Management framework: trivial changes and comment-only rewording	2022-06-21 17:22:45 -07:00
Christian Kreibich	d6042cf516	Management framework: add config validation During `set_configuration_request` handling the controller now validates received configurations, checking for a few common gotchas around naming and port use. Validation continues once it finds a problem, resulting in a list summarizing all identified problems.	2022-06-19 01:20:16 -07:00
Christian Kreibich	620db4d4eb	Management framework: improvements to port auto-enumeration The numbering process now accounts for the possibility of colliding with the agent port, as well as with ports explicitly assigned in the configuration. It also avoids nondeterminism that could result from traversal of sets.	2022-06-19 01:19:54 -07:00
Christian Kreibich	0c20f16055	Management framework: control output-to-console in Supervisor It helps during testing to be able to control whether the Supervisor process also routs node output to the console, in addition to writing to output files. Since the Supervisor runs as the main process in Docker containers, its output becomes visible in "docker logs" that way, simplifying diagnostics.	2022-06-19 01:19:54 -07:00
Christian Kreibich	5592beaf31	Management framework: handle no-instances corner case in set-config correctly When the controller receives a configuration with no instances (and thus no nodes), it needs to roundtrip to agents and can send the response right away.	2022-06-19 01:19:47 -07:00
Christian Kreibich	a3fcd1462d	Management framework: make agents support zeek-archiver invocations This makes agents handle log archival automatically. By default, they invoke zeek-archiver once every log rotation interval to archive rotated files from the log-queue spool directory into the installation's log directory. The user can disable the feature, customize the command to invoke, and adjust the rotation interval.	2022-06-14 12:32:17 -07:00

1 2

88 commits