Commit graph

2937 commits

Author SHA1 Message Date
Christian Kreibich
9b4841912c Management framework: also use send_set_configuration_response_error elsewhere 2022-06-07 13:42:07 -07:00
Christian Kreibich
ccf3c24e23 Management framework: minor log formatting tweak, for consistency 2022-06-07 13:41:47 -07:00
Christian Kreibich
7a471df1a1 Management framework: support auto-assignment of ports in cluster nodes
This enables the controller to assign listening ports to managers, loggers, and
proxies. (We don't currently make the workers listen.) The feature is controlled
by the Management::Controller::auto_assign_ports flag. When enabled (the
default), enumeration starts from Management::Controller::auto_assign_start_port,
beginning with the manager, then the logger(s), then proxy(s). When the feature
is disabled and nodes that require a port lack it, the controller rejects the
configuration.
2022-06-07 13:38:04 -07:00
Tim Wojtulewicz
41aa8b2349 Merge remote-tracking branch 'origin/topic/christian/is_used_in_netcontrol_sumstats'
* origin/topic/christian/is_used_in_netcontrol_sumstats:
  Additional &is_used tags in the Netcontrol and Sumstats frameworks
2022-06-03 09:50:54 -07:00
Tim Wojtulewicz
febdc97f09 Merge remote-tracking branch 'origin/topic/christian/management-instance-handling'
* origin/topic/christian/management-instance-handling:
  Management framework: bump zeek-client to pull in rendering tweaks
  Management framework: bump external cluster testsuite
  Management framework: improve address and port handling
  Management framework: broaden get_instances response data to connected instances
  Management framework: expand notify_agent_hello event arguments
  Management framework: comment-only tweaks and typo fixes
2022-06-03 09:50:21 -07:00
Christian Kreibich
c53044981a Management framework: improve address and port handling
The get-nodes command also benefits from showing the state on connected agents
more broadly (as opposed to just the one for the current configuration).

Also a bugfix: ensure we use an agent's IP address as seen by the
controller. This avoids reporting "0.0.0.0" in some cases.
2022-06-03 02:14:07 -07:00
Christian Kreibich
0c47d45bb9 Management framework: broaden get_instances response data to connected instances
This response so far contained only the connected instances that are relevant to
the current configuration, but this isn't very helpful when troubleshooting
instance connectivity. It now reports all currently connected instances, with
network addresses & ports as known to Broker.
2022-06-03 02:13:30 -07:00
Christian Kreibich
72acf24f52 Management framework: expand notify_agent_hello event arguments
This swaps the host event argument for the Broker ID. The latter is more useful,
since the sending agent doesn't necessarily know its IP address as visible to
the controller, and the controller can pull up the full Broker context via the
ID.

It also adds an explicit argument to the event to indicate whether the agent
connected to the controller or vice versa. This simplifies the controller's
internal logic.

Also minor tweaks to logging to show Broker IDs.
2022-06-03 02:12:19 -07:00
Christian Kreibich
aa689807fa Management framework: comment-only tweaks and typo fixes 2022-06-03 02:12:12 -07:00
Christian Kreibich
edef3736fb Additional &is_used tags in the Netcontrol and Sumstats frameworks
When running a cluster, these functions only get called in select node types and
could trigger no-caller warnings on stderr.
2022-06-02 22:57:07 -07:00
Tim Wojtulewicz
535a6013aa Merge remote-tracking branch 'zeek-as-org/as-org'
* zeek-as-org/as-org:
  Mark lookup_asn() BIF as deprecated in v6.1
  Define geo_autonomous_system record type
  Add lookup_autonomous_system() BIF that returns AS number and org
2022-06-02 16:59:29 -07:00
Christian Kreibich
1cebdd569d Merge branch 'topic/christian/gh-2134-fix-intel-test-races'
* topic/christian/gh-2134-fix-intel-test-races:
  Expand scripts.base.frameworks.intel.cluster-transparency test
  Fix races in scripts.base.frameworks.intel.cluster-transparency-with-proxy test
  Add Intel::send_store_on_node_up boolean to control min_data_store delivery
2022-06-02 12:20:06 -07:00
Robin Sommer
d99f041ac5
Add WebSocket support for exchanging events with external clients.
This exposes Broker's new WebSocket support in Zeek. To enable it,
call `Broker::listen_websocket()`. Zeek will then start listening on
port 9997 for incoming WebSocket connections.

See the Broker documentation for a description of the message format
expected over these WebSocket connections.
2022-06-02 10:31:52 +02:00
Phil Rzewski
c08fe7c237 Define geo_autonomous_system record type 2022-06-01 18:39:07 -07:00
Christian Kreibich
892a3a8452 Add Intel::send_store_on_node_up boolean to control min_data_store delivery
This adds a redefinable const to the internals of the Intel framework, to allow
suppression of the manager sending its current min_data_store when a worker
connects. This feature is desirable for nodes that check in "late" to bring them
up to speed, but during testing it introduces nondeterminism.
2022-06-01 17:45:19 -07:00
Christian Kreibich
f10b94de39 Management framework: enable stdout/stderr reporting
This uses the new frameworks/management/supervisor functionality to maintain
stdout/stderr files, and hooks output context into set_configuration error
results.
2022-05-31 12:55:21 -07:00
Christian Kreibich
24a495da42 Management framework: Supervisor extensions for stdout/stderr handling
This improves the framework's handling of Zeek node stdout and stderr by
extending the (script-layer) Supervisor functionality.

- The Supervisor _either_ directs Zeek nodes' stdout/stderr to files _or_ lets
you hook into it at the script level. We'd like both: files make sense to allow
inspection outside of the framework, and the framework would benefit from
tapping into the streams e.g. for error context. We now provide the file
redirection functionality in the Supervisor, in addition to the hook
mechanism. The hook mechanism also builds up rolling windows of up to
100 lines (configurable) into stdout/stderr.

- The new Mangement::Supervisor::API::notify_node_exit event notifies
subscribers (agents, really) that a particular node has exited (and is possibly
being restarted by the Supervisor). The event includes the name of the node,
plus its recent stdout/stderr context.
2022-05-31 12:55:21 -07:00
Christian Kreibich
f74f21767a Management framework: disambiguate redef field names in agent and controller
During Zeekygen's doc generation both the agent's and controller's main.zeek get
loaded. This just happened to not throw errors so far because the redefs either
matched perfectly or used different field names.
2022-05-31 12:55:21 -07:00
Christian Kreibich
49b9f1669c Management framework: move to ResultVec in agent's set_configuration response
We so far reported one result record per agent, which made it hard to report
per-node outcomes for the new configuration. Agents now report one result record
per node they're responsible for.
2022-05-31 12:55:21 -07:00
Christian Kreibich
83c60fd8ac Management framework: tune request timeout granularity and interval
When the controller relays requests to agents, we want agents to time out more
quickly than the corresponding controller requests. This allows agents to
respond with more meaningful errors, while the controller's timeout acts mostly
as a last resort to ensure a response to the client actually happens.

This dials down the table_expire_interval to 2 seconds in both agent and
controller, for more predictable timeout behavior. It also dials the agent-side
request expiration interval down to 5 seconds, compared to the agent's 10
seconds.

We may have to revisit this to allow custom expiration intervals per
request/response message type.
2022-05-31 12:55:21 -07:00
Christian Kreibich
4371c17d4c Management framework: verify node starts when deploying a configuration
We so far hoped for the best when an agent asked the Supervisor to launch a
node. Since the Management::Node::API::notify_node_hello events arriving from
new nodes signal when such nodes are up and running, we can use those events to
track once/whether all launched nodes have checked in, and respond accordingly.

This delays the set_configuration_response event until these checkins have
occurred, or a timeout kicks in. In case of error, the agent's response to the
controller is in error state and has the remaining, unresponsive/failed  set of
nodes as its data member.
2022-05-31 12:55:21 -07:00
Christian Kreibich
c922f749c5 Management framework: a bit of debug-level logging for troubleshooting 2022-05-31 12:55:21 -07:00
Christian Kreibich
93bed5a261 Merge branch 'topic/christian/node-status-notification'
* topic/christian/node-status-notification:
  Add Supervisor::node_status notification event
2022-05-31 12:53:18 -07:00
Christian Kreibich
14188fc7a7 Add Supervisor::node_status notification event
The Supervisor generates this event every time it receives a status update from
the stem, meaning a node got created or re-created. A corresponding
SupervisorControl::node_status event relays the same information for users
interacting with the Supervisor over Broker.
2022-05-30 21:36:35 -07:00
Vern Paxson
07cf5cb089 deprecation messages for unused base script functions 2022-05-27 14:36:30 -07:00
Vern Paxson
6dc711c39e annotate orphan base script components with &deprecated 2022-05-26 17:39:17 -07:00
Vern Paxson
9b8ac44169 annotate base scripts with &is_used as needed 2022-05-26 17:39:17 -07:00
Christian Kreibich
415bbe17d6 Merge branch 'topic/christian/management-cluster-dirs'
* topic/christian/management-cluster-dirs:
  Management framework: bump zeek-client to pull in instance serialization fixes
  Management framework: bump external cluster testsuite
  Management framework: update agent-checkin test to reflect recent changes
  Management framework: place each Zeek process in its own working dir
  Management framework: set defaults for log rotation and persistent state
  Management framework: add spool and state directory config settings
  Management framework: establish stdout/stderr files also for cluster nodes
  Management framework: default to having agents check in with the (local) controller
  Management framework: move role variable from logging into framework-wide config
  Management framework: distinguish supervisor/supervisee when loading agent/controller
  Management framework: simplify agent and controller stdout/stderr files
  Management framework: prefix the management logs with "management-"
  Management framework: comment and layouting tweaks, no functional change
  Management framework: rename env var that labels agents/controllers
  Management framework: increase robustness of agent/controller naming
2022-05-26 16:10:14 -07:00
Christian Kreibich
93ea03a081 Management framework: place each Zeek process in its own working dir
This establishes a directory "nodes" in Management::state_dir and places each
Zeek process into a subdirectory in it, named after the Zeek process. For
example, node "worker-01" runs with cwd <state_dir>/nodes/worker-01/.

Explicitly configured directories can override the naming logic, and also ignore
the state directory if they're absolute paths. One exception remains: the
Supervisor itself -- we'd have to use LogAscii::logdir to automatically place it
too in its own directory, but that feature currently does not interoperate with
log rotation.
2022-05-26 12:56:02 -07:00
Christian Kreibich
d1cd409e59 Management framework: set defaults for log rotation and persistent state
This adds management/persistence.zeek to establish common configuration for log
rotation and persistent variable state. Log-writing Zeek processes initially
write locally in their working directory, and rotate into subdirectory
"log-queue" of the spool. Since agent and controller have no logger,
persistence.zeek puts in place compatible configurations for them.

Storage folders for Broker-backed tables and clusterized stores default to
subdirectories of the new Zeek-level state folder.

When setting the ZEEK_MANAGEMENT_TESTING environment variable, persistent state
is kept in the local directory, and log rotation remains disabled.

This also tweaks @loads a bit in favor of simply loading frameworks/management,
which is easier to keep track of.
2022-05-26 12:55:10 -07:00
Christian Kreibich
7708cbe500 Management framework: add spool and state directory config settings
This allows specifying spool and variable-state directories specifically for the
management framework. They default to the corresponding installation-level
folders.
2022-05-25 13:56:23 -07:00
Christian Kreibich
e305d9c613 Management framework: establish stdout/stderr files also for cluster nodes 2022-05-25 13:56:23 -07:00
Christian Kreibich
da016b8a68 Management framework: default to having agents check in with the (local) controller
This allows single-machine settings to work out of the box when agent and
cluster are loaded in Supervisor mode.
2022-05-25 13:56:23 -07:00
Christian Kreibich
b96a4276eb Management framework: move role variable from logging into framework-wide config
The role isn't just about logging, it can also act as a general indicator to key
in on in role-specific code elsewhere, such as @if.
2022-05-25 13:56:23 -07:00
Christian Kreibich
e78fdc39e4 Management framework: distinguish supervisor/supervisee when loading agent/controller
Load the agent/controller bootstrapping code only from the Supervisor, and the
basic config only from a supervisee. When we're neither (which is likely a
mistake), we do nothing.
2022-05-25 13:56:23 -07:00
Christian Kreibich
d40bb6e85f Management framework: simplify agent and controller stdout/stderr files
Moving to a model in which every Zeek process runs out of its own working
directory simplifies the handling of those files.
2022-05-25 13:56:23 -07:00
Christian Kreibich
f8f7fd97e8 Management framework: prefix the management logs with "management-"
These were still using "cluster-", a leftover from earlier days of the
framework.
2022-05-25 13:56:23 -07:00
Christian Kreibich
bd6c1683a2 Management framework: comment and layouting tweaks, no functional change
Also remove additional instances of the term "data cluster".
2022-05-25 13:56:23 -07:00
Christian Kreibich
d4d6f10299 Management framework: rename env var that labels agents/controllers
Just a consistency tweak to avoid confusion with "cluster".
2022-05-25 13:56:23 -07:00
Christian Kreibich
d2903bb645 Management framework: increase robustness of agent/controller naming
The fallback mechanism when no explicit agent/controller names are configured
didn't work properly, because many places in the code relied on accessing the
name via the variables meant for explicit configuration, such as
Management::Agent::name. Agent and controller now offer functions for computing
the correct effective name, and we use that throughout.
2022-05-25 13:56:23 -07:00
Tim Wojtulewicz
f8bc23d3e1 Propagate BPF_Program error message to script land 2022-05-25 09:41:35 -07:00
Christian Kreibich
d4ecfa0a67 Merge branch 'topic/christian/installation-dirs-in-scriptland'
* topic/christian/installation-dirs-in-scriptland:
  Add scripts.base.misc.installation btest
  Add base/misc/installation.zeek, with Zeek installation directories
  Ensure presence of Zeek-related directories in toplevel CMakeLists.txt
2022-05-24 12:12:05 -07:00
Christian Kreibich
84a09debe3 Add base/misc/installation.zeek, with Zeek installation directories
This makes several of the installation's main directories available to the
script layer.
2022-05-23 14:16:59 -07:00
Christian Kreibich
9d59a48ae2 Expand build_path() function to handle empty dir arguments gracefully
When passing an empty string as a directory, the function would produce
filenames starting with a slash even when the given file_name is not an absolute
path. Defaulting to the root directory is likely never intended and might
conveivably be dangerous. The middle "/" is now skipped also if dir is an empty
string.
2022-05-19 09:45:52 -07:00
Johanna Amann
5118e7f86b Merge remote-tracking branch 'origin/topic/johanna/cert-weak-key'
* origin/topic/johanna/cert-weak-key:
  Include certificate information in SSL::Weak_Key notice
2022-05-12 11:04:34 +01:00
Tim Wojtulewicz
8b0263cb39 Merge remote-tracking branch 'origin/topic/vern/script-profiling'
* origin/topic/vern/script-profiling:
  tidy up after generating profile
  test suite updates for refined script coverage, use of new BiF to speed startup
  fix for coverage reporting for functions that use "when" statements
  new global_options() BiF to speed up startup, plus a micro-preen
  hooks for new --profile-scripts option
  classes for managing script profiles
  address some holes in script coverage
  fix for script coverage missing on-exit activity
  memory management fixes for loggers
  make curr_CPU_time() broadly available rather than just isolated to ZAM
2022-05-11 12:56:41 -07:00
Johanna Amann
5f04f216bc Include certificate information in SSL::Weak_Key notice 2022-05-11 18:56:04 +01:00
Johanna Amann
3a7b127a71 Merge remote-tracking branch 'origin/topic/johanna/topk-cite'
* origin/topic/johanna/topk-cite:
  Add exact name of the Top-k algorithm.
2022-05-11 18:45:48 +01:00
Johanna Amann
25c33d2a29 Add exact name of the Top-k algorithm.
I needed to figure out which exact algorithm we use for our
probabilistic top-k measurements. It turns out that we do not mention
this in our source tree at all so far.
2022-05-11 13:21:26 +01:00
Christian Kreibich
7198c847e8 Merge branch 'topic/christian/management-get-config'
* topic/christian/management-get-config:
  Management framework: add get_configuration_request/response transaction
2022-05-05 18:10:46 -07:00