* topic/christian/external-testsuite-tweaks:
Add helpers for syncing commit files with external testsuites
Fix typo in update-timing target for external testsuites
This provides "make sync-repos" to check out all locally available testsuites at
the commits indicated in their commit files, and "make sync-commits" to update
the commit files to the HEADs of the local testsuite repos.
Also adds the commit -> repo sync for the Makefile init target so initialization
always lands on the right version, and removes the corresponding explicit
checkout from the CI repo setup.
This PR changes the way in which the SSL analyzer tracks the direction
of connections. So far, the SSL analyzer assumed that the originator of
a connection would send the client hello (and other associated
client-side events), and that the responder would be the SSL servers.
In some circumstances this is not true, and the initiator of a
connection is the server, with the responder being the client. So far
this confused some of the internal statekeeping logic and could lead to
mis-parsing of extensions.
This reversal of roles can happen in DTLS, if a connection uses STUN -
and potentially in some StartTLS protocols.
This PR tracks the direction of a TLS connection using the hello
request, client hello and server hello handshake messages. Furthermore,
it changes the SSL events from providing is_orig to providing is_client,
where is_client is true for the client_side of a connection. Since the
argument positioning in the event has not changed, old scripts will
continue to work seamlessly - the new semantics are what everyone
writing SSL scripts will have expected in any case.
There is a new event that is raised when a connection is flipped. A
weird is raised if a flip happens repeatedly.
Addresses GH-2198.
* topic/christian/management-deploy: (21 commits)
Management framework: bump external cluster testsuite
Management framework: bump zeek-client
Management framework: rename set_configuration events to stage_configuration
Management framework: trigger deployment upon when instances are ready
Management framework: more resilient node shutdown upon deployment
Management framework: re-trigger deployment upon controller launch
Management framework: move most deployment handling to internal function
Management framework: distinguish internally and externally requested deployments
Management framework: track instances by their Broker IDs
Management framework: tweak Supervisor event logging
Management framework: make helper function a local
Management framework: rename "log_level" to "level"
Management framework: add "finish" callback to requests
Management framework: add a helper for rendering result vectors to a string
Management framework: agents now skip re-deployment of current config
Management framework: suppress notify_agent_hello upon Supervisor peering
Management framework: introduce state machine for configs and persist them
Management framework: introduce deployment API in controller
Management framework: rename agent "set_configuration" to "deploy"
Management framework: consistency fixes to the Result record
...
* topic/christian/management-auto-assign-ports:
Management framework: bump zeek-client to pull in relaxed port handling
Management framework: bump external cluster testsuite
Management framework: also use send_set_configuration_response_error elsewhere
Management framework: minor log formatting tweak, for consistency
Management framework: support auto-assignment of ports in cluster nodes
This swaps the host event argument for the Broker ID. The latter is more useful,
since the sending agent doesn't necessarily know its IP address as visible to
the controller, and the controller can pull up the full Broker context via the
ID.
It also adds an explicit argument to the event to indicate whether the agent
connected to the controller or vice versa. This simplifies the controller's
internal logic.
Also minor tweaks to logging to show Broker IDs.
* topic/christian/gh-2134-fix-intel-test-races:
Expand scripts.base.frameworks.intel.cluster-transparency test
Fix races in scripts.base.frameworks.intel.cluster-transparency-with-proxy test
Add Intel::send_store_on_node_up boolean to control min_data_store delivery
This exposes Broker's new WebSocket support in Zeek. To enable it,
call `Broker::listen_websocket()`. Zeek will then start listening on
port 9997 for incoming WebSocket connections.
See the Broker documentation for a description of the message format
expected over these WebSocket connections.
This simply expands this test to match the behavior of
cluster-transparency-with-proxy, since the two are so similar. This test does
not seem to need disabling the worker's initial send of the data store.
This test was unstable for two reasons:
- Nothing verified whether the two workers had checked in with the proxy,
meaning that messages between the workers and proxies could get lost. This adds
an extra node_up event that the proxy generates synthetically, with values
recognizable to the manager, once the proxy sees both workers connected. This is
a test-level workaround for what should really be a cluster-is-ready event in
the cluster framework proper.
- More subtle: the Intel framework makes the manager send its current
min_data_store to newly connected workers, which in the case of this tests
introduces a race: since the data store, arriving at the worker, replaces the
existing value, it could actually remove already established items if timing was
right. This would lead to the count in the test reaching 3, assuming that 3
intel items are available, when in reality it was less, causing the
Intel::seen() call to do nothing. We now disable the sending of the data store
upon connect, via the global added in the previous commit.
This also expands the test slightly so that both workers call Intel::seen() for
the items inserted by the other worker. This is added validation for the second
point above, because in the presence of that race one occasionally sees one log
entry make it, and the other fail.
* topic/christian/management-verify-nodestarts:
Management framework: bump external cluster testsuite
Management framework: bump zeek-client to pull in set-config rendering
Management framework: enable stdout/stderr reporting
Management framework: Supervisor extensions for stdout/stderr handling
Management framework: disambiguate redef field names in agent and controller
Management framework: move to ResultVec in agent's set_configuration response
Management framework: tune request timeout granularity and interval
Management framework: verify node starts when deploying a configuration
Management framework: a bit of debug-level logging for troubleshooting
This improves the framework's handling of Zeek node stdout and stderr by
extending the (script-layer) Supervisor functionality.
- The Supervisor _either_ directs Zeek nodes' stdout/stderr to files _or_ lets
you hook into it at the script level. We'd like both: files make sense to allow
inspection outside of the framework, and the framework would benefit from
tapping into the streams e.g. for error context. We now provide the file
redirection functionality in the Supervisor, in addition to the hook
mechanism. The hook mechanism also builds up rolling windows of up to
100 lines (configurable) into stdout/stderr.
- The new Mangement::Supervisor::API::notify_node_exit event notifies
subscribers (agents, really) that a particular node has exited (and is possibly
being restarted by the Supervisor). The event includes the name of the node,
plus its recent stdout/stderr context.
The Supervisor generates this event every time it receives a status update from
the stem, meaning a node got created or re-created. A corresponding
SupervisorControl::node_status event relays the same information for users
interacting with the Supervisor over Broker.