* topic/christian/cluster-controller-next: (22 commits)
Remove periodic pinging of controller by agents
Move cluster controller/agent main.zeek scripts into their own modules
Bump zeek-client
First uses of request state timeouts
Add expiration mechanism to client request state.
Move get_instances_response event to using a Result record
Track successful config deployment in cluster controller
Bump zeek-client
Add ClusterController::API::notify_agents_ready event
Make all globals start with a "g_" prefix
Add missing debug() log function to log module's API
Add separate utility module for controller and agent
Bump zeek-client
Support for dropping instances no longer needed after config updates
Additional infrastructure for printing types
Bump zeek-client
Support on-demand peering with agents when receiving new cluster configuration
Expand requests support in the controller
Whitespace tweaks in cluster controller and agent scripts
Add Github action job for cluster tests
...
- Removes dependency on <regex.h>
- Replaces regex function with Zeek's standard regex functions
- Some replacements are workaround, may be improved later via an
appropiate API
- Update test baseline to fix what seems to be capturing on a bug in the
existing code.
Edit pass by Robin Sommer. Note that our test doesn't cover all the code
paths, but it does go through the one with the most substantial change.
This changes the agent-controller communication to remove the need for ongoing
pinging of the controller by agents not actively "in service". Instead, agents
now use the notify_agent_hello event to the controller to report only their
identity. The controller puts them into service via an agent_welcome_request/
response pair, and takes them out of service via agent_standby_request/response.
This removes the on_change handler from the set of agents that is ready for
service, because not every change to this set is now a suitable time to
potentially send out the configuration. We now invoke this check explicitly in
the two situations where it's warranted: when a agent reports ready for service,
and when we've received a new configuration.
This has no practical relevance other than allowing the two to be loaded a the
same time, which some of our (cluster-unrelated) tests require. Absence of
namespacing would trigger symbol clashes at this point.
This now features support for the test_timeout_request/response events, as
supported by the client, and also adds a timeout event for set_configuration, in
case agents do not respond in time.
Includes corresponding zeek-client submodule bump.
This establishes a timeout controlled via ClusterController::request_timeout,
triggering a ClusterController::Request::request_expired event whenever a
timeout rolls around before request state has been finalized by a request's
normal processing.
This changes the basic agent-management model to one in which the configurations
received from the client define not just the data cluster, but also set the set
of acceptable instances. Unless connectivity already exists, the controller will
establish peerings with new agents that listen, or wait for ones that connect to
the controller to check in.
Once all required agents are available, the controller triggers the new
notify_agents_ready event, an agent/controller-level "cluster-is-ready"
event. The controller also uses this event to submit a pending config update to
the now-ready instances.
Prior to this, static configuration needed to be in place to configure the
controller/agent layout. The configuration update can now include new instances
that the controller will connect to, assuming they're instances with a listening
agent.
Request records for configuration updates now store the full configuration. The
ClusterController::Request module now provies a to_string() function for
rendering requests to a string.
This job runs in sequence after the image build one, using its resulting image.
The actual tests live in the external zeek-testing-cluster testsuite, which
the new job clones and runs.
To specify a version of the testsuite to use, testing/external/ has a new
commit-hash.zeek-testing-cluster file that tracks the testsuite's relevant
commit ref
Our test trace is extracted from https://www.cloudshark.org/captures/b9089aac6eee.
There actually seems to be a bug in the existing code: the URI passed to
bt_tracker_request() includes a partial HTTP version. This commits
includes the baseline as the current code produces it, we'll fix that in
a subsequent comment.
* origin/topic/vern/usage-usage:
fixes for double-delete and reducing '?' operator with constant alternatives
additional test suite updates for "-u" usage issues
test suite updates for "xform" and "usage" alternatives, plus test name change
removed unused script variable
correct usage info for -u flag; -uu no longer supported
fix typo in btest filename