The get-nodes command also benefits from showing the state on connected agents
more broadly (as opposed to just the one for the current configuration).
Also a bugfix: ensure we use an agent's IP address as seen by the
controller. This avoids reporting "0.0.0.0" in some cases.
This response so far contained only the connected instances that are relevant to
the current configuration, but this isn't very helpful when troubleshooting
instance connectivity. It now reports all currently connected instances, with
network addresses & ports as known to Broker.
This swaps the host event argument for the Broker ID. The latter is more useful,
since the sending agent doesn't necessarily know its IP address as visible to
the controller, and the controller can pull up the full Broker context via the
ID.
It also adds an explicit argument to the event to indicate whether the agent
connected to the controller or vice versa. This simplifies the controller's
internal logic.
Also minor tweaks to logging to show Broker IDs.
* zeek-as-org/as-org:
Mark lookup_asn() BIF as deprecated in v6.1
Define geo_autonomous_system record type
Add lookup_autonomous_system() BIF that returns AS number and org
* topic/christian/gh-2134-fix-intel-test-races:
Expand scripts.base.frameworks.intel.cluster-transparency test
Fix races in scripts.base.frameworks.intel.cluster-transparency-with-proxy test
Add Intel::send_store_on_node_up boolean to control min_data_store delivery
This exposes Broker's new WebSocket support in Zeek. To enable it,
call `Broker::listen_websocket()`. Zeek will then start listening on
port 9997 for incoming WebSocket connections.
See the Broker documentation for a description of the message format
expected over these WebSocket connections.
This simply expands this test to match the behavior of
cluster-transparency-with-proxy, since the two are so similar. This test does
not seem to need disabling the worker's initial send of the data store.
This test was unstable for two reasons:
- Nothing verified whether the two workers had checked in with the proxy,
meaning that messages between the workers and proxies could get lost. This adds
an extra node_up event that the proxy generates synthetically, with values
recognizable to the manager, once the proxy sees both workers connected. This is
a test-level workaround for what should really be a cluster-is-ready event in
the cluster framework proper.
- More subtle: the Intel framework makes the manager send its current
min_data_store to newly connected workers, which in the case of this tests
introduces a race: since the data store, arriving at the worker, replaces the
existing value, it could actually remove already established items if timing was
right. This would lead to the count in the test reaching 3, assuming that 3
intel items are available, when in reality it was less, causing the
Intel::seen() call to do nothing. We now disable the sending of the data store
upon connect, via the global added in the previous commit.
This also expands the test slightly so that both workers call Intel::seen() for
the items inserted by the other worker. This is added validation for the second
point above, because in the presence of that race one occasionally sees one log
entry make it, and the other fail.
This adds a redefinable const to the internals of the Intel framework, to allow
suppression of the manager sending its current min_data_store when a worker
connects. This feature is desirable for nodes that check in "late" to bring them
up to speed, but during testing it introduces nondeterminism.