This class is a parameter of virtual methods of the Backend API for users
to implement and also a parameter to the HookPublishEvent() API. Seems it
shouldn't be in detail and instead we should own it.
Alternatively, could mark the cluster APIs as not-stable-yet, but I
think we can move forward and make it non-detail for 8.0.
* origin/topic/awelzel/4177-4178-custom-event-metadata-part-2:
Event: Bail on add_missing_remote_network_timestamp without add_network_timestamp
btest/plugin: Test custom metadata publish
NEWS: Add note about generic event metadata
cluster: Remove deprecated Event constructor
cluster: Remove some explicit timestamp handling
broker/Manager: Fetch and forward all metadata from events
Event/init-bare: Add add_missing_remote_network_timestamp logic
cluster/Backend/DoProcessEvent: Use generic metadata, not just timestamps
cluster/Event: Support moving args and metadata from event
cluster/serializer/broker: Support generic metadata
cluster/Event: Generic metadata support
Event: Use -1.0 for undefined/unset timestamps
cluster: Use shorter obj_desc versions
Desc: Add obj_desc() / obj_desc_short() overloads for IntrusivePtr
Also use the generic metadata version for publishing, keep the
ts-based API for now, but only add timestamps when
EventMetadata::add_network_timestamp is T. I'm not sure what the
right way forward here is, maybe deprecating Broker's publish event
variations and funneling through cluster.
This will be cleaned up later to just pass all contained metadata from
a cluster event to the queued event, but for now do this here, otherwise
we break some internal tests.
Since a reorg in the Broker library (commit b04195183) that revamped flow
control and that we pulled in with Zeek 5.0, this setting hasn't done
anything. Broker's endpoint::make_subscriber() and
endpoint::make_status_subscriber() take a queue size argument (with a default
value) that simply gets dropped in the eventual subscriber::make() call. See:
b041951835 (diff-5c0d2baa7981caeb6a4080708ddca6ad929746d10c73d66598e46d7c2c03c8deL34-R178)
This implements basic tracking of each peering's current fill level, the maximum
level over a recent time interval (via a new Broker::buffer_stats_reset_interval
tunable, defaulting to 1min), and the number of times a buffer overflows. For
the disconnect policy this is the number of depeerings, but for drop_newest and
drop_oldest it implies the number of messages lost.
This doesn't use "proper" telemetry metrics for a few reasons: this tracking is
Broker-specific, so we need to track each peering via endpoint_ids, while we
want the metrics to use Cluster node name labels, and the latter live in the
script layer. Using broker::endpoint_id directly as keys also means we rely on
their ability to hash in STL containers, which should be fast.
This does not track the buffer levels for Broker "clients" (as opposed to
"peers"), i.e. WebSockets, since we currently don't have a way to name these,
and we don't want to use ephemeral Broker IDs in their telemetry.
To make the stats accessible to the script layer the Broker manager (via a new
helper class that lives in the event_observer) maintains a TableVal mapping
Broker IDs to a new BrokerPeeringStats record. The table's members get updated
every time that table is requested. This minimizes new val instantiation and
allows the script layer to customize the BrokerPeeringStats record by redefing,
updating fields, etc. Since we can't use Zeek vals outside the main thread, this
requires some care so all table updates happen only in the Zeek-side table
updater, PeerBufferState::GetPeeringStatsTable().
The new Broker API allows us to provide a custom logger to Broker that
pulls previously unattainable context information out of Broker to put
them into broker.log for users of Zeek.
Since Broker log events happen asynchronously, we cache them in a queue
and use a flare to notify Zeek of activity. Furthermore, the Broker
manager now implements the `ProcessFd` function to avoid unnecessary
polling of the new log queue. As a side effect, data stores are polled
less as well.
This allows callers of Subscribe() to pass in a callback that will be invoked
once the subscription is established or failed to establish. It is the
backend's responsibility to execute the callback on the main thread either
synchronously, or preferably asynchronously at a later point, by
scheduling a task on the IO main loop.
This turns on ZMQ_XPUB_VERBOSE for ZeroMQ so that notifications about
subscriptions are raised even if the subscriptions has previously been
observed.
This allows configurability at the code level to decide what to do with
a received remote events and events produced by a backend. For now, only
enqueue events into the process's script layer, but for the WebSocket
interface, the action would be to send out the event on a WebSocket
connection instead.
The broker serializer leverages the existing data_to_val() function.
During unserialization, if the destination type is any, the logic
simply wraps the broker::data value into a Broker::Data record.
Therefore, events with any parameters are currently exposed to
the Broker::Data type.
There is a bigger issue in that re-publishing such Broker::Data
instances would encode them as a normal record. Explicitly prevent
this by serializing the contained data value directly instead, similar
to what Broker already did when publishing a record.
* topic/christian/disconnect-slow-peers:
Bump cluster testsuite to pull in Broker backpressure tests
Expand documentation of Broker events.
Add sleep() BiF.
Add backpressure disconnect notification to cluster.log and via telemetry
Remove unneeded @loads from base/misc/version.zeek
Add Cluster::nodeid_to_node() helper function
Support re-peering with Broker peers that fall behind
Add Zeek-level configurability of Broker slow-peer disconnects
Bump Broker to pull in disconnect feature and infinite-loop fix
No need to namespace Cluster:: functions in their own namespace
The old `c_str_safe` utility function allowed Zeek to operator on
`broker::data` and `broker::variant`. The former grants access to actual
`std::string` objects while the latter only provides access to fields
via `std::string_view`. Since the Zeek formatting functions need null
terminated strings, we need to copy the characters into a
null-terminated container first.
After removing support for `broker::data` and `broker::variant` from the
same code paths, we can drop `c_str_safe` and always do the copying
(since we are always dealing with `broker::variant` now).
Since the transition to broker::variant has been long finalized, there
is no more need to be able to go back to a pre-variant version of
Broker. Hence, we can drop various utilities that allow Zeek to run with
older Broker releases.
This was added previously in the 7.1 cycle. Now that MakeEvent() was
removed from cluster::Backend, there's no need for Broker to provide
this version.
Discussed with @J-Gras, calling Broker::publish() within a scheduled
should use the "intended timestamp" implicitly.
This is subtle, but supposedly more expected when running
a pcap replay cluster.