If Cluster::init() hasn't been invoked yet, Cluster::subscribe() with the
ZeroMQ backend would block because the main_inproc socket didn't
yet have a connection from the child thread. Prevent this by connecting
the main and child socket pair at construction time.
This will queue the subscriptions and start processing them once the
child thread has started.
When either the XPUB socket's hwm is reached, or the onloop queue is
full, drop the events. Users can set ths xpub_sndhwm and
onloop_queue_hwm to 0 to avoid these drops at the risk of unbounded
memory growth.
After moving the log_push initialization from the constructor to the
DoInit() method, it's now possible that DoPublishLogWrites() is invoked
even if DoInit() was never called. Handle this by short-circuiting. This
is sort of an error, but can happen during tests if scripts are loaded
somewhat arbitrarily.
ZeroMQ's IPv6 support isn't enabled by default, resulting in
"No such device" errors when attempting to listen on an IPv6
address. This change adds a ipv6 option to the ZeroMQ module
and enables it by default. Further, adds a test configuring
everything to listen on IPv6 ::1 as well, and one test to provoke
the original error. This also regularizes some error messages.
The addr_to_uri() calls weren't actually needed, but they apparently do
not hurt and the result is easier on the eyes, so use them :-)
The ZeroMQ heuristic for "ready to publish" is to create an unique and
ephemeral subscription using the XSUB socket and observe it arrive on the
XPUB socket. At this point, visibility into other node's subscriptions
is provided.
The internal ZeroMQ thread would call QueueForProcessing() thereby
accessing the onloop member. As ThreadedBackend::DoTerminate() unsets it,
this was a) reported as a data race by TSAN and b) potentially caused
missed events that were still to be queued.
Explicitly notify the internal thread about the shutdown via the
inproc socket pair. This ensures that the internal thread processes
all previous messages on the inproc socket before terminating.
This fixes the scenario where a backend is created, a few messages published
and then immediately terminated as can be done with WebSocket clients.
Previously, some of the messages published might have still been in the
inproc socket's queue and were simply discarded.
Adds the same test for Broker and ZeroMQ backends.
Normal life-cycle is that Terminate() / DoTerminate() is called
by zeek-setup code. If that doesn't happen, shutdown and join
threads during destructor.
try { } catch (...) suggested by Benjamin.
...util::fmt() uses a static buffer, so this is problematic.
I've dabbled a bit replacing std::thread with using threading::BasicThread
which would offer Fmt(), but this makes things more complicated. Primarily
as BasicThread is registered with the thread manager and the shutdown
interactions become entangled. The thread might be terminated before the
backend, or vice-versa. Seems nicer for the thread to be owned by the backend.
This allows callers of Subscribe() to pass in a callback that will be invoked
once the subscription is established or failed to establish. It is the
backend's responsibility to execute the callback on the main thread either
synchronously, or preferably asynchronously at a later point, by
scheduling a task on the IO main loop.
This turns on ZMQ_XPUB_VERBOSE for ZeroMQ so that notifications about
subscriptions are raised even if the subscriptions has previously been
observed.
It is not safe to use the same socket from different threads, but the
current code used the xsub socket directly from the main thread (to setup
subscriptions) and from the internal thread for polling and reading.
Leverage the PAIR socket already in use for forwarding publish operations
to the internal thread also for subscribe and unsubscribe.
The failure mode is/was a bit annoying. Essentially, closing of the
context would hang indefinitely in zmq_ctx_term().
This allows configurability at the code level to decide what to do with
a received remote events and events produced by a backend. For now, only
enqueue events into the process's script layer, but for the WebSocket
interface, the action would be to send out the event on a WebSocket
connection instead.
This is a cluster backend implementation using a central XPUB/XSUB proxy
that by default runs on the manager node. Logging is implemented leveraging
PUSH/PULL sockets between logger and other nodes, rather than going
through XPUB/XSUB.
The test-all-policy-cluster baseline changed: Previously, Broker::peer()
would be called from setup-connections.zeek, causing the IO loop to be
alive. With the ZeroMQ backend, the IO loop is only alive when
Cluster::init() is called, but that doesn't happen anymore.