This is a cluster backend implementation using a central XPUB/XSUB proxy
that by default runs on the manager node. Logging is implemented leveraging
PUSH/PULL sockets between logger and other nodes, rather than going
through XPUB/XSUB.
The test-all-policy-cluster baseline changed: Previously, Broker::peer()
would be called from setup-connections.zeek, causing the IO loop to be
alive. With the ZeroMQ backend, the IO loop is only alive when
Cluster::init() is called, but that doesn't happen anymore.
Using network_time to calculate packet lag will produce wrong results
when there is no packet available but network time does not (yet) fall
back to wall clock.
This now picks up additional typical misspellings, but also triggers on
more identifiers we use. I opted for fixing the obvious misspellings and
updated the allowlist for anything else.
This wasn't possible before #3028 was fixed, but now it's safe to set
the value in new_connection() and allow other users access to the
field much earlier. We do not have to deal with connection_flipped()
because the community-id hash is symmetric.
remove instance of plus sign to account for real plus in sql
account for spaces encoding to plus signs in sqli regex detection
add test cases for sqli space to plus
account for spaces encoding to plus signs in sqli regex detection
forgot semicolon
account for spaces encoding to plus signs in sqli regex detection
Adding a metric for the network time value itself should make it
possible to observe it stopping or growing slowly as compared to
realtime when Zeek isn't able to keep up.
Also, modify the telemetry/log.zeek test to include misc/stats and
log at a higher frequency with a more interesting pcap.
With Cluster::Node$metrics_port being optional, there's not really
a need for the extra script. New rule, if a metrics_port is set, the
node will attempt to listen on it.
Users can still redef Telemetry::metrics_port *after*
base/frameworks/telemetry was loaded to change the port defined
in cluster-layout.zeek.
The controller learns IP addresses from agents that peer with it, but that
information has so far gotten lost when resulting configs get pushed out to the
agents. This makes these updates include that information.
This is quite redundant with the enumeration for Broker ports,
unfortunately. But the logic is subtly different: all nodes obtain a telemetry
port, while not all nodes require a Broker port, for example, and in the metrics
port assignment we also cross-check selected Broker ports. I found more unified
code actually harder to read in the end.
The logic for the two sets remains the same: from a start point, ports get
enumerated sequentially that aren't otherwise taken. These ports are assumed
available; there's nothing that checks their availability -- for now.
The default start port is 9000. I considered 9090, to align with the Prometheus
default, but counting upward from there is likely to hit trouble with the Broker
default ports (9999/9997), used by the Supervisor. Counting downward is a bit
unnatural, and shifting the Broker default ports brings subtle ordering issues.
This also changes the node ordering logic slightly since it seems more intuitive
to keep sequential ports on a given instance, instead of striping across them.
This eliminates one place in which we currently need to mirror changes to the
script-land Cluster::Node record. Instead of keeping an exact in-core equivalent, the
Supervisor now treats the data structure as opaque, and stores the whole cluster
table as a JSON string.
We may replace the script-layer Supervisor::ClusterEndpoint in the future, using
Cluster::Node directly. But that's a more invasive change that will affect how
people invoke Supervisor::create() and similars.
Relying on JSON for serialization has the side-effect of removing the
Supervisor's earlier quirk of using 0/tcp, not 0/unknown, to indicate unused
ports in the Supervisor::ClusterEndpoint record.
The previous "fix" caused significant performance degradation without
the signature ever having a chance to trigger. Moving it to policy
seems the best compromise, the alternative being outright removing it.
rule_added_policy allows the modification of rules just after they have
been added. This allows the implementation of some more complex features
- like changing rule states depending on insertion in other plugins.
Catch-and-release logs now include the plugin that is responsible for an
action. Furthermore, the catch-and-release log also includes instances
where a rule already existed, and where an error occurred during an
operation.
ssl-log-ext had a bug that caused data present in the SSL connection to
not be logged in some cases. Specifically, the script relied on the base
ssl script to initialize some data structures; however, this means that
protocol messages that arrive before a message is handled by the base
ssl script are not logged.
This commit changes the ssl-log-ext script to also initialize the data
structures; now messages are correctly included in the log in all cases.