Commit graph

1287 commits

Author SHA1 Message Date
Tim Wojtulewicz
1158757b2b Merge remote-tracking branch 'origin/topic/awelzel/move-broker-to-cluster-publish'
* origin/topic/awelzel/move-broker-to-cluster-publish:
  netcontrol: Move to Cluster::publish()
  openflow: Move to Cluster::publish()
  netcontrol/catch-and-release: Move to Cluster::publish()
  config: Move to Cluster::publish()
  ssl/validate-certs: Move to Cluster::publish()
  irc: Move to Cluster::publish()
  ftp: Move to Cluster::publish()
  dhcp: Move to cluster publish
  notice: Move to Cluster::publish()
  intel: Move to Cluster::publish()
  sumstats: Move to Cluster::publish()
2024-12-12 13:18:21 -07:00
Tim Wojtulewicz
25554fa668 Merge remote-tracking branch 'origin/topic/awelzel/fix-cluster-publish-any'
* origin/topic/awelzel/fix-cluster-publish-any:
  cluster/Backend: Handle unspecified table/set
  cluster: Fix Cluster::publish() of Broker::Data
  cluster: Be noisy when attempting to connect to an unknown node
2024-12-12 13:17:08 -07:00
Arne Welzel
3d55341690 netcontrol: Move to Cluster::publish() 2024-12-12 17:54:42 +01:00
Arne Welzel
b2df78c0bb openflow: Move to Cluster::publish() 2024-12-12 17:54:42 +01:00
Arne Welzel
66f6149662 config: Move to Cluster::publish() 2024-12-12 17:54:42 +01:00
Arne Welzel
a9243bafcc notice: Move to Cluster::publish() 2024-12-12 17:54:42 +01:00
Arne Welzel
347faf5e86 intel: Move to Cluster::publish() 2024-12-12 17:54:42 +01:00
Arne Welzel
f58a2c2ca8 sumstats: Move to Cluster::publish() 2024-12-12 17:54:42 +01:00
Arne Welzel
271fc15041 cluster: Be noisy when attempting to connect to an unknown node
Mostly due to spending too much time wondering why nodes didn't connect
when there was a mismatch between "manager" and "manager-1" in the
cluster layout. Remove manager from test-all-policy-cluster test to
avoid connection attempts in this test.
2024-12-12 13:01:04 +01:00
Justin Azoff
10438408a5 Pre-compute the node topics for all pool entries.
A zeek script profile showed a small percentage of time spent in
Cluster::node_topic, but this never changes and can be cached.
2024-12-11 15:57:01 -05:00
Arne Welzel
a2249f7ecb cluster: Add Cluster::node_id(), allow redef of node_topic(), nodeid_topic()
This provides a way for non-broker cluster backends to override a
node's identifier and its own topics that it listens on by default.
2024-12-10 20:33:02 +01:00
Christian Kreibich
1c42bfc715 Merge branch 'topic/christian/disconnect-slow-peers'
* topic/christian/disconnect-slow-peers:
  Bump cluster testsuite to pull in Broker backpressure tests
  Expand documentation of Broker events.
  Add sleep() BiF.
  Add backpressure disconnect notification to cluster.log and via telemetry
  Remove unneeded @loads from base/misc/version.zeek
  Add Cluster::nodeid_to_node() helper function
  Support re-peering with Broker peers that fall behind
  Add Zeek-level configurability of Broker slow-peer disconnects
  Bump Broker to pull in disconnect feature and infinite-loop fix
  No need to namespace Cluster:: functions in their own namespace
2024-12-09 23:33:35 -08:00
Tim Wojtulewicz
ccefd66d37 Move python signatures to a separate file 2024-12-09 11:08:30 -07:00
Christian Kreibich
ead6134501 Add backpressure disconnect notification to cluster.log and via telemetry
This adds a Broker-specific script to the cluster framework, loaded only when
Zeek is running in cluster mode. It adds logging in cluster.log as well as
telemetry via a metrics counter for Broker-observed backpressure disconnects.

The new zeek_broker_backpressure_disconnects counter, labeled by the neighboring
peer that the reporting node has determined to be unresponsive, counts the
number of unpeerings for this reason.

Here the node "worker" has observed node "proxy" falling behind once:

# HELP zeek_broker_backpressure_disconnects_total Number of Broker peering drops due to a neighbor falling too far behind in message I/O
# TYPE zeek_broker_backpressure_disconnects_total counter
zeek_broker_backpressure_disconnects_total{endpoint="worker",peer="proxy"} 1

Includes small btest baseline update to reflect @load of a new script.
2024-12-06 15:18:05 -08:00
Christian Kreibich
46a11ec37d Add Cluster::nodeid_to_node() helper function
This translates backend-specific node identifiers (like Broker IDs) to
cluster nodes and their names, if available.
2024-12-06 15:18:05 -08:00
Christian Kreibich
0010e65f6d Support re-peering with Broker peers that fall behind
This adds re-peering at the Broker level for peers that Broker decided to
unpeer. We keep this at the Broker level since this behavior is specific to
it (as opposed to other cluster backends).

Includes baseline updates for btests that pick up on the new script's @load.
2024-12-06 15:18:05 -08:00
Dominik Charousset
4c4eb4b8e2 Add Zeek-level configurability of Broker slow-peer disconnects 2024-12-06 15:18:05 -08:00
Christian Kreibich
e81856a4af No need to namespace Cluster:: functions in their own namespace 2024-12-06 15:18:05 -08:00
Tim Wojtulewicz
bbd7f56dcc Add signatures for Python bytecode for 3.8-3.14 2024-12-06 13:45:46 -07:00
Johanna Amann
7b582bc345 Merge remote-tracking branch 'origin/topic/johanna/sqlite-pragmas'
* origin/topic/johanna/sqlite-pragmas:
  Options for SQLite log writer, eliminate duplicate definitions
  Test synchronous/journal mode options for SQLite log writer
  Added default options for synchronous and journal mode
  Support for synchronous and journal_mode
2024-11-27 08:32:08 +00:00
Johanna Amann
d592942ccb Test synchronous/journal mode options for SQLite log writer
Also adds some small tweaks and adds the new feature to NEWS.
2024-11-26 12:26:38 +00:00
Arne Welzel
fc12be1f17 cluster/setup-connections: Switch to Cluster::subscribe(), short-circuit broker
For the time being, this is easiest, otherwise we'd need to
conditionally load a broker-specific policy script based on
Cluster::backend being set.
2024-11-26 12:58:23 +01:00
Arne Welzel
ef04a199c8 cluster: Add Cluster scoped bifs
... and a broker based test using Cluster::publish() and
Cluster::subscribe().
2024-11-26 12:58:23 +01:00
Mymaqn
3ca56f7e0f Added default options for synchronous and journal mode
Added enum options SQLITE_SYNCHRONOUS_DEFAULT and SQLITE_JOURNAL_MODE_DEFAULT
and changed the default to be these instead.
2024-11-26 11:08:30 +00:00
Mymaqn
6e026ba313 Support for synchronous and journal_mode 2024-11-26 11:08:18 +00:00
Arne Welzel
97f05b2f8c Merge remote-tracking branch 'origin/topic/awelzel/pluggable-cluster-backends-part1'
* origin/topic/awelzel/pluggable-cluster-backends-part1:
  btest: Test Broker::make_event() together with Cluster::publish_hrw()
  btest: Add cluster dir, minimal test for enum value
  broker: Add shim plugin adding a backend component
  zeek-setup: Instantiate backend::manager
  cluster: Add to src/CMakeLists.txt
  cluster: Add Components and ComponentManager for new components
  cluster/Backend: Interface for cluster backends
  cluster/Serializer: Interface for event and log serializers
  logging: Introduce logging/Types.h
  SerialTypes/Field: Allow default construction and add move constructor
  DebugLogger: Add cluster debugging stream
  plugin: Add component enums for pluggable cluster backends
  broker: Pass frame to MakeEvent()
2024-11-22 12:53:23 +01:00
Arne Welzel
fb23a06f6f cluster/Backend: Interface for cluster backends 2024-11-22 10:43:50 +01:00
Arne Welzel
91f5945f92 sumstat/non-cluster: Move last epoch processing to zeek_done()
@Sheco reported that standalone epoch processing may exclude scheduled
events when the final sumstat epoch runs before. For example, this easily
happens when attempting to do sumstat observations within connection_state_remove().

Delay final epoch processing to zeek_done() instead.

This doesn't deal with the clustered version - this would need something
more elaborate and potentially a mechanism to delay the shutdown of
other cluster nodes until/after sumstat processing completed.
2024-11-18 15:58:01 +01:00
Arne Welzel
aabc4a4114 sumstats: Remove copy() for Broker::publish() calls
Serialization happens immediately at Broker::publish() time, there
should be no caching issues.
2024-11-14 12:59:22 +01:00
Arne Welzel
6abb9d7eda broker/Eventhandler: Deprecate Broker::auto_publish() for v8.1
Relates to #3637
2024-11-14 12:59:22 +01:00
Arne Welzel
883ae3694c sumstats: Remove Broker::auto_publish() 2024-11-14 12:59:22 +01:00
Arne Welzel
b32153037a openflow: Remove Broker::auto_publish() 2024-11-14 12:59:22 +01:00
Arne Welzel
08f2198d3e frameworks/notice: Remove Broker::auto_publish() 2024-11-14 12:59:22 +01:00
Arne Welzel
219d621234 netcontrol: Replace Broker::auto_publish()
I'd think we could drop the cluster.zeek and non-cluster.zeek and
just unconditionally do the publish(), but keeping it for now.
2024-11-06 15:27:48 +01:00
Arne Welzel
93478a246e intel: Switch to Cluster::publish()
This isn't quite making things a lot nicer, but more explicit.
2024-11-06 15:27:48 +01:00
Christian Kreibich
66173633f4 Merge branch 'topic/christian/telemetry-make-bifs-primary'
* topic/christian/telemetry-make-bifs-primary:
  Telemetry framework: move BIFs to the primary-bif stage
  Minor comment tweaks for init-frameworks-and-bifs.zeek
2024-10-24 07:09:16 -07:00
Arne Welzel
70872673a1 telemetry: Invoke Telemetry::sync() only at scrape/collection time
This stops invoking Telemetry::sync() via a scheduled event and instead
only invokes it on-demand. This makes metric collection network time
independent and lazier, too.

With Prometheus scrape requests being processed on Zeek's main thread
now, we can safely invoke the script layer Telemetry::sync() hook.

Closes #3947
2024-10-22 18:49:11 +02:00
Christian Kreibich
71f7e89974 Telemetry framework: move BIFs to the primary-bif stage
This moves the Telemetry framework's BIF-defined functionalit from the
secondary-BIFs stage to the primary one. That is, this functionality is now
available from the end of init-bare.zeek, not only after the end of
init-frameworks-and-bifs.zeek.

This allows us to use script-layer telemetry in our Zeek's own code that get
pulled in during init-frameworks-and-bifs.

This change splits up the BIF features into functions, constants, and types,
because that's the granularity most workable in Func.cc and NetVar. It also now
defines the Telemetry::MetricsType enum once, not redundantly in BIFs and script
layer.

Due to subtle load ordering issues between the telemetry and cluster frameworks
this pushes the redef stage of Telemetry::metrics_port and address into
base/frameworks/telemetry/options.zeek, which is loaded sufficiently late in
init-frameworks-and-bifs.zeek to sidestep those issues. (When not doing this,
the effect is that the redef in telemetry/main.zeek doesn't yet find the
cluster-provided values, and Zeek does not end up listening on these ports.)

The need to add basic Zeek headers in script_opt/ZAM/ZBody.cc as a side-effect
of this is curious, but looks harmless.

Also includes baseline updates for the usual btests and adds a few doc strings.
2024-10-18 09:56:29 -07:00
Arne Welzel
6bb7b9d726 scripts/base/cluster: Move active node management into node_down()
With the idea of an alternative cluster backend, we should
not maintain Cluster state within low-level Broker events.
2024-09-27 15:32:09 +02:00
Robin Sommer
0d3296590d
Spicy: Register well-known ports through an event handler.
This avoids the earlier problem of not tracking ports correctly in
scriptland, while still supporting `port` in EVT files and `%port` in
Spicy files.

As it turns out we are already following the same approach for file
analyzers' MIME types, so I'm applying the same pattern: it's one
event per port, without further customization points. That leaves the
patch pretty small after all while fixing the original issue.
2024-08-22 10:24:55 +02:00
Tim Wojtulewicz
4e9d843cec Remove deprecated Cluster::Node::interface field 2024-08-07 11:58:22 -07:00
Tim Wojtulewicz
a716903f3a Remove deprecated time machine settings 2024-08-07 11:58:21 -07:00
Arne Welzel
bf9704f339 telemetry: Deprecate prometheus.zeek policy script
With Cluster::Node$metrics_port being optional, there's not really
a need for the extra script. New rule, if a metrics_port is set, the
node will attempt to listen on it.

Users can still redef Telemetry::metrics_port *after*
base/frameworks/telemetry was loaded to change the port defined
in cluster-layout.zeek.
2024-07-21 17:49:21 +02:00
Jan Grashoefer
0c06c604ab Add logging of disabled analyzers to analyzer.log 2024-07-09 18:22:43 +02:00
Christian Kreibich
fa6361af56 Management framework: propagate metrics port from agent
This propagates the metrics port from the node config passed through the
supervisor all the way into the script layer.
2024-07-08 23:05:24 -07:00
Christian Kreibich
563704a26e Management framework: add metrics port in management & Supervisor node records
This allows setting a metrics port for creation in new nodes.
2024-07-08 23:05:24 -07:00
Christian Kreibich
3ecacf4f50 Comment-only tweaks for telemetry-related settings.
These weren't quite accurate any more.
2024-07-08 23:05:24 -07:00
Christian Kreibich
737b1a2013 Remove the Supervisor's internal ClusterEndpoint struct.
This eliminates one place in which we currently need to mirror changes to the
script-land Cluster::Node record. Instead of keeping an exact in-core equivalent, the
Supervisor now treats the data structure as opaque, and stores the whole cluster
table as a JSON string.

We may replace the script-layer Supervisor::ClusterEndpoint in the future, using
Cluster::Node directly. But that's a more invasive change that will affect how
people invoke Supervisor::create() and similars.

Relying on JSON for serialization has the side-effect of removing the
Supervisor's earlier quirk of using 0/tcp, not 0/unknown, to indicate unused
ports in the Supervisor::ClusterEndpoint record.
2024-07-02 14:52:17 -07:00
Christian Kreibich
a98ec6b08b Provide a script-layer equivalent to Supervisor::__init_cluster().
If the script layer is able to access the current node's config via
Supervisor::node(), it can handle populating Cluster::nodes. That code
is much more straightforward than an equivalent in-core implementation
(especially with the upcoming change to the cluster table's implementation).
This introduces base/frameworks/cluster/supervisor.zeek and
Cluster::Supervisor::__init_cluster_nodes() for that purpose.

The @load of the Supervisor API in cluster/main.zeek isn't technically
necessary since we already load it explicitly even in init-bare.zeek,
but being explicit seems better.
2024-07-02 14:52:13 -07:00
Tim Wojtulewicz
d549e3d56a Add Telemetry::metrics_address option 2024-06-07 09:28:27 -07:00