Commit graph

71 commits

Author SHA1 Message Date
Tim Wojtulewicz
d95affde4d Remove deprecations tagged for v8.1 2025-08-12 10:19:03 -07:00
Benjamin Bannier
d5fd29edcd Prefer explicit construction to coercion in record initialization
While we support initializing records via coercion from an expression
list, e.g.,

    local x: X = [$x1=1, $x2=2];

this can sometimes obscure the code to readers, e.g., when assigning to
value declared and typed elsewhere. The language runtime has a similar
overhead since instead of just constructing a known type it needs to
check at runtime that the coercion from the expression list is valid;
this can be slower than just writing the readible code in the first
place, see #4559.

With this patch we use explicit construction, e.g.,

    local x = X($x1=1, $x2=2);
2025-07-11 16:28:37 -07:00
Christian Kreibich
68fadd0464 Lower listen/connect retry intervals in Broker and the cluster framework to 1sec
The former defaults (30sec, 1min) can slow down cluster startup and recovery
considerably, and other systems have more aggressive intervals still.
2025-04-25 10:22:35 -07:00
Christian Kreibich
841a40ff88 Switch Broker's default backpressure policy to drop_oldest, bump buffer sizes
At every site where we've dug into backpressure disconnect findings, it has been
the case that the default values were too small. 8192, so 4x the old default,
suffices at every site to drown out premature disconnects.

With metrics now available for the send buffers regardless of backpressure
overflow policy, this also switches the default from "disconnect" to
"drop_oldest" (for both peers and websockets), meaning that peerings remain
untouched but the oldest queued message simply gets dropped when a new message
is enqueued. With this policy, the number of backpressure overflows is then
simply the count of discarded messages, something that users can tune to see
drop to zero in everyday use.  Another benefit is that marginal overflows cause
less message loss than when an entire buffer's worth (plus potentially more
in-flight messages) gets thrown out with a disconnect.
2025-04-25 10:22:35 -07:00
Christian Kreibich
5008f586ea Deprecate Broker::congestion_queue_size and stop using it internally
Since a reorg in the Broker library (commit b04195183) that revamped flow
control and that we pulled in with Zeek 5.0, this setting hasn't done
anything. Broker's endpoint::make_subscriber() and
endpoint::make_status_subscriber() take a queue size argument (with a default
value) that simply gets dropped in the eventual subscriber::make() call. See:

b041951835 (diff-5c0d2baa7981caeb6a4080708ddca6ad929746d10c73d66598e46d7c2c03c8deL34-R178)
2025-04-25 10:22:35 -07:00
Christian Kreibich
f5fbad23ff Add peer buffer update tracking to the Broker manager's event_observer
This implements basic tracking of each peering's current fill level, the maximum
level over a recent time interval (via a new Broker::buffer_stats_reset_interval
tunable, defaulting to 1min), and the number of times a buffer overflows. For
the disconnect policy this is the number of depeerings, but for drop_newest and
drop_oldest it implies the number of messages lost.

This doesn't use "proper" telemetry metrics for a few reasons: this tracking is
Broker-specific, so we need to track each peering via endpoint_ids, while we
want the metrics to use Cluster node name labels, and the latter live in the
script layer. Using broker::endpoint_id directly as keys also means we rely on
their ability to hash in STL containers, which should be fast.

This does not track the buffer levels for Broker "clients" (as opposed to
"peers"), i.e. WebSockets, since we currently don't have a way to name these,
and we don't want to use ephemeral Broker IDs in their telemetry.

To make the stats accessible to the script layer the Broker manager (via a new
helper class that lives in the event_observer) maintains a TableVal mapping
Broker IDs to a new BrokerPeeringStats record. The table's members get updated
every time that table is requested. This minimizes new val instantiation and
allows the script layer to customize the BrokerPeeringStats record by redefing,
updating fields, etc. Since we can't use Zeek vals outside the main thread, this
requires some care so all table updates happen only in the Zeek-side table
updater, PeerBufferState::GetPeeringStatsTable().
2025-04-24 22:47:18 -07:00
Arne Welzel
ab25e5d24b broker/main: Reference Cluster::publish() for auto_publish() deprecation
In hindsight, this is the better thing to do and with Zeek 7.2 we should
be confident enough that it'll work.
2025-04-23 14:27:43 +02:00
Arne Welzel
a7423104e1 broker/main: Deprecate Broker::listen_websocket()
Optimistically deprecate Broker::listen_websocket() and promote
Cluster::listen_websocket() instead.
2025-04-23 14:27:43 +02:00
Christian Kreibich
549e678dff Use Broker peering directionality when re-peering after backpressure overflows
This avoids creating pointless connection reattempts to ephemeral TCP
client-side ports, which have been cluttering up the Broker logs since 7.1.
2025-04-21 14:08:42 -07:00
Christian Kreibich
b430d5235c Expand Broker APIs to allow tracking directionality of peering establishment
This provides ways to figure out for a given peer, or a given address/port pair,
whether the local node originally established the peering.
2025-04-21 14:08:42 -07:00
Arne Welzel
6bc36e8cf8 broker/main: Adapt enum values to agree with comm.bif
Logic to detect this error already existed, but due to enum identifiers
not having a value set, it never triggered before.

Should probably backport this one.
2025-04-04 15:36:42 +02:00
Arne Welzel
14697ea6ba Merge remote-tracking branch 'origin/topic/neverlord/broker-logging'
* origin/topic/neverlord/broker-logging:
  Integrate review feedback
  Hook into Broker logs via its new API
2025-03-31 18:53:43 +02:00
Dominik Charousset
20b3eca257 Integrate review feedback 2025-02-15 16:37:24 +01:00
Dominik Charousset
30615f425e Hook into Broker logs via its new API
The new Broker API allows us to provide a custom logger to Broker that
pulls previously unattainable context information out of Broker to put
them into broker.log for users of Zeek.

Since Broker log events happen asynchronously, we cache them in a queue
and use a flare to notify Zeek of activity. Furthermore, the Broker
manager now implements the `ProcessFd` function to avoid unnecessary
polling of the new log queue. As a side effect, data stores are polled
less as well.
2025-02-08 16:28:02 +01:00
Tim Wojtulewicz
c1a8f8b763 Fix errors from rst linting on the generated docs 2025-01-24 11:41:36 -07:00
Christian Kreibich
0010e65f6d Support re-peering with Broker peers that fall behind
This adds re-peering at the Broker level for peers that Broker decided to
unpeer. We keep this at the Broker level since this behavior is specific to
it (as opposed to other cluster backends).

Includes baseline updates for btests that pick up on the new script's @load.
2024-12-06 15:18:05 -08:00
Dominik Charousset
4c4eb4b8e2 Add Zeek-level configurability of Broker slow-peer disconnects 2024-12-06 15:18:05 -08:00
Arne Welzel
6abb9d7eda broker/Eventhandler: Deprecate Broker::auto_publish() for v8.1
Relates to #3637
2024-11-14 12:59:22 +01:00
Tim Wojtulewicz
128bf3fe9f Remove Broker metrics configuration values and methods 2024-05-31 13:30:31 -07:00
Tim Wojtulewicz
97a35011a7 Add necessary script-land changes 2024-05-31 13:30:31 -07:00
Arne Welzel
f35cf228dc broker/store: Extend SQLiteOptions around data safety and performance
Add configurability of synchronous and journal_mode for SQLite backed
Broker data stores. Setting these to synchronous=normal and journal_mode=wal
can significantly improve throughput at the cost of some durability in
the presence of power loss or OS crash. In the context of Zeek, this is
likely more than acceptable.

Additionally, add integrity_check and failure_mode options to support deleting
and re-opening a corrupted SQLite database at store creation.

Closes #2698
2023-01-30 10:25:37 +01:00
Josh Soref
21e0d777b3 Spelling fixes: scripts
* accessing
* across
* adding
* additional
* addresses
* afterwards
* analyzer
* ancillary
* answer
* associated
* attempts
* because
* belonging
* buffer
* cleanup
* committed
* connects
* database
* destination
* destroy
* distinguished
* encoded
* entries
* entry
* hopefully
* image
* include
* incorrect
* information
* initial
* initiate
* interval
* into
* java
* negotiation
* nodes
* nonexistent
* ntlm
* occasional
* omitted
* otherwise
* ourselves
* paragraphs
* particular
* perform
* received
* receiver
* referring
* release
* repetitions
* request
* responded
* retrieval
* running
* search
* separate
* separator
* should
* synchronization
* target
* that
* the
* threshold
* timeout
* transaction
* transferred
* transmission
* triggered
* vetoes
* virtual

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-11-02 17:36:39 -04:00
Dominik Charousset
6565b4862d Add missing bits for Broker::metrics_import_topics 2022-08-22 17:10:07 +02:00
Arne Welzel
c2ca92d772 Try adding Broker::metrics_import_topics, stuck 2022-08-08 17:20:13 +02:00
Robin Sommer
d99f041ac5
Add WebSocket support for exchanging events with external clients.
This exposes Broker's new WebSocket support in Zeek. To enable it,
call `Broker::listen_websocket()`. Zeek will then start listening on
port 9997 for incoming WebSocket connections.

See the Broker documentation for a description of the message format
expected over these WebSocket connections.
2022-06-02 10:31:52 +02:00
Christian Kreibich
90b7c6961e Simplify the supervisor's listen() on default address/port 2021-08-18 12:35:49 -07:00
Dominik Charousset
7767c3d36c Sync new broker options, fix name inconsistencies 2021-05-25 17:22:45 +02:00
Dominik Charousset
f9cd05f00b Integrate new Broker metric exporter parameters 2021-05-24 17:20:48 +02:00
Jon Siwek
6af436aad3 GH-1426: Improve handling of Broker data store creation failures
Broker::create_master() and Broker::create_clone() now return
a valid value even when there's a failure to open the backend database
(e.g. SQLite filesystem error).  In that case, the returned value can
still be passed into other data store operations, but they'll fail
immediately with an error.  Broker::is_closed() can now also be used to
determine whether the data store creation calls failed.
2021-03-06 02:32:29 -08:00
Jon Siwek
321a027d07 Remove unusable/broken RocksDB code and options
The Broker RockSDB data store backend was previously unusable
and broken, so all code and options related to it are now removed.
2021-01-11 11:12:59 -08:00
Christian Kreibich
1bd658da8f Support for log filter policy hooks
This adds a "policy" hook into the logging framework's streams and
filters to replace the existing log filter predicates. The hook
signature is as follows:

    hook(rec: any, id: Log::ID, filter: Log::Filter);

The logging manager invokes hooks on each log record. Hooks can veto
log records via a break, and modify them if necessary. Log filters
inherit the stream-level hook, but can override or remove the hook as
needed.

The distribution's existing log streams now come with pre-defined
hooks that users can add handlers to. Their name is standardized as
"log_policy" by convention, with additional suffixes when a module
provides multiple streams. The following adds a handler to the Conn
module's default log policy hook:

    hook Conn::log_policy(rec: Conn::Info, id: Log::ID, filter: Log::Filter)
            {
            if ( some_veto_reason(rec) )
                break;
            }

By default, this handler will get invoked for any log filter
associated with the Conn::LOG stream.

The existing predicates are deprecated for removal in 4.1 but continue
to work.
2020-09-30 12:32:45 -07:00
Robin Sommer
c3f4971eb2 Merge remote-tracking branch 'origin/topic/johanna/table-changes'
* origin/topic/johanna/table-changes: (26 commits)
  TableSync: try to make test more robust & add debug output
  Increase timeouts to see if FreeBSD will be happy with this.
  Try to make FreeBSD test happy with larger timeout.
  TableSync: refactor common functionality into function
  TableSync: don't raise &on_change, smaller fixes
  TableSync: rename auto_store -> table_store
  SyncTables: address feedback part 1 - naming (broker and zeek)
  BrokerStore <-> Zeek Tables: cleanup and bug workaround
  Zeek Table<->Brokerstore: cleanup, documentation, small fixes
  BrokerStore<->Zeek table: adopt to recent Zeek API changes
  BrokerStore<->Zeek Tables Fix a few small test failures.
  BrokerStore<->Zeek tables: allow setting storage location & tests
  BrokerStore<->Zeek tables: &backend works for in-memory stores.
  BrokerStore<->Zeek table - introdude &backend attribute
  BrokerStore<->Zeek tables: test for clones synchronizing to a master
  BrokerStore<->Zeek tables: load persistent tables on startup.
  Brokerstore<->Tables: attribute conflicts
  Zeek/Brokerstore updates: expiration
  Zeek/Brokerstore updates: add test that includes updates from clones
  Zeek/Brokerstore updates: first working end-to-end test
  ...
2020-07-21 15:39:39 +00:00
Johanna Amann
930a5c8ebd TableSync: rename auto_store -> table_store 2020-07-17 11:40:59 -07:00
Johanna Amann
6d2aa84952 SyncTables: address feedback part 1 - naming (broker and zeek)
This commit fixes capitalization issues.
2020-07-17 10:56:28 -07:00
Jon Siwek
c84a51ac09 GH-837: emit Reporter errors for Broker errors
Instead of only writing them in broker.log, which may be easy to
overlook.
2020-07-16 18:07:00 -07:00
Johanna Amann
2b2a40f49c Zeek Table<->Brokerstore: cleanup, documentation, small fixes
This commit adds script/c++ documentation and fixes a few loose ends.
It also adds tests for corner cases and massively improves error
messages.

This also actually introduces type-compatibility checking and introduces
a new attribute that lets a user override this if they really know what
they are doing. I am not quite sure if we should really let that stay in
- but it can be very convenient to have this functionality.

One test is continuing to fail - the expiry test is very flaky. This is,
I think, caused by delays of the broker store forwarding. I am unsure if
we can actually do anything about that.
2020-07-10 16:58:34 -07:00
Johanna Amann
318a72c303 BrokerStore<->Zeek table - introdude &backend attribute
The &backend attribute allows for a much more convenient way of
interacting with brokerstores. One does not need to create a broker
store anymore - instead all of this is done internally.

The current state of this partially works. This should work fine for
persistence - but clones are currently not yet correctly attached.
2020-06-30 16:33:52 -07:00
Jon Siwek
43e54c7930 GH-780: Prevent log batches from indefinite buffering
Logs that got sent sparsely or burstily would get buffered for long
periods of time since the logic to flush them only does so on the next
log write.  In the worst case, a subsequent log write could never happen
and cause a log entry to be indefinitely buffered.

This fix introduces a recurring event/timer to simply flush all pending
logs at frequency of Broker::log_batch_interval.
2020-02-05 13:06:52 -08:00
Jon Siwek
5b64c35185 Switch default CAF scheduler policy to work sharing
It may generally be better for our default use-case, as workers may
save a few percent cpu utilization as this policy does not have to
use any polling like the stealing policy does.

This also helps avoid a potential issue with the implementation of
spinlocks used in the work-stealing policy in current CAF versions,
where there's some conditions where lock contention causes a thread
to spin for long periods without relinquishing the cpu to others.
2019-06-28 16:34:33 -07:00
Jon Siwek
1ce0fcce49 GH-387: update Broker topic names to use "zeek/" prefix 2019-05-29 15:56:37 -07:00
Jon Siwek
7f0fb49612 Add an internal getenv wrapper function: zeekenv
It maps newer environment variable names starting with ZEEK to the
legacy names starting with BRO.
2019-05-23 20:42:42 -07:00
Daniel Thayer
1a74516db1 Rename all BRO-prefixed environment variables
For backward compatibility when reading values, we first check
the ZEEK-prefixed value, and if not set, then check the corresponding
BRO-prefixed value.
2019-05-22 00:12:31 -05:00
Daniel Thayer
be182aac83 More bro-to-zeek renaming in scripts and other files 2019-05-16 02:36:41 -05:00
Jon Siwek
cb6b9a1f1a Allow tuning Broker log batching via scripts
Via redefining "Broker::log_batch_size" or "Broker::log_batch_interval"
2019-05-08 12:44:55 -07:00
Jon Siwek
aebcb1415d GH-234: rename Broxygen to Zeexygen along with roles/directives
* All "Broxygen" usages have been replaced in
  code, documentation, filenames, etc.

* Sphinx roles/directives like ":bro:see" are now ":zeek:see"

* The "--broxygen" command-line option is now "--zeexygen"
2019-04-22 19:45:50 -07:00
Jon Siwek
a994be9eeb Merge remote-tracking branch 'origin/topic/seth/zeek_init'
* origin/topic/seth/zeek_init:
  Some more testing fixes.
  Update docs and tests for bro_(init|done) -> zeek_(init|done)
  Implement the zeek_init handler.
2019-04-19 11:24:29 -07:00
Seth Hall
8cefb9be42 Implement the zeek_init handler.
Implements the change and a test.
2019-04-14 08:37:35 -04:00
Daniel Thayer
18bd74454b Rename all scripts to have ".zeek" file extension 2019-04-11 21:12:40 -05:00
Jon Siwek
0d685efbf5 Add Broker::peer_counts_as_iosource option
Disabling this option allows one to read pcaps, but still initiate
Broker peerings and automatically exit when done processing the pcap
file.  The default behavior would normally cause Broker::peer() to
prevent shutting the process down even after done reading the pcap.
2019-01-16 19:03:35 -06:00
Jon Siwek
4bd6da7186 Update default Broker/CAF thread tuning 2018-09-07 17:50:28 -05:00