Commit graph

18186 commits

Author SHA1 Message Date
Arne Welzel
e459d96fb6 QUIC: Do not consume EncryptedLongPacketPayload
The payload is already consumed within the InitialPacket unit. Consuming
it again resulted in UDP datagrams with multiple packets to ignore
the remaining packets in the same UDP datagram. The baseline changes
showing I being followed by a new H indicates that the INITIAL packet
was followed by a HANDSHAKE packet, but previously Zeek discarded
these.
2025-05-05 14:34:11 +02:00
Arne Welzel
f63677fcd5 QUIC: Fix ACK frame parsing
Later tests will exercise this.
2025-04-30 15:54:42 +02:00
Arne Welzel
d5e1dc27c6 Merge branch 'topic/mohan/intel-event-groups' of https://github.com/Mohan-Dhawan/zeek
* 'topic/mohan/intel-event-groups' of https://github.com/Mohan-Dhawan/zeek:
  coalesce smtp handlers for ADDR
  Add fine-grained groups for Intel events
2025-04-29 15:00:58 +02:00
Mohan Dhawan
36c4d112c8
coalesce smtp handlers for ADDR 2025-04-29 16:30:31 +05:30
Arne Welzel
5bf660a9ce Merge remote-tracking branch 'origin/topic/awelzel/cluster-coverity-fixes'
* origin/topic/awelzel/cluster-coverity-fixes:
  broker/WebSocketShim: Check RegisterFd() return
  cluster/OnLoop: Fix coverity report about proc accessed without lock
2025-04-28 19:41:10 +02:00
Arne Welzel
540baa89af Merge remote-tracking branch 'origin/topic/awelzel/3045-no-holes-in-vectors'
* origin/topic/awelzel/3045-no-holes-in-vectors:
  broker/Data/data_to_val: Fail on vectors/lists with holes
2025-04-28 18:24:25 +02:00
Arne Welzel
7092db6318 broker/Data/data_to_val: Fail on vectors/lists with holes
Instead of simply removing holes from vectors or lists when converting
from Val to Broker format, error out as the receiver has no chance to
reconstruct where the hole might have been.

We could encode holes with broker::none, but this will put unnecessary
burden on language bindings and users due to the potential optionality.
Think a std::vector<uint64_t> that technically needs to be a
std::vector<std::optional<uint64_t>> to represent optional elements
properly.

Closes #3045
2025-04-28 18:23:37 +02:00
Arne Welzel
d02588d25c broker/WebSocketShim: Check RegisterFd() return 2025-04-28 16:24:25 +02:00
Arne Welzel
4101efed4f cluster/OnLoop: Fix coverity report about proc accessed without lock
Coverity complains proc is set under a lock, but accessed in Process()
without a lock. Fix this by setting it in Close() also without locking.
The proc member should only ever be accessed my the main thread.
2025-04-28 16:23:08 +02:00
Tim Wojtulewicz
b9b268bd86 Merge remote-tracking branch 'origin/topic/timw/use-after-move'
* origin/topic/timw/use-after-move:
  Fix use-after-move in recent broker changes
2025-04-25 16:11:56 -07:00
Tim Wojtulewicz
f8d2f30cec Fix use-after-move in recent broker changes 2025-04-25 13:48:14 -07:00
Tim Wojtulewicz
223c5ab955 Start of 8.0.0 development 2025-04-25 11:59:08 -07:00
Tim Wojtulewicz
aefcae2e2e Update docs submodule [nomail] [skip ci] 2025-04-25 11:10:16 -07:00
Tim Wojtulewicz
82bf555f7d Merge branch 'topic/timw/4218-lowercase-http'
* topic/timw/4218-lowercase-http:
  Ignore case when matching prefix in http analyzer
2025-04-25 10:33:39 -07:00
Kshitiz Bartariya
40935c31b1 Ignore case when matching prefix in http analyzer 2025-04-25 10:33:11 -07:00
Tim Wojtulewicz
4f65b89edf Merge remote-tracking branch 'origin/topic/timw/seven-two-news'
* origin/topic/timw/seven-two-news:
  Updates for the various Broker changes
  Add versions of bundled dependencies
  Fix a few typos.
  Additional user contributions for NEWS
  NEWS addition for cluster backends
  NEWS additions for 7.2
  Reformat 7.2 NEWS entries for consistent line lengths
2025-04-25 10:25:12 -07:00
Christian Kreibich
fee65e83ee Updates for the various Broker changes 2025-04-25 10:24:07 -07:00
Tim Wojtulewicz
3d584011a0 Add versions of bundled dependencies 2025-04-25 10:24:07 -07:00
Christian Kreibich
3dbb5b98f3 Fix a few typos. 2025-04-25 10:24:07 -07:00
Christian Kreibich
03e4d084b3 Additional user contributions for NEWS
Beyond PRs these also include (non-trivial, non-support) Github issues -- bug
reports, feature requests, etc.
2025-04-25 10:24:07 -07:00
Arne Welzel
8295c35f4b NEWS addition for cluster backends 2025-04-25 10:24:07 -07:00
Tim Wojtulewicz
b41e07ae0f NEWS additions for 7.2 2025-04-25 10:24:07 -07:00
Tim Wojtulewicz
ad4fa22889 Reformat 7.2 NEWS entries for consistent line lengths 2025-04-25 10:24:07 -07:00
Christian Kreibich
ebd0207352 Merge branch 'topic/christian/broker-tuning'
* topic/christian/broker-tuning:
  Lower listen/connect retry intervals in Broker and the cluster framework to 1sec
  Bump cluster testsuite
  Switch Broker's default backpressure policy to drop_oldest, bump buffer sizes
  Deprecate Broker::congestion_queue_size and stop using it internally
2025-04-25 10:23:55 -07:00
Christian Kreibich
68fadd0464 Lower listen/connect retry intervals in Broker and the cluster framework to 1sec
The former defaults (30sec, 1min) can slow down cluster startup and recovery
considerably, and other systems have more aggressive intervals still.
2025-04-25 10:22:35 -07:00
Christian Kreibich
7540d48fd5 Bump cluster testsuite
This pulls in an update for the backpressure disconnect tests, which now need to
set the policy explicitly.
2025-04-25 10:22:35 -07:00
Christian Kreibich
841a40ff88 Switch Broker's default backpressure policy to drop_oldest, bump buffer sizes
At every site where we've dug into backpressure disconnect findings, it has been
the case that the default values were too small. 8192, so 4x the old default,
suffices at every site to drown out premature disconnects.

With metrics now available for the send buffers regardless of backpressure
overflow policy, this also switches the default from "disconnect" to
"drop_oldest" (for both peers and websockets), meaning that peerings remain
untouched but the oldest queued message simply gets dropped when a new message
is enqueued. With this policy, the number of backpressure overflows is then
simply the count of discarded messages, something that users can tune to see
drop to zero in everyday use.  Another benefit is that marginal overflows cause
less message loss than when an entire buffer's worth (plus potentially more
in-flight messages) gets thrown out with a disconnect.
2025-04-25 10:22:35 -07:00
Christian Kreibich
5008f586ea Deprecate Broker::congestion_queue_size and stop using it internally
Since a reorg in the Broker library (commit b04195183) that revamped flow
control and that we pulled in with Zeek 5.0, this setting hasn't done
anything. Broker's endpoint::make_subscriber() and
endpoint::make_status_subscriber() take a queue size argument (with a default
value) that simply gets dropped in the eventual subscriber::make() call. See:

b041951835 (diff-5c0d2baa7981caeb6a4080708ddca6ad929746d10c73d66598e46d7c2c03c8deL34-R178)
2025-04-25 10:22:35 -07:00
Christian Kreibich
c1a5f70df8 Merge branch 'topic/christian/broker-backpressure-metrics'
* topic/christian/broker-backpressure-metrics:
  Add basic btest to verify that Broker peering telemetry is available.
  Add cluster framework telemetry for Broker's send-buffer use
  Add peer buffer update tracking to the Broker manager's event_observer
  Rename the Broker manager's LoggerAdapter
  Avoid race in the cluster/broker/publish-any btest
2025-04-25 10:04:09 -07:00
Christian Kreibich
35ab9d5c80 Add basic btest to verify that Broker peering telemetry is available. 2025-04-25 09:15:17 -07:00
Christian Kreibich
88a0cda8ca Add cluster framework telemetry for Broker's send-buffer use
This hooks into Telemetry::sync() to update Broker-level metrics tracking the
peerings' send buffer state. We do this in the cluster framework so we can label
the resulting metrics with Zeek cluster node names, not Broker's endpoint IDs.
2025-04-25 09:14:33 -07:00
Tim Wojtulewicz
6f52bdd29a Merge remote-tracking branch 'origin/topic/timw/clang-tidy-highway-hash'
* origin/topic/timw/clang-tidy-highway-hash:
  Skip linting on highwayhash and src/3rdparty files
2025-04-25 06:41:16 -07:00
Tim Wojtulewicz
c4613cf573 Merge remote-tracking branch 'origin/topic/timw/storage-framework-script-docs-updates'
* origin/topic/timw/storage-framework-script-docs-updates:
  Minor changes to storage framework script docs
2025-04-25 06:40:54 -07:00
Evan Typanski
154ee7720e Merge remote-tracking branch 'origin/topic/etyp/spicy-bump'
* origin/topic/etyp/spicy-bump:
  Bump Spicy
2025-04-25 08:41:02 -04:00
Evan Typanski
e98aae8b5f Bump Spicy 2025-04-25 13:07:02 +02:00
Arne Welzel
a852ecf913 Merge remote-tracking branch 'origin/topic/awelzel/backend-ready-callback-logic'
* origin/topic/awelzel/backend-ready-callback-logic:
  btest/cluster/websocket: Move no-subscriptions test
  cluster/websocket: Leverage ReadyToPublishCallback()
  cluster/zeromq: Implement DoReadyToPublishCallback()
  cluster/Backend: Add ReadyToPublishCallback() API
2025-04-25 10:06:36 +00:00
Arne Welzel
43a1bab960 btest/cluster/websocket: Move no-subscriptions test
...and also add one for broker.
2025-04-25 10:01:23 +00:00
Arne Welzel
2cd2a2b8a6 cluster/websocket: Leverage ReadyToPublishCallback()
Change WebSocket client handling to return only when the ready to
publish callback has been invoked.
2025-04-25 09:57:06 +00:00
Arne Welzel
643b926625 cluster/zeromq: Implement DoReadyToPublishCallback()
The ZeroMQ heuristic for "ready to publish" is to create an unique and
ephemeral subscription using the XSUB socket and observe it arrive on the
XPUB socket. At this point, visibility into other node's subscriptions
is provided.
2025-04-25 09:57:06 +00:00
Arne Welzel
e7a876da35 cluster/Backend: Add ReadyToPublishCallback() API
Provide a mechanism to allow a cluster backend report when it is ready
for publish operations. This is primarily useful for ZeroMQ which has
sender-side filtering and is only really ready for publishing when it
has learned about subscriptions from other nodes.
2025-04-25 09:57:06 +00:00
Arne Welzel
b0ecc131d0 Merge remote-tracking branch 'origin/topic/awelzel/comment-out-broker-websocket-shim-two-endpoint-tests'
* origin/topic/awelzel/comment-out-broker-websocket-shim-two-endpoint-tests:
  broker/WebSocketShim/tests: Comment out two endpoint tests
  broker/WebSocketShim/tests: Replace hard-coded timeout values with vars
2025-04-25 09:03:14 +02:00
Christian Kreibich
f5fbad23ff Add peer buffer update tracking to the Broker manager's event_observer
This implements basic tracking of each peering's current fill level, the maximum
level over a recent time interval (via a new Broker::buffer_stats_reset_interval
tunable, defaulting to 1min), and the number of times a buffer overflows. For
the disconnect policy this is the number of depeerings, but for drop_newest and
drop_oldest it implies the number of messages lost.

This doesn't use "proper" telemetry metrics for a few reasons: this tracking is
Broker-specific, so we need to track each peering via endpoint_ids, while we
want the metrics to use Cluster node name labels, and the latter live in the
script layer. Using broker::endpoint_id directly as keys also means we rely on
their ability to hash in STL containers, which should be fast.

This does not track the buffer levels for Broker "clients" (as opposed to
"peers"), i.e. WebSockets, since we currently don't have a way to name these,
and we don't want to use ephemeral Broker IDs in their telemetry.

To make the stats accessible to the script layer the Broker manager (via a new
helper class that lives in the event_observer) maintains a TableVal mapping
Broker IDs to a new BrokerPeeringStats record. The table's members get updated
every time that table is requested. This minimizes new val instantiation and
allows the script layer to customize the BrokerPeeringStats record by redefing,
updating fields, etc. Since we can't use Zeek vals outside the main thread, this
requires some care so all table updates happen only in the Zeek-side table
updater, PeerBufferState::GetPeeringStatsTable().
2025-04-24 22:47:18 -07:00
Christian Kreibich
23554280e0 Rename the Broker manager's LoggerAdapter
This is about to do more than just log handling, so this renames it simply to
Observer, reflecting the fact that it implements broker::event_observer.
2025-04-24 13:09:10 -07:00
Christian Kreibich
89780514fa Avoid race in the cluster/broker/publish-any btest
On very busy machines the hardwired scheduling of the ping batches could move
around among the arriving pongs, causing baseline deviations. We now wait for
each batch to complete before triggering the next one.
2025-04-24 13:09:10 -07:00
Tim Wojtulewicz
3ab83a3f74 Minor changes to storage framework script docs 2025-04-24 11:11:08 -07:00
Mohan Dhawan
8314b18092
Add fine-grained groups for Intel events 2025-04-24 23:24:40 +05:30
Arne Welzel
63a75c26c4 broker/WebSocketShim/tests: Comment out two endpoint tests
Running the remote tests on a loaded system results in timeouts, even
after bumping the tiemouts to 10 seconds. Comment them out for now.
2025-04-24 19:19:58 +02:00
Arne Welzel
8030ecf893 broker/WebSocketShim/tests: Replace hard-coded timeout values with vars 2025-04-24 19:19:58 +02:00
Arne Welzel
69a1ad2c3d Merge remote-tracking branch 'origin/topic/awelzel/cluster-fix-tsan-zeromq-do-terminate'
* origin/topic/awelzel/cluster-fix-tsan-zeromq-do-terminate:
  NEWS: Add entry about WebSocket client events
  btest/cluster: Testing cleanup
  cluster/websocket: Raise websocket_client_lost() after terminate
  cluster/ThreadedBackend: Invoke onloop->Process() during DoTerminate()
  cluster/ThreadedBackend: Remove Process()
  zeromq: Call super class DoTerminate() after stopping thread
2025-04-24 14:04:11 +02:00
Arne Welzel
7513d0ef1b NEWS: Add entry about WebSocket client events 2025-04-24 09:50:04 +02:00