Commit graph

5326 commits

Author SHA1 Message Date
Arne Welzel
aaddeb19ad cluster/websocket: Support configurable ping interval
Primarily for testing purposes and maybe the hard-coded 5 seconds is too
aggressive for some deployments, so makes sense for it to be
configurable.
2025-05-13 18:26:03 +02:00
Tim Wojtulewicz
fd10dd015f Move options to redis backend options instead of module-level options 2025-05-07 15:38:58 -07:00
Tim Wojtulewicz
824b91216f Add operation_timeout and command_timeout storage backend options 2025-05-07 15:38:58 -07:00
Tim Wojtulewicz
6f8924596f Merge remote-tracking branch 'origin/topic/johanna/fix-failed-service-logging'
* origin/topic/johanna/fix-failed-service-logging:
  Fix policy/protocols/conn/failed-service-logging.zeek
2025-05-07 10:29:54 -07:00
Tim Wojtulewicz
8096388904 Update opt.ZAM-bif-tracking baseline 2025-05-07 09:12:56 -07:00
Arne Welzel
8089f5bed4 Merge remote-tracking branch 'origin/topic/awelzel/more-terminate-while-queueing-hardening'
* origin/topic/awelzel/more-terminate-while-queueing-hardening:
  btest/cluster/generic/publish-any: Apply Christian's fix from broker/publish-any
  wstest/terminate-while-queueing: Patch close_socket()
2025-05-07 17:24:04 +02:00
Arne Welzel
3ec3205074 btest/cluster/generic/publish-any: Apply Christian's fix from broker/publish-any 2025-05-07 17:18:01 +02:00
Tim Wojtulewicz
58ee8d3c5c Add Storage::is_connected BIF 2025-05-07 08:13:16 -07:00
Arne Welzel
82731992d9 wstest/terminate-while-queueing: Patch close_socket()
I believe there's a bug/usage issue in the websockets library
where during send(), EOF is detected and stored, but the receiving
thread is then discarding the last received frame. Avoid the bug
by replacing the close_socket() implementation of the websockets
library just for that test and leave detecting the EOF condition
to the receiving thread.
2025-05-07 16:33:54 +02:00
Arne Welzel
ca02316671 cluster/websocket: Stop and wait for reply thread during Terminate()
The terminate-while-queueing test added for #4428 failed spuriously
indicating that sometimes WebSocket clients receive code 1000 instead of 1001.
This happens if the ixwebsocket server is shutdown before the reply thread had a
chance to process queued close messages.

Fix by signaling and waiting for the dispatcher's reply thread to terminate
before returning from Terminate().
2025-05-07 12:45:01 +02:00
Johanna Amann
f293d5a852 Fix policy/protocols/conn/failed-service-logging.zeek
In GH-4422 it was pointed out that the protocols/conn/failed-service-logging.zeek
policy script only works when
`DPD::track_removed_services_in_connection=T` is set.

This was caused by a logic error in the script. This commit fixes this
logic error and introduces an additional test that checks that
failed-service-logging works even when the option is not set to true.
2025-05-06 13:37:12 +01:00
Arne Welzel
3be7a9ce91 Merge remote-tracking branch 'origin/topic/awelzel/double-commented-btest-lines'
* origin/topic/awelzel/double-commented-btest-lines:
  testing/btest: Fix double commented @TEST- lines
2025-05-06 14:21:03 +02:00
Arne Welzel
bb06af601f Websocket: Close onloop during Terminate()
Terminate() is called when Zeek shuts down. If WebSocket client threads
were blocked in QueueForProcessing() due to reaching queue limits, these
previously would not exit QueueForProcessing() and instead block
indefinitely, resulting in the ixwebsocket library blocking and its
garbage collection thread running at 100%. Not great.

Closing the onloop instance will unblock the WebSocket client threads
for a timely shutdown.

Closes #4420
2025-05-06 14:19:08 +02:00
Arne Welzel
0e327a0c12 testing/btest: Fix double commented @TEST- lines
sed -i 's/^# # @/# @/g'
2025-05-06 14:06:29 +02:00
Tim Wojtulewicz
0393e4b84a Merge remote-tracking branch 'XueSongTap/master'
* XueSongTap/master:
  Add baseline for find_first test, update comments, and reorder function imports
  Add find_first string function
2025-05-05 13:40:40 -07:00
Arne Welzel
50ac8d1468 Merge remote-tracking branch 'origin/topic/awelzel/4405-quic-fragmented-crypto'
* origin/topic/awelzel/4405-quic-fragmented-crypto:
  Bump external/zeek-testing
  QUIC: Extract reset_crypto() function
  QUIC: Rename ConnectionIDInfo to Context
  QUIC: Switch initial_destination_conn_id to optional
  QUIC: Use initial destination conn_id for decryption
  QUIC: Handle CRYPTO frames across multiple INITIAL packets
  QUIC: Do not consume EncryptedLongPacketPayload
  QUIC: Fix ACK frame parsing
2025-05-05 14:40:59 +02:00
Arne Welzel
8fd3cbf7cc Bump external/zeek-testing 2025-05-05 14:34:38 +02:00
Arne Welzel
fe89a521d1 QUIC: Use initial destination conn_id for decryption
Ensure the client side also uses the initial destination connection ID
for decryption purposes instead of the one from the current long header
packet. PCAP from local WiFi hotspot.
2025-05-05 14:34:11 +02:00
Arne Welzel
ae90524027 QUIC: Handle CRYPTO frames across multiple INITIAL packets
Instead of sending the accumulated CRYPTO frames after processing an
INITIAL packet, add logic to determine the total length of the TLS
Client or Server Hello (by peeking into the first 4 byte). Once all
CRYPTO frames have arrived, flush the reassembled data to the TLS
analyzer at once.
2025-05-05 14:34:11 +02:00
Arne Welzel
e459d96fb6 QUIC: Do not consume EncryptedLongPacketPayload
The payload is already consumed within the InitialPacket unit. Consuming
it again resulted in UDP datagrams with multiple packets to ignore
the remaining packets in the same UDP datagram. The baseline changes
showing I being followed by a new H indicates that the INITIAL packet
was followed by a HANDSHAKE packet, but previously Zeek discarded
these.
2025-05-05 14:34:11 +02:00
yexiaochuan
fd7045e274 Add baseline for find_first test, update comments, and reorder function imports 2025-05-02 11:51:45 +08:00
Arne Welzel
d655c64e0b Merge remote-tracking branch 'origin/topic/awelzel/event-publish-hook'
* origin/topic/awelzel/event-publish-hook:
  NEWS: Add HookPublishEvent() note
  btest/plugin: Test for PublishEventHook()
  broker and cluster: Wire up HookPublishEvent
  plugin: Add HookPublishEvent hook
2025-04-30 17:57:46 +02:00
Arne Welzel
0bf3417d4c btest/plugin: Test for PublishEventHook() 2025-04-30 17:26:33 +02:00
Arne Welzel
f8b75426ee Merge remote-tracking branch 'origin/topic/awelzel/bif-tracking-no-zeromq'
* origin/topic/awelzel/bif-tracking-no-zeromq:
  ZAM-bif-tracking: Remove ZeroMQ dependency
2025-04-30 17:23:22 +02:00
Arne Welzel
90eb22ce73 ZAM-bif-tracking: Remove ZeroMQ dependency
Vern didn't have ZeroMQ installed and the test was skipped for him.
Generally would recommend anyone working on core Zeek to install
libzmq-dev or the equivalent for their environment, but until it is a
real required dependency, loosen the requirements on the test.
2025-04-30 17:08:21 +02:00
Tim Wojtulewicz
e56de061f9 Merge remote-tracking branch 'origin/topic/vern/zam-inlining-temps'
* origin/topic/vern/zam-inlining-temps:
  fixed incorrect ZAM optimization of expressions seen in single-statement inlined functions
2025-04-29 17:50:39 -07:00
Vern Paxson
d2762fb247 fixed incorrect ZAM optimization of expressions seen in single-statement inlined functions 2025-04-29 14:29:07 -07:00
yexiaochuan
6c240dc0bb Add find_first string function 2025-04-30 00:15:34 +08:00
Tim Wojtulewicz
2cf8497bf7 Merge remote-tracking branch 'origin/topic/timw/update-ct-ca-lists'
* origin/topic/timw/update-ct-ca-lists:
  External tests: add removed logs to CT list to prevent baseline changes
  Update Mozilla CA list and CT list to NSS 3.110
2025-04-29 08:53:04 -07:00
Arne Welzel
7092db6318 broker/Data/data_to_val: Fail on vectors/lists with holes
Instead of simply removing holes from vectors or lists when converting
from Val to Broker format, error out as the receiver has no chance to
reconstruct where the hole might have been.

We could encode holes with broker::none, but this will put unnecessary
burden on language bindings and users due to the potential optionality.
Think a std::vector<uint64_t> that technically needs to be a
std::vector<std::optional<uint64_t>> to represent optional elements
properly.

Closes #3045
2025-04-28 18:23:37 +02:00
Johanna Amann
28ec4e2f2a External tests: add removed logs to CT list to prevent baseline changes 2025-04-28 16:42:52 +01:00
Tim Wojtulewicz
223c5ab955 Start of 8.0.0 development 2025-04-25 11:59:08 -07:00
Kshitiz Bartariya
40935c31b1 Ignore case when matching prefix in http analyzer 2025-04-25 10:33:11 -07:00
Christian Kreibich
68fadd0464 Lower listen/connect retry intervals in Broker and the cluster framework to 1sec
The former defaults (30sec, 1min) can slow down cluster startup and recovery
considerably, and other systems have more aggressive intervals still.
2025-04-25 10:22:35 -07:00
Christian Kreibich
7540d48fd5 Bump cluster testsuite
This pulls in an update for the backpressure disconnect tests, which now need to
set the policy explicitly.
2025-04-25 10:22:35 -07:00
Christian Kreibich
c1a5f70df8 Merge branch 'topic/christian/broker-backpressure-metrics'
* topic/christian/broker-backpressure-metrics:
  Add basic btest to verify that Broker peering telemetry is available.
  Add cluster framework telemetry for Broker's send-buffer use
  Add peer buffer update tracking to the Broker manager's event_observer
  Rename the Broker manager's LoggerAdapter
  Avoid race in the cluster/broker/publish-any btest
2025-04-25 10:04:09 -07:00
Christian Kreibich
35ab9d5c80 Add basic btest to verify that Broker peering telemetry is available. 2025-04-25 09:15:17 -07:00
Christian Kreibich
88a0cda8ca Add cluster framework telemetry for Broker's send-buffer use
This hooks into Telemetry::sync() to update Broker-level metrics tracking the
peerings' send buffer state. We do this in the cluster framework so we can label
the resulting metrics with Zeek cluster node names, not Broker's endpoint IDs.
2025-04-25 09:14:33 -07:00
Arne Welzel
43a1bab960 btest/cluster/websocket: Move no-subscriptions test
...and also add one for broker.
2025-04-25 10:01:23 +00:00
Christian Kreibich
f5fbad23ff Add peer buffer update tracking to the Broker manager's event_observer
This implements basic tracking of each peering's current fill level, the maximum
level over a recent time interval (via a new Broker::buffer_stats_reset_interval
tunable, defaulting to 1min), and the number of times a buffer overflows. For
the disconnect policy this is the number of depeerings, but for drop_newest and
drop_oldest it implies the number of messages lost.

This doesn't use "proper" telemetry metrics for a few reasons: this tracking is
Broker-specific, so we need to track each peering via endpoint_ids, while we
want the metrics to use Cluster node name labels, and the latter live in the
script layer. Using broker::endpoint_id directly as keys also means we rely on
their ability to hash in STL containers, which should be fast.

This does not track the buffer levels for Broker "clients" (as opposed to
"peers"), i.e. WebSockets, since we currently don't have a way to name these,
and we don't want to use ephemeral Broker IDs in their telemetry.

To make the stats accessible to the script layer the Broker manager (via a new
helper class that lives in the event_observer) maintains a TableVal mapping
Broker IDs to a new BrokerPeeringStats record. The table's members get updated
every time that table is requested. This minimizes new val instantiation and
allows the script layer to customize the BrokerPeeringStats record by redefing,
updating fields, etc. Since we can't use Zeek vals outside the main thread, this
requires some care so all table updates happen only in the Zeek-side table
updater, PeerBufferState::GetPeeringStatsTable().
2025-04-24 22:47:18 -07:00
Christian Kreibich
89780514fa Avoid race in the cluster/broker/publish-any btest
On very busy machines the hardwired scheduling of the ping batches could move
around among the arriving pongs, causing baseline deviations. We now wait for
each batch to complete before triggering the next one.
2025-04-24 13:09:10 -07:00
Arne Welzel
2a6beae50b btest/cluster: Testing cleanup 2025-04-24 09:35:53 +02:00
Arne Welzel
23f0370e91 cluster/websocket: Short-circuit clients without subscriptions 2025-04-24 08:14:56 +02:00
Arne Welzel
f2e60fdaff btest/cluster: Add broker logging test for sanity
Not very related to the PR, but created to help provoke an issue
with the broker changes.
2025-04-23 14:27:43 +02:00
Arne Welzel
011029addc cluster/websocket: Make websocket dispatcher queue size configurable
Limit the number WebSocket events queued from external clients to
dispatcher instances to produce back pressure to the clients if
Zeek's IO loop is overloaded.
2025-04-23 14:27:43 +02:00
Arne Welzel
6bd624d9b2 cluster/zeromq: Attempt publish during termination
Explicitly notify the internal thread about the shutdown via the
inproc socket pair. This ensures that the internal thread processes
all previous messages on the inproc socket before terminating.

This fixes the scenario where a backend is created, a few messages published
and then immediately terminated as can be done with WebSocket clients.
Previously, some of the messages published might have still been in the
inproc socket's queue and were simply discarded.

Adds the same test for Broker and ZeroMQ backends.
2025-04-23 14:27:43 +02:00
Arne Welzel
0c8f52664d btest/cluster/websocket: Add tests using broker
Add tests to verify Cluster::listen_websocket() with the Broker backend
is functional.
2025-04-23 14:27:43 +02:00
Arne Welzel
3319615c65 btest/cluster/websocket: Move ZeroMQ test and use wstest.py
Adapt the test to be the same as Broker, to have "expected" behavior.
2025-04-23 14:27:43 +02:00
Arne Welzel
1191f6b66d btest/files: Introduce wstest.py
This adds a minimal helper library for reusing some of the code to
test WebSocket client access to Zeek using Python.
2025-04-23 14:27:43 +02:00
Arne Welzel
76c508f001 broker: Add WebSocketShim backend
This adds a cluster backend implementation using broker's hub primitive
to connect WebSocket clients with the local broker endpoint for pub/sub
functionality.
2025-04-23 14:27:43 +02:00