Mirror/zeek - git.uphillsecurity.com: We code.

mirror of https://github.com/zeek/zeek.git synced 2025-10-02 06:38:20 +00:00

Author	SHA1	Message	Date
Christian Kreibich	ace5c11048	Bugfix: accurately track Broker buffer overflows w/ multiple peerings When a node restarts or a peering between two nodes starts over for other reasons, the internal tracking in the Broker manager resets its state (since it's per-peering), and thus the message overflow counter. The script layer was unaware of this, and threw errors when trying to reset the corresponding counter metric down to zero at sync time. We now track past buffer overflows via a separate epoch table, using Broker peer ID comparisons to identify new peerings, and set the counter to the sum of past and current overflows. I considered just making this a gauge, but it seems more helpful to be able to look at a counter to see whether any messages have ever been dropped over the lifetime of the node process. As an aside, this now also avoids repeatedly creating the labels vector, re-using the same one for each metric. Thanks to @pbcullen for identifying this one!	2025-05-07 17:30:45 -07:00
Christian Kreibich	d9f11643a2	Use Broker peering directionality when re-peering after backpressure overflows This avoids creating pointless connection reattempts to ephemeral TCP client-side ports, which have been cluttering up the Broker logs since 7.1. (cherry picked from commit `549e678dff`)	2025-04-29 17:00:50 -07:00
Christian Kreibich	4372cdfe2a	Expand Broker APIs to allow tracking directionality of peering establishment This provides ways to figure out for a given peer, or a given address/port pair, whether the local node originally established the peering. (cherry picked from commit `b430d5235c`)	2025-04-29 17:00:30 -07:00
Christian Kreibich	458b887df1	Lower listen/connect retry intervals in Broker and the cluster framework to 1sec The former defaults (30sec, 1min) can slow down cluster startup and recovery considerably, and other systems have more aggressive intervals still. (cherry picked from commit `68fadd0464`)	2025-04-29 16:47:13 -07:00
Christian Kreibich	446f49e6bc	Switch Broker's default backpressure policy to drop_oldest, bump buffer sizes At every site where we've dug into backpressure disconnect findings, it has been the case that the default values were too small. 8192, so 4x the old default, suffices at every site to drown out premature disconnects. With metrics now available for the send buffers regardless of backpressure overflow policy, this also switches the default from "disconnect" to "drop_oldest" (for both peers and websockets), meaning that peerings remain untouched but the oldest queued message simply gets dropped when a new message is enqueued. With this policy, the number of backpressure overflows is then simply the count of discarded messages, something that users can tune to see drop to zero in everyday use. Another benefit is that marginal overflows cause less message loss than when an entire buffer's worth (plus potentially more in-flight messages) gets thrown out with a disconnect. (cherry picked from commit `841a40ff88`)	2025-04-29 16:47:13 -07:00
Christian Kreibich	e6705732ec	Add basic btest to verify that Broker peering telemetry is available. This differs in the upstream version in that it explicitly invokes Telemetry::sync(), since 7.0.x doesn't have the on-demand invocation of the hook at scrape & collection time. (cherry picked from commit `35ab9d5c80`)	2025-04-29 16:47:07 -07:00
Christian Kreibich	8b9b16d7a8	Add cluster framework telemetry for Broker's send-buffer use This hooks into Telemetry::sync() to update Broker-level metrics tracking the peerings' send buffer state. We do this in the cluster framework so we can label the resulting metrics with Zeek cluster node names, not Broker's endpoint IDs. (cherry picked from commit `88a0cda8ca`)	2025-04-29 15:19:38 -07:00
Tim Wojtulewicz	c78335c47b	Fix use-after-move in recent broker changes (cherry picked from commit `f8d2f30cec`)	2025-04-29 15:08:05 -07:00
Christian Kreibich	d5bbf05a32	Add peer buffer update tracking to the Broker manager's event_observer This implements basic tracking of each peering's current fill level, the maximum level over a recent time interval (via a new Broker::buffer_stats_reset_interval tunable, defaulting to 1min), and the number of times a buffer overflows. For the disconnect policy this is the number of depeerings, but for drop_newest and drop_oldest it implies the number of messages lost. This doesn't use "proper" telemetry metrics for a few reasons: this tracking is Broker-specific, so we need to track each peering via endpoint_ids, while we want the metrics to use Cluster node name labels, and the latter live in the script layer. Using broker::endpoint_id directly as keys also means we rely on their ability to hash in STL containers, which should be fast. This does not track the buffer levels for Broker "clients" (as opposed to "peers"), i.e. WebSockets, since we currently don't have a way to name these, and we don't want to use ephemeral Broker IDs in their telemetry. To make the stats accessible to the script layer the Broker manager (via a new helper class that lives in the event_observer) maintains a TableVal mapping Broker IDs to a new BrokerPeeringStats record. The table's members get updated every time that table is requested. This minimizes new val instantiation and allows the script layer to customize the BrokerPeeringStats record by redefing, updating fields, etc. Since we can't use Zeek vals outside the main thread, this requires some care so all table updates happen only in the Zeek-side table updater, PeerBufferState::GetPeeringStatsTable(). (cherry picked from commit `f5fbad23ff`)	2025-04-29 15:08:05 -07:00
Christian Kreibich	3bf709f705	Rename the Broker manager's LoggerAdapter This is about to do more than just log handling, so this renames it simply to Observer, reflecting the fact that it implements broker::event_observer. (cherry picked from commit `23554280e0`)	2025-04-29 15:08:05 -07:00
Christian Kreibich	ae14eff1f6	Add event observer via Broker's new API This is a heavily modified version of `30615f425e`, part of PR #3998, removing all of the logging-specific parts. It only establishes the basic adapter and the broker::logging() call to register it.	2025-04-29 15:08:05 -07:00
Christian Kreibich	6e8906c0d8	Update scripts.base.frameworks.telemetry.internal-metrics baseline With the Broker submodule bump, the broker_buffered_messages metric no longer exists.	2025-04-29 15:08:05 -07:00
Christian Kreibich	5b29dad2c7	Bump Broker to pull in new observer API	2025-04-29 14:49:15 -07:00
Christian Kreibich	186dbe085f	Expand documentation of Broker events. (cherry picked from commit `feb2aa890d`)	2025-04-08 15:09:44 -07:00
Christian Kreibich	90ecf7ff0d	Add backpressure disconnect notification to cluster.log and via telemetry This adds a Broker-specific script to the cluster framework, loaded only when Zeek is running in cluster mode. It adds logging in cluster.log as well as telemetry via a metrics counter for Broker-observed backpressure disconnects. The new zeek_broker_backpressure_disconnects counter, labeled by the neighboring peer that the reporting node has determined to be unresponsive, counts the number of unpeerings for this reason. Here the node "worker" has observed node "proxy" falling behind once: # HELP zeek_broker_backpressure_disconnects_total Number of Broker peering drops due to a neighbor falling too far behind in message I/O # TYPE zeek_broker_backpressure_disconnects_total counter zeek_broker_backpressure_disconnects_total{endpoint="worker",peer="proxy"} 1 Includes small btest baseline update to reflect @load of a new script. (cherry picked from commit `ead6134501`)	2025-04-08 15:09:44 -07:00
Christian Kreibich	67f135f57a	Remove unneeded @loads from base/misc/version.zeek This module is loaded by the telemetry framework, which we're now loading via the cluster framework, i.e. also in bare mode. The resulting additional thread (for creating reporter.log) trips up a number of btest baselines. version.zeek doesn't use any of the string helper functions. (cherry picked from commit `d260a5b7a9`)	2025-04-08 15:09:44 -07:00
Christian Kreibich	06fa47e21d	Add Cluster::nodeid_to_node() helper function This translates backend-specific node identifiers (like Broker IDs) to cluster nodes and their names, if available. (cherry picked from commit `46a11ec37d`)	2025-04-08 15:09:44 -07:00
Christian Kreibich	1cbbbc5c40	Support re-peering with Broker peers that fall behind This adds re-peering at the Broker level for peers that Broker decided to unpeer. We keep this at the Broker level since this behavior is specific to it (as opposed to other cluster backends). Includes baseline updates for btests that pick up on the new script's @load. (cherry picked from commit `0010e65f6d`)	2025-04-08 15:09:44 -07:00
Dominik Charousset	eeb0e7184d	Add Zeek-level configurability of Broker slow-peer disconnects (cherry picked from commit `4c4eb4b8e2`)	2025-04-08 15:09:44 -07:00
Christian Kreibich	f7e8fe1d68	Bump Broker to pull in disconnect feature and infinite-loop fix (cherry picked from commit `b9df1674b7`)	2025-04-08 15:09:41 -07:00
Christian Kreibich	11701d4734	No need to namespace Cluster:: functions in their own namespace (cherry picked from `e81856a4af`)	2025-04-08 14:50:50 -07:00
Christian Kreibich	2ad80f8fb2	Telemetry framework: move BIFs to the primary-bif stage This moves the Telemetry framework's BIF-defined functionalit from the secondary-BIFs stage to the primary one. That is, this functionality is now available from the end of init-bare.zeek, not only after the end of init-frameworks-and-bifs.zeek. This allows us to use script-layer telemetry in our Zeek's own code that get pulled in during init-frameworks-and-bifs. This change splits up the BIF features into functions, constants, and types, because that's the granularity most workable in Func.cc and NetVar. It also now defines the Telemetry::MetricsType enum once, not redundantly in BIFs and script layer. Due to subtle load ordering issues between the telemetry and cluster frameworks this pushes the redef stage of Telemetry::metrics_port and address into base/frameworks/telemetry/options.zeek, which is loaded sufficiently late in init-frameworks-and-bifs.zeek to sidestep those issues. (When not doing this, the effect is that the redef in telemetry/main.zeek doesn't yet find the cluster-provided values, and Zeek does not end up listening on these ports.) The need to add basic Zeek headers in script_opt/ZAM/ZBody.cc as a side-effect of this is curious, but looks harmless. Also includes baseline updates for the usual btests and adds a few doc strings. (cherry picked from commit `71f7e89974`)	2025-04-08 14:50:45 -07:00
Christian Kreibich	5503688758	Minor comment tweaks for init-frameworks-and-bifs.zeek (cherry picked from `acdd7a7934`)	2025-04-08 14:50:28 -07:00
Tim Wojtulewicz	3e5060018a	Update docs submodule to fix RTD [nomail] [skip ci]	2025-03-20 13:48:45 -07:00
Tim Wojtulewicz	9f8e27118e	Update CHANGES, VERSION, and NEWS for 7.0.6 release	2025-03-20 12:24:26 -07:00
Tim Wojtulewicz	89376095dc	Update zeekctl submodule to fix a couple btests	2025-03-19 13:04:31 -07:00
Tim Wojtulewicz	3e8af6497e	Update zeekjs to v0.16.0	2025-03-19 10:43:17 -07:00
Tim Wojtulewicz	5051cce720	Updating CHANGES and VERSION.	2025-03-19 10:43:02 -07:00
Tim Wojtulewicz	c30b835a14	Update mozilla-ca-list.zeek and ct-list.zeek to NSS 3.109	2025-03-18 17:59:01 -07:00
Tim Wojtulewicz	a041080e3f	Update core/vntag-in-vlan baseline to remove ip_proto field for 7.0	2025-03-18 17:03:05 -07:00
Tim Wojtulewicz	fc3001c76a	CI: Force rebuild of tumbleweed docker image	2025-03-18 16:33:45 -07:00
Tim Wojtulewicz	e2b2c79306	Merge remote-tracking branch 'origin/topic/timw/ci-macos-upgrade-pip' * origin/topic/timw/ci-macos-upgrade-pip: CI: Unconditionally upgrade pip on macOS (cherry picked from commit `e8d91c8227`)	2025-03-18 16:21:45 -07:00
Tim Wojtulewicz	ed32ee73fa	Merge remote-tracking branch 'origin/topic/timw/ci-macos-sequoia' * origin/topic/timw/ci-macos-sequoia: ci/init-external-repo.sh: Use regex to match macos cirrus task CI: Change macOS runner to Sequoia (cherry picked from commit `43f108bb71`)	2025-03-18 16:21:13 -07:00
Tim Wojtulewicz	eed9858bc4	CI: Update freebsd to 13.4 and 14.2	2025-03-18 16:20:06 -07:00
Tim Wojtulewicz	ed081212ae	Merge remote-tracking branch 'origin/topic/timw/vntag-in-vlan' * origin/topic/timw/vntag-in-vlan: Add analyzer registration from VLAN to VNTAG (cherry picked from commit `cb5e3d0054`)	2025-03-18 16:18:13 -07:00
Arne Welzel	ec04c925a0	Merge remote-tracking branch 'origin/topic/awelzel/2311-load-plugin-bare-mode' * origin/topic/awelzel/2311-load-plugin-bare-mode: scan.l: Fix @load-plugin scripts loading scan.l: Extract switch_to() from load_files() ScannedFile: Allow skipping canonicalization (cherry picked from commit `a3a08fa0f3`)	2025-03-18 16:16:39 -07:00
Arne Welzel	de8127f3cd	Merge remote-tracking branch 'origin/topic/awelzel/4198-4201-quic-maintenance' * origin/topic/awelzel/4198-4201-quic-maintenance: QUIC/decrypt_crypto: Rename all_data to data QUIC: Confirm before forwarding data to SSL QUIC: Parse all QUIC packets in a UDP datagram QUIC: Only slurp till packet end, not till &eod (cherry picked from commit `44304973fb`)	2025-03-18 16:15:34 -07:00
Arne Welzel	b5774f2de9	Merge remote-tracking branch 'origin/topic/vern/ZAM-field-assign-in-op' * origin/topic/vern/ZAM-field-assign-in-op: pre-commit: Bump spicy-format to 0.23 fix for ZAM optimization of assigning a record field to result of "in" operation (cherry picked from commit `991bc9644d`)	2025-03-18 16:13:01 -07:00
Tim Wojtulewicz	7c8a7680ba	Update CHANGES, VERSION, and NEWS for 7.0.5 release	2024-12-16 11:12:48 -07:00
Tim Wojtulewicz	26b50908e1	Merge remote-tracking branch 'security/topic/timw/7.0.5-patches' into release/7.0 * security/topic/timw/7.0.5-patches: QUIC/decrypt_crypto: Actually check if decryption was successful QUIC/decrypt_crypto: Limit payload_length to 10k QUIC/decrypt_crypto: Fix decrypting into too small stack buffer	2024-12-16 10:21:59 -07:00
Arne Welzel	c2f2388f18	QUIC/decrypt_crypto: Actually check if decryption was successful ...and bail if it wasn't. PCAP was produced using OSS-Fuzz input from issue 383379789.	2024-12-13 13:10:45 -07:00
Arne Welzel	d745d746bc	QUIC/decrypt_crypto: Limit payload_length to 10k Given we dynamically allocate memory for decryption, employ a limit that is unlikely to be hit, but allows for large payloads produced by the fuzzer or jumbo frames.	2024-12-13 13:10:45 -07:00
Arne Welzel	5fbb6b4599	QUIC/decrypt_crypto: Fix decrypting into too small stack buffer A QUIC initial packet larger than 1500 bytes could lead to crashes due to the usage of a fixed size stack buffer for decryption. Allocate the necessary memory dynamically on the heap instead.	2024-12-13 13:10:45 -07:00
Tim Wojtulewicz	7c463b5f92	Update docs submodule [nomail] [skip ci]	2024-12-13 13:08:51 -07:00
Tim Wojtulewicz	e7f694bcbb	Merge remote-tracking branch 'origin/topic/vern/ZAM-tbl-iteration-memory-mgt-fix' * origin/topic/vern/ZAM-tbl-iteration-memory-mgt-fix: fix for memory management associated with ZAM table iteration (cherry picked from commit `805e9db588`)	2024-12-13 12:27:16 -07:00
Arne Welzel	f54416eae4	Merge remote-tracking branch 'origin/topic/christian/fix-zam-analyzer-name' * origin/topic/christian/fix-zam-analyzer-name: Fix ZAM's implementation of Analyzer::name() BiF (cherry picked from commit `e100a8e698`)	2024-12-12 13:14:10 -07:00
Arne Welzel	68bfe8d1c0	Merge remote-tracking branch 'origin/topic/vern/zam-exception-leaks' * origin/topic/vern/zam-exception-leaks: More robust memory management for ZAM execution - fixes #4052 (cherry picked from commit `c3b30b187e`)	2024-12-12 13:05:13 -07:00
Arne Welzel	cf97ed6ac1	Merge remote-tracking branch 'origin/topic/awelzel/bump-zeekjs-0-14-0' * origin/topic/awelzel/bump-zeekjs-0-14-0: Bump zeekjs to v0.14.0 (cherry picked from commit `aac640ebff`)	2024-12-12 12:45:14 -07:00
Benjamin Bannier	35cd891d6e	Merge remote-tracking branch 'origin/topic/bbannier/doc-have-spicy' (cherry picked from commit `4a96d34af6`)	2024-12-12 12:43:43 -07:00
Tim Wojtulewicz	f300ddb9fe	Update CHANGES, VERSION, and NEWS for 7.0.4 release	2024-11-19 12:35:32 -07:00

1 2 3 4 5 ...

16818 commits