Commit graph

2990 commits

Author SHA1 Message Date
Christian Kreibich
1dcd13a019 Fix a typo. 2025-06-05 17:51:54 -07:00
Johanna Amann
58613f0313 Introduce new c$failed_analyzers field
This field is used internally to trace which analyzers already had a
violation. This is mostly used to prevent duplicate logging.

In the past, c$service_violation was used for a similar purpose -
however it has slightly different semantics. Where c$failed_analyzers
tracks analyzers that were removed due to a violation,
c$service_violation tracks violations - and doesn't care if an analyzer
was actually removed due to it.
2025-06-04 12:07:13 +01:00
Johanna Amann
42ba2fcca0 Settle on analyzer.log for the dpd.log replacement
This commit renames analyzer-failed.log to analyzer.log, and updates the
respective news entry.
2025-06-03 17:33:36 +01:00
Johanna Amann
130c89a0a7 dpd->analyzer.log change - rename files
To address review feedback in GH-4362: rename analyzer-failed-log.zeek
to loggig.zeek, analyzer-debug-log.zeek to debug-logging.zeek and
dpd-log.zeek to deprecated-dpd-log.zeek.

Includes respective test, NEWS, etc updates.
2025-06-03 16:32:52 +01:00
Johanna Amann
af77a7a83b Analyzer failure logging: tweaks and test fixes
The main part of this commit are changes in tests. A lot of the tests
that previously relied on analyzer.log or dpd.log now use the new
analyzer-failed.log.

I verified all the changes and, as far as I can tell, everything
behaves as it should. This includes the external test baselines.

This change also enables logging of file and packet analyzer to
analyzer_failed.log and fixes some small behavior issues.

The analyzer_failed event is no longer raised when the removal of an
analyzer is vetoed.

If an analyzer is no longer active when an analyzer violation is raised,
currently the analyzer_failed event is raised. This can, e.g., happen
when an analyzer error happens at the very end of the connection. This
makes the behavior more similar to what happened in the past, and also
intuitively seems to make sense.

A bug introduced in the failed service logging was fixed.
2025-06-03 15:56:42 +01:00
Johanna Amann
8c814fa88c Introduce analyzer-failed.log, as a replacement for dpd.log
Analyzer-failed.log is, essentially, the replacement for dpd.log. The
name should make more sense, as it does now log analyzer failures. For
protocol analyzers specifically, these are failures that lead to the
analyzer being disabled.
2025-06-03 15:17:26 +01:00
Johanna Amann
c55e21da71 Rename analyzer.log to analyzer.debug log; move to policy
The current analyzer.log is more useful for debugging than for
operational purposes. Hence this is disabled by default, moved to a
policy script, and the log is renamed to analyzer-debug.log.

Furthermore, logging of analyzer confirmations and disabling analyzers
are now enabled by default.
2025-06-03 15:17:26 +01:00
Johanna Amann
6183c5086b Move dpd.log to policy script
This is the first phase of moving from the current dpd log to a more
modern logfile, without some of the weirdnesses that the current dpd log
contains.

Tests will not pass in the current state; this is just splitting out
functionality.
2025-06-03 15:17:26 +01:00
Arne Welzel
0a34b39e7a Merge remote-tracking branch 'origin/topic/awelzel/4177-4178-custom-event-metadata-part-2'
* origin/topic/awelzel/4177-4178-custom-event-metadata-part-2:
  Event: Bail on add_missing_remote_network_timestamp without add_network_timestamp
  btest/plugin: Test custom metadata publish
  NEWS: Add note about generic event metadata
  cluster: Remove deprecated Event constructor
  cluster: Remove some explicit timestamp handling
  broker/Manager: Fetch and forward all metadata from events
  Event/init-bare: Add add_missing_remote_network_timestamp logic
  cluster/Backend/DoProcessEvent: Use generic metadata, not just timestamps
  cluster/Event: Support moving args and metadata from event
  cluster/serializer/broker: Support generic metadata
  cluster/Event: Generic metadata support
  Event: Use -1.0 for undefined/unset timestamps
  cluster: Use shorter obj_desc versions
  Desc: Add obj_desc() / obj_desc_short() overloads for IntrusivePtr
2025-06-02 17:33:22 +02:00
Arne Welzel
96f2d5d369 Event/init-bare: Add add_missing_remote_network_timestamp logic
Make defaulting to the local network timestamp for remote events opt-in.
2025-06-02 17:31:36 +02:00
Arne Welzel
7eb849ddf4 intel: Add indicator_inserted and indicator_removed hooks
This change adds two new hooks to the Intel framework that can be used
to intercept added and removed indicators and their type.

These hooks are fairly low-level. One immediate use-case is to count the
number of indicators loaded per Intel::Type and enable and disable the
corresponding event groups of the intel/seen scripts.

I attempted to gauge the overhead and while it's definitely there, loading
a file with ~500k DOMAIN entries takes somewhere around ~0.5 seconds hooks
when populated via the min_data_store store mechanism. While that
doesn't sound great, it actually takes the manager on my system 2.5
seconds to serialize and Cluster::publish() the min_data_store alone
and its doing that serially for every active worker. Mostly to say that
the bigger overhead in that area on the manager doing redundant work
per worker.

Co-authored-by: Mohan Dhawan <mohan@corelight.com>
2025-06-02 09:50:48 +02:00
Arne Welzel
93813a5079 logging/ascii/json: Make TS_MILLIS signed, add TS_MILLIS_UNSIGNED
It seems TS_MILLIS is specifically for Elasticsearch and starting with
Elasticsearch 8.2 epoch_millis does (again?) support negative epoch_millis,
so make Zeek produce that by default.

If this breaks a given deployment, they can switch Zeek back to TS_MILLIS_UNSIGNED.

https://discuss.elastic.co/t/migration-from-es-6-8-to-7-17-issues-with-negative-date-epoch-timestamp/335259
https://github.com/elastic/elasticsearch/pull/80208

Thanks for @timo-mue for reporting!

Closes #4494
2025-05-30 17:23:29 +02:00
Arne Welzel
544d571089 cluster/websocket: Deprecate $listen_host, introduce $listen_addr
This only changes the script-layer API, but keeps the std::string host
in the C++ layer's ServerOptions. Mostly because the ixwebsocket library
takes host as std::string. Also, maybe at  some point we'd want to
support something scheme-based like unix:///var/run/zeek.sock and placing
that in a string could not be totally wrong.

Add tests for IPV6, too.
2025-05-30 11:02:41 +02:00
Evan Typanski
b4429a995a spicy-redis: Separate error replies from success 2025-05-27 09:31:25 -04:00
Evan Typanski
d5b121db14 spicy-redis: Cleanup scripts and tests
- Recomputes checksums for pcaps to keep clean
- Removes some tests that had big pcaps or weren't necessary
- Cleans up scripting names and minor points
- Comments out Spicy code that causes a build failure now with a TODO to
  uncomment it
2025-05-27 09:29:13 -04:00
Evan Typanski
11777bd6d5 spciy-redis: Bring Redis analyzer into Zeek proper 2025-05-27 09:28:12 -04:00
Evan Typanski
7f28ec8bc5 spicy-redis: Add dpd signature and clean pcaps 2025-05-27 09:28:12 -04:00
Evan Typanski
f0e9f46c7c spicy-redis: Add some commands and touch up parsing 2025-05-27 09:28:12 -04:00
Evan Typanski
22bda56af3 spicy-redis: Add some script logic for logging
Also "rebrands" from RESP to Redis.
2025-05-27 09:28:12 -04:00
Evan Typanski
757cbbf902 spicy-redis: Separate client/server
This makes the parser more official and splits the client/server out
from each other. Apparently they're different enough to be separate.
2025-05-27 09:28:12 -04:00
Evan Typanski
f0f2969a66 spicy-redis: Touchup logging and Spicy issues 2025-05-27 09:28:12 -04:00
Evan Typanski
97d26a689d spicy-redis: Add synchronization and pipeline support
Also adds some command support
2025-05-27 09:28:12 -04:00
Evan Typanski
4210e62e57 spicy-redis: Begin Spicy Redis analyzer 2025-05-27 09:28:12 -04:00
Arne Welzel
53b0f0ad64 Event: Deprecate default network timestamp metadata
This deprecates the Event constructor and the ``ts`` parameter of Enqueue()
Instead, versions are introduced that take a detail::MetadataVectorPtr which
can hold the network timestamp metadata and is meant to be allocated by the
caller instead of automatically during Enqueue() or within the Event
constructor.

This also introduces a BifConst ``EventMetadata::add_network_timestamp`` to
opt-in adding network timestamps to events globally. It's disabled by
default as there are not a lot of known use cases that need this.
2025-05-23 19:32:23 +02:00
Arne Welzel
cc7dc60c1e EventRegistry/zeek.bif/init-bare: Add event metadata infrastructure
Introduce a new EventMetadata module and members on EventMgr to register
event metadata types.
2025-05-23 19:31:58 +02:00
Christian Kreibich
fdecfba6b4 Merge branch 'smoot-improve-from_json' of github.com:/stevesmoot/zeek
* 'smoot-improve-from_json' of github.com:/stevesmoot/zeek:
  update baseline for zam
  Update src/zeek.bif
  Change from_json to return an error rather than print it.
2025-05-19 11:06:29 -07:00
Jan Grashoefer
84cc4b890d Add STLS command to POP3 DPD signature 2025-05-14 16:37:25 +02:00
Arne Welzel
a61aff010f cluster/websocket: Propagate code and reason to websocket_client_lost()
This allows to get visibility into the reason why ixwebsocket or the
client decided to disconnect.

Closed #4440
2025-05-13 18:26:03 +02:00
Arne Welzel
aaddeb19ad cluster/websocket: Support configurable ping interval
Primarily for testing purposes and maybe the hard-coded 5 seconds is too
aggressive for some deployments, so makes sense for it to be
configurable.
2025-05-13 18:26:03 +02:00
Christian Kreibich
738ce1c235 Bugfix: accurately track Broker buffer overflows w/ multiple peerings
When a node restarts or a peering between two nodes starts over for other
reasons, the internal tracking in the Broker manager resets its state (since
it's per-peering), and thus the message overflow counter. The script layer was
unaware of this, and threw errors when trying to reset the corresponding counter
metric down to zero at sync time.

We now track past buffer overflows via a separate epoch table, using Broker peer
ID comparisons to identify new peerings, and set the counter to the sum of past
and current overflows.

I considered just making this a gauge, but it seems more helpful to be able to
look at a counter to see whether any messages have ever been dropped over the
lifetime of the node process.

As an aside, this now also avoids repeatedly creating the labels vector,
re-using the same one for each metric.

Thanks to @pbcullen for identifying this one!
2025-05-07 17:27:38 -07:00
Tim Wojtulewicz
2cf8497bf7 Merge remote-tracking branch 'origin/topic/timw/update-ct-ca-lists'
* origin/topic/timw/update-ct-ca-lists:
  External tests: add removed logs to CT list to prevent baseline changes
  Update Mozilla CA list and CT list to NSS 3.110
2025-04-29 08:53:04 -07:00
Kshitiz Bartariya
40935c31b1 Ignore case when matching prefix in http analyzer 2025-04-25 10:33:11 -07:00
Christian Kreibich
68fadd0464 Lower listen/connect retry intervals in Broker and the cluster framework to 1sec
The former defaults (30sec, 1min) can slow down cluster startup and recovery
considerably, and other systems have more aggressive intervals still.
2025-04-25 10:22:35 -07:00
Christian Kreibich
841a40ff88 Switch Broker's default backpressure policy to drop_oldest, bump buffer sizes
At every site where we've dug into backpressure disconnect findings, it has been
the case that the default values were too small. 8192, so 4x the old default,
suffices at every site to drown out premature disconnects.

With metrics now available for the send buffers regardless of backpressure
overflow policy, this also switches the default from "disconnect" to
"drop_oldest" (for both peers and websockets), meaning that peerings remain
untouched but the oldest queued message simply gets dropped when a new message
is enqueued. With this policy, the number of backpressure overflows is then
simply the count of discarded messages, something that users can tune to see
drop to zero in everyday use.  Another benefit is that marginal overflows cause
less message loss than when an entire buffer's worth (plus potentially more
in-flight messages) gets thrown out with a disconnect.
2025-04-25 10:22:35 -07:00
Christian Kreibich
5008f586ea Deprecate Broker::congestion_queue_size and stop using it internally
Since a reorg in the Broker library (commit b04195183) that revamped flow
control and that we pulled in with Zeek 5.0, this setting hasn't done
anything. Broker's endpoint::make_subscriber() and
endpoint::make_status_subscriber() take a queue size argument (with a default
value) that simply gets dropped in the eventual subscriber::make() call. See:

b041951835 (diff-5c0d2baa7981caeb6a4080708ddca6ad929746d10c73d66598e46d7c2c03c8deL34-R178)
2025-04-25 10:22:35 -07:00
Christian Kreibich
c1a5f70df8 Merge branch 'topic/christian/broker-backpressure-metrics'
* topic/christian/broker-backpressure-metrics:
  Add basic btest to verify that Broker peering telemetry is available.
  Add cluster framework telemetry for Broker's send-buffer use
  Add peer buffer update tracking to the Broker manager's event_observer
  Rename the Broker manager's LoggerAdapter
  Avoid race in the cluster/broker/publish-any btest
2025-04-25 10:04:09 -07:00
Christian Kreibich
88a0cda8ca Add cluster framework telemetry for Broker's send-buffer use
This hooks into Telemetry::sync() to update Broker-level metrics tracking the
peerings' send buffer state. We do this in the cluster framework so we can label
the resulting metrics with Zeek cluster node names, not Broker's endpoint IDs.
2025-04-25 09:14:33 -07:00
Christian Kreibich
f5fbad23ff Add peer buffer update tracking to the Broker manager's event_observer
This implements basic tracking of each peering's current fill level, the maximum
level over a recent time interval (via a new Broker::buffer_stats_reset_interval
tunable, defaulting to 1min), and the number of times a buffer overflows. For
the disconnect policy this is the number of depeerings, but for drop_newest and
drop_oldest it implies the number of messages lost.

This doesn't use "proper" telemetry metrics for a few reasons: this tracking is
Broker-specific, so we need to track each peering via endpoint_ids, while we
want the metrics to use Cluster node name labels, and the latter live in the
script layer. Using broker::endpoint_id directly as keys also means we rely on
their ability to hash in STL containers, which should be fast.

This does not track the buffer levels for Broker "clients" (as opposed to
"peers"), i.e. WebSockets, since we currently don't have a way to name these,
and we don't want to use ephemeral Broker IDs in their telemetry.

To make the stats accessible to the script layer the Broker manager (via a new
helper class that lives in the event_observer) maintains a TableVal mapping
Broker IDs to a new BrokerPeeringStats record. The table's members get updated
every time that table is requested. This minimizes new val instantiation and
allows the script layer to customize the BrokerPeeringStats record by redefing,
updating fields, etc. Since we can't use Zeek vals outside the main thread, this
requires some care so all table updates happen only in the Zeek-side table
updater, PeerBufferState::GetPeeringStatsTable().
2025-04-24 22:47:18 -07:00
Tim Wojtulewicz
3ab83a3f74 Minor changes to storage framework script docs 2025-04-24 11:11:08 -07:00
Steve Smoot
9ef579b09e Change from_json to return an error rather than print it. 2025-04-23 15:56:12 -07:00
Tim Wojtulewicz
cb35da08bc Update Mozilla CA list and CT list to NSS 3.110 2025-04-23 10:41:19 -07:00
Arne Welzel
011029addc cluster/websocket: Make websocket dispatcher queue size configurable
Limit the number WebSocket events queued from external clients to
dispatcher instances to produce back pressure to the clients if
Zeek's IO loop is overloaded.
2025-04-23 14:27:43 +02:00
Arne Welzel
ab25e5d24b broker/main: Reference Cluster::publish() for auto_publish() deprecation
In hindsight, this is the better thing to do and with Zeek 7.2 we should
be confident enough that it'll work.
2025-04-23 14:27:43 +02:00
Arne Welzel
a7423104e1 broker/main: Deprecate Broker::listen_websocket()
Optimistically deprecate Broker::listen_websocket() and promote
Cluster::listen_websocket() instead.
2025-04-23 14:27:43 +02:00
Arne Welzel
3d3b7a0759 cluster/Backend: Add ProcessError()
Allow backends to pass errors to a strategy. Locally, these raise
Cluster::Backend::error() events that are logged to the reporter
as errors.
2025-04-23 14:19:08 +02:00
Christian Kreibich
549e678dff Use Broker peering directionality when re-peering after backpressure overflows
This avoids creating pointless connection reattempts to ephemeral TCP
client-side ports, which have been cluttering up the Broker logs since 7.1.
2025-04-21 14:08:42 -07:00
Christian Kreibich
b430d5235c Expand Broker APIs to allow tracking directionality of peering establishment
This provides ways to figure out for a given peer, or a given address/port pair,
whether the local node originally established the peering.
2025-04-21 14:08:42 -07:00
Arne Welzel
b8e573a3b9 ldap: Clean up from code review
Co-authored-by: Benjamin Bannier <benjamin.bannier@corelight.com>
2025-04-15 20:10:56 +02:00
Arne Welzel
07bf7f8b18 ldap: Add Sicily Authentication constants
The aduser1-ntlm.pcap contains bindRequest messages using Microsoft AD
specific Sicily Authentication [1]. Add the entries to the enum so we
don't log undefined for these and also check the NTLMSSP signature.

[1] https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-adts/8b9dbfb2-5b6a-497a-a533-7e709cb9a982
2025-04-15 20:10:56 +02:00
Tim Wojtulewicz
cb1ef47a31 Add STORAGE_ prefixes for backends and serializers 2025-04-14 10:11:13 -07:00