Commit graph

354 commits

Author SHA1 Message Date
Johanna Amann
2f712c3c24 Allow to track service violations in conn.log.
This introduces ian options, DPD::track_removed_services_in_connection.
It adds failed services to the services column, prefixed with a
"-".

Alternatively, this commit also adds
policy/protocols/conn/failed-services.zeek, which provides the same
information in a new column in conn.log.
2025-01-30 16:59:44 +00:00
Johanna Amann
c72c1cba6f DPD: change handling of pre-confirmation violations, remove max_violations
This commit revamps the handling of analyzer violations that happen
before an analyzer confirms the protocol.

The current state is that an analyzer is disabled after 5 violations, if
it has not been confirmed. If it has been confirmed, it is disabled
after a single violation.

The reason for this is a historic mistake. In Zeek up to versions 1.5,
analyzers were unconditianally removed when they raised the first
protocol violation.

When this script was ported to the new layout for Zeek 2.0 in
b4b990cfb5, a logic error was introduced
that caused analyzers to no longer be disabled if they were not
confirmed.

This was the state for ~8 years, till the DPD::max_violations options
was added, which instates the current approach of disabling unconfirmed
analyzers after 5 violations. Sadly, there is not much discussion about
this change - from my hazy memory, I think this was discovered during
performance tests and the new behavior was added without checking into
the history of previous changes.

This commit reinstates the originally intended behavior of DPD. When an
analyzer that has not been confirmed raises a protocol violation, it is
immediately removed from the connection. This also makes a lot of sense
- this allows the analyzer to be in a "tasting" phase at the beginning
of the connection, and to error out quickly once it realizes that it was
attached to a connection not containing the desired protocol.

This change also removes the DPD::max_violations option, as it no longer
serves any purpose after this change. (In practice, the option remains
with an &deprecated warning, but it is no longer used for anything).

There are relatively minimal test-baseline changes due to this; they are
mostly triggered by the removal of the data structure and by less
analyzer errors being thrown, as unconfirmed analyzers are disabled
after the first error.
2025-01-30 16:59:44 +00:00
Tim Wojtulewicz
0fcbc8546e Update btests for new local-only subnets 2025-01-09 22:16:42 -07:00
Arne Welzel
35c79ab2e3 cluster/backend/zeromq: Add ZeroMQ based cluster backend
This is a cluster backend implementation using a central XPUB/XSUB proxy
that by default runs on the manager node. Logging is implemented leveraging
PUSH/PULL sockets between logger and other nodes, rather than going
through XPUB/XSUB.

The test-all-policy-cluster baseline changed: Previously, Broker::peer()
would be called from setup-connections.zeek, causing the IO loop to be
alive. With the ZeroMQ backend, the IO loop is only alive when
Cluster::init() is called, but that doesn't happen anymore.
2024-12-10 20:33:02 +01:00
Christian Kreibich
1c42bfc715 Merge branch 'topic/christian/disconnect-slow-peers'
* topic/christian/disconnect-slow-peers:
  Bump cluster testsuite to pull in Broker backpressure tests
  Expand documentation of Broker events.
  Add sleep() BiF.
  Add backpressure disconnect notification to cluster.log and via telemetry
  Remove unneeded @loads from base/misc/version.zeek
  Add Cluster::nodeid_to_node() helper function
  Support re-peering with Broker peers that fall behind
  Add Zeek-level configurability of Broker slow-peer disconnects
  Bump Broker to pull in disconnect feature and infinite-loop fix
  No need to namespace Cluster:: functions in their own namespace
2024-12-09 23:33:35 -08:00
Tim Wojtulewicz
ccefd66d37 Move python signatures to a separate file 2024-12-09 11:08:30 -07:00
Christian Kreibich
0010e65f6d Support re-peering with Broker peers that fall behind
This adds re-peering at the Broker level for peers that Broker decided to
unpeer. We keep this at the Broker level since this behavior is specific to
it (as opposed to other cluster backends).

Includes baseline updates for btests that pick up on the new script's @load.
2024-12-06 15:18:05 -08:00
Arne Welzel
51836d08ae protocol: Add StreamEvent analyzer
This analyzer can be used to transport raw stream data for a given
connection to the script layer. For example, adding this analyzer into
the HTTP::upgrade_analyzer or using it to configure a child WebSocket
analyzer allows to get access to the raw stream data in script land
when no more appropriate protocol analyzer is available.
2024-12-06 16:12:40 +01:00
Arne Welzel
ef04a199c8 cluster: Add Cluster scoped bifs
... and a broker based test using Cluster::publish() and
Cluster::subscribe().
2024-11-26 12:58:23 +01:00
Tim Wojtulewicz
5e5aceb6f7 Rename protocol_id field to ip_proto and similar renaming for name field 2024-11-13 12:02:00 -07:00
Tim Wojtulewicz
35ec9733c0 Add conn.log entries for connections with unhandled IP protocols 2024-11-13 11:25:40 -07:00
Christian Kreibich
71f7e89974 Telemetry framework: move BIFs to the primary-bif stage
This moves the Telemetry framework's BIF-defined functionalit from the
secondary-BIFs stage to the primary one. That is, this functionality is now
available from the end of init-bare.zeek, not only after the end of
init-frameworks-and-bifs.zeek.

This allows us to use script-layer telemetry in our Zeek's own code that get
pulled in during init-frameworks-and-bifs.

This change splits up the BIF features into functions, constants, and types,
because that's the granularity most workable in Func.cc and NetVar. It also now
defines the Telemetry::MetricsType enum once, not redundantly in BIFs and script
layer.

Due to subtle load ordering issues between the telemetry and cluster frameworks
this pushes the redef stage of Telemetry::metrics_port and address into
base/frameworks/telemetry/options.zeek, which is loaded sufficiently late in
init-frameworks-and-bifs.zeek to sidestep those issues. (When not doing this,
the effect is that the redef in telemetry/main.zeek doesn't yet find the
cluster-provided values, and Zeek does not end up listening on these ports.)

The need to add basic Zeek headers in script_opt/ZAM/ZBody.cc as a side-effect
of this is curious, but looks harmless.

Also includes baseline updates for the usual btests and adds a few doc strings.
2024-10-18 09:56:29 -07:00
Vern Paxson
61258587bf BTest baseline update for more complete function/lambda names 2024-09-27 14:16:10 -07:00
Arne Welzel
cf9fe91705 pop3: Prevent unbounded state growth
The cmds list may grow unbounded due to the POP3 analyzer being in
multiLine mode after seeing `AUTH` in a Redis connection, but never
a `.` terminator. This can easily be provoked by the Redis ping
command.

This adds two heuristics: 1) Forcefully process the oldest commands in
the cmds list and cap it at max_pending_commands. 2) Start raising
analyzer violations if the client has been using more than
max_unknown_client_commands commands (default 10).

Closes #3936
2024-09-18 19:05:39 +02:00
Arne Welzel
a5d93c4dec btest: Update baselines for removal-hooks addition
The removal_hooks field exists in bare mode (seems fine) and moved within the
connection record to earlier, so a bunch of baselines changed
2024-09-17 18:15:15 +02:00
Tim Wojtulewicz
7ac7ce1d2b Process metric callbacks from the main-loop thread
This avoids the callbacks from being processed on the worker thread
spawned by Civetweb. It fixes data race issues with lookups involving
global variables, amongst other threading issues.
2024-08-02 15:30:47 -07:00
Jan Grashoefer
c6c8d078c0 Extend btest for logging of disabled analyzers 2024-07-09 20:15:46 +02:00
Tim Wojtulewicz
a63ea5a04e Btest updates due to recent changes 2024-05-31 13:30:31 -07:00
Vern Paxson
e84b60762a added a space when rendering some expressions so they're more readable 2024-05-29 12:40:05 -07:00
Arne Welzel
efc2681152 WebSocket: Introduce new analyzer and log
This adds a new WebSocket analyzer that is enabled with the HTTP upgrade
mechanism introduced previously. It is a first implementation in BinPac with
manual chunking of frame payload. Configuration of the analyzer is sketched
via the new websocket_handshake() event and a configuration BiF called
WebSocket::__configure_analyzer(). In short, script land collects WebSocket
related HTTP headers and can forward these to the analyzer to change its
parsing behavior at websocket_handshake() time. For now, however, there's
no actual logic that would change behavior based on agreed upon extensions
exchanged via HTTP headers (e.g. frame compression). WebSocket::Configure()
simply attaches a PIA_TCP analyzer to the WebSocket analyzer for dynamic
protocol detection (or a custom analyzer if set). The added pcaps show this
in action for tunneled ssh, http and https using wstunnel. One test pcap is
Broker's WebSocket traffic from our own test suite, the other is the
Jupyter websocket traffic from the ticket/discussion.

This commit further adds a basic websocket.log that aggregates the WebSocket
specific headers (Sec-WebSocket-*) headers into a single log.

Closes #3424
2024-01-22 18:54:38 +01:00
Arne Welzel
2a858d252e MIME: Cap nested MIME analysis depth to 100
OSS-Fuzz managed to produce a MIME multipart message construction with
thousands of nested entities (or that's what Zeek makes out of it anyhow).
Prevent such deep analysis by capping at a nesting depth of 100,
preventing unnecessary resource usage. A new weird named exceeded_mime_max_depth
is reported when this limit is reached.

This change reduces the runtime of the OSS-Fuzz reproducer from ~45 seconds
to ~2.5 seconds.

The test PCAP was produced from a Python script using the email package
and sending the rendered version via POST to a HTTP server.

Closes #208
2024-01-17 10:18:13 -07:00
Arne Welzel
14949941ce SMTP: Add BDAT support
Closes #3264
2024-01-12 10:18:07 +01:00
Christian Kreibich
4e45a3462b Update btest baselines to reflect introduction of mmdb.bif 2024-01-10 20:28:41 -08:00
Arne Welzel
e3796894c6 logging: Do not keep delay state persistent
If Log::remove_stream() and Log::create_stream() is called for a stream,
do not restore the previously used max delay or max queue size.
2023-11-29 11:53:11 +01:00
Arne Welzel
f0e67022fd logging: Introduce Log::delay() and Log::delay_finish()
This is a verbose, opinionated and fairly restrictive version of the log delay idea.
Main drivers are explicitly, foot-gun-avoidance and implementation simplicity.

Calling the new Log::delay() function is only allowed within the execution
of a Log::log_stream_policy() hook for the currently active log write.

Conceptually, the delay is placed between the execution of the global stream
policy hook and the individual filter policy hooks. A post delay callback
can be registered with every Log::delay() invocation. Post delay callbacks
can (1) modify a log record as they see fit, (2) veto the forwarding of the
log record to the log filters and (3) extend the delay duration by calling
Log::delay() again. The last point allows to delay a record by an indefinite
amount of time, rather than a fixed maximum amount. This should be rare and
is therefore explicit.

Log::delay() increases an internal reference count and returns an opaque
token value to be passed to Log::delay_finish() to release a delay reference.
Once all references are released, the record is forwarded to all filters
attached to a stream when the delay completes.

This functionality separates Log::log_stream_policy() and individual filter
policy hooks. One consequence is that a common use-case of filter policy hooks,
removing unproductive log records, may run after a record was delayed. Users
can lift their filtering logic to the stream level (or replicate the condition
before the delay decision). The main motivation here is that deciding on a
stream-level delay in per-filter hooks is too late. Attaching multiple filters
to a stream can additionally result in hard to understand behavior.

On the flip side, filter policy hooks are guaranteed to run after the delay
and can be used for further mangling or filtering of a delayed record.
2023-11-29 11:53:11 +01:00
Vern Paxson
23c08a05de descriptions of "for" statements now include their "value variable" if present 2023-11-10 09:56:51 +01:00
Arne Welzel
54a08a74da base/frameworks/spicy: Do not load base/misc/version
Unsure what it's used for today and also results in the situation that on
some platforms we generate a reporter.log in bare mode, while on others
where spicy is disabled, we do not.

If we want base/frameworks/version loaded by default, should put it into
init-bare.zeek and possibly remove the loading of the reporter framework
from it - Reporter::error() would still work and be visible on stderr,
just not create a reporter.log.
2023-10-24 13:15:21 +02:00
Tim Wojtulewicz
6d9d4523bc Add registration for GRE-over-UDP 2023-10-16 11:42:24 -07:00
Arne Welzel
07ac6fa074 btest/plugins/hooks: Run in bare mode
Motivation is basically the same as in 88bb527026.
For plugin.hooks, one example is that adding a new option in the default script
changes the baseline due registration of change handlers. Also, the connection
record is printed in various places, resulting in churn when the default
scripts change.
2023-10-09 16:13:59 +02:00
Johanna Amann
e18edfa452 Add extract_limit_includes_missing option for file extraction
Setting this option to false does not count missing bytes in files towards the
extraction limits, and allows to extract data up to the desired limit,
even when partial files are written.

When missing bytes are encountered, files are now written as sparse
files.

Using this option requires the underlying storage and utilities to support
sparse files.
2023-09-14 12:11:42 -07:00
Tim Wojtulewicz
5934e143aa Revert "Add extract_limit_includes_missing option for file extraction"
This reverts commit f4d0fdcd5c.
2023-09-14 12:10:40 -07:00
Johanna Amann
f4d0fdcd5c Add extract_limit_includes_missing option for file extraction
Setting this option to false does not count missing bytes in files towards the
extraction limits, and allows to extract data up to the desired limit,
even when partial files are written.

When missing bytes are encountered, files are now written as sparse
files.

Using this option requires the underlying storage and utilities to support
sparse files.

(cherry picked from commit afa6f3a0d3b8db1ec5b5e82d26225504c2891089)
2023-09-12 12:00:36 -07:00
Arne Welzel
cea7c0ab46 ID/Stmt: Introduce INIT_SKIP and use in ForStmt
Currently, loop vars are added to a function scope's inits and
initialized upon entering a function with default values. This
applies to vector, record and table types.

This is unnecessary for variables used in for loops as they are
guaranteed to be initialized while iterating.
2023-09-08 13:05:44 +02:00
Arne Welzel
af1714853f http: Prevent request/response de-synchronization and unbounded state growth
When http_reply events are received before http_request events, either
through faking traffic or possible re-ordering, it is possible to trigger
unbounded state growth due to later http_requests never being matched
again with responses.

Prevent this by synchronizing request/response counters when late
requests come in.

Also forcefully flush pending requests when http_replies are never
observed either due to the analyzer having been disabled or because
half-duplex traffic.

Fixes #1705
2023-08-28 15:02:58 +02:00
Arne Welzel
ee12a7a6e7 PPP: Add PPP analyzer to handle LINKTYPE_PPP (0x9)
Using pcaps from https://interop.seemann.io/ as samples for QUIC protocol
data didn't produce a conn.log for the contained data. `tcpdump -r`
and Wireshark do show the contained IP/UDP packets. Teach Zeek how
to handle link type DLT_PPP 0x09 using a new PPP analyzer based on the
PPPSerial analyzer code.

Usual update to files/x509 baseline after adding new analyzer due
to enum values changing.
2023-08-23 16:41:19 +02:00
Tim Wojtulewicz
5b74e717bc Fix plugin.hooks test for opaque-printing change 2023-07-20 10:43:36 -07:00
Tim Wojtulewicz
4229af6820 Remove deprecations tagged for v6.1 2023-06-14 10:07:22 -07:00
Arne Welzel
eef7acc1e9 cluster/main: Remove extra @if ( Cluster::is_enabled() )
These have been discussed in the context of "@if &analyze" [1] and
am much in favor for not disabling/removing ~100 lines (more than
fits on a single terminal) out from the middle of a file. There's no
performance impact for having these handlers enabled unconditionally.
Also, any future work on "@if &analyze" will look at them again which
we could also skip.

This also reverts back to the behavior where the Cluster::LOG stream
is created even in non cluster setups like in previous Zeek versions.
As long as no one writes to it there's essentially no difference. If
someone does write to Cluster::LOG, I'd argue not black holing these
messages is better. Schema generators using Log::active_streams will
continue to discover Cluster::LOG even if they run in non-cluster
mode.

https://github.com/zeek/zeek/pull/3062#discussion_r1200498905
2023-06-06 15:20:10 +02:00
Robin Sommer
a62e153dd3
Do not load Spicy scripts if Spicy is not available. 2023-05-16 10:21:21 +02:00
Robin Sommer
0040111955
Integrate the Spicy plugin into Zeek proper.
This reflects the `spicy-plugin` code as of `d8c296b81cc2a11`.

In addition to moving the code into Zeek's source tree, this comes
with a couple small functional changes:

- `spicyz` no longer tries to infer if it's running from the build
  directory. Instead `ZEEK_SPICY_LIBRARY` can be set to a custom
  location. `zeek-set-path.sh` does that now.

- ZEEK_CONFIG can be set to change what `spicyz -z` print out. This is
  primarily for backwards compatibility.

Some further notes on specifics:

- We raise the minimum Spicy version to 1.8 (i.e., current `main`
  branch).

- Renamed the `compiler/` subdirectory to `spicyz` to avoid
  include-path conflicts with the Spicy headers.

- In `cmake/`, the corresponding PR brings a new/extended version of
  `FindZeek`, which Spicy analyzer packages need. We also now install
  some of the files that the Spicy plugin used to bring for testing,
  so that existing packages keep working.

- For now, this all remains backwards compatible with the current
  `zkg` analyzer templates so that they work with both external and
  integrated Spicy support. Later, once we don't need to support any
  external Spicy plugin versions anymore, we can clean up the
  templates as well.

- All the plugin's tests have moved into the standard test suite. They
  are skipped if configure with `--disable-spicy`.

This holds off on adapting the new code further to Zeek's coding
conventions, so that it remains easier to maintain it in parallel to
the (now legacy) external plugin. We'll make a pass over the
formatting for (presumable) Zeek 6.1.
2023-05-16 10:17:45 +02:00
Arne Welzel
3ac877e20d scripts/smb2-main: Reset script-level state upon smb2_discarded_messages_state()
This is similar to what the external corelight/zeek-smb-clear-state script
does, but leverages the smb2_discarded_messages_state() event instead of
regularly checking on the state of SMB connections.

The pcap was created using the dperson/samba container image and mounting
a share with Linux's CIFS filesystem, then copying the content of a
directory with 100 files. The test uses a BPF filter to imitate mostly
"half-duplex" traffic.
2023-05-03 11:22:01 +02:00
Arne Welzel
004dce2cf2 Merge remote-tracking branch 'origin/topic/awelzel/zeekctl-multiple-loggers'
* origin/topic/awelzel/zeekctl-multiple-loggers:
  NEWS: Add entry for ZeekControl and multi-loggers
  Bump zeekctl to multi-logger version
  logging: Support rotation_postprocessor_command_env
2023-04-27 12:17:02 +02:00
Tim Wojtulewicz
7aa7909c94 Add forwarding from VLAN analyzer into LLC, SNAP, and Novell 802.3 analyzers 2023-04-25 12:29:55 -07:00
Tim Wojtulewicz
c5b8603218 Remove non-standard way of forwarding out of the Ethernet analyzer 2023-04-25 12:29:55 -07:00
Tim Wojtulewicz
7e88a2b3fb Add basic LLC, SNAP, and Novell 802.3 packet analyzers 2023-04-25 12:29:54 -07:00
Tim Wojtulewicz
f62f8e5cc9 Remove workaround for tunnels from IEEE 802.11 analyzer 2023-04-25 09:28:20 -07:00
Tim Wojtulewicz
5b1c6216bd Fix IEEE 802.11 analyzer to properly forward tunneled packets
This mostly happens with Aruba, but could possibly happen with other tunnels too.
2023-04-25 09:28:20 -07:00
Arne Welzel
ffb73e4de9 Merge remote-tracking branch 'origin/topic/awelzel/add-community-id'
* origin/topic/awelzel/add-community-id:
  testing/external: Bump hashes for community_id addition
  NEWS: Add entry for Community ID
  policy: Import zeek-community-id scripts into protocols/conn frameworks/notice
  Add community_id_v1() based on corelight/zeek-community-id
2023-04-24 10:12:56 +02:00
Christian Kreibich
99de7b7526 Add community_id_v1() based on corelight/zeek-community-id
"Community ID" has become an established flow hash for connection correlation
across different monitoring and storage systems. Other NSMs have had native
and built-in support for Community ID since late 2018. And even though the
roots of "Community ID" are very close to Zeek, Zeek itself has never provided
out-of-the-box support and instead required users to install an external plugin.

While we try to make that installation as easy as possible, an external plugin
always sets the bar higher for an initial setup and can be intimidating.
It also requires a rebuild operation of the plugin during upgrades. Nothing
overly complicated, but somewhat unnecessary for such popular functionality.

This isn't a 1:1 import. The options are parameters and the "verbose"
functionality  has been removed. Further, instead of a `connection`
record, the new bif works with `conn_id`, allowing computation of the
hash with little effort on the command line:

    $ zeek -e 'print community_id_v1([$orig_h=1.2.3.4, $orig_p=1024/tcp, $resp_h=5.6.7.8, $resp_p=80/tcp])'
    1:RcCrCS5fwYUeIzgDDx64EN3+okU

Reference: https://github.com/corelight/zeek-community-id/
2023-04-21 20:44:09 +02:00
Jan Grashoefer
3db8bb4a44 Generalize Cluster::worker_count. 2023-04-21 19:04:39 +02:00