OSS-Fuzz managed to produce a MIME multipart message construction with
thousands of nested entities (or that's what Zeek makes out of it anyhow).
Prevent such deep analysis by capping at a nesting depth of 100,
preventing unnecessary resource usage. A new weird named exceeded_mime_max_depth
is reported when this limit is reached.
This change reduces the runtime of the OSS-Fuzz reproducer from ~45 seconds
to ~2.5 seconds.
The test PCAP was produced from a Python script using the email package
and sending the rendered version via POST to a HTTP server.
Closes#208
(cherry picked from commit 4e5849fe82c6097df5d25cd1a74d69ab4fa50f46)
* origin/topic/awelzel/smb-unbounded-recent-files-growth:
smb: Fix &read_expire not in effect due to &default=string_set() usage
(cherry picked from commit 5740dbcf20)
Previously, seq was computed as the result of |pending_commands|+1. This
opened the possibility to override queued commands, as well as logging
the same pending ftp reply multiple times.
For example, when commands 1, 2, 3 are pending, command 1 may be dequeued,
but the incoming command then receives seq 3 and overrides the already
pending command 3. The second scenario happens when ftp_reply() selected
command 3 as pending for logging, but is then followed by many ftp_request()
events. This resulted in command 3's response being logged for every
following ftp_request() over and over again.
Avoid both scenarios by tracking the command sequence as an absolute counter.
The previous fix also made it clear that the ssl_history field may grow
unbounded via the ssl_alert event. Prevent by capping using a configurable
limit (default 100) and raise a weird once reached.
Limit the number of events raised from an SSL record with content_type
alert (21) to a configurable maximum number (default 10). For TLS 1.3,
the limit is set to 1 as specified in the RFC. Add a new weird for the
in cases where the limit is exceeded.
OSS-Fuzz managed to generate a reproducer that raised ~660k ssl_plaintext
and ssl_alert events together given ~810kb of input data. This prevents
it with hopefully no negative side-effect in the real-world.
Setting this option to false does not count missing bytes in files towards the
extraction limits, and allows to extract data up to the desired limit,
even when partial files are written.
When missing bytes are encountered, files are now written as sparse
files.
Using this option requires the underlying storage and utilities to support
sparse files.
(cherry picked from commit afa6f3a0d3b8db1ec5b5e82d26225504c2891089)
In the past, we allocated a buffer with zeroes and wrote that with
fwrite. Now, instead we just fseek to the correct offset.
This changes the way in which the file extract limit is counted a bit;
skipped bytes do no longer count against the file size limit.
(cherry picked from commit 5071592e9b7105090a1d9de19689c499070749d4)
OSS Fuzz generated a CWD request and reply followed by very many EPRT
requests. This caused Zeek to re-log the CWD request and invoke `build_url_ftp()`
over and over again resulting in long processing times.
Avoid this scenario by not logging commands that aren't pending anymore.
(cherry picked from commit b05dd31667ff634ec7d017f09d122f05878fdf65)
A call to `extract_filename_from_content_disposition()` is only
efficient if the string is guaranteed to contain the pattern that
is removed by `sub()`. Due to missing brackets around the `[:blank:]`
character class, an overly long string (756kb) ending in
"Type:dtanameaa=" matched the wrong pattern causing `sub()` to
exhibit quadratic runtime. Besides that, we may have potentially
extracted wrong information from a crafted header value.
(cherry picked from commit 6d385b1ca724a10444865e4ad38a58b31a2e2288)
* origin/topic/jazoff/gh-3268:
Fix check for emailed notices
Changes: Added a test-case printing email_delay_tokens to compare email vs
non-email notice types. Previously, both notice types would have email
delay tokens at that point in the flow.
(cherry picked from commit 7e11501d3c)
* origin/topic/awelzel/3145-dcerpc-state-clean:
dce-rpc: Test cases for unbounded state growth
dce-rpc: Handle smb2_close_request() in scripts
smb/dce-rpc: Cleanup DCE-RPC analyzers when fid is closed and limit them
dce-rpc: Do not repeatedly register removal hooks
(cherry picked from commit f9904511ab)
* topic/awelzel/3112-log-suffix-left-over-log-rotation:
cluster/logger: Fix leftover-log-rotation in multi-logger setups
cluster/logger: Fix global var reference
(cherry picked from commit f53aefdd5b)
* origin/topic/awelzel/cluster-at-if-removal:
test-all-policy: Do not load nodes-experimental/manager.zeek
cluster/main: Remove extra @if ( Cluster::is_enabled() )
(cherry picked from commit 98e44ee14f)
We previously would reprent port ranges from EVT files element-wise.
This can potentially generate a lot of code (all on a single line
though) which some versions of GCC seem to have trouble with, and which
also causes JIT overhead.
With this patch we switch to directly representing ranges. Single ports
are represented as ranges `[start, start]`.
Closes#3094.
* origin/topic/vern/at-if-analyze:
updates reflecting review comments
change base scripts to use run-time if's or @if ... &analyze
a number of BTests updated with @if ... &analyze
update for scripting coverage BTest demonstrating utility of @if ... &analyze
BTests for new @if ... &analyze functionality
"if ( ... ) &analyze" language feature
classes for tracking "@if (...) &analyze" notion of code being/not being "activated"
RemoveGlobal() method for Scope class + simplifying interfaces
While writing documentation about troubleshooting and looking a bit
at the older stats.log, realized we don't have the packet lag metric
exposed as metric/telemetry. Add it.
This is a Zeek instance lagging behind in network time ~6second because
it's very overloaded:
zeek_net_packet_lag_seconds{endpoint=""} 6.169406 1684848998092
Repeating the message for every new call to get_file_handle() is not
very useful. It's pretty much an analyzer configuration issue so logging
it once should be enough.
OSS-Fuzz generated traffic containing a CWD command with a single very large
path argument (427kb) starting with ".___/` \x00\x00...", This is followed
by a large number of ftp replies with code 250. The directory logic in
ftp_reply() would match every incoming reply with the one pending CWD command,
triggering path buildup ending with something 120MB in size.
Protect from re-using a directory command by setting a flag in the
CmdArg record when it was consumed for the path traversal logic.
This doesn't prevent unbounded path build-up generally, but does prevent the
amplification of a single large command with very many small ftp_replies.
Re-using a pending path command seems like a bug as well.
* jgras/topic/jgras/cluster-active-node-count-fix:
Fix get_active_node_count for node types not present.
Changed over to explicit existence check instead to avoid the set()
creation upon missed lookups.
This reflects the `spicy-plugin` code as of `d8c296b81cc2a11`.
In addition to moving the code into Zeek's source tree, this comes
with a couple small functional changes:
- `spicyz` no longer tries to infer if it's running from the build
directory. Instead `ZEEK_SPICY_LIBRARY` can be set to a custom
location. `zeek-set-path.sh` does that now.
- ZEEK_CONFIG can be set to change what `spicyz -z` print out. This is
primarily for backwards compatibility.
Some further notes on specifics:
- We raise the minimum Spicy version to 1.8 (i.e., current `main`
branch).
- Renamed the `compiler/` subdirectory to `spicyz` to avoid
include-path conflicts with the Spicy headers.
- In `cmake/`, the corresponding PR brings a new/extended version of
`FindZeek`, which Spicy analyzer packages need. We also now install
some of the files that the Spicy plugin used to bring for testing,
so that existing packages keep working.
- For now, this all remains backwards compatible with the current
`zkg` analyzer templates so that they work with both external and
integrated Spicy support. Later, once we don't need to support any
external Spicy plugin versions anymore, we can clean up the
templates as well.
- All the plugin's tests have moved into the standard test suite. They
are skipped if configure with `--disable-spicy`.
This holds off on adapting the new code further to Zeek's coding
conventions, so that it remains easier to maintain it in parallel to
the (now legacy) external plugin. We'll make a pass over the
formatting for (presumable) Zeek 6.1.
Issue #3028 tracks how a flipped connections reset a connection's value
including any state set during new_connection(). For the time being,
update community-id functionality back to the original connection_state_remove()
approach to avoid missing community_ids on flipped connections.
* amazing-pp/topic/fupeng/from_json_bif:
Implement from_json bif
Minor updates during merge: Moved ValFromJSON into zeek::detail for the
time being, removed gotos, normalized some error messages to lower case,
minimal test extension and added a raw reader input framework test reading
"json lines" as a demo, adding notes about the implicit type
conversions.
When multiple loggers are configured in a Supervisor controlled cluster
configuration, encode extra information into the rotated filename to
identify which logger produced the log.
This is similar to the approach taken for ZeekControl, re-using the
log_suffix terminology, but as there's only a single zeek-archiver
process and no postprocessors and no other side-channel for additional
information, we encode extra metadata into the filename. zeek-archiver
is extended to recognize the special metadata part of the filename.
This also solves the issue that multiple loggers in a supervisor setup
overwrite each others log files within a single log-queue directory.
* origin/topic/awelzel/smb2-state-handling:
NEWS: Add entry about SMB::max_pending_messages and state discarding
scripts/smb2-main: Reset script-level state upon smb2_discarded_messages_state()
smb2: Limit per-connection read/ioctl/tree state
DTLSv1.3 changes the DTLS record format, introducing a completely new
header - which is a first for DTLS.
We don't currently completely parse this header, as this requires a bit
more statekeeping. This will be added in a future revision. This also
also has little practical implications.
* topic/johanna/no-error-message-durning-tls-or-dtls-protocol-violations:
SSL: failing analyzer handling - address review feedback
SSL: do not try to disable failed analyzer
Also folds in minor feedback from GH-3012
It turns out that we never logged hello retry requests correctly in the
ssl_history field.
Hello retry requests are (in their final version) signaled by a specific
random value in the server random.
This commit fixes this oversight, and hello retry requests are now
correctly logged as such.
Currently, if a TLS/DTLS analyzer fails with a protocol violation, we
will still try to remove the analyzer later, which results in the
following error message:
error: connection does not have analyzer specified to disable
Now, instead we don't try removing the analyzer anymore, after a
violation occurred.