Repeating the message for every new call to get_file_handle() is not
very useful. It's pretty much an analyzer configuration issue so logging
it once should be enough.
OSS-Fuzz generated traffic containing a CWD command with a single very large
path argument (427kb) starting with ".___/` \x00\x00...", This is followed
by a large number of ftp replies with code 250. The directory logic in
ftp_reply() would match every incoming reply with the one pending CWD command,
triggering path buildup ending with something 120MB in size.
Protect from re-using a directory command by setting a flag in the
CmdArg record when it was consumed for the path traversal logic.
This doesn't prevent unbounded path build-up generally, but does prevent the
amplification of a single large command with very many small ftp_replies.
Re-using a pending path command seems like a bug as well.
* jgras/topic/jgras/cluster-active-node-count-fix:
Fix get_active_node_count for node types not present.
Changed over to explicit existence check instead to avoid the set()
creation upon missed lookups.
This reflects the `spicy-plugin` code as of `d8c296b81cc2a11`.
In addition to moving the code into Zeek's source tree, this comes
with a couple small functional changes:
- `spicyz` no longer tries to infer if it's running from the build
directory. Instead `ZEEK_SPICY_LIBRARY` can be set to a custom
location. `zeek-set-path.sh` does that now.
- ZEEK_CONFIG can be set to change what `spicyz -z` print out. This is
primarily for backwards compatibility.
Some further notes on specifics:
- We raise the minimum Spicy version to 1.8 (i.e., current `main`
branch).
- Renamed the `compiler/` subdirectory to `spicyz` to avoid
include-path conflicts with the Spicy headers.
- In `cmake/`, the corresponding PR brings a new/extended version of
`FindZeek`, which Spicy analyzer packages need. We also now install
some of the files that the Spicy plugin used to bring for testing,
so that existing packages keep working.
- For now, this all remains backwards compatible with the current
`zkg` analyzer templates so that they work with both external and
integrated Spicy support. Later, once we don't need to support any
external Spicy plugin versions anymore, we can clean up the
templates as well.
- All the plugin's tests have moved into the standard test suite. They
are skipped if configure with `--disable-spicy`.
This holds off on adapting the new code further to Zeek's coding
conventions, so that it remains easier to maintain it in parallel to
the (now legacy) external plugin. We'll make a pass over the
formatting for (presumable) Zeek 6.1.
* amazing-pp/topic/fupeng/from_json_bif:
Implement from_json bif
Minor updates during merge: Moved ValFromJSON into zeek::detail for the
time being, removed gotos, normalized some error messages to lower case,
minimal test extension and added a raw reader input framework test reading
"json lines" as a demo, adding notes about the implicit type
conversions.
When multiple loggers are configured in a Supervisor controlled cluster
configuration, encode extra information into the rotated filename to
identify which logger produced the log.
This is similar to the approach taken for ZeekControl, re-using the
log_suffix terminology, but as there's only a single zeek-archiver
process and no postprocessors and no other side-channel for additional
information, we encode extra metadata into the filename. zeek-archiver
is extended to recognize the special metadata part of the filename.
This also solves the issue that multiple loggers in a supervisor setup
overwrite each others log files within a single log-queue directory.
* origin/topic/awelzel/smb2-state-handling:
NEWS: Add entry about SMB::max_pending_messages and state discarding
scripts/smb2-main: Reset script-level state upon smb2_discarded_messages_state()
smb2: Limit per-connection read/ioctl/tree state
DTLSv1.3 changes the DTLS record format, introducing a completely new
header - which is a first for DTLS.
We don't currently completely parse this header, as this requires a bit
more statekeeping. This will be added in a future revision. This also
also has little practical implications.
* topic/johanna/no-error-message-durning-tls-or-dtls-protocol-violations:
SSL: failing analyzer handling - address review feedback
SSL: do not try to disable failed analyzer
Also folds in minor feedback from GH-3012
It turns out that we never logged hello retry requests correctly in the
ssl_history field.
Hello retry requests are (in their final version) signaled by a specific
random value in the server random.
This commit fixes this oversight, and hello retry requests are now
correctly logged as such.
Currently, if a TLS/DTLS analyzer fails with a protocol violation, we
will still try to remove the analyzer later, which results in the
following error message:
error: connection does not have analyzer specified to disable
Now, instead we don't try removing the analyzer anymore, after a
violation occurred.
This is similar to what the external corelight/zeek-smb-clear-state script
does, but leverages the smb2_discarded_messages_state() event instead of
regularly checking on the state of SMB connections.
The pcap was created using the dperson/samba container image and mounting
a share with Linux's CIFS filesystem, then copying the content of a
directory with 100 files. The test uses a BPF filter to imitate mostly
"half-duplex" traffic.
Users on Slack observed memory growth in an environment with a lot of
SMB traffic. jeprof memory profiling pointed at the offset and fid maps
kept per-connection for smb2 read requests.
These maps can grow unbounded if responses are seen before requests, there's
packet drops, just one side of the connection is visible, or we fail to parse
responses properly.
Forcefully wipe out these maps when they grow too large and raise
smb2_discarded_messages_state() to notify script land about this.
For low-level packet analysis use-cases, these fields are currently
not script-land accessible via raw_packet() or so. They are accessible
on the icmp_context record, but not on the actual ip4_hdr record, so
add them.
* origin/topic/awelzel/zeekctl-multiple-loggers:
NEWS: Add entry for ZeekControl and multi-loggers
Bump zeekctl to multi-logger version
logging: Support rotation_postprocessor_command_env
This also fixes the GRE analyzer to forward into the IEEE 802.11 analyzer
if it encounters Aruba packets with the proper protocol types. This way
the QoS header can be handled correctly.
* 'topic/amazingpp/irc-fuid-missing' of github.com:AmazingPP/zeek:
Add irc_dcc_send_ack event and fix missing fields
I've moved IRC_Data back into the zeek::analyzer::file namespace, but
we did move the declaration from protocol/file/File.h to protocol/irc/IRC.h.
But, if someone actually customized IRC_Data and didn't include protocol/irc/IRC.h
for other reasons, I'll be surprised (and also just suggest to update the include).
* origin/topic/awelzel/add-community-id:
testing/external: Bump hashes for community_id addition
NEWS: Add entry for Community ID
policy: Import zeek-community-id scripts into protocols/conn frameworks/notice
Add community_id_v1() based on corelight/zeek-community-id
"Community ID" has become an established flow hash for connection correlation
across different monitoring and storage systems. Other NSMs have had native
and built-in support for Community ID since late 2018. And even though the
roots of "Community ID" are very close to Zeek, Zeek itself has never provided
out-of-the-box support and instead required users to install an external plugin.
While we try to make that installation as easy as possible, an external plugin
always sets the bar higher for an initial setup and can be intimidating.
It also requires a rebuild operation of the plugin during upgrades. Nothing
overly complicated, but somewhat unnecessary for such popular functionality.
This isn't a 1:1 import. The options are parameters and the "verbose"
functionality has been removed. Further, instead of a `connection`
record, the new bif works with `conn_id`, allowing computation of the
hash with little effort on the command line:
$ zeek -e 'print community_id_v1([$orig_h=1.2.3.4, $orig_p=1024/tcp, $resp_h=5.6.7.8, $resp_p=80/tcp])'
1:RcCrCS5fwYUeIzgDDx64EN3+okU
Reference: https://github.com/corelight/zeek-community-id/
This set contains the topics to reach all cluster nodes. Due to broker's
forwarding mechanism, we cannot define a single broadcast topic, as it
would create routing loops.
This new table provides a mechanism to add environment variables to the
postprocessor execution. Use case is from ZeekControl to inject a suffix
to be used when running with multiple logger.
While working on a rotation format function, ran into Zeek crashing
when not returning a value from it, fix and recover the same way as
for scripting errors.
* security/topic/awelzel/152-smtp-validate-mail-transactions:
smtp: Validate mail transaction and disable SMTP analyzer if excessive
generic-analyzer-fuzzer: Detect disable_analyzer() from scripts
* security/topic/awelzel/148-ftp-skip-get-pending-commands-multi-line-response:
ftp/main: Special case for intermediate reply lines
ftp/main: Skip get_pending_command() for intermediate reply lines
The medium.trace in the private external test suite contains one
session/server that violates the multi-line reply protocol and
happened to work out fairly well regardless due to how we looked
up the pending commands unconditionally before.
Continue to match up reply lines that "look like they contain status codes"
even if cont_resp = T. This still improves runtime for the OSS-Fuzz
generated test case and keeps the external baselines valid.
The affected session can be extracted as follows:
zcat Traces/medium.trace.gz | tcpdump -r - 'port 1491 and port 21'
We could push this into the analyzer, too, minimally the RFC says:
> If an intermediary line begins with a 3-digit number, the Server
> must pad the front to avoid confusion.