Commit graph

2772 commits

Author SHA1 Message Date
Arne Welzel
3f7881a57b segment_profiling: Remove SegmentProfiler and load_sample event
While it seems interesting functionality, this hasn't been documented,
maintained or knowingly leveraged for many years.

There are various other approaches today, too:

* We track the number of event handler invocations regardless of
  profiling. It's possible to approximate a load_sample event by
  comparing the result of two get_event_stats() calls. Or, visualize
  the corresponding counters in a Prometheus setup to get an idea of
  event/s broken down by event names.

* HookCallFunction() allows to intercept script execution, including
  measuring the time execution takes.

* The global call_stack and g_frame_stack can be used from plugins
  (and even external processes) to walk the Zeek script stack at certain
  points to implement a sampling profiler.

* USDT probes or more plugin hooks will likely be preferred over Zeek
  builtin functionality in the future.

Relates to #3458
2024-01-03 11:55:54 +01:00
Arne Welzel
fea8ee2260 smb: Fix &read_expire not in effect due to &default=string_set() usage
The SMB::State$recent_files field is meant to have expiring entries.
However, due to usage of &default=string_set(), the &read_expire
attribute is not respected causing unbounded state growth. Replace
&default=string_set() with &default=set().

Thanks to ya-sato on Slack for reporting!

Related: zeek/zeek-docs#179, #3513.
2023-12-17 15:02:05 +01:00
Arne Welzel
e3796894c6 logging: Do not keep delay state persistent
If Log::remove_stream() and Log::create_stream() is called for a stream,
do not restore the previously used max delay or max queue size.
2023-11-29 11:53:11 +01:00
Arne Welzel
fd096b1ce6 logging: delay documentation polishing
Based on PR feedback.
2023-11-29 11:53:11 +01:00
Arne Welzel
5e046eee58 logging/Manager: Implement DelayTokenType as an actual opaque
With a bit of tweaking in the JavaScript plugin to support opaque types, this
will allow the delay functionality to work there, too.

Making the LogDelayToken an actual opaque seems reasonable, too. It's not
supposed to be user inspected.
2023-11-29 11:53:11 +01:00
Arne Welzel
2dbb467ba2 logging: Implement get_delay_queue_size()
Primarily for introspection given that re-delaying may exceed
queue sizes.
2023-11-29 11:53:11 +01:00
Arne Welzel
f0e67022fd logging: Introduce Log::delay() and Log::delay_finish()
This is a verbose, opinionated and fairly restrictive version of the log delay idea.
Main drivers are explicitly, foot-gun-avoidance and implementation simplicity.

Calling the new Log::delay() function is only allowed within the execution
of a Log::log_stream_policy() hook for the currently active log write.

Conceptually, the delay is placed between the execution of the global stream
policy hook and the individual filter policy hooks. A post delay callback
can be registered with every Log::delay() invocation. Post delay callbacks
can (1) modify a log record as they see fit, (2) veto the forwarding of the
log record to the log filters and (3) extend the delay duration by calling
Log::delay() again. The last point allows to delay a record by an indefinite
amount of time, rather than a fixed maximum amount. This should be rare and
is therefore explicit.

Log::delay() increases an internal reference count and returns an opaque
token value to be passed to Log::delay_finish() to release a delay reference.
Once all references are released, the record is forwarded to all filters
attached to a stream when the delay completes.

This functionality separates Log::log_stream_policy() and individual filter
policy hooks. One consequence is that a common use-case of filter policy hooks,
removing unproductive log records, may run after a record was delayed. Users
can lift their filtering logic to the stream level (or replicate the condition
before the delay decision). The main motivation here is that deciding on a
stream-level delay in per-filter hooks is too late. Attaching multiple filters
to a stream can additionally result in hard to understand behavior.

On the flip side, filter policy hooks are guaranteed to run after the delay
and can be used for further mangling or filtering of a delayed record.
2023-11-29 11:53:11 +01:00
Johanna Amann
7c0f325d1b TLS: Update cipher consts and keyexchange parsing
Update cipher consts.

Furthermore some past updates have been applied to scriptland, but it
was not considered that some of these also have to be applied to binpac
code, to be able to correcly parse the ServerKeyExchange message.

(As a side-note - this was discovered due to a test discrepancy with the
Spicy parser)
2023-11-27 16:22:24 +00:00
Arne Welzel
37113b4de6 frameworks/software: Fix stale value used for stripping
There was some confusion around which value was used subsequent to a strip(),
but sub not respecting anchors make it appear to work. Also seems that the
`\(?` part seems redundant.
2023-11-17 14:37:28 +01:00
Arne Welzel
398122206e EventRegistry: Deprecate UsedHandlers() and UnusedHandlers()
and check_for_unused_event_handlers: UsageAnalyzer is more thorough
and the previous ones weren't extended to work with &is_used and
should probably be considered superseded by the UsageAnalyzer even
if that currently does not provide a public API and just prints
out deprecation warnings.

I'm also tempted to deprecate SetUsed() and Used() of EventHandler
for the same reason.

Closes #3187.
2023-11-07 16:06:17 +01:00
Arne Welzel
cd24acdfc8 time machine: Mark leftovers for removal in v7.1
I suspect we could just drop these directly, but lets follow the
deprecation cycle.
2023-11-07 16:06:16 +01:00
Arne Welzel
d88b147ac9 cluster: Deprecate the Cluster::Node$interface field
This field isn't required by a worker and it's certainly not used by a
worker to listen on that specific interface. It also isn't required to
be set consistently and its use in-tree limited to the old load-balancing
script.

There's a bif called packet_source() which on a worker will provide
information about the actually used packet source.

Relates to zeek/zeek#2877.
2023-11-07 16:06:16 +01:00
Johanna Amann
ff27eb5a69 SSL: Add new extension types and ECH test
This commit adds a multitude of new extension types that were added in
the last few years; it also adds grease values to extensions, curves,
and ciphersuites.

Furthermore, it adds a test that contains a encrypted-client-hello
key-exchange (which uses several extension types that we do not have in
our baseline so far).
2023-10-30 14:19:16 +00:00
Tim Wojtulewicz
091c849abe Merge remote-tracking branch 'security/topic/awelzel/200-pop-fuzzer-timeout'
* security/topic/awelzel/200-pop-fuzzer-timeout:
  ssl: Prevent unbounded ssl_history growth
  ssl: Cap number of alerts parsed from SSL record
2023-10-27 11:04:03 -07:00
Tim Wojtulewicz
d9534f687a Merge remote-tracking branch 'security/topic/awelzel/196-ftp-timeout-smaller-fix'
* security/topic/awelzel/196-ftp-timeout-smaller-fix:
  Update baselines
  ftp: Do not base seq on number of pending commands
2023-10-27 11:03:54 -07:00
Arne Welzel
560f8a4a84 ssl: Prevent unbounded ssl_history growth
The ssl_history field may grow unbounded (e.g., ssl_alert event). Prevent this
by capping using a configurable limit (default 100) and raise a weird once reached.
2023-10-25 09:35:45 +02:00
Arne Welzel
c960d279a2 ssl: Cap number of alerts parsed from SSL record
Limit the number of events raised from an SSL record with content_type
alert (21) to a configurable maximum number (default 10). For TLS 1.3,
the limit is set to 1 as specified in the RFC. Add a new weird cases
where the limit is exceeded.

OSS-Fuzz managed to generate a reproducer that raised ~660k ssl_plaintext
and ssl_alert events given ~810kb of input data. This change prevents this
with hopefully no negative side-effect in the real-world.
2023-10-25 09:35:10 +02:00
Arne Welzel
ce4cbac1ef ftp: Do not base seq on number of pending commands
Previously, seq was computed as the result of |pending_commands|+1. This
opened the possibility to override queued commands, as well as logging
the same pending ftp reply multiple times.

For example, when commands 1, 2, 3 are pending, command 1 may be dequeued,
but the incoming command then receives seq 3 and overrides the already
pending command 3. The second scenario happens when ftp_reply() selected
command 3 as pending for logging, but is then followed by many ftp_request()
events. This resulted in command 3's response being logged for every
following ftp_request() over and over again.

Avoid both scenarios by tracking the command sequence as an absolute counter.
2023-10-24 19:10:07 +02:00
Arne Welzel
54a08a74da base/frameworks/spicy: Do not load base/misc/version
Unsure what it's used for today and also results in the situation that on
some platforms we generate a reporter.log in bare mode, while on others
where spicy is disabled, we do not.

If we want base/frameworks/version loaded by default, should put it into
init-bare.zeek and possibly remove the loading of the reporter framework
from it - Reporter::error() would still work and be visible on stderr,
just not create a reporter.log.
2023-10-24 13:15:21 +02:00
Arne Welzel
688d68cbf6 zeek.bif: Switch mmdb stale check to network_time
Makes testing easier and aligns better with log rotation and timer
expiration. Should not have an effect in practice. Also, log detail
about whether inode or modification time changed, too.
2023-10-24 11:11:00 +02:00
Arne Welzel
6604010a05 quic: Bump maximum history length, make configurable
From zeek/spicy-quic#15
2023-10-20 20:42:30 +02:00
Arne Welzel
a503c2a672 Merge remote-tracking branch 'origin/topic/awelzel/quic-ldap-event-prototypes'
* origin/topic/awelzel/quic-ldap-event-prototypes:
  ldap: Use longer event names
  ldap: Add spicy-events.zeek
  quic: Add spicy-events.zeek
2023-10-19 11:08:36 +02:00
Arne Welzel
e1864ec131 ldap: Use longer event names
It's unusual to compress and shorten event names of protocol analyzers,
switch to a slightly longer name instead.
2023-10-19 10:49:19 +02:00
Arne Welzel
fb31ad0c6e ldap: Add spicy-events.zeek 2023-10-19 10:48:34 +02:00
Arne Welzel
2389f6f6c5 quic: Add spicy-events.zeek 2023-10-19 10:48:24 +02:00
Tim Wojtulewicz
6d9d4523bc Add registration for GRE-over-UDP 2023-10-16 11:42:24 -07:00
Arne Welzel
007bcefd09 Merge remote-tracking branch 'origin/topic/awelzel/2326-import-quic'
* origin/topic/awelzel/2326-import-quic:
  ci/btest: Remove spicy-quic helper, disable Spicy on CentOS 7
  btest/core/ppp: Run test in bare mode
  btest/quic: Update other tests
  testing/quic: Fixups and simplification after Zeek integration
  quic: Integrate as default analyzer
  quic: Include Copyright lines to the analyzer's source code contributed by Fox-IT
  quic: Squashed follow-ups: quic.log, tests, various fixes, performance
  quic: Initial implementation
2023-10-11 18:05:14 +02:00
Arne Welzel
94a8cf2a09 Merge remote-tracking branch 'origin/topic/awelzel/pcap-reading-configurable-buffer'
* origin/topic/awelzel/pcap-reading-configurable-buffer:
  iosource/pcap: Support configurable buffer size
  util/setvbuf: Respect buf argument
2023-10-11 15:20:17 +02:00
Arne Welzel
ee827eecf7 quic: Integrate as default analyzer 2023-10-11 14:10:22 +02:00
Arne Welzel
359f8d2ae6 quic: Squashed follow-ups: quic.log, tests, various fixes, performance 2023-10-11 14:10:22 +02:00
Joost
44d7c45723 quic: Initial implementation 2023-10-11 14:10:22 +02:00
Arne Welzel
72df1a0216 Merge remote-tracking branch 'origin/topic/bbannier/issue-3234'
* origin/topic/bbannier/issue-3234:
  Introduce dedicated `LDAP::Info`
  Remove redundant storing of protocol in LDAP logs
  Use LDAP `RemovalHook` instead of implementing `connection_state_remove`
  Tidy up LDAP code by using local references
  Pluralize container names in LDAP types
  Move LDAP script constants to their own file
  Name `LDAP::Message` and `LDAP::Search` `*Info`
  Make ports for LDAP analyzers fully configurable
  Require have-spicy for tests which log spicy-ldap information
  Fix LDAP analyzer setup for when Spicy analyzers are disabled
  Bump zeek-testing-private
  Integrate spicy-ldap test suite
  Move spicy-ldap into Zeek protocol analyzer tree
  Explicitly use all of spicy-ldap's modules
  Explicitly list `asn1.spicy` as spicy-ldap source
  Remove uses of `zeek` module in spicy-ldap
  Fix typos in spicy-ldap
  Remove project configuration files in spicy-ldap
  Integrate spicy-ldap into build
  Import zeek/spicy-ldap@57b5eff988
2023-10-10 20:07:03 +02:00
Benjamin Bannier
346d2c49a9 Introduce dedicated LDAP::Info 2023-10-10 18:49:25 +02:00
Benjamin Bannier
301d8722bf Remove redundant storing of protocol in LDAP logs 2023-10-10 18:49:25 +02:00
Benjamin Bannier
82b3a4048f Use LDAP RemovalHook instead of implementing connection_state_remove 2023-10-10 18:49:25 +02:00
Benjamin Bannier
1d4412a9e7 Tidy up LDAP code by using local references 2023-10-10 18:49:25 +02:00
Benjamin Bannier
3a60a60619 Pluralize container names in LDAP types 2023-10-10 18:49:25 +02:00
Benjamin Bannier
0c126f3c6b Move LDAP script constants to their own file 2023-10-10 18:28:13 +02:00
Benjamin Bannier
c43bc52e18 Name LDAP::Message and LDAP::Search *Info 2023-10-10 18:28:13 +02:00
Benjamin Bannier
9b02b93889 Make ports for LDAP analyzers fully configurable
This moves the ports the LDAP analyzers should be triggered on from the
EVT file to the Zeek module. This gives users full control over which
ports the analyzers are registered for while previously they could only
register them for additional ports (there is no Zeek script equivalent
of `Manager::UnregisterAnalyzerForPort`).

The analyzers could still be triggered via DPD, but this is intentional.
To fully disable analyzers users can use e.g.,

```zeek
event zeek_init()
    {
    Analyzer::disable_analyzer(Analyzer::ANALYZER_LDAP_TCP);
    }
```
2023-10-10 18:28:13 +02:00
Arne Welzel
7fac5837c3 iosource/pcap: Support configurable buffer size
On Linux with a default ext4 or tmpfs filesystem, the default buffer size for
reading a pcap is chosen as 4k (strace/gdb validated). When reading large pcaps
containing raw data transfers, the syscall overhead for read becomes visible
in profiles. Support configurability of the buffer size and default to 128kb.

When processing a ~830M PCAP (16 UDP connections, each transferring ~50MB) in
bare mode, this change improves runtime from 1.39 sec to 1.29 sec. Increasing
the buffer further didn't provide a noticeable boost.
2023-10-10 15:08:51 +02:00
Benjamin Bannier
53d4052d68 Fix LDAP analyzer setup for when Spicy analyzers are disabled 2023-10-10 09:21:57 +02:00
Benjamin Bannier
f172febbcb Move spicy-ldap into Zeek protocol analyzer tree 2023-10-10 09:21:57 +02:00
Johanna Amann
e18edfa452 Add extract_limit_includes_missing option for file extraction
Setting this option to false does not count missing bytes in files towards the
extraction limits, and allows to extract data up to the desired limit,
even when partial files are written.

When missing bytes are encountered, files are now written as sparse
files.

Using this option requires the underlying storage and utilities to support
sparse files.
2023-09-14 12:11:42 -07:00
Johanna Amann
9928f7efb7 File extraction: use fseek
In the past, we allocated a buffer with zeroes and wrote that with
fwrite. Now, instead we just fseek to the correct offset.

This changes the way in which the file extract limit is counted a bit;
skipped bytes do no longer count against the file size limit.

(cherry picked from commit 5071592e9b7105090a1d9de19689c499070749d4)
2023-09-14 12:11:37 -07:00
Tim Wojtulewicz
5934e143aa Revert "Add extract_limit_includes_missing option for file extraction"
This reverts commit f4d0fdcd5c.
2023-09-14 12:10:40 -07:00
Johanna Amann
f4d0fdcd5c Add extract_limit_includes_missing option for file extraction
Setting this option to false does not count missing bytes in files towards the
extraction limits, and allows to extract data up to the desired limit,
even when partial files are written.

When missing bytes are encountered, files are now written as sparse
files.

Using this option requires the underlying storage and utilities to support
sparse files.

(cherry picked from commit afa6f3a0d3b8db1ec5b5e82d26225504c2891089)
2023-09-12 12:00:36 -07:00
Arne Welzel
b2c40a22cb ftp: Do not log non-pending commands
OSS Fuzz generated a CWD request and reply followed by very many EPRT
requests. This caused Zeek to re-log the CWD request and invoke `build_url_ftp()`
over and over again resulting in long processing times.

Avoid this scenario by not logging commands that aren't pending anymore.

(cherry picked from commit b05dd31667ff634ec7d017f09d122f05878fdf65)
2023-09-12 12:00:36 -07:00
Arne Welzel
f6e7ea43c3 http/smtp: Fix wrong character class usage
A call to `extract_filename_from_content_disposition()` is only
efficient if the string is guaranteed to contain the pattern that
is removed by `sub()`. Due to missing brackets around the `[:blank:]`
character class, an overly long string (756kb) ending in
"Type:dtanameaa=" matched the wrong pattern causing `sub()` to
exhibit quadratic runtime. Besides that, we may have potentially
extracted wrong information from a crafted header value.

(cherry picked from commit 6d385b1ca724a10444865e4ad38a58b31a2e2288)
2023-09-12 12:00:36 -07:00
Arne Welzel
14a2c02f9d Merge remote-tracking branch 'origin/topic/awelzel/1705-http-pending-requests'
* origin/topic/awelzel/1705-http-pending-requests:
  http: Prevent request/response de-synchronization and unbounded state growth
2023-09-01 11:54:10 +02:00