Commit graph

253 commits

Author SHA1 Message Date
Christian Kreibich
66173633f4 Merge branch 'topic/christian/telemetry-make-bifs-primary'
* topic/christian/telemetry-make-bifs-primary:
  Telemetry framework: move BIFs to the primary-bif stage
  Minor comment tweaks for init-frameworks-and-bifs.zeek
2024-10-24 07:09:16 -07:00
Arne Welzel
daa358c840 Merge remote-tracking branch 'origin/topic/awelzel/3947-telemetry-hook-scrape'
* origin/topic/awelzel/3947-telemetry-hook-scrape:
  btest/telemetry: Fix "Note compilable" typo
  misc/stats: Add zeek_net_timestamp_seconds
  telemetry/Manager: Remove variant include
  telemetry: Invoke Telemetry::sync() only at scrape/collection time
2024-10-22 19:04:51 +02:00
Arne Welzel
70872673a1 telemetry: Invoke Telemetry::sync() only at scrape/collection time
This stops invoking Telemetry::sync() via a scheduled event and instead
only invokes it on-demand. This makes metric collection network time
independent and lazier, too.

With Prometheus scrape requests being processed on Zeek's main thread
now, we can safely invoke the script layer Telemetry::sync() hook.

Closes #3947
2024-10-22 18:49:11 +02:00
Christian Kreibich
71f7e89974 Telemetry framework: move BIFs to the primary-bif stage
This moves the Telemetry framework's BIF-defined functionalit from the
secondary-BIFs stage to the primary one. That is, this functionality is now
available from the end of init-bare.zeek, not only after the end of
init-frameworks-and-bifs.zeek.

This allows us to use script-layer telemetry in our Zeek's own code that get
pulled in during init-frameworks-and-bifs.

This change splits up the BIF features into functions, constants, and types,
because that's the granularity most workable in Func.cc and NetVar. It also now
defines the Telemetry::MetricsType enum once, not redundantly in BIFs and script
layer.

Due to subtle load ordering issues between the telemetry and cluster frameworks
this pushes the redef stage of Telemetry::metrics_port and address into
base/frameworks/telemetry/options.zeek, which is loaded sufficiently late in
init-frameworks-and-bifs.zeek to sidestep those issues. (When not doing this,
the effect is that the redef in telemetry/main.zeek doesn't yet find the
cluster-provided values, and Zeek does not end up listening on these ports.)

The need to add basic Zeek headers in script_opt/ZAM/ZBody.cc as a side-effect
of this is curious, but looks harmless.

Also includes baseline updates for the usual btests and adds a few doc strings.
2024-10-18 09:56:29 -07:00
Benjamin Bannier
cfd66ec6f3 Fix invalid Sphinx directive in docstring
Use of `:zeek::see:..` instead of `:zeek:see:..` caused a Sphinx build
failure which prevented automatic regeneration of docs.
2024-10-15 12:47:39 +02:00
Arne Welzel
0d925e935e logging: Dedicated log flush timer
Log flushing is currently triggered based on the threading heartbeat timer
of WriterBackends and the hard-coded WRITE_BUFFER_SIZE 1000.

This change introduces a separate timer that is managed by the logger
manager instead of piggy-backing on the heartbeat timer, as well as a
const &redef for the buffer size.

This allows to modify the log flush frequency and batch size independently
of the threading heartbeat interval. Later, this will allow to re-use the
buffering and flushing logic of writer frontends for non-Broker cluster
backends, too.

One change here is that even frontends that do not have a backend will
be flushed regularly. This is wanted for non-Broker backends and should be
very cheap. Possibly, Broker can piggy back on this timer down the road, too,
rather than using its own script-level timer (see Broker::log_flush()).
2024-09-27 15:30:35 +02:00
Arne Welzel
cf9fe91705 pop3: Prevent unbounded state growth
The cmds list may grow unbounded due to the POP3 analyzer being in
multiLine mode after seeing `AUTH` in a Redis connection, but never
a `.` terminator. This can easily be provoked by the Redis ping
command.

This adds two heuristics: 1) Forcefully process the oldest commands in
the cmds list and cap it at max_pending_commands. 2) Start raising
analyzer violations if the client has been using more than
max_unknown_client_commands commands (default 10).

Closes #3936
2024-09-18 19:05:39 +02:00
Evan Typanski
170276807b Add DNS TKEY event 2024-08-16 10:20:42 -04:00
Tim Wojtulewicz
a716903f3a Remove deprecated time machine settings 2024-08-07 11:58:21 -07:00
Tim Wojtulewicz
e2b03681d1 Remove EventRegistry::Used and EventRegistry::SetUsed 2024-08-07 11:58:21 -07:00
Tim Wojtulewicz
7ac7ce1d2b Process metric callbacks from the main-loop thread
This avoids the callbacks from being processed on the worker thread
spawned by Civetweb. It fixes data race issues with lookups involving
global variables, amongst other threading issues.
2024-08-02 15:30:47 -07:00
Tim Wojtulewicz
99e64aa113 Restore label_names field in MetricOpts record 2024-06-04 14:14:58 -07:00
Tim Wojtulewicz
433c257886 Move telmetry label names out of opts records, into main metric records 2024-06-04 14:14:58 -07:00
Tim Wojtulewicz
46ff48c29a Change all instruments to only handle doubles 2024-05-31 13:36:37 -07:00
Tim Wojtulewicz
e195d3d778 Fix some determinism issues with btests 2024-05-31 13:30:31 -07:00
Tim Wojtulewicz
84aa308527 Rework everything to access the prometheus-cpp objects more directly 2024-05-31 13:30:31 -07:00
Tim Wojtulewicz
17d09c657b Move base types from telemetry framework to init-bare 2024-05-31 13:30:31 -07:00
Tim Wojtulewicz
6821a41c4e Move the options from policy/tuning/defaults to actual Zeek defaults, deprecate that package 2024-05-06 11:13:04 -07:00
Arne Welzel
c1a685a05d websocket: Add Spicy parser version, too.
The Spicy analyzer is added as a child analyzer when enabled and the
WebSocket.cc logic dispatches between the BinPac and Spicy version.

It substantially slower when tested against a somewhat artificial
2.4GB PCAP. The first flamegraph indicates that the unmask() function
stands out with 35% of all samples, and above it shared_ptr samples.
2024-02-06 17:29:55 +01:00
Arne Welzel
e17655be61 websocket: Verify Sec-WebSocket-Key/Accept headers and review feedback
Don't log them, they are random and arbitrary in the normal case. Users
can do the following to log them if wanted.

    redef += WebSocket::Info$client_key += { &log };
    redef += WebSocket::Info$server_accept += { &log };
2024-01-22 18:54:38 +01:00
Arne Welzel
efc2681152 WebSocket: Introduce new analyzer and log
This adds a new WebSocket analyzer that is enabled with the HTTP upgrade
mechanism introduced previously. It is a first implementation in BinPac with
manual chunking of frame payload. Configuration of the analyzer is sketched
via the new websocket_handshake() event and a configuration BiF called
WebSocket::__configure_analyzer(). In short, script land collects WebSocket
related HTTP headers and can forward these to the analyzer to change its
parsing behavior at websocket_handshake() time. For now, however, there's
no actual logic that would change behavior based on agreed upon extensions
exchanged via HTTP headers (e.g. frame compression). WebSocket::Configure()
simply attaches a PIA_TCP analyzer to the WebSocket analyzer for dynamic
protocol detection (or a custom analyzer if set). The added pcaps show this
in action for tunneled ssh, http and https using wstunnel. One test pcap is
Broker's WebSocket traffic from our own test suite, the other is the
Jupyter websocket traffic from the ticket/discussion.

This commit further adds a basic websocket.log that aggregates the WebSocket
specific headers (Sec-WebSocket-*) headers into a single log.

Closes #3424
2024-01-22 18:54:38 +01:00
Arne Welzel
8ebd054abc HTTP: Add mechanism to instantiate Upgrade analyzer
When a HTTP upgrade request/reply is detected, lookup an analyzer tag
from HTTP::upgrade_analyzers, or if nothing is found, attach PIA_TCP.
2024-01-22 18:54:38 +01:00
Tim Wojtulewicz
13fde341d2 Merge remote-tracking branch 'security/topic/awelzel/topic/awelzel/208-http-mime-nested-v2'
* security/topic/awelzel/topic/awelzel/208-http-mime-nested-v2:
  MIME: Cap nested MIME analysis depth to 100
2024-01-21 19:31:14 -07:00
Christian Kreibich
ae2fd8f171 Fix typo in docstring [skip ci] 2024-01-18 16:14:27 -08:00
Arne Welzel
2a858d252e MIME: Cap nested MIME analysis depth to 100
OSS-Fuzz managed to produce a MIME multipart message construction with
thousands of nested entities (or that's what Zeek makes out of it anyhow).
Prevent such deep analysis by capping at a nesting depth of 100,
preventing unnecessary resource usage. A new weird named exceeded_mime_max_depth
is reported when this limit is reached.

This change reduces the runtime of the OSS-Fuzz reproducer from ~45 seconds
to ~2.5 seconds.

The test PCAP was produced from a Python script using the email package
and sending the rendered version via POST to a HTTP server.

Closes #208
2024-01-17 10:18:13 -07:00
Arne Welzel
14949941ce SMTP: Add BDAT support
Closes #3264
2024-01-12 10:18:07 +01:00
Arne Welzel
ffffd88bef Merge remote-tracking branch 'origin/topic/christian/mmdb-configurability'
* origin/topic/christian/mmdb-configurability:
  Modernize various C++/Zeek-isms in the MMDB code.
  Fix MMDB code to re-open explicitly opened DBs correctly
  Add btest to verify behavior of re-opened MMDBs opened directly via BIFs
  Simplify MMDB code by moving more lookup functionality into MMDB class
  Move MMDB logic out of mmdb.bif and into MMDB.cc/h.
  Fix mmdb.temporary-error testcase when MMDBs are installed on system
  Adapt MMDB BiF code to new script-layer variables
  Update btest baselines to reflect introduction of mmdb.bif
  Move MaxMind/GeoIP BiF functionality into separate file
  Provide script-level configurability of MaxMind DB placement on disk
  Sort toplevel .bif list in CMakeLists
2024-01-12 09:28:36 +01:00
Arne Welzel
fddbdf6232 init-bare: Default Tunnel::max_depth to 4
In AWS GLB environments, the max_depth of 2 is easily reached due to packets
being encapsulated with GENEVE and VXLAN [1]. Any additional encapsulation
layer causes Zeek raise a weird and ignore the inner traffic. Bump the default
maximum depth to 4, while not common it's not unusual either to observe
this in the wild.

[1] https://docs.aws.amazon.com/vpc/latest/mirroring/traffic-mirroring-packet-formats.html

Closes #3439
2024-01-11 10:22:36 +01:00
Christian Kreibich
8406959ae2 Move MaxMind/GeoIP BiF functionality into separate file 2024-01-10 20:28:37 -08:00
Christian Kreibich
06642d185b Provide script-level configurability of MaxMind DB placement on disk
This lifts the list of fallback directories in which Zeek will look for Maxmind
DBs into the script layer, and makes the names of the DB files themselves
(previously hardwired) configurable as well.

This does not yet change the in-core code; that commit follows.
2024-01-10 20:14:24 -08:00
Arne Welzel
3f7881a57b segment_profiling: Remove SegmentProfiler and load_sample event
While it seems interesting functionality, this hasn't been documented,
maintained or knowingly leveraged for many years.

There are various other approaches today, too:

* We track the number of event handler invocations regardless of
  profiling. It's possible to approximate a load_sample event by
  comparing the result of two get_event_stats() calls. Or, visualize
  the corresponding counters in a Prometheus setup to get an idea of
  event/s broken down by event names.

* HookCallFunction() allows to intercept script execution, including
  measuring the time execution takes.

* The global call_stack and g_frame_stack can be used from plugins
  (and even external processes) to walk the Zeek script stack at certain
  points to implement a sampling profiler.

* USDT probes or more plugin hooks will likely be preferred over Zeek
  builtin functionality in the future.

Relates to #3458
2024-01-03 11:55:54 +01:00
Arne Welzel
398122206e EventRegistry: Deprecate UsedHandlers() and UnusedHandlers()
and check_for_unused_event_handlers: UsageAnalyzer is more thorough
and the previous ones weren't extended to work with &is_used and
should probably be considered superseded by the UsageAnalyzer even
if that currently does not provide a public API and just prints
out deprecation warnings.

I'm also tempted to deprecate SetUsed() and Used() of EventHandler
for the same reason.

Closes #3187.
2023-11-07 16:06:17 +01:00
Arne Welzel
cd24acdfc8 time machine: Mark leftovers for removal in v7.1
I suspect we could just drop these directly, but lets follow the
deprecation cycle.
2023-11-07 16:06:16 +01:00
Tim Wojtulewicz
091c849abe Merge remote-tracking branch 'security/topic/awelzel/200-pop-fuzzer-timeout'
* security/topic/awelzel/200-pop-fuzzer-timeout:
  ssl: Prevent unbounded ssl_history growth
  ssl: Cap number of alerts parsed from SSL record
2023-10-27 11:04:03 -07:00
Arne Welzel
c960d279a2 ssl: Cap number of alerts parsed from SSL record
Limit the number of events raised from an SSL record with content_type
alert (21) to a configurable maximum number (default 10). For TLS 1.3,
the limit is set to 1 as specified in the RFC. Add a new weird cases
where the limit is exceeded.

OSS-Fuzz managed to generate a reproducer that raised ~660k ssl_plaintext
and ssl_alert events given ~810kb of input data. This change prevents this
with hopefully no negative side-effect in the real-world.
2023-10-25 09:35:10 +02:00
Arne Welzel
688d68cbf6 zeek.bif: Switch mmdb stale check to network_time
Makes testing easier and aligns better with log rotation and timer
expiration. Should not have an effect in practice. Also, log detail
about whether inode or modification time changed, too.
2023-10-24 11:11:00 +02:00
Arne Welzel
7fac5837c3 iosource/pcap: Support configurable buffer size
On Linux with a default ext4 or tmpfs filesystem, the default buffer size for
reading a pcap is chosen as 4k (strace/gdb validated). When reading large pcaps
containing raw data transfers, the syscall overhead for read becomes visible
in profiles. Support configurability of the buffer size and default to 128kb.

When processing a ~830M PCAP (16 UDP connections, each transferring ~50MB) in
bare mode, this change improves runtime from 1.39 sec to 1.29 sec. Increasing
the buffer further didn't provide a noticeable boost.
2023-10-10 15:08:51 +02:00
Tim Wojtulewicz
1dc9235cee Pass parsed file record information with ReadFile/WriteFile events 2023-08-07 13:44:38 -07:00
Tim Wojtulewicz
18fd384469 Add length field from header to ModbusHeaders record type 2023-08-07 13:44:37 -07:00
Tim Wojtulewicz
f9904511ab Merge remote-tracking branch 'origin/topic/awelzel/3145-dcerpc-state-clean'
* origin/topic/awelzel/3145-dcerpc-state-clean:
  dce-rpc: Test cases for unbounded state growth
  dce-rpc: Handle smb2_close_request() in scripts
  smb/dce-rpc: Cleanup DCE-RPC analyzers when fid is closed and limit them
  dce-rpc: Do not repeatedly register removal hooks
2023-07-11 16:17:12 -07:00
Arne Welzel
0d6174a5d6 Remove icmp_conn leftovers
Roughly 2.5 years ago all events taking the ``icmp_conn`` parameter were
removed with 44ad614094 and the NetVar.cc
type not populated anymore.

Remove the left-overs in script land, too.
2023-07-04 17:57:20 +02:00
Arne Welzel
6517ed94f2 smb/dce-rpc: Cleanup DCE-RPC analyzers when fid is closed and limit them
This patch does two things:

1) For SMB close requests, tear down any associated DCE-RPC
   analyzer if one exists.

2) Protect from fid_to_analyzer_map growing unbounded by introducing a
   new SMB::max_dce_rpc_analyzers limit and forcefully wipe the
   analyzers if exceeded. Propagate this to script land as event
   smb_discarded_dce_rpc_analyzers() for additional cleanup.

This is mostly to fix how the binpac SMB analyzer tracks individual
DCE-RPC analyzers per open fid. Connections that re-open the same or
different pipe may currently allocate unbounded number of analyzers.

Closes #3145.
2023-06-30 15:14:32 +02:00
Arne Welzel
480d52ca1f from_json: Support function to normalize key names
When a JSON document contains key names containing colons or other
special characters that are not valid in Zeek identifiers, from_json()
cannot be used to parse such input.

This change allows a customizable normalization function.

Closes #3142.
2023-06-29 15:57:49 +02:00
Tim Wojtulewicz
9a79b98a1e Remove analyzer_confirmation/analyzer_violation events (6.1 deprecation) 2023-06-14 10:07:22 -07:00
Tim Wojtulewicz
4229af6820 Remove deprecations tagged for v6.1 2023-06-14 10:07:22 -07:00
Arne Welzel
0b0f6e509f Stmt: Rework assertion hooks break semantics
Using break in either of the hooks allows to suppress the default reporter
error message rather than suppressing solely based on the existence of an
assertion_failure() handler.
2023-06-13 16:18:01 +02:00
Arne Welzel
25ea678626 Stmt: Introduce assert statement and related hooks
including two hooks called assertion_failure() and assertion_result() for
customization and tracking of assertion results.
2023-06-12 18:16:02 +02:00
Robin Sommer
a62e153dd3
Do not load Spicy scripts if Spicy is not available. 2023-05-16 10:21:21 +02:00
Robin Sommer
0040111955
Integrate the Spicy plugin into Zeek proper.
This reflects the `spicy-plugin` code as of `d8c296b81cc2a11`.

In addition to moving the code into Zeek's source tree, this comes
with a couple small functional changes:

- `spicyz` no longer tries to infer if it's running from the build
  directory. Instead `ZEEK_SPICY_LIBRARY` can be set to a custom
  location. `zeek-set-path.sh` does that now.

- ZEEK_CONFIG can be set to change what `spicyz -z` print out. This is
  primarily for backwards compatibility.

Some further notes on specifics:

- We raise the minimum Spicy version to 1.8 (i.e., current `main`
  branch).

- Renamed the `compiler/` subdirectory to `spicyz` to avoid
  include-path conflicts with the Spicy headers.

- In `cmake/`, the corresponding PR brings a new/extended version of
  `FindZeek`, which Spicy analyzer packages need. We also now install
  some of the files that the Spicy plugin used to bring for testing,
  so that existing packages keep working.

- For now, this all remains backwards compatible with the current
  `zkg` analyzer templates so that they work with both external and
  integrated Spicy support. Later, once we don't need to support any
  external Spicy plugin versions anymore, we can clean up the
  templates as well.

- All the plugin's tests have moved into the standard test suite. They
  are skipped if configure with `--disable-spicy`.

This holds off on adapting the new code further to Zeek's coding
conventions, so that it remains easier to maintain it in parallel to
the (now legacy) external plugin. We'll make a pass over the
formatting for (presumable) Zeek 6.1.
2023-05-16 10:17:45 +02:00
Arne Welzel
264284150b Merge remote-tracking branch 'amazing-pp/topic/fupeng/from_json_bif'
* amazing-pp/topic/fupeng/from_json_bif:
  Implement from_json bif

Minor updates during merge: Moved ValFromJSON into zeek::detail for the
time being, removed gotos, normalized some error messages to lower case,
minimal test extension and added a raw reader input framework test reading
"json lines" as a demo, adding notes about the implicit type
conversions.
2023-05-09 10:36:58 +02:00