Seem reasonable give we log the server SCID. Interestingly, the Chromium
examples actually have zero length (empty) source connection IDs. I wonder
if that's part of their "protocol ossification avoidance" effort.
This introduces a new hook into the Intel::seen() function that allows
users to directly interact with the result of a find() call via external
scripts.
This should solve the use-case brought up by @chrisanag1985 in
discussion #3256: Recording and acting on "no intel match found".
@Canon88 was recently asking on Slack about enabling HTTP logging for a
given connection only when an Intel match occurred and found that the
Intel::match() event would only occur on the manager. The
Intel::match_remote() event might be a workaround, but possibly running a
bit too late and also it's just an internal "detail" event that might not
be stable.
Another internal use case revolved around enabling packet recording
based on Intel matches which necessarily needs to happen on the worker
where the match happened. The proposed workaround is similar to the above
using Intel::match_remote().
This hook also provides an opportunity to rate-limit heavy hitter intel
items locally on the worker nodes, or even replacing the event approach
currently used with a customized approach.
Don't log them, they are random and arbitrary in the normal case. Users
can do the following to log them if wanted.
redef += WebSocket::Info$client_key += { &log };
redef += WebSocket::Info$server_accept += { &log };
This adds a new WebSocket analyzer that is enabled with the HTTP upgrade
mechanism introduced previously. It is a first implementation in BinPac with
manual chunking of frame payload. Configuration of the analyzer is sketched
via the new websocket_handshake() event and a configuration BiF called
WebSocket::__configure_analyzer(). In short, script land collects WebSocket
related HTTP headers and can forward these to the analyzer to change its
parsing behavior at websocket_handshake() time. For now, however, there's
no actual logic that would change behavior based on agreed upon extensions
exchanged via HTTP headers (e.g. frame compression). WebSocket::Configure()
simply attaches a PIA_TCP analyzer to the WebSocket analyzer for dynamic
protocol detection (or a custom analyzer if set). The added pcaps show this
in action for tunneled ssh, http and https using wstunnel. One test pcap is
Broker's WebSocket traffic from our own test suite, the other is the
Jupyter websocket traffic from the ticket/discussion.
This commit further adds a basic websocket.log that aggregates the WebSocket
specific headers (Sec-WebSocket-*) headers into a single log.
Closes#3424
OSS-Fuzz managed to produce a MIME multipart message construction with
thousands of nested entities (or that's what Zeek makes out of it anyhow).
Prevent such deep analysis by capping at a nesting depth of 100,
preventing unnecessary resource usage. A new weird named exceeded_mime_max_depth
is reported when this limit is reached.
This change reduces the runtime of the OSS-Fuzz reproducer from ~45 seconds
to ~2.5 seconds.
The test PCAP was produced from a Python script using the email package
and sending the rendered version via POST to a HTTP server.
Closes#208
* origin/topic/christian/mmdb-configurability:
Modernize various C++/Zeek-isms in the MMDB code.
Fix MMDB code to re-open explicitly opened DBs correctly
Add btest to verify behavior of re-opened MMDBs opened directly via BIFs
Simplify MMDB code by moving more lookup functionality into MMDB class
Move MMDB logic out of mmdb.bif and into MMDB.cc/h.
Fix mmdb.temporary-error testcase when MMDBs are installed on system
Adapt MMDB BiF code to new script-layer variables
Update btest baselines to reflect introduction of mmdb.bif
Move MaxMind/GeoIP BiF functionality into separate file
Provide script-level configurability of MaxMind DB placement on disk
Sort toplevel .bif list in CMakeLists
In AWS GLB environments, the max_depth of 2 is easily reached due to packets
being encapsulated with GENEVE and VXLAN [1]. Any additional encapsulation
layer causes Zeek raise a weird and ignore the inner traffic. Bump the default
maximum depth to 4, while not common it's not unusual either to observe
this in the wild.
[1] https://docs.aws.amazon.com/vpc/latest/mirroring/traffic-mirroring-packet-formats.htmlCloses#3439
This lifts the list of fallback directories in which Zeek will look for Maxmind
DBs into the script layer, and makes the names of the DB files themselves
(previously hardwired) configurable as well.
This does not yet change the in-core code; that commit follows.
* origin/topic/awelzel/3504-ldap-logs-scalars:
Update external baselines
ldap: Use scalar values in logs where appropriate
ldap: Rename LDAP::search_result to LDAP::search_result_entry
Skimming through the RFC, the previous approach of having containers for most
fields seems unfounded for normal protocol operation. The new weirds could just
as well be considered protocol violations. Outside of duplicated or missed data
they just shouldn't happen for well-behaved client/server behavior.
Additionally, with non-conformant traffic it would be trivial to cause
unbounded state growth and immense log record sizes.
Unfortunately, things have become a bit clunky now.
Closes#3504
While it seems interesting functionality, this hasn't been documented,
maintained or knowingly leveraged for many years.
There are various other approaches today, too:
* We track the number of event handler invocations regardless of
profiling. It's possible to approximate a load_sample event by
comparing the result of two get_event_stats() calls. Or, visualize
the corresponding counters in a Prometheus setup to get an idea of
event/s broken down by event names.
* HookCallFunction() allows to intercept script execution, including
measuring the time execution takes.
* The global call_stack and g_frame_stack can be used from plugins
(and even external processes) to walk the Zeek script stack at certain
points to implement a sampling profiler.
* USDT probes or more plugin hooks will likely be preferred over Zeek
builtin functionality in the future.
Relates to #3458
The SMB::State$recent_files field is meant to have expiring entries.
However, due to usage of &default=string_set(), the &read_expire
attribute is not respected causing unbounded state growth. Replace
&default=string_set() with &default=set().
Thanks to ya-sato on Slack for reporting!
Related: zeek/zeek-docs#179, #3513.
With a bit of tweaking in the JavaScript plugin to support opaque types, this
will allow the delay functionality to work there, too.
Making the LogDelayToken an actual opaque seems reasonable, too. It's not
supposed to be user inspected.
This is a verbose, opinionated and fairly restrictive version of the log delay idea.
Main drivers are explicitly, foot-gun-avoidance and implementation simplicity.
Calling the new Log::delay() function is only allowed within the execution
of a Log::log_stream_policy() hook for the currently active log write.
Conceptually, the delay is placed between the execution of the global stream
policy hook and the individual filter policy hooks. A post delay callback
can be registered with every Log::delay() invocation. Post delay callbacks
can (1) modify a log record as they see fit, (2) veto the forwarding of the
log record to the log filters and (3) extend the delay duration by calling
Log::delay() again. The last point allows to delay a record by an indefinite
amount of time, rather than a fixed maximum amount. This should be rare and
is therefore explicit.
Log::delay() increases an internal reference count and returns an opaque
token value to be passed to Log::delay_finish() to release a delay reference.
Once all references are released, the record is forwarded to all filters
attached to a stream when the delay completes.
This functionality separates Log::log_stream_policy() and individual filter
policy hooks. One consequence is that a common use-case of filter policy hooks,
removing unproductive log records, may run after a record was delayed. Users
can lift their filtering logic to the stream level (or replicate the condition
before the delay decision). The main motivation here is that deciding on a
stream-level delay in per-filter hooks is too late. Attaching multiple filters
to a stream can additionally result in hard to understand behavior.
On the flip side, filter policy hooks are guaranteed to run after the delay
and can be used for further mangling or filtering of a delayed record.
Update cipher consts.
Furthermore some past updates have been applied to scriptland, but it
was not considered that some of these also have to be applied to binpac
code, to be able to correcly parse the ServerKeyExchange message.
(As a side-note - this was discovered due to a test discrepancy with the
Spicy parser)
There was some confusion around which value was used subsequent to a strip(),
but sub not respecting anchors make it appear to work. Also seems that the
`\(?` part seems redundant.
and check_for_unused_event_handlers: UsageAnalyzer is more thorough
and the previous ones weren't extended to work with &is_used and
should probably be considered superseded by the UsageAnalyzer even
if that currently does not provide a public API and just prints
out deprecation warnings.
I'm also tempted to deprecate SetUsed() and Used() of EventHandler
for the same reason.
Closes#3187.
This field isn't required by a worker and it's certainly not used by a
worker to listen on that specific interface. It also isn't required to
be set consistently and its use in-tree limited to the old load-balancing
script.
There's a bif called packet_source() which on a worker will provide
information about the actually used packet source.
Relates to zeek/zeek#2877.
This commit adds a multitude of new extension types that were added in
the last few years; it also adds grease values to extensions, curves,
and ciphersuites.
Furthermore, it adds a test that contains a encrypted-client-hello
key-exchange (which uses several extension types that we do not have in
our baseline so far).
The ssl_history field may grow unbounded (e.g., ssl_alert event). Prevent this
by capping using a configurable limit (default 100) and raise a weird once reached.
Limit the number of events raised from an SSL record with content_type
alert (21) to a configurable maximum number (default 10). For TLS 1.3,
the limit is set to 1 as specified in the RFC. Add a new weird cases
where the limit is exceeded.
OSS-Fuzz managed to generate a reproducer that raised ~660k ssl_plaintext
and ssl_alert events given ~810kb of input data. This change prevents this
with hopefully no negative side-effect in the real-world.
Previously, seq was computed as the result of |pending_commands|+1. This
opened the possibility to override queued commands, as well as logging
the same pending ftp reply multiple times.
For example, when commands 1, 2, 3 are pending, command 1 may be dequeued,
but the incoming command then receives seq 3 and overrides the already
pending command 3. The second scenario happens when ftp_reply() selected
command 3 as pending for logging, but is then followed by many ftp_request()
events. This resulted in command 3's response being logged for every
following ftp_request() over and over again.
Avoid both scenarios by tracking the command sequence as an absolute counter.
Unsure what it's used for today and also results in the situation that on
some platforms we generate a reporter.log in bare mode, while on others
where spicy is disabled, we do not.
If we want base/frameworks/version loaded by default, should put it into
init-bare.zeek and possibly remove the loading of the reporter framework
from it - Reporter::error() would still work and be visible on stderr,
just not create a reporter.log.
Makes testing easier and aligns better with log rotation and timer
expiration. Should not have an effect in practice. Also, log detail
about whether inode or modification time changed, too.