* origin/topic/awelzel/3424-http-upgrade-websocket-v1:
websocket: Handle breaking from WebSocket::configure_analyzer()
websocket: Address review feedback for BinPac code
fuzzers: Add WebSocket fuzzer
websocket: Fix crash for fragmented messages
websocket: Verify Sec-WebSocket-Key/Accept headers and review feedback
btest/websocket: Test for coalesced reply-ping
HTTP/CONNECT: Also weird on extra data in reply
HTTP/Upgrade: Weird when more data is available
ContentLine: Add GetDeliverStreamRemainingLength() accessor
HTTP: Drain event queue after instantiating upgrade analyzer
btest/http: Explain switching-protocols test change as comment
WebSocket: Introduce new analyzer and log
HTTP: Add mechanism to instantiate Upgrade analyzer
The &transient attribute does not work well with $element as that won't
be available within &until anymore apparently.
Found after a few seconds building out the fuzzer.
Don't log them, they are random and arbitrary in the normal case. Users
can do the following to log them if wanted.
redef += WebSocket::Info$client_key += { &log };
redef += WebSocket::Info$server_accept += { &log };
Add a constructed PCAP where the HTTP/websocket server send a WebSocket
ping message directly with the packet of the HTTP reply. Ensure this is
interpreted the same as if the WebSocket message is in a separate packet
following the HTTP reply.
For the server side this should work, for the client side we'd need to
synchronize suspend parsing the client side as we currently cannot quite
know whether it's a pipelined HTTP request following, or upgraded protocol
data and we don't have "suspend parsing" functionality here.
DPD enables HTTP based on the content of the WebSocket frames. However,
it's not HTTP, the protocol is x-kaazing-handshake and the server sends
some form of status/acknowledge to the client first, so the HTTP and the
HTTP analyzer receives that as the first bytes of the response and
bails, oh well.
This adds a new WebSocket analyzer that is enabled with the HTTP upgrade
mechanism introduced previously. It is a first implementation in BinPac with
manual chunking of frame payload. Configuration of the analyzer is sketched
via the new websocket_handshake() event and a configuration BiF called
WebSocket::__configure_analyzer(). In short, script land collects WebSocket
related HTTP headers and can forward these to the analyzer to change its
parsing behavior at websocket_handshake() time. For now, however, there's
no actual logic that would change behavior based on agreed upon extensions
exchanged via HTTP headers (e.g. frame compression). WebSocket::Configure()
simply attaches a PIA_TCP analyzer to the WebSocket analyzer for dynamic
protocol detection (or a custom analyzer if set). The added pcaps show this
in action for tunneled ssh, http and https using wstunnel. One test pcap is
Broker's WebSocket traffic from our own test suite, the other is the
Jupyter websocket traffic from the ticket/discussion.
This commit further adds a basic websocket.log that aggregates the WebSocket
specific headers (Sec-WebSocket-*) headers into a single log.
Closes#3424
The BDAT analyzer should be supporting uint64_t sized chunks reasonably well,
but the ContentLine analyzer does not, And also, I totally got types for
RemainingChunkSize() and in DeliverStream() wrong, resulting in overflows
and segfaults when very large chunk sizes were used.
Tickled by OSS-Fuzz. Actually running the fuzzer locally only took a
few minutes to find the crash, too. Embarrassing.
OSS-Fuzz managed to produce a MIME multipart message construction with
thousands of nested entities (or that's what Zeek makes out of it anyhow).
Prevent such deep analysis by capping at a nesting depth of 100,
preventing unnecessary resource usage. A new weird named exceeded_mime_max_depth
is reported when this limit is reached.
This change reduces the runtime of the OSS-Fuzz reproducer from ~45 seconds
to ~2.5 seconds.
The test PCAP was produced from a Python script using the email package
and sending the rendered version via POST to a HTTP server.
Closes#208
OSS-Fuzz found that providing an invalid BDAT line would tickle an
assert in UpdateState(). The BDAT state was never initialized, but
within UpdateState() that was expected.
This also removes the AnalyzerViolation() call for bad BDAT commands
and instead raises a weird. The SMTP analyzer is very lax and not triggering
the violation allows to parse the server's response to such an invalid
command.
PCAP files produced by a custom Python SMTP client against Postfix.
* origin/topic/christian/mmdb-configurability:
Modernize various C++/Zeek-isms in the MMDB code.
Fix MMDB code to re-open explicitly opened DBs correctly
Add btest to verify behavior of re-opened MMDBs opened directly via BIFs
Simplify MMDB code by moving more lookup functionality into MMDB class
Move MMDB logic out of mmdb.bif and into MMDB.cc/h.
Fix mmdb.temporary-error testcase when MMDBs are installed on system
Adapt MMDB BiF code to new script-layer variables
Update btest baselines to reflect introduction of mmdb.bif
Move MaxMind/GeoIP BiF functionality into separate file
Provide script-level configurability of MaxMind DB placement on disk
Sort toplevel .bif list in CMakeLists
ssl-log-ext had a bug that caused data present in the SSL connection to
not be logged in some cases. Specifically, the script relied on the base
ssl script to initialize some data structures; however, this means that
protocol messages that arrive before a message is handled by the base
ssl script are not logged.
This commit changes the ssl-log-ext script to also initialize the data
structures; now messages are correctly included in the log in all cases.
In AWS GLB environments, the max_depth of 2 is easily reached due to packets
being encapsulated with GENEVE and VXLAN [1]. Any additional encapsulation
layer causes Zeek raise a weird and ignore the inner traffic. Bump the default
maximum depth to 4, while not common it's not unusual either to observe
this in the wild.
[1] https://docs.aws.amazon.com/vpc/latest/mirroring/traffic-mirroring-packet-formats.htmlCloses#3439
The mmdb_open_location_db() and mmdb_open_asn_db() BiFs were untested, and Zeek
has a bug that makes any DBs opened that way fall back to looking up DBs via the
existing script-level config mechanism (via mmdb_dir), which is at least
unexpected and might well be unconfigured if somebody uses the direct BiFs.
The test would previously fail in settings where the user has Maxmind DBs
installed in the hardwired system locations, because the fallback logic still
picked those up.
The initial (prefix) and final (suffix) strings are specified individually
with a variable number of "any" matches that can occur between these.
The previous implementation assumed a single string and rendered it
as *<string>*.
Reported and PCAP provided by @martinvanhensbergen, thanks!
Closeszeek/spicy-ldap#27
* origin/topic/awelzel/3504-ldap-logs-scalars:
Update external baselines
ldap: Use scalar values in logs where appropriate
ldap: Rename LDAP::search_result to LDAP::search_result_entry
Skimming through the RFC, the previous approach of having containers for most
fields seems unfounded for normal protocol operation. The new weirds could just
as well be considered protocol violations. Outside of duplicated or missed data
they just shouldn't happen for well-behaved client/server behavior.
Additionally, with non-conformant traffic it would be trivial to cause
unbounded state growth and immense log record sizes.
Unfortunately, things have become a bit clunky now.
Closes#3504
While it seems interesting functionality, this hasn't been documented,
maintained or knowingly leveraged for many years.
There are various other approaches today, too:
* We track the number of event handler invocations regardless of
profiling. It's possible to approximate a load_sample event by
comparing the result of two get_event_stats() calls. Or, visualize
the corresponding counters in a Prometheus setup to get an idea of
event/s broken down by event names.
* HookCallFunction() allows to intercept script execution, including
measuring the time execution takes.
* The global call_stack and g_frame_stack can be used from plugins
(and even external processes) to walk the Zeek script stack at certain
points to implement a sampling profiler.
* USDT probes or more plugin hooks will likely be preferred over Zeek
builtin functionality in the future.
Relates to #3458
The produced coverage files are of little use in current local workflows
and usually just end-up taking up disk space. ZEEK_PROFILER_FILE can be
set explicitly if there's a one-off need to produce these locally, too.
* origin/topic/vern/CSE-opt:
incorporate latest version of gen-zam to correctly generate indirect calls
added sub-directory for tracking ZAM maintenance issues
BTest to stress-test AST optimizer's assessment of side effects
reworked AST optimizers analysis of side effects during aggregate operations & calls
script optimization support for tracking information associated with BiFs/functions
fix for AST analysis of inlined functions
improved AST optimizer's analysis of variable usage in inlined functions
new method for Stmt nodes to report whether they could execute a "return"
bug fixes for indirect function calls when using ZAM
minor fixes for script optimization, exporting of attr_name, script layout tweak
This change allows to specify a per signature specific event, overriding
the default signature_match event. It further removes the message
parameter from such events if not provided in the signature.
This also tracks the message as StringValPtr directly to avoid
allocating the same StringVal for every DoAction() call.
Closes#3403