So far, when Zeek didn't see a connection's regular tear-down (e.g.,
because its state timed-out before we got to the end), we'd still
signal a regular end-of-data to Spicy parsers. As a result, they would
then typically raise a parse error because they were probably still
expecting data and would now declare it missing. That's not very
useful because semantically it's not really a protocol issue if the
data just doesn't make it over to us; it's a transport-layer issue
that Zeek already handles elsewhere. So we now switch to signaling
end-of-data to Spicy analyzers only if the connection indeed shuts
down regularly. This is also matches how BinPAC handles it.
This also comes with a test exercising various combinations of
end-of-data behavior so that we ensure consistent/desired behavior.
Closes#4007.
This admittedly is a quite esoteric combination of protocols. But - as
we do correctly support them, it seems nice to have a slightly more
complete testcase that covers this.
This change tracks the current offset (number of bytes fed into matchers)
on the top-level RuleEndpointState such that we can compute the relative ending
for matched texts individually.
Additionally, it adds the data_end_offset as a new optional parameter to
signature_match().
This commit prevents most non-Modbus TCP traffic on port 502 to be
reported as Modbus in conn.log as well as in modbus.log.
To do so, we have introduced two &enforce checks in the Modbus
protocol definition that checks that some specific fields of the
(supposedly) Modbus header are compatible with values specified in
the specs.
To ensure non-regression, with this commit we also introduce a
new btest.
Closes#3962
Processing out-of-order commands or finishing commands based on invalid
server responses resulted in inconsistent analyzer state, potentially
triggering null pointer references for crafted traffic.
This commit reworks cf9fe91705 such that
too many pending commands are simply discarded, rather than any attempt
being made to process them. Further, invalid server responses do not
result in command completion anymore.
Test PCAP was crafted based on traffic produced by the OSS-Fuzz reproducer.
Closes#215
The cmds list may grow unbounded due to the POP3 analyzer being in
multiLine mode after seeing `AUTH` in a Redis connection, but never
a `.` terminator. This can easily be provoked by the Redis ping
command.
This adds two heuristics: 1) Forcefully process the oldest commands in
the cmds list and cap it at max_pending_commands. 2) Start raising
analyzer violations if the client has been using more than
max_unknown_client_commands commands (default 10).
Closes#3936
This adds a protocol parser for the PostgreSQL protocol and a new
postgresql.log similar to the existing mysql.log.
This should be considered preliminary and hopefully during 7.1 and 7.2
with feedback from the community, we can improve on the events and logs.
Even if most PostgreSQL communication is encrypted in the real-world, this
will minimally allow monitoring of the SSLRequest and hand off further
analysis to the SSL analyzer.
This originates from github.com/awelzel/spicy-postgresql, with lots of
polishing happening in the past two days.
The current implementation would only log, if the password contains a
colon, the part before the first colon (e.g., the password
`password:password` would be logged as `password`).
A test has been added to confirm the expected behaviour.
This reworks the parser such that COM_CHANGE_USER switches the
connection back into the CONNECTION_PHASE so that we can remove the
EXPECT_AUTH_SWITCH special case in the COMMAND_PHASE. Adds two pcaps
produced with Python that actually do COM_CHANGE_USER as it seems
not possible from the MySQL CLI.
The ctu-sme-11-win7ad-1-ldap-tcp-50041.pcap file was harvested
from the CTU-SME-11 (Experiment-VM-Microsoft-Windows7AD-1) dataset
at https://zenodo.org/records/7958259 (DOI 10.5281/zenodo.7958258).
Closes#3853
Pcap was generated as follows. Doesn't seem wireshark even parses
this properly right now.
with common.get_connection() as c:
with c.cursor() as cur:
date1 = datetime.date(1987, 10, 18)
datetime1 = datetime.datetime(1990, 9, 26, 12, 13, 14)
cur.add_attribute("number1", 42)
cur.add_attribute("string1", "a string")
cur.add_attribute("date1", date1)
cur.add_attribute("datetime1", datetime1)
cur.execute("SELECT version()")
result = cur.fetchall()
print("result", result)
PCAP was produced with a local OpenLDAP server configured to support StartTLS.
This puts the Zeek calls into a separate ldap_zeek.spicy file/module
to separate it from LDAP.
ASN1Message(True) may go off parsing arbitrary input data as
"something ASN.1" This could be GBs of octet strings or just very
long sequences. Avoid this by open-coding some top-level types expected.
This also tries to avoid some of the &parse-from usages that result
in unnecessary copies of data.
Adds a locally generated PCAP with addRequest/addResponse that we
don't currently handle.
Mostly staring at the PCAPs and opened a few RFCs. For now, only if the
MS_KRB5 OID is used and accepted in a bind response, start stripping
KRB5 wrap tokens for both, client and server traffic.
Would probably be nice to forward the GSS-API data to the analyzer...
Closeszeek/spicy-ldap#29.
When Zeek flips roles of a HTTP connection subsequent to the HTTP analyzer
being attached, that analyzer would not update its own ContentLine analyzer
state, resulting in the wrong ContentLine analyzer being switched into
plain delivery mode.
In debug builds, this would result in assertion failures, in production
builds, the HTTP analyzer would receive HTTP bodies as individual header
lines, or conversely, individual header lines would be delivered as a
large chunk from the ContentLine analyzer.
PCAPs were generated locally using tcprewrite to select well-known-http ports
for both endpoints, then editcap to drop the first SYN packet.
Kudos to @JordanBarnartt for keeping at it.
Closes#3789
Changing the default_file_bof_buffer_size has subtle impact on
MIME type detection and changed the zeek-testing baseline. Do
not load this new script via test-all-policy to avoid this.
The new test was mainly an aid to understand what is actually going on.
In short, if default_file_bof_buffer_size is larger than the file MIME
detection only runs when the buffer is full, or when the file is removed.
When a file transfer happens over multiple HTTP connections, only
some or one of the http.log entries will have a proper response MIME type.
PCAP extracted from 2009-M57-day11-18.trace.gz.
A continuation frame has the same type as the first frame, but that
information wasn't used nor kept, resulting payload of continuation
frames not being forwarded. The pcap was created with a fake Python
server and a bit of message crafting.
The &transient attribute does not work well with $element as that won't
be available within &until anymore apparently.
Found after a few seconds building out the fuzzer.
Don't log them, they are random and arbitrary in the normal case. Users
can do the following to log them if wanted.
redef += WebSocket::Info$client_key += { &log };
redef += WebSocket::Info$server_accept += { &log };
Add a constructed PCAP where the HTTP/websocket server send a WebSocket
ping message directly with the packet of the HTTP reply. Ensure this is
interpreted the same as if the WebSocket message is in a separate packet
following the HTTP reply.
For the server side this should work, for the client side we'd need to
synchronize suspend parsing the client side as we currently cannot quite
know whether it's a pipelined HTTP request following, or upgraded protocol
data and we don't have "suspend parsing" functionality here.
This adds a new WebSocket analyzer that is enabled with the HTTP upgrade
mechanism introduced previously. It is a first implementation in BinPac with
manual chunking of frame payload. Configuration of the analyzer is sketched
via the new websocket_handshake() event and a configuration BiF called
WebSocket::__configure_analyzer(). In short, script land collects WebSocket
related HTTP headers and can forward these to the analyzer to change its
parsing behavior at websocket_handshake() time. For now, however, there's
no actual logic that would change behavior based on agreed upon extensions
exchanged via HTTP headers (e.g. frame compression). WebSocket::Configure()
simply attaches a PIA_TCP analyzer to the WebSocket analyzer for dynamic
protocol detection (or a custom analyzer if set). The added pcaps show this
in action for tunneled ssh, http and https using wstunnel. One test pcap is
Broker's WebSocket traffic from our own test suite, the other is the
Jupyter websocket traffic from the ticket/discussion.
This commit further adds a basic websocket.log that aggregates the WebSocket
specific headers (Sec-WebSocket-*) headers into a single log.
Closes#3424
The BDAT analyzer should be supporting uint64_t sized chunks reasonably well,
but the ContentLine analyzer does not, And also, I totally got types for
RemainingChunkSize() and in DeliverStream() wrong, resulting in overflows
and segfaults when very large chunk sizes were used.
Tickled by OSS-Fuzz. Actually running the fuzzer locally only took a
few minutes to find the crash, too. Embarrassing.
OSS-Fuzz managed to produce a MIME multipart message construction with
thousands of nested entities (or that's what Zeek makes out of it anyhow).
Prevent such deep analysis by capping at a nesting depth of 100,
preventing unnecessary resource usage. A new weird named exceeded_mime_max_depth
is reported when this limit is reached.
This change reduces the runtime of the OSS-Fuzz reproducer from ~45 seconds
to ~2.5 seconds.
The test PCAP was produced from a Python script using the email package
and sending the rendered version via POST to a HTTP server.
Closes#208
OSS-Fuzz found that providing an invalid BDAT line would tickle an
assert in UpdateState(). The BDAT state was never initialized, but
within UpdateState() that was expected.
This also removes the AnalyzerViolation() call for bad BDAT commands
and instead raises a weird. The SMTP analyzer is very lax and not triggering
the violation allows to parse the server's response to such an invalid
command.
PCAP files produced by a custom Python SMTP client against Postfix.
In AWS GLB environments, the max_depth of 2 is easily reached due to packets
being encapsulated with GENEVE and VXLAN [1]. Any additional encapsulation
layer causes Zeek raise a weird and ignore the inner traffic. Bump the default
maximum depth to 4, while not common it's not unusual either to observe
this in the wild.
[1] https://docs.aws.amazon.com/vpc/latest/mirroring/traffic-mirroring-packet-formats.htmlCloses#3439