This is similar to what the external corelight/zeek-smb-clear-state script
does, but leverages the smb2_discarded_messages_state() event instead of
regularly checking on the state of SMB connections.
The pcap was created using the dperson/samba container image and mounting
a share with Linux's CIFS filesystem, then copying the content of a
directory with 100 files. The test uses a BPF filter to imitate mostly
"half-duplex" traffic.
* security/topic/awelzel/152-smtp-validate-mail-transactions:
smtp: Validate mail transaction and disable SMTP analyzer if excessive
generic-analyzer-fuzzer: Detect disable_analyzer() from scripts
The medium.trace in the private external test suite contains one
session/server that violates the multi-line reply protocol and
happened to work out fairly well regardless due to how we looked
up the pending commands unconditionally before.
Continue to match up reply lines that "look like they contain status codes"
even if cont_resp = T. This still improves runtime for the OSS-Fuzz
generated test case and keeps the external baselines valid.
The affected session can be extracted as follows:
zcat Traces/medium.trace.gz | tcpdump -r - 'port 1491 and port 21'
We could push this into the analyzer, too, minimally the RFC says:
> If an intermediary line begins with a 3-digit number, the Server
> must pad the front to avoid confusion.
An invalid mail transaction is determined as
* RCPT TO command without a preceding MAIL FROM
* a DATA command without a preceding RCPT TO
and logged as a weird.
The testing pcap for invalid mail transactions was produced with a Python
script against a local exim4 configured to accept more errors and unknown
commands than 3 by default:
# exim4.conf.template
smtp_max_synprot_errors = 100
smtp_max_unknown_commands = 100
See also: https://www.rfc-editor.org/rfc/rfc5321#section-3.3
Intermediate lines of multiline replies usually do not contain valid status
codes (even if servers may opt to include them). Their content may be anything
and likely unrelated to the original command. There's little reason for us
trying to match them with a corresponding command.
OSS-Fuzz generated a large command reply with very many intermediate lines
which caused long processing times due to matching every line with all
currently pending commands.
This is a DoS vector against Zeek. The new ipv6-multiline-reply.trace and
ipv6-retr-samba.trace files have been extracted from the external ipv6.trace.
There was a misunderstanding whether to include them by default in
the dns.log, so remove them again.
There had also been a discussion and quirk that AD of a request would
always be overwritten by reply in the dns.log unless the reply is
missing. For now, let users extend dns.log themselves for what best
fits their requirements, rather than adding these flags by default.
Add a btest to print AD and CD flags for smoke testing still.
* 'dnssec-flag-parse' of github.com:micrictor/zeek-codespace:
Update external testing commit hash for DNS flag changes
Parse DNSSEC AD and CD bits
Updated dump-events baseline which seemed unrelated.
Parse authentic data (AD) and checking disabled (CD) bits according to
RFC 2535. Leaves the Z field as-is, in case users are already handling
this elsewhere and depend on the value being the integer for all 3 bits.
https://www.rfc-editor.org/rfc/rfc2535#section-6.1Fixes#2672
The user and password fields are replicated to each of the ftp.log
entries. Using a very large username (100s of KBs) allows to bloat
the log without actually sending much traffic. Further, limit the
arg and reply_msg columns to large, but not unbounded values.
We previously used the Spicy plugin's `Spicy::available` to test for
Spicy support. However, having Spicy support does not necessarily mean that we
have built Zeek with its in-tree Spicy analyzers: the Spicy plugin
could have been pulled in from external. The new BIF now reliably
tells us whether the Spicy analyzers are available; its result
corresponds to what `zeek-config --have-spicy-analyzers` returns as
well.
We also move the two current checks over to use this BIF.
(Note: I refrained from renaming the CMake-side `USE_SPICY_ANALYERS`
to `HAVE_SPICY_ANALYZERS`. We should do this eventually for
consistency, but I didn't want to make more changes than necessary
right now.)
As initial examples, this branch ports the Syslog and Finger analyzers
over. We leave the old analyzers in place for now and activate them
iff we compile without any Spicy.
Needs `zeek-spicy-infra` branches in `spicy/`, `spicy-plugin/`,
`CMake/`, and `zeek/zeek-testing-private`.
Note that the analyzer events remain associated with the Spicy plugin
for now: that's where they will show up with `-NN`, and also inside
the Zeekygen documentation.
We switch CMake over to linking the runtime library into the plugin,
vs. at the top-level through object libraries.
We were parsing MySQL using bigendian even though the protocol is
specified as with "least significant byte first" [1]. This is most
problematic when parsing length encoded strings with 2 byte length
fields...
Further, I think, the EOF_Packet parsing was borked, either due to
testing the CLIENT_DEPRECATE_EOF with the wrong endianness, or due to
the workaround in Resultset processing raising mysql_ok(). Introduce a
new mysql_eof() that triggers for EOF_Packet's and remove the fake
mysql_ok() Resultset invocation to fix. Adapt the mysql script and tests
to account for the new event.
This is a quite backwards incompatible change on the event level, but
due to being quite buggy in general, doubt this matters to many.
I think there is more buried, but this fixes the violation of the simple
"SHOW ENGINE INNODB STATUS" and the existing tests continue to
succeed...
[1] https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_basic_dt_integers.html
* jeff-bb/patch-2:
Log raw keyboard value on best guess
Avoid excessive fmt calls, return default behavior on unknown
"Best Guess" unknown keyboard / language variants
Using "in" to query the language const. This also handles the case of not having a best guess and continue using the existing behavior.
Given
keyboard_layout = 1033 (0x0409), "keyboard-English - United States"
keyboard_layout = 66569 (0x00010409), "keyboard-English - United States (Best Guess)"
keyboard_layout = 12345 (0x3039), "keyboard-12345"
If the lookup table does not have an entry, it will just log as the raw decimal language/keyboard code. With this change, if we do not have an entry in the lookup table, we'll look at the low order / 4 least significant bits to see if we have a match. The high order / 4 most significant bits are flags/modifiers to the base language/keyboard code. We'll append that it is a "Best Guess"
(This is my first attempt at Zeek scripting, apologies upfront if I'm missing obvious language features. I feel like the const language lookup should return a success/fail return code that we would key off of, but unsure how to accomplish that so instead went for string matching on value in == value out).
From Vern in GH-846: This is a conscious decision in the TCP analysis to
consider a connection's "duration" to run up through the end of its
productive (= data can be delivered) lifetime, not extending beyond that. So
once it's closed, packets seen subsequently (until the state-holding for the
connection times out) get processed in terms of updating the associated
history, but not the duration. This can include (unnecessarily) retransmitted
data packets, like in one of the examples above. An advantage of this definition
of "duration" is it allows more accurate computation of connection data rates.
When a negotiate request offers no dialects, but the response contains
an ntlm record which selects a dialect, a script error is triggered.
$ zeek -C -r ./f2b0e.pcap 'DPD::ignore_violations+={ Analyzer::ANALYZER_SMB }'
1668615340.837882 expression error in /home/awelzel/corelight-oss/zeek/scripts/base/protocols/smb/./smb1-main.zeek, line 96: no such index (SMB1::c$smb_state$current_cmd$smb1_offered_dialects[SMB1::response$ntlm$dialect_index])
Script error triggered by fuzzing when testing Tim's all-the-fuzzing branch.
While unusual, analyzer_confirmation() may never be called for the
SSH analyzer, but still ssh_auth_attempted is invoked later indicating
successful authentication. I haven't checked how that is actually possible,
but seems prudent to check for the existence of c$ssh$analyzer_id before
referencing it (also in light of runtime enable/disabling of events).
This was found testing Tim's all-the-fuzzing branch on large system,
merging this should avoid oss-fuzz telling us about it.
$ zeek -C -r ./e83db.pcap 'DPD::ignore_violations+={ Analyzer::ANALYZER_SSH }'
1668610572.429058 expression error in scripts/base/protocols/ssh/./main.zeek, line 260: field value missing (SSH::c$ssh$analyzer_id)
This uses the v3 json as a source for the first time. The test needed
some updating because Google removed a couple more logs - in the future
this should hopefully not be neccessary anymore because I think v3
should retain all logs.
In theory this might be neat in 5.1.
Introduce two new events for analyzer confirmation and analyzer violation
reporting. The current analyzer_confirmation and analyzer_violation
events assume connection objects and analyzer ids are available which
is not always the case. We're already passing aid=0 for packet analyzers
and there's not currently a way to report violations from file analyzers
using analyzer_violation, for example.
These new events use an extensible Info record approach so that additional
(optional) information can be added later without changing the signature.
It would allow for per analyzer extensions to the info records to pass
analyzer specific info to script land. It's not clear that this would be
a good idea, however.
The previous analyzer_confirmation and analyzer_violation events
continue to exist, but are deprecated and will be removed with Zeek 6.1.
The current_entity tracking in HTTP assumes that client/server never
send HTTP entities at the same time. The attached pcap (generated
artificially) violates this and triggers:
1663698249.307259 expression error in <...>base/protocols/http/./entities.zeek, line 89: field value missing (HTTP::c$http$current_entity)
For the http-no-crlf test, include weird.log as baseline. Now that weird is
@load'ed from http, it is actually created and seems to make sense
to btest-diff it, too.
...the only known cases where the `-` for `connection$service` was
handled is to skip/ignore these analyzers.
Slight suspicion that join_string_set() should maybe become a bif
now determine_service() runs once for each connection.
Closes#2388
* micrictor/master:
Add a field to Modbus/TCP log to indicate the Modbus PDU type
Add modbus transaction and unit ids to logs
Enable modbus logging for requests
The next patch will have a test script rely on c$ssl$analyzer_id staying
around when disable_analyzer() wasn't successful.
I was tempted to remove the `delete` completely as neither RDP nor SSH
have that and not sure why SSL is special here.
A flood of DHCP traffic can result if very large log entries consisting
of many uids and/or msg_types. Such large log entries can disrupt a SIEM
ingestion pipeline. This change forcing a log entry to be written when
the number of uids or the number of msg_Types exceed a certain value.
The values are treated as options for easy configuration.
Add the `pdu_type` field to Modbus over TCP logs to indicate whether the Modbus
message was a request or a response. Due to the client/server nature of Modbus
over TCP/IP, all messages from the TCP session originator are requests, while
all messages from the TCP session responder are responses.
Adding this information to the default log surfaces protocol metadata in a way
that doesn't require users to understand the Modbus over TCP protocol.