Commit graph

929 commits

Author SHA1 Message Date
Tim Wojtulewicz
f701f1fc94 Merge remote-tracking branch 'security/topic/awelzel/152-smtp-validate-mail-transactions'
* security/topic/awelzel/152-smtp-validate-mail-transactions:
  smtp: Validate mail transaction and disable SMTP analyzer if excessive
  generic-analyzer-fuzzer: Detect disable_analyzer() from scripts
2023-04-11 15:16:25 -07:00
Arne Welzel
7665e808a2 ftp/main: Special case for intermediate reply lines
The medium.trace in the private external test suite contains one
session/server that violates the multi-line reply protocol and
happened to work out fairly well regardless due to how we looked
up the pending commands unconditionally before.

Continue to match up reply lines that "look like they contain status codes"
even if cont_resp = T. This still improves runtime for the OSS-Fuzz
generated test case and keeps the external baselines valid.

The affected session can be extracted as follows:

    zcat Traces/medium.trace.gz | tcpdump -r  - 'port 1491 and port 21'

We could push this into the analyzer, too, minimally the RFC says:

    > If an intermediary line begins with a 3-digit number, the Server
    > must pad the front  to avoid confusion.
2023-04-03 14:05:13 +02:00
Arne Welzel
b8dc6ad120 smtp: Validate mail transaction and disable SMTP analyzer if excessive
An invalid mail transaction is determined as

* RCPT TO command without a preceding MAIL FROM
* a DATA command without a preceding RCPT TO

and logged as a weird.

The testing pcap for invalid mail transactions was produced with a Python
script against a local exim4 configured to accept more errors and unknown
commands than 3 by default:

    # exim4.conf.template
    smtp_max_synprot_errors = 100
    smtp_max_unknown_commands = 100

See also: https://www.rfc-editor.org/rfc/rfc5321#section-3.3
2023-03-27 18:41:47 +02:00
Arne Welzel
1b3e8a611e ftp/main: Skip get_pending_command() for intermediate reply lines
Intermediate lines of multiline replies usually do not contain valid status
codes (even if servers may opt to include them). Their content may be anything
and likely unrelated to the original command. There's little reason for us
trying to match them with a corresponding command.

OSS-Fuzz generated a large command reply with very many intermediate lines
which caused long processing times due to matching every line with all
currently pending commands.
This is a DoS vector against Zeek. The new ipv6-multiline-reply.trace and
ipv6-retr-samba.trace files have been extracted from the external ipv6.trace.
2023-03-23 13:50:36 +01:00
Arne Welzel
cf2da5160b dns: Remove AD and CD flags from log
There was a misunderstanding whether to include them by default in
the dns.log, so remove them again.

There had also been a discussion and quirk that AD of a request would
always be overwritten by reply in the dns.log unless the reply is
missing. For now, let users extend dns.log themselves for what best
fits their requirements, rather than adding these flags by default.

Add a btest to print AD and CD flags for smoke testing still.
2023-03-16 10:09:27 +01:00
Arne Welzel
33090d7a27 Merge branch 'dnssec-flag-parse' of github.com:micrictor/zeek-codespace
* 'dnssec-flag-parse' of github.com:micrictor/zeek-codespace:
  Update external testing commit hash for DNS flag changes
  Parse DNSSEC AD and CD bits

Updated dump-events baseline which seemed unrelated.
2023-03-14 10:35:50 +01:00
Michael R. Torres
fe8390c646 Parse DNSSEC AD and CD bits
Parse authentic data (AD) and checking disabled (CD) bits according to
RFC 2535. Leaves the Z field as-is, in case users are already handling
this elsewhere and depend on the value being the integer for all 3 bits.

https://www.rfc-editor.org/rfc/rfc2535#section-6.1

Fixes #2672
2023-03-13 14:35:06 -07:00
Arne Welzel
f56785740c ftp: Limit user, password, arg and reply_msg column sizes in log
The user and password fields are replicated to each of the ftp.log
entries. Using a very large username (100s of KBs) allows to bloat
the log without actually sending much traffic. Further, limit the
arg and reply_msg columns to large, but not unbounded values.
2023-02-21 12:28:07 -07:00
Robin Sommer
bc252c63dc
Add BIF have_spicy_analyzers().
We previously used the Spicy plugin's `Spicy::available` to test for
Spicy support. However, having Spicy support does not necessarily mean that we
have built Zeek with its in-tree Spicy analyzers: the Spicy plugin
could have been pulled in from external. The new BIF now reliably
tells us whether the Spicy analyzers are available; its result
corresponds to what `zeek-config --have-spicy-analyzers` returns as
well.

We also move the two current checks over to use this BIF.

(Note: I refrained from renaming the CMake-side `USE_SPICY_ANALYERS`
to `HAVE_SPICY_ANALYZERS`. We should do this eventually for
consistency, but I didn't want to make more changes than necessary
right now.)
2023-02-03 13:47:26 +01:00
Robin Sommer
04a1ead978
Provide infrastructure to migrate legacy analyzers to Spicy.
As initial examples, this branch ports the Syslog and Finger analyzers
over. We leave the old analyzers in place for now and activate them
iff we compile without any Spicy.

Needs `zeek-spicy-infra` branches in `spicy/`, `spicy-plugin/`,
`CMake/`, and `zeek/zeek-testing-private`.

Note that the analyzer events remain associated with the Spicy plugin
for now: that's where they will show up with `-NN`, and also inside
the Zeekygen documentation.

We switch CMake over to linking the runtime library into the plugin,
vs. at the top-level through object libraries.
2023-02-01 11:33:48 +01:00
Arne Welzel
8be8c22b3e smb1: Prevent accessing uninitialized referenced_tree
The added pcap was created from an OSS Fuzz test case and is borderline
valid SMB traffic, but it triggered a scripting error.

Closes #2726
2023-01-27 19:22:13 +01:00
Arne Welzel
672602dae7 MySQL: Fix endianness, introduce mysql_eof() event
We were parsing MySQL using bigendian even though the protocol is
specified as with "least significant byte first" [1]. This is most
problematic when parsing length encoded strings with 2 byte length
fields...

Further, I think, the EOF_Packet parsing was borked, either due to
testing the CLIENT_DEPRECATE_EOF with the wrong endianness, or due to
the workaround in Resultset processing raising mysql_ok(). Introduce a
new mysql_eof() that triggers for EOF_Packet's and remove the fake
mysql_ok() Resultset invocation to fix. Adapt the mysql script and tests
to account for the new event.

This is a quite backwards incompatible change on the event level, but
due to being quite buggy in general, doubt this matters to many.

I think there is more buried, but this fixes the violation of the simple
"SHOW ENGINE INNODB STATUS" and the existing tests continue to
succeed...

[1] https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_basic_dt_integers.html
2023-01-27 10:59:23 +01:00
Tim Wojtulewicz
6cfb45d24f Merge remote-tracking branch 'jeff-bb/patch-2'
* jeff-bb/patch-2:
  Log raw keyboard value on best guess
  Avoid excessive fmt calls, return default behavior on unknown
  "Best Guess" unknown keyboard / language variants
2023-01-23 12:50:23 -07:00
jeff-bb
7085104c33
Log raw keyboard value on best guess 2023-01-23 09:12:48 -06:00
jeff-bb
04113b13d5
Avoid excessive fmt calls, return default behavior on unknown
Using "in" to query the language const. This also handles the case of not having a best guess and continue using the existing behavior.

Given
keyboard_layout = 1033 (0x0409), "keyboard-English - United States"
keyboard_layout = 66569 (0x00010409), "keyboard-English - United States (Best Guess)"
keyboard_layout = 12345 (0x3039), "keyboard-12345"
2023-01-20 08:29:55 -06:00
jeff-bb
3012e0417a
Remove Duplicate 4122 Croatian 2023-01-19 17:04:42 -06:00
jeff-bb
dd2cdb064b
"Best Guess" unknown keyboard / language variants
If the lookup table does not have an entry, it will just log as the raw decimal language/keyboard code. With this change, if we do not have an entry in the lookup table, we'll look at the low order / 4 least significant bits to see if we have a match. The high order / 4 most significant bits are flags/modifiers to the base language/keyboard code. We'll append that it is a "Best Guess"

(This is my first attempt at Zeek scripting, apologies upfront if I'm missing obvious language features. I feel like the const language lookup should return a success/fail return code that we would key off of, but unsure how to accomplish that so instead went for string matching on value in == value out).
2023-01-19 16:55:23 -06:00
jeff-bb
ee21b30a18
Revert Sort 2023-01-19 16:05:43 -06:00
jeff-bb
498aaef428
Update RDP Keyboard Languages
Using additional sources to add more languages / locales. Changed sort order to match other sections.
2023-01-19 15:44:57 -06:00
Christian Kreibich
1c381b5531 Merge branch 'topic/christian/gh-846-tcp-duration-docs'
* topic/christian/gh-846-tcp-duration-docs:
  Expand Conn::Info$duration comment to clarify TCP end-of-connection handling
2022-11-30 09:42:18 -08:00
Christian Kreibich
b0f96fa22c Expand Conn::Info$duration comment to clarify TCP end-of-connection handling
From Vern in GH-846: This is a conscious decision in the TCP analysis to
consider a connection's "duration" to run up through the end of its
productive (= data can be delivered) lifetime, not extending beyond that. So
once it's closed, packets seen subsequently (until the state-holding for the
connection times out) get processed in terms of updating the associated
history, but not the duration. This can include (unnecessarily) retransmitted
data packets, like in one of the examples above. An advantage of this definition
of "duration" is it allows more accurate computation of connection data rates.
2022-11-30 09:39:57 -08:00
Johanna Amann
3253168a53 Merge remote-tracking branch 'origin/topic/awelzel/2583-mqtt-to-base'
* origin/topic/awelzel/2583-mqtt-to-base:
  mqtt: Move from policy/ into base/
2022-11-30 13:44:27 +00:00
Arne Welzel
eb3bea4e4a mqtt: Move from policy/ into base/
Register dpd signatures and the analyzer when running in default mode.

Closes #2583
2022-11-30 10:14:20 +01:00
Arne Welzel
8698a00f03 smb: Drop references to uid_map in state.
This isn't ever written to and probably was meant to be removed during
the following commit: 5b5589e167
2022-11-23 18:19:53 +01:00
Arne Welzel
b04f378f0f smb: Drop AUTH_LOG
This is never used and probably should've been removed
with 143eee5d8d
2022-11-23 18:18:20 +01:00
Tim Wojtulewicz
6055a85b3c Merge remote-tracking branch 'origin/topic/awelzel/smb1-avoid-dialect-index-error'
* origin/topic/awelzel/smb1-avoid-dialect-index-error:
  smb1: Ensure existence of dialect_index in offered dialects
2022-11-16 14:51:56 -07:00
Arne Welzel
e9fa853048 smb1: Ensure existence of dialect_index in offered dialects
When a negotiate request offers no dialects, but the response contains
an ntlm record which selects a dialect, a script error is triggered.

    $ zeek -C -r ./f2b0e.pcap 'DPD::ignore_violations+={ Analyzer::ANALYZER_SMB }'
    1668615340.837882 expression error in /home/awelzel/corelight-oss/zeek/scripts/base/protocols/smb/./smb1-main.zeek, line 96: no such index (SMB1::c$smb_state$current_cmd$smb1_offered_dialects[SMB1::response$ntlm$dialect_index])

Script error triggered by fuzzing when testing Tim's all-the-fuzzing branch.
2022-11-16 17:49:55 +01:00
Arne Welzel
187096d4a4 ssh: Test for c$ssh$analyzer_id existence
While unusual, analyzer_confirmation() may never be called for the
SSH analyzer, but still ssh_auth_attempted is invoked later indicating
successful authentication. I haven't checked how that is actually possible,
but seems prudent to check for the existence of c$ssh$analyzer_id before
referencing it (also in light of runtime enable/disabling of events).

This was found testing Tim's all-the-fuzzing branch on large system,
merging this should avoid oss-fuzz telling us about it.

    $ zeek -C -r ./e83db.pcap 'DPD::ignore_violations+={ Analyzer::ANALYZER_SSH }'
    1668610572.429058 expression error in scripts/base/protocols/ssh/./main.zeek, line 260: field value missing (SSH::c$ssh$analyzer_id)
2022-11-16 16:35:57 +01:00
Arne Welzel
8b04868de3 {http,smtp}/entities: Align header regexes with extract_filename_from_content_disposition() 2022-11-08 16:45:25 -07:00
Arne Welzel
c132d140ae ftp: Limit pending commands to FTP::max_pending_commands (default 20) 2022-11-08 16:44:17 -07:00
Tim Wojtulewicz
68450eac47 Merge remote-tracking branch 'origin/topic/timw/update-dns-types'
* origin/topic/timw/update-dns-types:
  Update external test hashes
  Update DNS type strings to match correct mappings
2022-11-03 08:57:19 -07:00
Josh Soref
21e0d777b3 Spelling fixes: scripts
* accessing
* across
* adding
* additional
* addresses
* afterwards
* analyzer
* ancillary
* answer
* associated
* attempts
* because
* belonging
* buffer
* cleanup
* committed
* connects
* database
* destination
* destroy
* distinguished
* encoded
* entries
* entry
* hopefully
* image
* include
* incorrect
* information
* initial
* initiate
* interval
* into
* java
* negotiation
* nodes
* nonexistent
* ntlm
* occasional
* omitted
* otherwise
* ourselves
* paragraphs
* particular
* perform
* received
* receiver
* referring
* release
* repetitions
* request
* responded
* retrieval
* running
* search
* separate
* separator
* should
* synchronization
* target
* that
* the
* threshold
* timeout
* transaction
* transferred
* transmission
* triggered
* vetoes
* virtual

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-11-02 17:36:39 -04:00
Tim Wojtulewicz
c9610ec45b Update DNS type strings to match correct mappings 2022-11-02 14:22:46 -07:00
Arne Welzel
8c5896a74d scripts: Migrate table iteration to blank identifiers
No obvious hot-cases. Maybe the describe_file() ones or the intel ones
if/when there are hot intel hits.
2022-10-24 10:36:09 +02:00
Tim Wojtulewicz
a91d363e56 smtp: Prevent script errors when smtp$entity is not set
This is the same issue presented in 38e226bf75 but
for SMTP instead of HTTP.
2022-10-10 11:26:08 -07:00
Johanna Amann
3d9a1157f9 Update CT log list.
This uses the v3 json as a source for the first time. The test needed
some updating because Google removed a couple more logs - in the future
this should hopefully not be neccessary anymore because I think v3
should retain all logs.

In theory this might be neat in 5.1.
2022-10-06 15:10:59 +01:00
Tim Wojtulewicz
37d4a28816 Merge remote-tracking branch 'origin/topic/awelzel/http-no-current-entity'
* origin/topic/awelzel/http-no-current-entity:
  http: Prevent script errors when http$current_entity is not set
2022-10-03 09:44:46 -07:00
Arne Welzel
bc8fd5a4c6 Introduce generic analyzer_confirmation_info and analyzer_violation_info
Introduce two new events for analyzer confirmation and analyzer violation
reporting. The current analyzer_confirmation and analyzer_violation
events assume connection objects and analyzer ids are available which
is not always the case. We're already passing aid=0 for packet analyzers
and there's not currently a way to report violations from file analyzers
using analyzer_violation, for example.

These new events use an extensible Info record approach so that additional
(optional) information can be added later without changing the signature.
It would allow for per analyzer extensions to the info records to pass
analyzer specific info to script land. It's not clear that this would be
a good idea, however.

The previous analyzer_confirmation and analyzer_violation events
continue to exist, but are deprecated and will be removed with Zeek 6.1.
2022-09-27 17:49:51 +02:00
Arne Welzel
38e226bf75 http: Prevent script errors when http$current_entity is not set
The current_entity tracking in HTTP assumes that client/server never
send HTTP entities at the same time. The attached pcap (generated
artificially) violates this and triggers:

    1663698249.307259 expression error in <...>base/protocols/http/./entities.zeek, line 89: field value missing (HTTP::c$http$current_entity)

For the http-no-crlf test, include weird.log as baseline. Now that weird is
@load'ed from http, it is actually created and seems to make sense
to btest-diff it, too.
2022-09-26 10:18:24 +02:00
Arne Welzel
660172013b scripts/conn: Open-code determine_service()
...and avoid doing it as suggested by Justin to avoid the extra over-head
in scan scenarios where c$service is empty.
2022-09-20 23:07:26 +02:00
Arne Welzel
31aeb58e10 dpd: Replace negated service fmt() magic with dedicated field
...the only known cases where the `-` for `connection$service` was
handled is to skip/ignore these analyzers.

Slight suspicion that join_string_set() should maybe become a bif
now determine_service() runs once for each connection.

Closes #2388
2022-09-20 23:07:26 +02:00
Arne Welzel
b60a4e3a1f scripts/dce-rpc,ntlm: Do not load base/frameworks/dpd
DPD will work without loading this explicitly and these are the only
scripts that do load it explicitly.
2022-08-31 16:50:37 +02:00
Tim Wojtulewicz
7fe6290974 Merge remote-tracking branch 'micrictor/master'
* micrictor/master:
  Add a field to Modbus/TCP log to indicate the Modbus PDU type
  Add modbus transaction and unit ids to logs
  Enable modbus logging for requests
2022-08-11 11:57:10 -07:00
Arne Welzel
02985b9966 ssl: Only delete c$ssl$analyzer_id when disabling the analyzer was successful
The next patch will have a test script rely on c$ssl$analyzer_id staying
around when disable_analyzer() wasn't successful.

I was tempted to remove the `delete` completely as neither RDP nor SSH
have that and not sure why SSL is special here.
2022-08-11 09:40:34 +02:00
Peter Cullen
fb4858d42b Prevent large dhcp log entries
A flood of DHCP traffic can result if very large log entries consisting
of many uids and/or msg_types. Such large log entries can disrupt a SIEM
ingestion pipeline. This change forcing a log entry to be written when
the number of uids or the number of msg_Types exceed a certain value.
The values are treated as options for easy configuration.
2022-07-28 11:34:18 -07:00
Michael Torres
b85801aa7e Add a field to Modbus/TCP log to indicate the Modbus PDU type
Add the `pdu_type` field to Modbus over TCP logs to indicate whether the Modbus
message was a request or a response. Due to the client/server nature of Modbus
over TCP/IP, all messages from the TCP session originator are requests, while
all messages from the TCP session responder are responses.

Adding this information to the default log surfaces protocol metadata in a way
that doesn't require users to understand the Modbus over TCP protocol.
2022-07-24 02:41:26 +00:00
Michael Torres
bab2036aa4 Add modbus transaction and unit ids to logs
Add transaction IDs and unit IDs to default modbus over TCP/IP logs.
Update the relevant testing baselines to account for the extra fields.
2022-07-17 21:02:37 +00:00
Michael Torres
7c24b53b4f Enable modbus logging for requests 2022-07-17 21:02:37 +00:00
Arne Welzel
3dae8ab086 smb2: Raise smb2_file_delete for CREATE with FILE_DELETE_ON_CLOSE
When a CREATE request contains the FILE_DELETE_ON_CLOSE option and
the subsequent CREATE response indicates success, we now raise the
smb2_file_delete event to log a delete action in smb_files.log and
also give users a way to handle this scenario.

The provided pcap was generated locally by recording a smbtorture run
of the smb2.delete-on-close-perms test case.

Placed the create_options into the CmdInfo record for potential
exposure in smb_cmd.log (wasn't sure how that would look so left it
for the future).

Fixes #2276.
2022-07-16 17:14:13 +02:00
Johanna Amann
e14eddeb97 SSL Analyzer: track connection direction by messages
This PR changes the way in which the SSL analyzer tracks the direction
of connections. So far, the SSL analyzer assumed that the originator of
a connection would send the client hello (and other associated
client-side events), and that the responder would be the SSL servers.

In some circumstances this is not true, and the initiator of a
connection is the server, with the responder being the client. So far
this confused some of the internal statekeeping logic and could lead to
mis-parsing of extensions.

This reversal of roles can happen in DTLS, if a connection uses STUN -
and potentially in some StartTLS protocols.

This PR tracks the direction of a TLS connection using the hello
request, client hello and server hello handshake messages. Furthermore,
it changes the SSL events from providing is_orig to providing is_client,
where is_client is true for the client_side of a connection. Since the
argument positioning in the event has not changed, old scripts will
continue to work seamlessly - the new semantics are what everyone
writing SSL scripts will have expected in any case.

There is a new event that is raised when a connection is flipped. A
weird is raised if a flip happens repeatedly.

Addresses GH-2198.
2022-06-24 18:35:44 +01:00