Commit graph

120 commits

Author SHA1 Message Date
Arne Welzel
31b548babc ftp: Reset fuid after logging
A user reported being confused about the fuid association of subsequent
FTP commands when a data transfer has completed. It seems reasonable to
unset fuid upon logging a FTP command which had a fuid.

The current behavior results in the PORT or PASV commands after a RETR or STOR
to have the fuid of the prior file transfer. Similarly, any CWD or DEL commands
following a file transfer will unnecessarily be logged with the fuid of the
prior file transfer.

This tickles the baselines for the private testing PCAP a lot, primarily
because there data connections in that pcap are never established properly.
E.g, the fuids FzDzid1Dxm9srVKHXf and FEfYX73q5C6GEQZXX9 have been re-used
for multiple commands.

This may look like we're losing information, but the fuids vanishing
in the normal btests belong to a LIST command that isn't logged by
default into ftp.log. If it was, the fuid would be attached to it.
2024-02-21 12:41:32 +01:00
Arne Welzel
ce4cbac1ef ftp: Do not base seq on number of pending commands
Previously, seq was computed as the result of |pending_commands|+1. This
opened the possibility to override queued commands, as well as logging
the same pending ftp reply multiple times.

For example, when commands 1, 2, 3 are pending, command 1 may be dequeued,
but the incoming command then receives seq 3 and overrides the already
pending command 3. The second scenario happens when ftp_reply() selected
command 3 as pending for logging, but is then followed by many ftp_request()
events. This resulted in command 3's response being logged for every
following ftp_request() over and over again.

Avoid both scenarios by tracking the command sequence as an absolute counter.
2023-10-24 19:10:07 +02:00
Arne Welzel
b2c40a22cb ftp: Do not log non-pending commands
OSS Fuzz generated a CWD request and reply followed by very many EPRT
requests. This caused Zeek to re-log the CWD request and invoke `build_url_ftp()`
over and over again resulting in long processing times.

Avoid this scenario by not logging commands that aren't pending anymore.

(cherry picked from commit b05dd31667ff634ec7d017f09d122f05878fdf65)
2023-09-12 12:00:36 -07:00
Tim Wojtulewicz
7a867d52e2 Remove script functions marked as unused (6.1 deprecations) 2023-06-14 10:07:22 -07:00
Tim Wojtulewicz
5a3abbe364 Revert "Merge remote-tracking branch 'origin/topic/vern/at-if-analyze'"
This reverts commit 4e797ddbbc, reversing
changes made to 3ac28ba5a2.
2023-05-31 09:20:33 +02:00
Vern Paxson
e441ba394a updates reflecting review comments 2023-05-25 18:00:13 -07:00
Vern Paxson
890010915a change base scripts to use run-time if's or @if ... &analyze 2023-05-19 13:26:27 -07:00
Arne Welzel
64f84aba34 ftp: No unbounded directory command re-use
OSS-Fuzz generated traffic containing a CWD command with a single very large
path argument (427kb) starting with ".___/` \x00\x00...", This is followed
by a large number of ftp replies with code 250. The directory logic in
ftp_reply() would match every incoming reply with the one pending CWD command,
triggering path buildup ending with something 120MB in size.

Protect from re-using a directory command by setting a flag in the
CmdArg record when it was consumed for the path traversal logic.

This doesn't prevent unbounded path build-up generally, but does prevent the
amplification of a single large command with very many small ftp_replies.
Re-using a pending path command seems like a bug as well.
2023-05-19 09:37:12 -07:00
Arne Welzel
7665e808a2 ftp/main: Special case for intermediate reply lines
The medium.trace in the private external test suite contains one
session/server that violates the multi-line reply protocol and
happened to work out fairly well regardless due to how we looked
up the pending commands unconditionally before.

Continue to match up reply lines that "look like they contain status codes"
even if cont_resp = T. This still improves runtime for the OSS-Fuzz
generated test case and keeps the external baselines valid.

The affected session can be extracted as follows:

    zcat Traces/medium.trace.gz | tcpdump -r  - 'port 1491 and port 21'

We could push this into the analyzer, too, minimally the RFC says:

    > If an intermediary line begins with a 3-digit number, the Server
    > must pad the front  to avoid confusion.
2023-04-03 14:05:13 +02:00
Arne Welzel
1b3e8a611e ftp/main: Skip get_pending_command() for intermediate reply lines
Intermediate lines of multiline replies usually do not contain valid status
codes (even if servers may opt to include them). Their content may be anything
and likely unrelated to the original command. There's little reason for us
trying to match them with a corresponding command.

OSS-Fuzz generated a large command reply with very many intermediate lines
which caused long processing times due to matching every line with all
currently pending commands.
This is a DoS vector against Zeek. The new ipv6-multiline-reply.trace and
ipv6-retr-samba.trace files have been extracted from the external ipv6.trace.
2023-03-23 13:50:36 +01:00
Arne Welzel
f56785740c ftp: Limit user, password, arg and reply_msg column sizes in log
The user and password fields are replicated to each of the ftp.log
entries. Using a very large username (100s of KBs) allows to bloat
the log without actually sending much traffic. Further, limit the
arg and reply_msg columns to large, but not unbounded values.
2023-02-21 12:28:07 -07:00
Arne Welzel
c132d140ae ftp: Limit pending commands to FTP::max_pending_commands (default 20) 2022-11-08 16:44:17 -07:00
Arne Welzel
8c5896a74d scripts: Migrate table iteration to blank identifiers
No obvious hot-cases. Maybe the describe_file() ones or the intel ones
if/when there are hot intel hits.
2022-10-24 10:36:09 +02:00
Vern Paxson
07cf5cb089 deprecation messages for unused base script functions 2022-05-27 14:36:30 -07:00
Vern Paxson
6dc711c39e annotate orphan base script components with &deprecated 2022-05-26 17:39:17 -07:00
Vern Paxson
9b8ac44169 annotate base scripts with &is_used as needed 2022-05-26 17:39:17 -07:00
Tim Wojtulewicz
a6378531db Remove trailing whitespace from script files 2021-10-20 09:57:09 -07:00
Johanna Amann
93d7778f97 Small bugfix and updates for external test hashes (SSL/X509) 2021-06-29 15:25:08 +01:00
Johanna Amann
b02f22a667 Change SSL and X.509 logging format
This commit changes the SSL and X.509 logging formats to something that,
hopefully, slowly approaches what they will look like in the future.

X.509 log is not yet deduplicated; this will come in the future.

This commit introduces two new options, which determine if certificate
issuers and subjects are still logged in ssl.log. The default is to have
the host subject/issuer logged, but to remove client-certificate
information. Client-certificates are not a typically used feature
nowadays.
2021-06-29 09:26:43 +01:00
Vern Paxson
c991c54690 &is_set => &is_assigned 2021-02-04 12:18:46 -08:00
Vern Paxson
0d77b474e6 adding &is_set attributes to base scripts so -u output isn't cluttered 2021-01-23 10:55:27 -08:00
Christian Kreibich
1bd658da8f Support for log filter policy hooks
This adds a "policy" hook into the logging framework's streams and
filters to replace the existing log filter predicates. The hook
signature is as follows:

    hook(rec: any, id: Log::ID, filter: Log::Filter);

The logging manager invokes hooks on each log record. Hooks can veto
log records via a break, and modify them if necessary. Log filters
inherit the stream-level hook, but can override or remove the hook as
needed.

The distribution's existing log streams now come with pre-defined
hooks that users can add handlers to. Their name is standardized as
"log_policy" by convention, with additional suffixes when a module
provides multiple streams. The following adds a handler to the Conn
module's default log policy hook:

    hook Conn::log_policy(rec: Conn::Info, id: Log::ID, filter: Log::Filter)
            {
            if ( some_veto_reason(rec) )
                break;
            }

By default, this handler will get invoked for any log filter
associated with the Conn::LOG stream.

The existing predicates are deprecated for removal in 4.1 but continue
to work.
2020-09-30 12:32:45 -07:00
Jon Siwek
05cf511f18 GH-1119: add base/protcols/conn/removal-hooks.zeek
This adds two new functions: `Conn::register_removal_hook()` and
`Conn::unregister_removal_hook()` for registering a hook function to be
called back during `connection_state_remove`.  The benefit of using hook
callback approach is better scalability: the overhead of unrelated
protocols having to dispatch no-op `connection_state_remove` handlers is
avoided.
2020-09-11 12:12:10 -07:00
Jon Siwek
5f435c2644 Remove connection_successful and successful_connection_remove events
Related to https://github.com/zeek/zeek/issues/1119
2020-09-10 12:06:50 -07:00
Johanna Amann
b948180247 Fix minimize_info in ftp/main not returning a value.
Fixes GH-1120
2020-08-12 19:53:53 +00:00
Jon Siwek
7e9a3e1e00 Minimize data published for expected FTP data channel analysis
Previously, more data than could effectively be utilized by any remote
Zeek was published (e.g. full list of pending commands or other
transient state that may add up to non-trivial amount of bytes).
2020-06-17 12:45:21 -07:00
Jon Siwek
31f60853c9 GH-646: add new "successful_connection_remove" event
And switch Zeek's base scripts over to using it in place of
"connection_state_remove".  The difference between the two is
that "connection_state_remove" is raised for all events while
"successful_connection_remove" excludes TCP connections that were never
established (just SYN packets).  There can be performance benefits
to this change for some use-cases.

There's also a new event called ``connection_successful`` and a new
``connection`` record field named "successful" to help indicate this new
property of connections.
2019-11-11 19:52:59 -08:00
Jon Siwek
872adda5b1 Merge branch 'topic/jsbarber/ftp-cluster-fix-patch' of https://github.com/jsbarber/zeek
Minor cleanup in merge: remove print statements and unnecessary @if
directive.

* 'topic/jsbarber/ftp-cluster-fix-patch' of https://github.com/jsbarber/zeek:
  Publish ftp_data_expected updates to other workers for synchronization
2019-11-04 17:31:59 -08:00
Jeff Barber
d698bddc7d Publish ftp_data_expected updates to other workers for synchronization 2019-10-30 15:50:22 -06:00
Jon Siwek
aebcb1415d GH-234: rename Broxygen to Zeexygen along with roles/directives
* All "Broxygen" usages have been replaced in
  code, documentation, filenames, etc.

* Sphinx roles/directives like ":bro:see" are now ":zeek:see"

* The "--broxygen" command-line option is now "--zeexygen"
2019-04-22 19:45:50 -07:00
Jon Siwek
a994be9eeb Merge remote-tracking branch 'origin/topic/seth/zeek_init'
* origin/topic/seth/zeek_init:
  Some more testing fixes.
  Update docs and tests for bro_(init|done) -> zeek_(init|done)
  Implement the zeek_init handler.
2019-04-19 11:24:29 -07:00
Seth Hall
8cefb9be42 Implement the zeek_init handler.
Implements the change and a test.
2019-04-14 08:37:35 -04:00
Daniel Thayer
18bd74454b Rename all scripts to have ".zeek" file extension 2019-04-11 21:12:40 -05:00
Jon Siwek
01d303b480 Migrate table-based for-loops to key-value iteration 2019-03-15 19:54:44 -07:00
Vlad Grigorescu
93c094fff2 Switch GridFTP options from redef to option 2018-11-05 13:41:05 -06:00
Jon Siwek
6a059a1cf7 Fix minor documentation mistakes 2018-10-25 18:56:38 -05:00
Daniel Thayer
9bfc01b705 Convert more redef-able constants to runtime options 2018-08-27 19:38:47 -05:00
Daniel Thayer
01a899255e Convert more redef-able constants to runtime options 2018-08-24 16:05:44 -05:00
Daniel Thayer
dc0904a7f3 Convert some redef-able constants to runtime options 2018-08-15 10:17:14 -05:00
Johanna Amann
39a026c88d Merge remote-tracking branch 'origin/topic/jazoff/fix-gridftp'
* origin/topic/jazoff/fix-gridftp:
  problem: gridftp threshold is being applied to all connections
2017-09-21 09:15:57 -07:00
Justin Azoff
6b864d5dd2 problem: gridftp threshold is being applied to all connections
The bytes_threshold_crossed event in the gridftp analyzer is not first
checking to see if the connection passed the initial criteria.  This
causes the script to add the gridftp-data service to any connection that
crosses a threshold that is the same as or greater than the gridftp
size_threshold.
2017-09-21 10:50:26 -04:00
Robin Sommer
d195f1b047 Fixing FTP cwd getting overlue long.
Now storing them compressed.
2016-05-29 08:52:47 -07:00
Seth Hall
08399da6cb Files transferred over FTP were showing incorrect sizes.
The server-reported file size was being collected poorly and if
a file name had a number in it, that was reported as the file
size instead of the actual size.

A new test is included to avoid reintroducing the problem.
2016-03-11 12:56:28 -05:00
Robin Sommer
582da62d04 Fix reporter errors with GridFTP traffic. 2015-06-08 09:42:06 -07:00
Robin Sommer
ed91732e09 Merge remote-tracking branch 'origin/topic/seth/more-file-type-ident-fixes'
* origin/topic/seth/more-file-type-ident-fixes:
  File API updates complete.
  Fixes for file type identification.
  API changes to file analysis mime type detection.
  Make HTTP 206 reassembly require ETags by default.
  More file type identification improvements
  Fix an issue with files having gaps before the bof_buffer is filled.
  Fix an issue with packet loss in http file reporting.
  Adding WOFF fonts to file type identification.
  Extended JSON matching and added OCSP responses.
  Another large signature update.
  More signature updates.
  Even more file type ident clean up.
  Lots of fixes for file type identification.

BIT-1368 #merged
2015-04-20 13:31:00 -07:00
Seth Hall
ed375167c8 File API updates complete.
Addresses BIT-1368.
2015-04-20 10:46:48 -04:00
Robin Sommer
1e010fbb76 Merge remote-tracking branch 'origin/topic/johanna/conn-threshold'
* origin/topic/johanna/conn-threshold:
  Wrap threshold stuff up - fix two small bugs and update baselines.
  update GridFTP analyzer to use connection thresholding instead of polling
  Add high level api for thresholding that holds lists of thresholds and raises an event for each threshold exactly once.
  Allow setting packet and byte thresholds for connections.

BIT-1377 #merged
2015-04-17 13:02:31 -07:00
Johanna Amann
b44b725d59 Wrap threshold stuff up - fix two small bugs and update baselines. 2015-04-17 09:59:34 -07:00
Johanna Amann
024bb7206e update GridFTP analyzer to use connection thresholding instead
of polling
2015-04-17 07:15:53 -07:00
Jon Siwek
a55ce01ef3 API changes to file analysis mime type detection.
Removed "file_mime_type" and "file_mime_types" event, replacing them
with a new event called "file_metadata_inferred".  It has a record
argument of type "inferred_file_metadata", which contains the mime type
information that the earlier events used to supply.  The idea here is
that future extensions to the record with new metadata will be less
likely to break user code than the alternatives (adding new events or
new event parameters).

Addresses BIT-1368.
2015-04-10 16:31:29 -05:00