Commit graph

8 commits

Author SHA1 Message Date
Johanna Amann
0c875220e9 Default canonifier change to only remove first timestamp in line
In the past, we used a default canonifier, which removes everything that
looks like a timestamp from log files. The goal of this is to prevent
logs from changing, e.g., due to local system times ending up in log
files.

This, however, also has the side-effect of removing information that is
parsed from protocols which probably should be part of our tests.
There is at least one test (1999 certificates) where the entire test
output was essentially removed by the canonifier.

GH-4521 was similarly masked by this.

This commit changes the default canonifier, so that only the first
timestamp in a line is removed. This should skip timestamps that are
likely to change while keeping timestamps that are parsed
from protocol information.

A pass has been made over the tests, with some additional adjustments
for cases which require the old canonifier.

There are some cases in which we probably could go further and not
remove timestamps at all - that, however, seems like a follow-up
project.
2025-06-18 15:41:48 +01:00
Vern Paxson
c11c2830b1 performance speed-up for SMB base scripts 2024-04-25 09:15:12 -07:00
Christian Kreibich
1843e2daae Update btest baselines to reflect the use of local address ranges. 2023-03-15 17:11:04 -07:00
Arne Welzel
d2314d2666 files.log: Unroll and introduce uid and id fields
This is a script-only change that unrolls File::Info records into
multiple files.log entries if the same file was seen over different
connections by single worker. Consequently, the File::Info record
gets the commonly used uid and id fields added. These fields are
optional for File::Info - a file may be analyzed without relation
to a network connection (e.g by using Input::add_analysis()).

The existing tx_hosts, rx_hosts and conn_uids fields of Files::Info
are not meaningful after this change and removed by default. Therefore,
files.log will have them removed, too.

The tx_hosts, rx_hosts and conn_uids fields can be revived by using the
policy script frameworks/files/deprecated-txhosts-rxhosts-connuids.zeek
included in the distribution. However, with v6.1 this script will be
removed.
2022-08-16 17:22:20 +02:00
Robin Sommer
369e42a6e4 Fix SMB tests on Apple M1.
Due to different double precision on M1, file IDs for SMB could end up
changing on M1 because the access time of a file goes into their
computation. The real solution for this would be changing Zeek's
internal "time" representation to uint64; that's planned, but requires
major surgery. For now, this PR changes the SMB code to also pass SMB's
original time representation (which is a uint64) into script-land, and
then use that for computing the file ID.

Closes #1406
2021-06-29 20:17:02 +02:00
Christian Kreibich
0b674eb851 Baseline refresh to reflect btest 0.64 2020-12-06 20:19:49 -08:00
Johanna Amann
3bce313b12 Switch file UID hashing from md5 to highwayhash.
This commit switches UID hashing from md5 to a highway hash. It also
moves the salt value out of the file plugin - and makes it
installation-specific instead - it is moved to the global namespace.

There now are digest hash functions to make "static"
installation-specific hashes that are stable over workers available to
everyone; hashes can be 64, 128 or 256 bits in size.

Due to the fact that we switch the file hashing algorithm, all file
hashes change.

The underlyigng algorithm that is used for hashing is highwayhash-128,
which is significantly faster than md5.
2020-04-30 10:20:09 -07:00
Jon Siwek
2d8acab664 Merge branch 'smb2-fix' of https://github.com/mauropalumbo75/zeek
* 'smb2-fix' of https://github.com/mauropalumbo75/zeek:
  added test and pcap files for smb_files.log fix
  fixing some missing log lines in smb_files.log
2019-03-20 18:01:35 -07:00