This commit revamps the handling of analyzer violations that happen
before an analyzer confirms the protocol.
The current state is that an analyzer is disabled after 5 violations, if
it has not been confirmed. If it has been confirmed, it is disabled
after a single violation.
The reason for this is a historic mistake. In Zeek up to versions 1.5,
analyzers were unconditianally removed when they raised the first
protocol violation.
When this script was ported to the new layout for Zeek 2.0 in
b4b990cfb5, a logic error was introduced
that caused analyzers to no longer be disabled if they were not
confirmed.
This was the state for ~8 years, till the DPD::max_violations options
was added, which instates the current approach of disabling unconfirmed
analyzers after 5 violations. Sadly, there is not much discussion about
this change - from my hazy memory, I think this was discovered during
performance tests and the new behavior was added without checking into
the history of previous changes.
This commit reinstates the originally intended behavior of DPD. When an
analyzer that has not been confirmed raises a protocol violation, it is
immediately removed from the connection. This also makes a lot of sense
- this allows the analyzer to be in a "tasting" phase at the beginning
of the connection, and to error out quickly once it realizes that it was
attached to a connection not containing the desired protocol.
This change also removes the DPD::max_violations option, as it no longer
serves any purpose after this change. (In practice, the option remains
with an &deprecated warning, but it is no longer used for anything).
There are relatively minimal test-baseline changes due to this; they are
mostly triggered by the removal of the data structure and by less
analyzer errors being thrown, as unconfirmed analyzers are disabled
after the first error.
While it seems interesting functionality, this hasn't been documented,
maintained or knowingly leveraged for many years.
There are various other approaches today, too:
* We track the number of event handler invocations regardless of
profiling. It's possible to approximate a load_sample event by
comparing the result of two get_event_stats() calls. Or, visualize
the corresponding counters in a Prometheus setup to get an idea of
event/s broken down by event names.
* HookCallFunction() allows to intercept script execution, including
measuring the time execution takes.
* The global call_stack and g_frame_stack can be used from plugins
(and even external processes) to walk the Zeek script stack at certain
points to implement a sampling profiler.
* USDT probes or more plugin hooks will likely be preferred over Zeek
builtin functionality in the future.
Relates to #3458
The dump-events baseline changes are pure noise and have spurred confusion
for internal and external contributors. For example, adding new
analyzers have perturbed orderings of sets holding analyzer tags.
Running in non-bare mode, the baselines change almost whenever any of the
record types attached to connections change in the default scripts. This
causes continuous and seemingly little useful updates to the baselines.
This change switches the test to run in bare mode and explicitly loads
just base/protocols/conn and base/protocols/smtp. The primary intention
of the test should be testing the functionality of the misc/dump-events
script, not the raised events of all loaded default scripts (for that the
used PCAP is too narrow).
Protocol specific scripts that do want to leverage misc/dump-events for
baseline creation of their or their analyzer's events can add additional
specific tests with suitable PCAP files.
Using pcaps from https://interop.seemann.io/ as samples for QUIC protocol
data didn't produce a conn.log for the contained data. `tcpdump -r`
and Wireshark do show the contained IP/UDP packets. Teach Zeek how
to handle link type DLT_PPP 0x09 using a new PPP analyzer based on the
PPPSerial analyzer code.
Usual update to files/x509 baseline after adding new analyzer due
to enum values changing.
If DataIn() was called and a cur_entity_id (file_id) has been produced
previously, re-use it for calls to EndOfFile(). This avoids a costly
event_mgr.Drain() when we already have that information. It should be safer,
too, as `get_file_handle()` in script may generate a different ID and
thereby de-synchronizing.
For low-level packet analysis use-cases, these fields are currently
not script-land accessible via raw_packet() or so. They are accessible
on the icmp_context record, but not on the actual ip4_hdr record, so
add them.
An invalid mail transaction is determined as
* RCPT TO command without a preceding MAIL FROM
* a DATA command without a preceding RCPT TO
and logged as a weird.
The testing pcap for invalid mail transactions was produced with a Python
script against a local exim4 configured to accept more errors and unknown
commands than 3 by default:
# exim4.conf.template
smtp_max_synprot_errors = 100
smtp_max_unknown_commands = 100
See also: https://www.rfc-editor.org/rfc/rfc5321#section-3.3
This would generally happen the next loop iteration around anyway, but
seems nice to ensure a zero timeout source will be processed at the same
time as sources with ready FDs.
There was a misunderstanding whether to include them by default in
the dns.log, so remove them again.
There had also been a discussion and quirk that AD of a request would
always be overwritten by reply in the dns.log unless the reply is
missing. For now, let users extend dns.log themselves for what best
fits their requirements, rather than adding these flags by default.
Add a btest to print AD and CD flags for smoke testing still.
* 'dnssec-flag-parse' of github.com:micrictor/zeek-codespace:
Update external testing commit hash for DNS flag changes
Parse DNSSEC AD and CD bits
Updated dump-events baseline which seemed unrelated.
Parse authentic data (AD) and checking disabled (CD) bits according to
RFC 2535. Leaves the Z field as-is, in case users are already handling
this elsewhere and depend on the value being the integer for all 3 bits.
https://www.rfc-editor.org/rfc/rfc2535#section-6.1Fixes#2672
This change exposes the signature tyope inside the signed portion of an
X.509 certificate. In the past, we only exposed the signature type that
is contained inside the signature, which is outside the signed portion
of the X.509 certificate.
In theory, both signature fields should have the same value; it is,
however, possible to encode differing values in both fields. The new
field is not logged by default.
- Remove tag types for each component type (analyzer, etc)
- Add deprecated versions of the old types
- Remove unnecessary tag element from templates for TaggedComponent and ComponentManager
- Enable TaggedComponent to pass an EnumType when initializing Tag objects
- Update some tests that are affected by the tag enum values changing order
By default, each certificate is now output only once per hour. This also
should work in cluster mode, where we use the net broker-table-syncing
feature to distribute the information about already seen certificates
across the entire cluster.
Log caching is also pretty configureable and can be changed using a
range of confiuration options and hooks.
Note that this is currently completely separate from X509 events
caching, which prevents duplicate parsing of X509 certificates.
This commit changes the SSL and X.509 logging formats to something that,
hopefully, slowly approaches what they will look like in the future.
X.509 log is not yet deduplicated; this will come in the future.
This commit introduces two new options, which determine if certificate
issuers and subjects are still logged in ssl.log. The default is to have
the host subject/issuer logged, but to remove client-certificate
information. Client-certificates are not a typically used feature
nowadays.
* topic/johanna/GH-169:
Make event ordering deterministic
dump-events: try to make baseline work on all systems
Introduce generate_all_events bif and add option to misc/dump-events
Fixes GH-169
generate_all_events causes all events to be raised internally; this
makes it possible for dump_events to really capture all events (and not
just those that were handled).
Addresses GH-169
This adds two new functions: `Conn::register_removal_hook()` and
`Conn::unregister_removal_hook()` for registering a hook function to be
called back during `connection_state_remove`. The benefit of using hook
callback approach is better scalability: the overhead of unrelated
protocols having to dispatch no-op `connection_state_remove` handlers is
avoided.
This also updates all usages of the deprecated Val ctor to use
either IntervalVal, TimeVal, or DoubleVal ctors. The reason for
doing away with the old constructor is that using it with TYPE_INTERVAL
isn't strictly correct since there exists a more specific subclass,
IntervalVal, with overriden ValDescribe() method that ought to be used
to print such values in a more descriptive way.
This commit switches UID hashing from md5 to a highway hash. It also
moves the salt value out of the file plugin - and makes it
installation-specific instead - it is moved to the global namespace.
There now are digest hash functions to make "static"
installation-specific hashes that are stable over workers available to
everyone; hashes can be 64, 128 or 256 bits in size.
Due to the fact that we switch the file hashing algorithm, all file
hashes change.
The underlyigng algorithm that is used for hashing is highwayhash-128,
which is significantly faster than md5.