Zeek's analyzer API makes it hard to determine during analyzer
shutdown whether a regular end-of-data has been reached, or if we're
aborting in the middle of a session (e.g., because Zeek missed the
remaining packets): the corresponding analyzer method, `EndOfData()`
gets called in both cases.
In an earlier change, we had stopped signaling Spicy analyzers a
regular finish when that `EndOfData()` method executes, because doing
so could trigger a parse error if it wasn't a regular shutdown—-which
isn't desired, a user request was to just silently stop processing in
this case.
However, that behavior now seems unfortunate in the case that one
deliberately calls `zeek::protocol_handle_close()` to terminate an
analyzer: this feels like a regular shutdown that should just
immediately happen. We achieve this now in this function by
additionally signaling the shutdown at the TCP layer as an "end of
file", which, for Spicy analyzers, happens to run the final, orderly
tear-down.
Not exactly great, but ti seems to thread the needle to achieve the
desired semantics in both cases.
pre-commit ignores Cargo.lock files for Rust projects, so any movement
in a Rust project's dependencies can break a hook, even if no code in
the hook changed. I have tried to work with upstream on a fix, but they
basically told me they weren't interested and to get lost.
This bumps the `spicy-format` pre-commit hook to a new version which
explicitly deals with bumps of its dependencies. Having to do this
semi-regularly is not fun, and ideally somebody interested in using this
hook would help set up infrastructure in the hook so it just pulls
pre-built binaries. This is not directly supported by pre-commit, but
many projects work around this by declaring a Python module which then
pulls pre-build binaries which already exist for spicy-format.
This changes the PPPoE parser so that it doesn't forward extra bytes
that might be appended after the payload. Instead, it raises a weird if
the payload size doesn't match the size indicated by the header.
This is in line with what other protocol parsers (like UDP) are doing.
Two tests needed to be updated - with this change, the traffic in
pppoe-over-qinq.pcap is now valid TLS. A new trace was introduced for
the confirmation-violation-info test.
Addresses GH-4602
* origin/topic/vern/line-number-ordering:
Bump ZeekJS to work with new Location constructor
remove non-functional column information from Location objects
isolate Location specifics to private class variables to enforce correct line number ordering
The .rst generation doesn't escape the trailing `_` and the docs build
gets upset due to using `type` as a reference target then.
For the better or worse, revert to using tpe. Though I acknowledge this
means we need to be careful with trailing underscores because our docs
build is so fragile.
Partly reverts b9eabbabba.
* origin/topic/awelzel/4605-conn-id-context:
NEWS: Adapt for conn_id$ctx introduction
conn_key/fivetuple: Drop support for non conn_id records
Conn: Move conn_id init and flip to IPBasedConnKey
IPBasedConnKey: Add GetTransportProto() helper
input/Manager: Ignore empty record types
external: Bump commit hashes for external suites
ip/vlan_fivetuple: Populate nested conn_id_context, not conn_id
ConnKey: Extend DoPopulateConnIdVal() with ctx
btest: Update tests and baselines after adding ctx to conn_id
init-bare: Add conn_id_ctx to conn_id
Previously, we supported any records that happened to have orig_h,
resp_h, etc. fields, but it's not exactly clear why we ever did. Users
that relied on this can instantiate an explicit conn_id instance, too.
This loosens the coupling of the script-layer conn_id record and
the code in Conn a bit, moving more into the IPBasedConnKey class.
I'm not quite sure whether moving the flipping logic is worth it,
but assuming Conn could become non-IP in the future, it might.
Somewhere record types with zero fields get the optional attribute
apparently. The input/sqlite/basic test failed due to complaining
that ctx is optional. It isn't optional and when it has zero fields
we can just ignore it, too.
Also adds a input framework test with an explicit empty record type
get_file_handle() may include c$id and perturbs their values when adding new
fields. I think that's reasonable, as files transferred in one VLAN should
be treated separate from files transferred in a different VLAN.
This prepares the move where ConnKey implementations should fill out
ctx rather than filling conn_id directly. The API continues to receive
both, conn_id and ctx, as adding fields to `conn_id` is reasonable
use-case even if it's just for logging purposes.
This nested record can be used to discriminate orig_h or resp_h being
observed in different "contexts". A context can be based on VLAN tags,
but any custom ConnKey implementation should populate the ctx field,
allowing to write context-aware Zeek scripts without needing to know
what the context really is.
Changes in endpoint stats are a side-effect caused by the ConnSize
analyzer updating the conn record triggering the threshold event. The
phenomenon is described in https://github.com/zeek/zeek/issues/4214.