This adds a new PacketAnalyzer::PPPoE::session_id bif, which extracts
the PPPoE session ID from the current packet.
Furthermore, a new policy script is added which adds the pppoe session
id to the connection log.
Related to GH-4602
* origin/topic/robin/gh-4481-test-analyzer:
Spicy: Fix missing include.
Bump Spicy.
Spicy: Add functions to check if Zeek provides an analyzer of a given name.
Specifically, set a MIME part's parent_id to the rfc822_msg_fuid if it
is set and take into account the current rfc822_msg_fuid for describe_file()
to avoid fuid collisions of the top-level RFC822 message and the first
MIME part.
The backend does not serve expired but still present entries so to a
user they do not exist. When they put new data over such an entry their
expecation is that the value is overwritten, even if not explicitly
requested.
The SQLite storage backend implements expiration by hand and garbage
collection is done in `DoExpire`. This previously relied exclusively on
gets not running within `Storage::expire_interval` of the put, otherwise
we would potentially serve expired entries.
With this patch we explictly check that entries are not expired before
serving them so that the SQLite backend should never serve expired
entries.
```
## Checks if there is a Zeek analyzer of a given name.
##
## analyzer: the Zeek-side name of the analyzer to check for
## if_enabled: if true, only checks for analyzers that are enabled
##
## Returns the type of the analyzer if it exists, or ``Undef`` if it does not.
public function has_analyzer(analyzer: string, if_enabled: bool = True): bool &cxxname="zeek::spicy::rt::has_analyzer";
## Differentiates between the types of analyzers Zeek provides.
public type AnalyzerType = enum { Protocol, File, Packet, };
## Returns the type of a Zeek analyzer of a given name.
##
## analyzer: the Zeek-side name of the analyzer to check
## if_enabled: if true, only checks for analyzers that are enabled
##
## Returns the type of the analyzer if it exists, or ``Undef`` if it does not.
public function analyzer_type(analyzer: string, if_enabled: bool = True): AnalyzerType &cxxname="zeek::spicy::rt::analyzer_type";
```
Closes#4481.
Zeek's analyzer API makes it hard to determine during analyzer
shutdown whether a regular end-of-data has been reached, or if we're
aborting in the middle of a session (e.g., because Zeek missed the
remaining packets): the corresponding analyzer method, `EndOfData()`
gets called in both cases.
In an earlier change, we had stopped signaling Spicy analyzers a
regular finish when that `EndOfData()` method executes, because doing
so could trigger a parse error if it wasn't a regular shutdown—-which
isn't desired, a user request was to just silently stop processing in
this case.
However, that behavior now seems unfortunate in the case that one
deliberately calls `zeek::protocol_handle_close()` to terminate an
analyzer: this feels like a regular shutdown that should just
immediately happen. We achieve this now in this function by
additionally signaling the shutdown at the TCP layer as an "end of
file", which, for Spicy analyzers, happens to run the final, orderly
tear-down.
Not exactly great, but ti seems to thread the needle to achieve the
desired semantics in both cases.
This changes the PPPoE parser so that it doesn't forward extra bytes
that might be appended after the payload. Instead, it raises a weird if
the payload size doesn't match the size indicated by the header.
This is in line with what other protocol parsers (like UDP) are doing.
Two tests needed to be updated - with this change, the traffic in
pppoe-over-qinq.pcap is now valid TLS. A new trace was introduced for
the confirmation-violation-info test.
Addresses GH-4602
* origin/topic/vern/line-number-ordering:
Bump ZeekJS to work with new Location constructor
remove non-functional column information from Location objects
isolate Location specifics to private class variables to enforce correct line number ordering
* origin/topic/awelzel/4605-conn-id-context:
NEWS: Adapt for conn_id$ctx introduction
conn_key/fivetuple: Drop support for non conn_id records
Conn: Move conn_id init and flip to IPBasedConnKey
IPBasedConnKey: Add GetTransportProto() helper
input/Manager: Ignore empty record types
external: Bump commit hashes for external suites
ip/vlan_fivetuple: Populate nested conn_id_context, not conn_id
ConnKey: Extend DoPopulateConnIdVal() with ctx
btest: Update tests and baselines after adding ctx to conn_id
init-bare: Add conn_id_ctx to conn_id
Somewhere record types with zero fields get the optional attribute
apparently. The input/sqlite/basic test failed due to complaining
that ctx is optional. It isn't optional and when it has zero fields
we can just ignore it, too.
Also adds a input framework test with an explicit empty record type
get_file_handle() may include c$id and perturbs their values when adding new
fields. I think that's reasonable, as files transferred in one VLAN should
be treated separate from files transferred in a different VLAN.
This prepares the move where ConnKey implementations should fill out
ctx rather than filling conn_id directly. The API continues to receive
both, conn_id and ctx, as adding fields to `conn_id` is reasonable
use-case even if it's just for logging purposes.
This nested record can be used to discriminate orig_h or resp_h being
observed in different "contexts". A context can be based on VLAN tags,
but any custom ConnKey implementation should populate the ctx field,
allowing to write context-aware Zeek scripts without needing to know
what the context really is.
Changes in endpoint stats are a side-effect caused by the ConnSize
analyzer updating the conn record triggering the threshold event. The
phenomenon is described in https://github.com/zeek/zeek/issues/4214.
Closes#4504
Messages are not typical responses, so they need special handling. This
is different between RESP2 and 3, so this is the first instance where
the script layer needs to tell the difference.
A bit ad-hoc formatting for the log, but that's mostly because cluster.log
only has message field and I don't think having a dedicated application_name
column is worth it. That could also be added by custom scripts if it's really
wanted for a given deployment.
* origin/topic/awelzel/cluster-telemetry-follow-up:
Bump cluster test suite
cluster/Telemetry: Cache CallExpr locations
cluster/Telemetry: Avoid unneeded StringVal() construction
Val: Switch TablePatternMatcher to std::string_view
RE: Add MatchAll() and MatchSet() for std::string_view
cluster/websocket: Fix and test for invalid X-Application-Name
cluster/telemetry: Move topic_normalization redef to zeromq