```
## Checks if there is a Zeek analyzer of a given name.
##
## analyzer: the Zeek-side name of the analyzer to check for
## if_enabled: if true, only checks for analyzers that are enabled
##
## Returns the type of the analyzer if it exists, or ``Undef`` if it does not.
public function has_analyzer(analyzer: string, if_enabled: bool = True): bool &cxxname="zeek::spicy::rt::has_analyzer";
## Differentiates between the types of analyzers Zeek provides.
public type AnalyzerType = enum { Protocol, File, Packet, };
## Returns the type of a Zeek analyzer of a given name.
##
## analyzer: the Zeek-side name of the analyzer to check
## if_enabled: if true, only checks for analyzers that are enabled
##
## Returns the type of the analyzer if it exists, or ``Undef`` if it does not.
public function analyzer_type(analyzer: string, if_enabled: bool = True): AnalyzerType &cxxname="zeek::spicy::rt::analyzer_type";
```
Closes#4481.
Zeek's analyzer API makes it hard to determine during analyzer
shutdown whether a regular end-of-data has been reached, or if we're
aborting in the middle of a session (e.g., because Zeek missed the
remaining packets): the corresponding analyzer method, `EndOfData()`
gets called in both cases.
In an earlier change, we had stopped signaling Spicy analyzers a
regular finish when that `EndOfData()` method executes, because doing
so could trigger a parse error if it wasn't a regular shutdown—-which
isn't desired, a user request was to just silently stop processing in
this case.
However, that behavior now seems unfortunate in the case that one
deliberately calls `zeek::protocol_handle_close()` to terminate an
analyzer: this feels like a regular shutdown that should just
immediately happen. We achieve this now in this function by
additionally signaling the shutdown at the TCP layer as an "end of
file", which, for Spicy analyzers, happens to run the final, orderly
tear-down.
Not exactly great, but ti seems to thread the needle to achieve the
desired semantics in both cases.
To address review feedback in GH-4362: rename analyzer-failed-log.zeek
to loggig.zeek, analyzer-debug-log.zeek to debug-logging.zeek and
dpd-log.zeek to deprecated-dpd-log.zeek.
Includes respective test, NEWS, etc updates.
The main part of this commit are changes in tests. A lot of the tests
that previously relied on analyzer.log or dpd.log now use the new
analyzer-failed.log.
I verified all the changes and, as far as I can tell, everything
behaves as it should. This includes the external test baselines.
This change also enables logging of file and packet analyzer to
analyzer_failed.log and fixes some small behavior issues.
The analyzer_failed event is no longer raised when the removal of an
analyzer is vetoed.
If an analyzer is no longer active when an analyzer violation is raised,
currently the analyzer_failed event is raised. This can, e.g., happen
when an analyzer error happens at the very end of the connection. This
makes the behavior more similar to what happened in the past, and also
intuitively seems to make sense.
A bug introduced in the failed service logging was fixed.
So far, when Zeek didn't see a connection's regular tear-down (e.g.,
because its state timed-out before we got to the end), we'd still
signal a regular end-of-data to Spicy parsers. As a result, they would
then typically raise a parse error because they were probably still
expecting data and would now declare it missing. That's not very
useful because semantically it's not really a protocol issue if the
data just doesn't make it over to us; it's a transport-layer issue
that Zeek already handles elsewhere. So we now switch to signaling
end-of-data to Spicy analyzers only if the connection indeed shuts
down regularly. This is also matches how BinPAC handles it.
This also comes with a test exercising various combinations of
end-of-data behavior so that we ensure consistent/desired behavior.
Closes#4007.
This avoids the earlier problem of not tracking ports correctly in
scriptland, while still supporting `port` in EVT files and `%port` in
Spicy files.
As it turns out we are already following the same approach for file
analyzers' MIME types, so I'm applying the same pattern: it's one
event per port, without further customization points. That leaves the
patch pretty small after all while fixing the original issue.
The current test attempts to instantiate two spicy::SSH_1 protocol
analyzers in the .evt file. The intention likely was to use two
distinct protocol analyzer both trying to replace the builtin SSH
analyzer.
Coincidentally, fixing this happens to workaround TSAN errors tickled
by the FatalError() call while loading the .hlto with two identically
named analyzers.
$ cat .tmp/spicy.replaces-conflicts/output
error: redefinition of protocol analyzer spicy::SSH_1
ThreadSanitizer: main thread finished with ignores enabled
One of the following ignores was not ended (in order of probability)
Ignore was enabled at:
#0 __llvm_gcov_init __linker___d192e45c25d5ee23-484d3e0fc2caf5b4.cc (ssh.hlto+0x34036) (BuildId: 091934ca4da885e7)
#1 __llvm_gcov_init __linker___d192e45c25d5ee23-484d3e0fc2caf5b4.cc (ssh.hlto+0x34036) (BuildId: 091934ca4da885e7)
...
I was tempted to replace FatalError() with Error() and rely on
zeek-setup.cc's early exiting on any reporter errors, but this
seems easier for now.
Relates to #3865.
This allows to read Zeek global variables from inside Spicy code. The
main challenge here is supporting all of Zeek's data type in a
type-safe manner.
The most straight-forward API is a set of functions
`get_<type>(<id>)`, where `<type>` is the Zeek-side type
name (e.g., `count`, `string`, `bool`) and `<id>` is the fully scoped
name of the Zeek-side global (e.g., `MyModule::Boolean`). These
functions then return the corresponding Zeek value, converted in an
appropriate Spicy type. Example:
Zeek:
module Foo;
const x: count = 42;
const y: string = "xxx";
Spicy:
import zeek;
assert zeek::get_count("Foo::x") == 42;
assert zeek::get_string("Foo::y") == b"xxx"; # returns bytes(!)
For container types, the `get_*` function returns an opaque types that
can be used to access the containers' values. An additional set of
functions `as_<type>` allows converting opaque values of atomic
types to Spicy equivalents. Example:
Zeek:
module Foo;
const s: set[count] = { 1, 2 };
const t: table[count] of string = { [1] = "One", [2] = "Two" }
Spicy:
# Check set membership.
local set_ = zeek::get_set("Foo::s");
assert zeek::set_contains(set_, 1) == True
# Look up table element.
local table_ = zeek::get_table("Foo::t");
local value = zeek::table_lookup(t, 1);
assert zeek::as_string(value) == b"One"
There are also functions for accessing elements of Zeek-side vectors
and records.
If any of these `zeek::*` conversion functions fails (e.g., due to a
global of that name not existing), it will throw an exception.
Design considerations:
- We support only reading Zeek variables, not writing. This is
both to simplify the API, and also conceptually to avoid
offering backdoors into Zeek state that could end up with a very
tight coupling of Spicy and Zeek code.
- We accept that a single access might be relatively slow due to
name lookup and data conversion. This is primarily meant for
configuration-style data, not for transferring lots of dynamic
state over.
- In that spirit, we don't support deep-copying complex data types
from Zeek over to Spicy. This is (1) to avoid performance
problems when accidentally copying large containers over,
potentially even at every access; and (2) to avoid the two sides
getting out of sync if one ends up modifying a container without
the other being able to see it.
We now reject EVT files that attempt to replace the same built-in
analyzer multiple times as doing so would be ill-defined and not very
intuitive in what exactly it means.
Closes#3783.
* origin/topic/robin/gh-3561-forward-to-udp:
Update docs.
Add explicit children life-cycle management method to analyzers.
Spicy: Support UDP in Spicy's `protocol_*` runtime functions.
Add method to analyzer to retrieve direct child by name.
Extend PIA's `FirstPacket` API.
Spicy: Prepare for supporting forwarding to protocols other than TCP.
This extends the ability to feed new payload back into Zeek's analyzer
pipeline from TCP to now also UDP.
Note: We don't extend this further to ICMP because the ICMP analyzer
cannot be dynamically instantiated (Zeek aborts when trying so). As
ICMP isn't very interesting from use-case perspective anyways, that
seems fine.
Closes#3561.
* origin/topic/robin/gh-3573-replaces-cleanup:
Fix packet analyzer replacement.
Spicy: Wenn replacing an analyzer add a component mapping.
Add component API to transparently remap one component to another one.
Move enabled/disabled functionality from analyzers into `Component` base class API.
This uses the new API to replace components internally.
With these changes in place, replacing protocol analyzers now don't
need to register their ports anymore if they match what the original
analyzer was using (because the old one's registrations will map
over).
Packet analyzer replacement doesn't quite work yet but will be fixed
in next commit.
Closes#3573.
Like traditional file analyzers, we now query Zeek's
`get_file_handle()` event for handles when a connection begins
analyzing an embedded file. That means that Spicy-side protocol
analyzers that are forwarding data into file analysis now need to call
Zeek's `Files::register_protocol()` and provide a callback for
computing file handles. If that's missing, Zeek will now issue a
warning. This aligns with the requirements Zeek's traditional protocol
analyzers. (If the EVT file defines a protocol analyzer to `replace`
an existing one, that one's `register_protocol()` will be consulted.)
Because Zeek's `get_file_handle()` event requires a current
connection, if a Spicy file analyzer isn't directly part of a
connection context (e.g., with nested files), we continue to use
hardcoded, built-in file handle. Scriptland won't be consulted in
that case, just like before.
Closes#3440.
While in Spicy code a hook priority is spelled `priority=4711` the
attribute is still called `&priority` (like in HILTI) and we rely on
exactly that name when e.g., extracting hook priorities for scheduling.
This change was introduced as part of
db98dc4193 and caused the default hook
priority for hooks defined in EVT files (intended to be -1000 to likely
schedule after e.g., hooks in the Spicy grammars) to be ignored. This
could then e.g., introduce issue when a `%done` hook would mutate state
exposed in an EVT hook (which now might not have seen the updated state
due to different scheduling).
Allow spicy parsers to generate their own file IDs and provide them to
Zeek. This duplicates functionality that is currently possible (and
used) by some binpac-based analyzers. One example for an analyzer
creating its own file IDs is the SSL analyzer.
```
## Tells Zeek to skip sending any further input data to the current analyzer.
## This is supported for protocol and file analyzers.
public function skip_input() : void;
```
Closes#3443.
Without this, Zeekygen won't generate documentation about exported
enum types as it can not resolve the identifier. Also, only register a
type as item with the Spicy plugin if there's no _module_info currently
active.
So far we had trouble documenting Spicy analyzers through Zeekygen
because they would show up as components belonging to the
`Zeek::Spicy` plugin; whereas traditional analyzers would be their own
plugins and hence documented individually on their own. This commit
teaches Zeekygen to track Spicy analyzers separately inside their own
`Info` instances. This information isn't further used in this commit
yet, but will be merged with the plugin output in a subsequent change
to get the expected joint output.
To pass additional information to Zeekygen, EVT files now also support
two new tags for Zeekygen purposes:
- `%doc-id = ID;` defines the global ID under which everything inside
the EVT file will be documented by Zeekygen, conceptually comparable
to plugin names (e.g., `Zeek::Syslog`).
- `%doc-description = "text" provides additional text to go into the
documentation (comparable to plugin descriptions).
This information is carried through into the HLTO runtime
initialization code, from where it's registered with Zeekygen.
This commit also removes a couple of previous hacks of how Spicy
integrated with Zeekygen which (1) ended up generating broken doc output
for Spicy components, and (2) don't seem to be necessary anymore
anyways.