Copy docs into Zeek repo directly

This is based on commit 99e6942efec5feff50523f6b2a1f5868f19ab638 from the
zeek-docs repo.
This commit is contained in:
Tim Wojtulewicz 2025-09-15 15:52:18 -07:00
parent 979a98c73c
commit adce4e604a
1075 changed files with 169492 additions and 1 deletions

View file

@ -0,0 +1,120 @@
.. _cluster_backend_zeromq:
======================
ZeroMQ Cluster Backend
======================
.. versionadded:: 7.1
*Experimental*
Quickstart
==========
To switch a Zeek cluster with a static cluster layout over to use ZeroMQ
as cluster backend, add the following snippet to ``local.zeek``:
.. code-block:: zeek
@load frameworks/cluster/backend/zeromq/connect
Note that the function :zeek:see:`Broker::publish` will be non-functional
and a warning emitted when used - use :zeek:see:`Cluster::publish` instead.
By default, a configuration based on hard-coded endpoints and cluster layout
information is created. For more customization, refer to the module documentation
at :doc:`cluster/backend/zeromq/main.zeek </scripts/policy/frameworks/cluster/backend/zeromq/main.zeek>`.
Architecture
============
Publish-Subscribe of Zeek Events
--------------------------------
The `ZeroMQ <https://zeromq.org/>`_ based cluster backend uses a central
XPUB/XSUB broker for publish-subscribe functionality. Zeek events published
via :zeek:see:`Cluster::publish` are distributed by this central broker to
interested nodes.
.. figure:: /images/cluster/zeromq-pubsub.png
As depicted in the figure above, each cluster node connects to the central
broker twice, once via its XPUB socket and once via its XSUB socket. This
results in two TCP connections from every cluster node to the central broker.
This setup allows every node in the cluster to see messages from all other
nodes, avoiding the need for cluster topology awareness.
.. note::
Scalability of the central broker in production setups, but for small
clusters on a single node, may be fast enough.
On a cluster node, the XPUB socket provides notifications about subscriptions
created by other nodes: For every subscription created by any node in
the cluster, the :zeek:see:`Cluster::Backend::ZeroMQ::subscription` event is
raised locally on every other node (unless another node had created the same
subscription previously).
This mechanism is used to discover the existence of other cluster nodes by
matching the topics with the prefix for node specific subscriptions as produced
by :zeek:see:`Cluster::nodeid_topic`.
As of now, the implementation of the central broker calls ZeroMQ's
``zmq::proxy()`` function to forward messages between the XPUB and
XSUB socket.
While the diagram above indicates the central broker being deployed separately
from Zeek cluster nodes, by default the manager node will start and run this
broker using a separate thread. There's nothing that would prevent from running
a long running central broker independently from the Zeek cluster nodes, however.
The serialization of Zeek events is done by the selected
:zeek:see:`Cluster::event_serializer` and is independent of ZeroMQ.
The central broker needs no knowledge about the chosen format, it is
only shuffling messages between nodes.
Logging
-------
While remote events always pass through the central broker, nodes connect and
send log writes directly to logger nodes in a cluster. The ZeroMQ cluster backend
leverages ZeroMQ's pipeline pattern for this functionality. That is, logger nodes
(including the manager if configured using :zeek:see:`Cluster::manager_is_logger`)
open a ZeroMQ PULL socket to receive log writes. All other nodes connect their
PUSH socket to all available PULL sockets. These connections are separate from
the publish-subscribe setup outlined above.
When sending log-writes over a PUSH socket, load balancing is done by ZeroMQ.
Individual cluster nodes do not have control over the decision which logger
node receives log writes at any given time.
.. figure:: /images/cluster/zeromq-logging.png
While the previous paragraph used "log writes", a single message to a logger
node actually contains a batch of log writes. The options :zeek:see:`Log::flush_interval`
and :zeek:see:`Log::write_buffer_size` control the frequency and maximum size
of these batches.
The serialization format used to encode such batches is controlled by the
selected :zeek:see:`Cluster::log_serializer` and is independent of ZeroMQ.
With the default serializer (:zeek:see:`Cluster::LOG_SERIALIZER_ZEEK_BIN_V1`),
every log batch on the wire has a header prepended that describes it. This allows
interpretation of log writes even by non-Zeek processes. This opens the possibility
to implement non-Zeek logger processes as long as the chosen serializer format
is understood by the receiving process. In the future, a JSON lines serialization
may be provided, allowing easier interpretation than a proprietary binary format.
Summary
-------
Combining the diagrams above, the connections between the different socket
types in a Zeek cluster looks something like the following.
.. figure:: /images/cluster/zeromq-cluster.png

111
doc/devel/contributors.rst Normal file
View file

@ -0,0 +1,111 @@
===================
Contributor's Guide
===================
See below for selection of some of the more common contribution guidelines
maintained directly in `Zeek wiki
<https://github.com/zeek/zeek/wiki#contributors>`_.
General Contribution Process
============================
See https://github.com/zeek/zeek/wiki/Contribution-Guide
Coding Style and Conventions
============================
See https://github.com/zeek/zeek/wiki/Coding-Style-and-Conventions
General Documentation Structure/Process
=======================================
See the :doc:`README </README>` file of https://github.com/zeek/zeek-docs
Documentation Style and Conventions
===================================
See https://github.com/zeek/zeek/wiki/Documentation-Style-and-Conventions
Checking for Memory Errors and Leaks
====================================
See https://github.com/zeek/zeek/wiki/Checking-for-Memory-Errors-and-Leaks
Maintaining long-lived forks of Zeek
====================================
Consistent formatting of the Zeek codebase is enforced automatically by
configurations tracked in the repository. Upstream updates to these
configurations can lead to formatting changes which could cause merge conflicts
for long-lived forks.
Currently the following configuration files in the root directory are used:
- ``.pre-commit-config.yaml``: Configuration for `pre-commit <https://pre-commit.com/>`_.
We use pre-commit to manage and orchestrate formatters and linters.
- ``.clang-format``: Configuration for `clang-format
<https://clang.llvm.org/docs/ClangFormat.html>`_ for formatting C++ files.
- ``.style.yapf``: Configuration for `YAPF <https://github.com/google/yapf>`_
for formatting Python files.
- ``.cmake-format.json``: Configuration for `cmake-format
<https://github.com/cheshirekow/cmake_format>`_ for formatting CMake files.
With these configuration files present ``pre-commit run --all-files`` will
install all needed formatters and reformat all files in the repository
according to the current configuration.
.. rubric:: Workflow: Zeek ``master`` branch regularly merged into fork
If Zeek's master branch is regularly merged into the fork, merge conflicts can
be resolved once and their resolution is tracked in the repository. Similarly,
we can explicitly reformat the fork once and then merge the upstream branch.
.. code-block:: sh
## Get and stage latest versions of configuration files from master.
git checkout master -- .pre-commit-config.yaml .clang-format .style.yapf .cmake-format.json
## Reformat fork according to new configuration.
pre-commit run -a
## Record reformatted state of fork.
git add -u && git commit -m 'Reformat'
# Merge in master, resolve merge conflicts as usual.
git merge master
.. rubric:: Workflow: Fork regularly rebased onto Zeek ``master`` branch
If the target for a rebase has been reformatted individual diff hunks might not
apply cleanly anymore. There are different approaches to work around that. The
approach with the least conflicts is likely to first reformat the fork
according to upstream style without pulling in changes, and only after that
rebase on upstream and resolve potential semantic conflicts.
.. code-block:: sh
# Create a commit updating the configuration files.
git checkout master -- .pre-commit-config.yaml .clang-format .style.yapf .cmake-format.json
git commit -m 'Bump formatter configurations'
# With a fork branched from upstream at commit FORK_COMMIT, rebase the
# config update commit 'Bump formatter configurations' to the start of the
# fork, but do not yet rebase on master (interactively move the last patch
# to the start of the list of patches).
git rebase -i FORK_COMMIT
# Reformat all commits according to configs at the base. We use the '--exec'
# flag of 'git rebase' to execute pre-commit after applying each patch. If
# 'git rebase' detects uncommitted changes it stops automatic progress so
# one can inspect and apply the changes.
git rebase -i FORK_COMMIT --exec 'pre-commit run --all-files'
# When this stops, inspect changes and stage them.
git add -u
# Continue rebasing. This prompts for a commit message and amends the last
# patch.
git rebase --continue
# The fork is now formatted according to upstream style. Rebase on master,
# and drop the 'Bump formatter configurations' patch from the list of patches.
git rebase -i master

21
doc/devel/index.rst Normal file
View file

@ -0,0 +1,21 @@
================
Developer Guides
================
In addition to documentation found or mentioned below, some developer-oriented
content is maintained directly in the `Zeek wiki
<https://github.com/zeek/zeek/wiki#development-guides>`_ due to the nature of
the content (e.g. the author finds it to be more dynamic, informal, meta,
transient, etc. compared to other documentation).
.. toctree::
:maxdepth: 2
plugins
spicy/index
websocket-api
Documentation Guide </README.rst>
contributors
maintainers
cluster-backend-zeromq

13
doc/devel/maintainers.rst Normal file
View file

@ -0,0 +1,13 @@
==================
Maintainer's Guide
==================
Some notable guidelines for maintainers are linked below for convenience, but
they are generally maintained directly in the `Zeek wiki
<https://github.com/zeek/zeek/wiki#maintainers>`_.
Release Process
===============
See https://github.com/zeek/zeek/wiki/Release-Process

505
doc/devel/plugins.rst Normal file
View file

@ -0,0 +1,505 @@
.. _zkg package manager: https://docs.zeek.org/projects/package-manager/en/stable/
.. _writing-plugins:
===============
Writing Plugins
===============
Zeek provides a plugin API that enables extending
the system dynamically, without modifying the core code base. That way,
custom code remains self-contained and can be maintained, compiled,
and installed independently. Currently, plugins can add the following
functionality to Zeek:
- Zeek scripts.
- Builtin functions/events/types for the scripting language.
- Protocol analyzers.
- File analyzers.
- Packet sources and packet dumpers.
- Logging framework backends.
- Input framework readers.
A plugin's functionality is available to the user just as if Zeek had
the corresponding code built-in. Indeed, internally many of Zeek's
pieces are structured as plugins as well, they are just statically
compiled into the binary rather than loaded dynamically at runtime.
.. note::
Plugins and Zeek packages are related but separate concepts. Both extend
Zeek's functionality without modifying Zeek's source code. A plugin achieves
this via compiled, native code that Zeek links into its core at runtime. A Zeek
package, on the other hand, is a modular addition to Zeek, managed via the
`zkg package manager`_, that may or may not include a plugin. More commonly,
packages consist of script-layer additions to Zeek's functionality. Packages
also feature more elaborate metadata, enabling dependencies on other packages,
Zeek versions, etc.
Quick Start
===========
Writing a basic plugin is quite straight-forward as long as one
follows a few conventions. In the following, we create a simple example
plugin that adds a new Built-In Function (BIF) to Zeek: we'll add
``rot13(s: string) : string``, a function that rotates every letter
in a string by 13 places.
Generally, a plugin comes in the form of a directory following a
certain structure. To get started, Zeek's distribution provides a
helper script ``auxil/zeek-aux/plugin-support/init-plugin`` that creates
a skeleton plugin that can then be customized. Let's use that::
# init-plugin ./rot13-plugin Demo Rot13
As you can see, the script takes three arguments. The first is a
directory inside which the plugin skeleton will be created. The second
is the namespace the plugin will live in, and the third is a descriptive
name for the plugin itself relative to the namespace. Zeek uses the
combination of namespace and name to identify a plugin. The namespace
serves to avoid naming conflicts between plugins written by independent
developers; pick, e.g., the name of your organisation. The namespaces
``Bro`` (legacy) and ``Zeek`` are reserved for functionality distributed
by the Zeek Project. In
our example, the plugin will be called ``Demo::Rot13``.
The ``init-plugin`` script puts a number of files in place. The full
layout is described later. For now, all we need is
``src/rot13.bif``. It's initially empty, but we'll add our new BIF
there as follows::
# cat src/rot13.bif
%%{
#include <cstring>
#include <cctype>
#include "zeek/util.h"
#include "zeek/ZeekString.h"
#include "zeek/Val.h"
%%}
module Demo;
function rot13%(s: string%) : string
%{
char* rot13 = util::copy_string(s->CheckString());
for ( char* p = rot13; *p; p++ )
{
char b = islower(*p) ? 'a' : 'A';
char d = *p - b + 13;
if ( d >= 13 && d <= 38 )
*p = d % 26 + b;
}
zeek::String* zs = new zeek::String(1, reinterpret_cast<byte_vec>(rot13),
strlen(rot13));
return make_intrusive<StringVal>(zs);
%}
The syntax of this file is just like any other ``*.bif`` file; we
won't go into it here.
Now we are ready to compile our plugin. The configure script will just
need to be able to find the location of either a Zeek installation-tree or
a Zeek source-tree.
When building a plugin against a Zeek installation-tree, simply have the
installation's associated ``zeek-config`` in your :envvar:`PATH` and the
configure script will detect it and use it to obtain all the information
it needs::
# which zeek-config
/usr/local/zeek/bin/zeek-config
# cd rot13-plugin
# ./configure && make
[... cmake output ...]
When building a plugin against a Zeek source-tree (which itself needs
to have first been built), the configure script has to explicitly be
told its location::
# cd rot13-plugin
# ./configure --zeek-dist=/path/to/zeek/dist && make
[... cmake output ...]
This builds the plugin in a subdirectory ``build/``. In fact, that
subdirectory *becomes* the plugin: when ``make`` finishes, ``build/``
has everything it needs for Zeek to recognize it as a dynamic plugin.
Let's try that. Once we point Zeek to the ``build/`` directory, it will
pull in our new plugin automatically, as we can check with the ``-N``
option::
# export ZEEK_PLUGIN_PATH=/path/to/rot13-plugin/build
# zeek -N
[...]
Demo::Rot13 - <Insert description> (dynamic, version 0.1.0)
[...]
That looks quite good, except for the dummy description that we should
replace with something nicer so that users will know what our plugin
is about. We do this by editing the ``config.description`` line in
``src/Plugin.cc``, like this::
[...]
plugin::Configuration Plugin::Configure()
{
plugin::Configuration config;
config.name = "Demo::Rot13";
config.description = "Caesar cipher rotating a string's letters by 13 places.";
config.version.major = 0;
config.version.minor = 1;
config.version.patch = 0;
return config;
}
[...]
Now rebuild and verify that the description is visible::
# make
[...]
# zeek -N | grep Rot13
Demo::Rot13 - Caesar cipher rotating a string's letters by 13 places. (dynamic, version 0.1.0)
Zeek can also show us what exactly the plugin provides with the
more verbose option ``-NN``::
# zeek -NN
[...]
Demo::Rot13 - Caesar cipher rotating a string's letters by 13 places. (dynamic, version 0.1.0)
[Function] Demo::rot13
[...]
There's our function. Now let's use it::
# zeek -e 'print Demo::rot13("Hello")'
Uryyb
It works. We next install the plugin along with Zeek itself, so that it
will find it directly without needing the ``ZEEK_PLUGIN_PATH``
environment variable. If we first unset the variable, the function
will no longer be available::
# unset ZEEK_PLUGIN_PATH
# zeek -e 'print Demo::rot13("Hello")'
error in <command line>, line 1: unknown identifier Demo::rot13, at or near "Demo::rot13"
Once we install it, it works again::
# make install
# zeek -e 'print Demo::rot13("Hello")'
Uryyb
The installed version went into
``<zeek-install-prefix>/lib/zeek/plugins/Demo_Rot13``.
One can distribute the plugin independently of Zeek for others to use.
To distribute in source form, just remove the ``build/`` directory
(``make distclean`` does that) and then tar up the whole ``rot13-plugin/``
directory. Others then follow the same process as above after
unpacking.
To distribute the plugin in binary form, the build process
conveniently creates a corresponding tarball in ``build/dist/``. In
this case, it's called ``Demo_Rot13-0.1.0.tar.gz``, with the version
number coming out of the ``VERSION`` file that ``init-plugin`` put
into place. The binary tarball has everything needed to run the
plugin, but no further source files. Optionally, one can include
further files by specifying them in the plugin's ``CMakeLists.txt``
through the ``zeek_plugin_dist_files`` macro; the skeleton does that
for ``README``, ``VERSION``, ``CHANGES``, and ``COPYING``. To use the
plugin through the binary tarball, just unpack it into
``<zeek-install-prefix>/lib/zeek/plugins/``. Alternatively, if you unpack
it in another location, then you need to point ``ZEEK_PLUGIN_PATH`` there.
Before distributing your plugin, you should edit some of the meta
files that ``init-plugin`` puts in place. Edit ``README`` and
``VERSION``, and update ``CHANGES`` when you make changes. Also put a
license file in place as ``COPYING``; if BSD is fine, you will find a
template in ``COPYING.edit-me``.
Plugin Directory Layout
=======================
A plugin's directory needs to follow a set of conventions so that Zeek
(1) recognizes it as a plugin, and (2) knows what to load. While
``init-plugin`` takes care of most of this, the following is the full
story. We'll use ``<base>`` to represent a plugin's top-level
directory. With the skeleton, ``<base>`` corresponds to ``build/``.
``<base>/__zeek_plugin__``
A file that marks a directory as containing a Zeek plugin. The file
must exist, and its content must consist of a single line with the
qualified name of the plugin (e.g., "Demo::Rot13").
``<base>/lib/<plugin-name>.<os>-<arch>.so``
The shared library containing the plugin's compiled code. Zeek will
load this in dynamically at run-time if OS and architecture match
the current platform.
``scripts/``
A directory with the plugin's custom Zeek scripts. When the plugin
gets activated, this directory will be automatically added to
``ZEEKPATH``, so that any scripts/modules inside can be
"@load"ed.
``scripts``/__load__.zeek
A Zeek script that will be loaded when the plugin gets activated.
When this script executes, any BIF elements that the plugin
defines will already be available. See below for more information
on activating plugins.
``scripts``/__preload__.zeek
A Zeek script that will be loaded when the plugin gets activated,
but before any BIF elements become available. See below for more
information on activating plugins.
``lib/bif/``
Directory with auto-generated Zeek scripts that declare the plugin's
BIF elements. The files here are produced by ``bifcl``.
Any other files in ``<base>`` are ignored by Zeek.
By convention, a plugin should put its custom scripts into sub folders
of ``scripts/``, i.e., ``scripts/<plugin-namespace>/<plugin-name>/<script>.zeek``
to avoid conflicts. As usual, you can then put a ``__load__.zeek`` in
there as well so that, e.g., ``@load Demo/Rot13`` could load a whole
module in the form of multiple individual scripts.
Note that in addition to the paths above, the ``init-plugin`` helper
puts some more files and directories in place that help with
development and installation (e.g., ``CMakeLists.txt``, ``Makefile``,
and source code in ``src/``). However, all these do not have a special
meaning for Zeek at runtime and aren't necessary for a plugin to
function.
``init-plugin``
===============
``init-plugin`` puts a basic plugin structure in place that follows
the above layout and augments it with a CMake build and installation
system. Plugins with this structure can be used both directly out of
their source directory (after ``make`` and setting Zeek's
``ZEEK_PLUGIN_PATH``), and when installed alongside Zeek (after ``make
install``).
Upon completion, ``init-plugin`` initializes a git repository and stages its
produced files for committing, but does not yet commit the files. This allows
you to tweak the new plugin as needed prior to the initial commit.
``make install`` copies over the ``lib`` and ``scripts`` directories,
as well as the ``__zeek_plugin__`` magic file and any further
distribution files specified in ``CMakeLists.txt`` (e.g., README,
VERSION). You can find a full list of files installed in
``build/MANIFEST``. Behind the scenes, ``make install`` really just
unpacks the binary tarball from ``build/dist`` into the destination
directory.
``init-plugin`` will never overwrite existing files. If its target
directory already exists, it will by default decline to do anything.
You can run it with ``-u`` instead to update an existing plugin,
however it will never overwrite any existing files; it will only put
in place files it doesn't find yet. To revert a file back to what
``init-plugin`` created originally, delete it first and then rerun
with ``-u``.
``init-plugin`` puts a ``configure`` script in place that wraps
``cmake`` with a more familiar configure-style configuration. By
default, the script provides two options for specifying paths to the
Zeek source (``--zeek-dist``) and to the plugin's installation directory
(``--install-root``). To extend ``configure`` with plugin-specific
options (such as search paths for its dependencies) don't edit the
script directly but instead extend ``configure.plugin``, which
``configure`` includes. That way you will be able to more easily
update ``configure`` in the future when the distribution version
changes. In ``configure.plugin`` you can use the predefined shell
function ``append_cache_entry`` to seed values into the CMake cache;
see the installed skeleton version and existing plugins for examples.
.. note::
In the past ``init-plugin`` also generated a ``zkg.meta`` file, automatically
creating a Zeek package containing a plugin. ``init-plugin`` now focuses
purely on plugins, as its name suggests. To bootstrap new Zeek packages
(possibly containing plugins), use the more featureful templating
functionality provided by the ``zkg create`` command, explained `here
<https://docs.zeek.org/projects/package-manager/en/stable/package.html>`_.
Activating a Plugin
===================
A plugin needs to be *activated* to make it available to the user.
Activating a plugin will:
1. Load the dynamic module
2. Make any BIF items available
3. Add the ``scripts/`` directory to ``ZEEKPATH``
4. Load ``scripts/__preload__.zeek``
5. Make BIF elements available to scripts.
6. Load ``scripts/__load__.zeek``
By default, Zeek will automatically activate all dynamic plugins found
in its search path ``ZEEK_PLUGIN_PATH``. However, in bare mode (``zeek
-b``), no dynamic plugins will be activated by default; instead the
user can selectively enable individual plugins in scriptland using the
``@load-plugin <qualified-plugin-name>`` directive (e.g.,
``@load-plugin Demo::Rot13``). Alternatively, one can activate a
plugin from the command-line by specifying its full name
(``Demo::Rot13``), or set the environment variable
``ZEEK_PLUGIN_ACTIVATE`` to a list of comma-separated names of
plugins to unconditionally activate, even in bare mode.
``zeek -N`` shows activated plugins separately from found but not yet
activated plugins. Note that plugins compiled statically into Zeek are
always activated, and hence show up as such even in bare mode.
Plugin Components
=================
It's easy for a plugin to provide custom scripts: just put them into
``scripts/``, as described above. The CMake infrastructure will automatically
install them, as well include them into the source and binary plugin
distributions.
Any number or combination of other components can be provided by a single
plugin. For example a plugin can provide multiple different protocol
analyzers, or both a log writer and input reader.
The best place to look for examples or templates for a specific type of plugin
component are the source code of Zeek itself since every one of its components
uses the same API as any external plugin.
Each component type also has a simple integration test, found
in the Zeek source-tree's ``testing/btest/plugins/`` directory,
that can serve useful for creating basic plugin skeletons.
Testing Plugins
===============
A plugin should come with a test suite to exercise its functionality.
The ``init-plugin`` script puts in place a basic
`BTest <https://github.com/zeek/btest>`_ setup
to start with. Initially, it comes with a single test that just checks
that Zeek loads the plugin correctly::
# cd tests
# btest -A
[ 0%] rot13.show-plugin ... ok
all 1 tests successful
You can also run this via the Makefile::
# cd ..
# make test
make -C tests
make[1]: Entering directory `tests'
all 1 tests successful
make[1]: Leaving directory `tests'
Now let's add a custom test that ensures that our BIF works correctly::
# cd tests
# cat >rot13/bif-rot13.zeek
# @TEST-EXEC: zeek %INPUT >output
# @TEST-EXEC: btest-diff output
event zeek_init()
{
print Demo::rot13("Hello");
}
Check the output::
# btest -d rot13/bif-rot13.zeek
[ 0%] rot13.bif-rot13 ... failed
% 'btest-diff output' failed unexpectedly (exit code 100)
% cat .diag
== File ===============================
Uryyb
== Error ===============================
test-diff: no baseline found.
=======================================
% cat .stderr
1 of 1 test failed
Install the baseline::
# btest -U rot13/bif-rot13.zeek
all 1 tests successful
Run the test-suite::
# btest
all 2 tests successful
Debugging Plugins
=================
If your plugin isn't loading as expected, Zeek's debugging facilities
can help illuminate what's going on. To enable, recompile Zeek
with debugging support (``./configure --enable-debug``), and
afterwards rebuild your plugin as well. If you then run Zeek with ``-B
plugins``, it will produce a file :file:`debug.log` that records details
about the process for searching, loading, and activating plugins.
To generate your own debugging output from inside your plugin, you can
add a custom debug stream by using the ``PLUGIN_DBG_LOG(<plugin>,
<args>)`` macro (defined in ``DebugLogger.h``), where ``<plugin>`` is
the ``Plugin`` instance and ``<args>`` are printf-style arguments,
just as with Zeek's standard debugging macros (grep for ``DBG_LOG`` in
Zeek's ``src/`` to see examples). At runtime, you can then activate
your plugin's debugging output with ``-B plugin-<name>``, where
``<name>`` is the name of the plugin as returned by its
``Configure()`` method, yet with the namespace-separator ``::``
replaced with a simple dash. Example: If the plugin is called
``Demo::Rot13``, use ``-B plugin-Demo-Rot13``. As usual, the debugging
output will be recorded to :file:`debug.log` if Zeek's compiled in debug
mode.
.. _building-plugins-statically:
Building Plugins Statically into Zeek
=====================================
Plugins can be built statically into a Zeek binary using the
``--include-plugins`` option passed to Zeek's ``configure``. This argument
takes a semicolon-separated list of absolute paths to plugin sources. Each
path needs to contain a ``CMakeLists.txt`` file, as is commonly the case at the
toplevel of plugin source trees, and usually also in Zeek packages. Building
plugins in this manner includes them directly into the Zeek binary
and installation. They are loaded automatically by Zeek at startup
without needing to install them separately.
Building plugins into Zeek is a handy way to build them consistently with
sanitizers, as you can use Zeek's existing ``./configure --sanitizers=...``
infrastructure to apply transparently to built-in plugins.
The configure run lists built-in plugins at the end, so you can verify
successful inclusion of your plugin there. Your plugin should also
show up in the resulting build's ``zeek -NN`` output.
Headers for built-in plugins are installed into a subdirectory of
``<zeek-install-prefix>/include/zeek/builtin-plugins`` specific to
each plugin. Scripts are installed into a subdirectory of
``<zeek-install-prefix>/share/zeek/builtin-plugins`` specific to
each plugin. The scripts directory is also automatically added to
the default ``ZEEKPATH``.
Plugin Tutorials
================
.. toctree::
:maxdepth: 1
plugins/connkey-plugin
plugins/event-metadata-plugin

View file

@ -0,0 +1,205 @@
.. _connkey-plugin:
===============================
Writing a Connection Key Plugin
===============================
.. versionadded:: 8.0
By default, Zeek looks up internal connection state using the classic five-tuple
of originator and responder IP addresses, ports, and the numeric protocol
identifier (for TCP, UDP, etc). Zeek's data structure driving this is called a
connection key, or ``ConnKey``.
In certain environments the classic five-tuple does not sufficiently distinguish
connections. Consider traffic mirrored from multiple VLANs with overlapping IP
address ranges. Concretely, a connection between 10.0.0.1 and 10.0.0.2 in one
VLAN is distinct from a connection between the same IPs in another VLAN. Here,
Zeek should include the VLAN identifier into the connection key, and you can
instruct Zeek to do so by loading the
:doc:`/scripts/policy/frameworks/conn_key/vlan_fivetuple.zeek` policy script.
Zeek's plugin API allows adding support for additional custom connection keys.
This section provides a tutorial on how to do so, using the example of VXLAN-enabled
flow tuples. If you're not familiar with plugin development, head over to the
:ref:`Writing Plugins <writing-plugins>` section.
Our goal is to implement a custom connection key to scope connections
transported within a `VXLAN <https://datatracker.ietf.org/doc/html/rfc7348/index.html>`_
tunnel by the VXLAN Network Identifier (VNI).
As a test case, we have encapsulated the `HTTP GET trace <https://github.com/zeek/zeek/raw/refs/heads/master/testing/btest/Traces/http/get.trace>`_
from the Zeek repository twice with VXLAN using VNIs 4711 and 4242, respectively,
and merged the resulting two PCAP files with the original PCAP.
The :download:`resulting PCAP <connkey-vxlan-fivetuple-plugin-src/Traces/vxlan-overlapping-http-get.pcap>`
contains three HTTP connections, two of which are VXLAN-encapsulated.
By default, Zeek will create the same connection key for the original and
encapsulated HTTP connections, since they have identical inner five-tuples.
Therefore, Zeek creates only a single ``http.log`` entry, and two entries
in ``conn.log``.
.. code-block:: shell
$ zeek -C -r Traces/vxlan-overlapping-http-get.pcap
$ zeek-cut -m uid method host uri < http.log
uid method host uri
CpWF5etn1l2rpaLu3 GET bro.org /download/CHANGES.bro-aux.txt
$ zeek-cut -m uid service history orig_pkts resp_pkts < conn.log
uid service history orig_pkts resp_pkts
Cq2CY245oGGbibJ8k9 http ShADTadtFf 21 21
CMleDu4xANIMzePYd7 vxlan D 28 0
Note that just two of the HTTP connections are encapsulated.
That is why the VXLAN connection shows only 28 packets.
Each HTTP connection has 14 packets total, 7 in each direction. Zeek aggregates
all packets into the single HTTP connection, but only 28 of them were
transported within the VXLAN tunnel connection. Note also the ``t`` and ``T``
flags in the :zeek:field:`Conn::Info$history` field. These stand for retransmissions,
caused by Zeek not discriminating between the different HTTP connections.
The plugin we'll develop below adds the VXLAN VNI to the connection key.
As a result, Zeek will correctly report three HTTP connections, tracked
and logged separately. We'll add the VNI as
:zeek:field:`vxlan_vni` to the :zeek:see:`conn_id_ctx` record, making it available
in ``http.log`` and ``conn.log`` via the ``id.ctx.vxlan_vni`` column.
After activating the plugin Zeek tracks each HTTP connection individually and
the logs will look as follows:
.. code-block:: shell
$ zeek-cut -m uid method host uri id.ctx.vxlan_vni < http.log
uid method host uri id.ctx.vxlan_vni
CBifsS2vqGEg8Fa5ac GET bro.org /download/CHANGES.bro-aux.txt 4711
CEllEz13txeSrbGqBe GET bro.org /download/CHANGES.bro-aux.txt 4242
CRfbJw1kBBvHDQQBta GET bro.org /download/CHANGES.bro-aux.txt -
$ zeek-cut -m uid service history orig_pkts resp_pkts id.ctx.vxlan_vni < conn.log
uid service history orig_pkts resp_pkts id.ctx.vxlan_vni
CRfbJw1kBBvHDQQBta http ShADadFf 7 7 -
CEllEz13txeSrbGqBe http ShADadFf 7 7 4242
CBifsS2vqGEg8Fa5ac http ShADadFf 7 7 4711
CC6Ald2LejCS1qcDy4 vxlan D 28 0 -
Implementation
==============
Adding alternative connection keys involves implementing two classes.
First, a factory class producing ``zeek::ConnKey`` instances. This
is the class created through the added ``zeek::conn_key::Component``.
Second, a custom connection key class derived from ``zeek::ConnKey``.
Instances of this class are created by the factory. This is a typical
abstract factory pattern.
Our plugin's ``Configure()`` method follows the standard pattern of setting up
basic information about the plugin and registering our own ``ConnKey`` component.
.. literalinclude:: connkey-vxlan-fivetuple-plugin-src/src/Plugin.cc
:caption: Plugin.cc
:language: cpp
:lines: 16-
:linenos:
:tab-width: 4
Next, in the ``Factory.cc`` file, we're implementing a custom ``zeek::ConnKey`` class.
This class is named ``VxlanVniConnKey`` and inherits from ``zeek::IPBasedConnKey``.
While ``zeek::ConnKey`` is technically the base class, in this tutorial we'll
derive from ``zeek::IPBasedConnKey``.
Currently, Zeek only supports IP-based connection tracking via the
``IPBasedAnalyzer`` analyzer. This analyzer requires ``zeek::IPBasedConnKey``
instances.
.. literalinclude:: connkey-vxlan-fivetuple-plugin-src/src/Factory.cc
:caption: VxlanVniConnKey class in Factory.cc
:language: cpp
:linenos:
:lines: 18-78
:tab-width: 4
The current pattern for custom connection keys is to embed the bytes used for
the ``zeek::session::detail::Key`` as a packed struct within a ``ConnKey`` instance.
We override ``DoPopulateConnIdVal()`` to set the :zeek:field:`vxlan_vni` field
of the :zeek:see:`conn_id_ctx` record value to the extracted VXLAN VNI. A small trick
employed is that we default the most significant byte of ``key.vxlan_vni`` to 0xFF.
As a VNI has only 24 bits, this allows us to determine if a VNI was actually
extracted, or whether it remained unset.
The ``DoInit()`` implementation is the actual place for connection key customization.
This is where we extract the VXLAN VNI from packet data. To do so, we're using the relatively
new ``GetAnalyzerData()`` API of the packet analysis manager.
This API allows generic access to the raw data layers analyzed by a give packet analyzer.
For our use-case, we take the most outer VXLAN layer, if any, and extract the VNI
into ``key.vxlan_vni``.
There's no requirement to use the ``GetAnalyzerData()`` API. If the ``zeek::Packet``
instance passed to ``DoInit()`` contains the needed information, e.g. VLAN identifiers
or information from the packet's raw bytes, you can use them directly.
Specifically, ``GetAnalyzerData()`` may introduce additional overhead into the
packet path that you can avoid if the information is readily available
elsewhere.
Using other Zeek APIs to determine connection key information is of course
also possible.
The next part shown concerns the ``Factory`` class itself. The
``DoConnKeyFromVal()`` method contains logic to produce a ``VxlanVniConnKey``
instance from an existing :zeek:see:`conn_id` record.
This is needed in order for the :zeek:see:`lookup_connection` builtin function to work properly.
The implementation re-uses the ``DoConnKeyFromVal()`` implementation of the
default ``fivetuple::Factory`` that our factory inherits from to extract the
classic five-tuple information.
.. literalinclude:: connkey-vxlan-fivetuple-plugin-src/src/Factory.cc
:caption: Factory class in Factory.cc
:language: cpp
:linenos:
:lines: 80-103
:tab-width: 4
Calling the ``fivetuple::Factory::DoConnKeyFromVal()`` in turn calls our
own factory's ``DoNewConnKey()`` method through virtual dispatch. Since our
factory overrides this method to always return a ``VxlanVniConnKey`` instance,
the static cast later is safe.
Last, the plugin's ``__load__.zeek`` file is shown. It includes the extension
of the :zeek:see:`conn_id_ctx` identifier by the :zeek:field:`vxlan_vni` field.
.. literalinclude:: connkey-vxlan-fivetuple-plugin-src/scripts/__load__.zeek
:caption: The conn_id redefinition in __load__.zeek
:language: zeek
:linenos:
:tab-width: 4
Using the custom Connection Key
===============================
After installing the plugin, the new connection key implementation can be
selected by redefining the script-level :zeek:see:`ConnKey::factory` variable.
This can either be done in a separate script, but we do it directly on the
command-line for simplicity. The ``ConnKey::CONNKEY_VXLAN_VNI_FIVETUPLE`` is
registered in Zeek during the plugin's ``AddComponent()`` call during
``Configure()``, where the component has the name ``VXLAN_VNI_FIVETUPLE``.
.. code-block:: shell
$ zeek -C -r Traces/vxlan-overlapping-http-get.pcap ConnKey::factory=ConnKey::CONNKEY_VXLAN_VNI_FIVETUPLE
Viewing the ``conn.log`` now shows three separate HTTP connections,
two of which have a ``vxlan_vni`` value set in their logs.
.. code-block:: shell
$ zeek-cut -m uid service history orig_pkts resp_pkts id.ctx.vxlan_vni < conn.log
uid service history orig_pkts resp_pkts id.ctx.vxlan_vni
CRfbJw1kBBvHDQQBta http ShADadFf 7 7 -
CEllEz13txeSrbGqBe http ShADadFf 7 7 4242
CBifsS2vqGEg8Fa5ac http ShADadFf 7 7 4711
CC6Ald2LejCS1qcDy4 vxlan D 28 0 -
Pretty cool, isn't it?

View file

@ -0,0 +1,14 @@
cmake_minimum_required(VERSION 3.15 FATAL_ERROR)
project(ZeekPluginConnKeyVxlanVniFivetuple)
include(ZeekPlugin)
zeek_add_plugin(
Zeek
ConnKey_Vxlan_Vni_Fivetuple
SOURCES
src/Factory.cc
src/Plugin.cc
SCRIPT_FILES scripts/__load__.zeek
)

View file

@ -0,0 +1,26 @@
Copyright (c) 2025 by the Zeek Project. All rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View file

@ -0,0 +1,23 @@
#
# Convenience Makefile providing a few common top-level targets.
#
cmake_build_dir=build
arch=`uname -s | tr A-Z a-z`-`uname -m`
all: build-it
build-it:
( cd $(cmake_build_dir) && make )
install:
( cd $(cmake_build_dir) && make install )
clean:
( cd $(cmake_build_dir) && make clean )
distclean:
rm -rf $(cmake_build_dir)
test:
make -C tests

View file

@ -0,0 +1 @@
0.1.0

View file

@ -0,0 +1,193 @@
#!/bin/sh
#
# Wrapper for viewing/setting options that the plugin's CMake
# scripts will recognize.
#
# Don't edit this. Edit configure.plugin to add plugin-specific options.
#
set -e
command="$0 $*"
if [ -e $(dirname $0)/configure.plugin ]; then
# Include custom additions.
. $(dirname $0)/configure.plugin
fi
usage() {
cat 1>&2 <<EOF
Usage: $0 [OPTIONS]
Plugin Options:
--cmake=PATH Path to CMake binary
--zeek-dist=DIR Path to Zeek source tree
--install-root=DIR Path where to install plugin into
--with-binpac=DIR Path to BinPAC installation root
--with-broker=DIR Path to Broker installation root
--with-bifcl=PATH Path to bifcl executable
--enable-debug Compile in debugging mode
--disable-cpp-tests Don't build C++ unit tests
EOF
if type plugin_usage >/dev/null 2>&1; then
plugin_usage 1>&2
fi
echo
exit 1
}
# Function to append a CMake cache entry definition to the
# CMakeCacheEntries variable
# $1 is the cache entry variable name
# $2 is the cache entry variable type
# $3 is the cache entry variable value
append_cache_entry() {
CMakeCacheEntries="$CMakeCacheEntries -D $1:$2=$3"
}
# set defaults
builddir=build
zeekdist=""
installroot="default"
zeek_plugin_begin_opts=""
CMakeCacheEntries=""
while [ $# -ne 0 ]; do
case "$1" in
-*=*) optarg=$(echo "$1" | sed 's/[-_a-zA-Z0-9]*=//') ;;
*) optarg= ;;
esac
case "$1" in
--help | -h)
usage
;;
--cmake=*)
CMakeCommand=$optarg
;;
--zeek-dist=*)
zeekdist=$(cd $optarg && pwd)
;;
--install-root=*)
installroot=$optarg
;;
--with-binpac=*)
append_cache_entry BinPAC_ROOT_DIR PATH $optarg
binpac_root=$optarg
;;
--with-broker=*)
append_cache_entry BROKER_ROOT_DIR PATH $optarg
broker_root=$optarg
;;
--with-bifcl=*)
append_cache_entry BifCl_EXE PATH $optarg
;;
--enable-debug)
append_cache_entry BRO_PLUGIN_ENABLE_DEBUG BOOL true
;;
--disable-cpp-tests)
zeek_plugin_begin_opts="DISABLE_CPP_TESTS;$zeek_plugin_begin_opts"
;;
*)
if type plugin_option >/dev/null 2>&1; then
plugin_option $1 && shift && continue
fi
echo "Invalid option '$1'. Try $0 --help to see available options."
exit 1
;;
esac
shift
done
if [ -z "$CMakeCommand" ]; then
# prefer cmake3 over "regular" cmake (cmake == cmake2 on RHEL)
if command -v cmake3 >/dev/null 2>&1; then
CMakeCommand="cmake3"
elif command -v cmake >/dev/null 2>&1; then
CMakeCommand="cmake"
else
echo "This plugin requires CMake, please install it first."
echo "Then you may use this script to configure the CMake build."
echo "Note: pass --cmake=PATH to use cmake in non-standard locations."
exit 1
fi
fi
if [ -z "$zeekdist" ]; then
if type zeek-config >/dev/null 2>&1; then
zeek_config="zeek-config"
else
echo "Either 'zeek-config' must be in PATH or '--zeek-dist=<path>' used"
exit 1
fi
append_cache_entry BRO_CONFIG_PREFIX PATH $(${zeek_config} --prefix)
append_cache_entry BRO_CONFIG_INCLUDE_DIR PATH $(${zeek_config} --include_dir)
append_cache_entry BRO_CONFIG_PLUGIN_DIR PATH $(${zeek_config} --plugin_dir)
append_cache_entry BRO_CONFIG_LIB_DIR PATH $(${zeek_config} --lib_dir)
append_cache_entry BRO_CONFIG_CMAKE_DIR PATH $(${zeek_config} --cmake_dir)
append_cache_entry CMAKE_MODULE_PATH PATH $(${zeek_config} --cmake_dir)
build_type=$(${zeek_config} --build_type)
if [ "$build_type" = "debug" ]; then
append_cache_entry BRO_PLUGIN_ENABLE_DEBUG BOOL true
fi
if [ -z "$binpac_root" ]; then
append_cache_entry BinPAC_ROOT_DIR PATH $(${zeek_config} --binpac_root)
fi
if [ -z "$broker_root" ]; then
append_cache_entry BROKER_ROOT_DIR PATH $(${zeek_config} --broker_root)
fi
else
if [ ! -e "$zeekdist/zeek-path-dev.in" ]; then
echo "$zeekdist does not appear to be a valid Zeek source tree."
exit 1
fi
# BRO_DIST is the canonical/historical name used by plugin CMake scripts
# ZEEK_DIST doesn't serve a function at the moment, but set/provided anyway
append_cache_entry BRO_DIST PATH $zeekdist
append_cache_entry ZEEK_DIST PATH $zeekdist
append_cache_entry CMAKE_MODULE_PATH PATH $zeekdist/cmake
fi
if [ "$installroot" != "default" ]; then
mkdir -p $installroot
append_cache_entry BRO_PLUGIN_INSTALL_ROOT PATH $installroot
fi
if [ -n "$zeek_plugin_begin_opts" ]; then
append_cache_entry ZEEK_PLUGIN_BEGIN_OPTS STRING "$zeek_plugin_begin_opts"
fi
if type plugin_addl >/dev/null 2>&1; then
plugin_addl
fi
echo "Build Directory : $builddir"
echo "Zeek Source Directory : $zeekdist"
mkdir -p $builddir
cd $builddir
"$CMakeCommand" $CMakeCacheEntries ..
echo "# This is the command used to configure this build" >config.status
echo $command >>config.status
chmod u+x config.status

View file

@ -0,0 +1,3 @@
redef record conn_id_ctx += {
vxlan_vni: count &log &optional;
};

View file

@ -0,0 +1,105 @@
// See the file "COPYING" in the main distribution directory for copyright.
#include "Factory.h"
#include <memory>
#include "zeek/ID.h"
#include "zeek/Val.h"
#include "zeek/iosource/Packet.h"
#include "zeek/packet_analysis/Analyzer.h"
#include "zeek/packet_analysis/Manager.h"
#include "zeek/packet_analysis/protocol/ip/conn_key/IPBasedConnKey.h"
#include "zeek/packet_analysis/protocol/ip/conn_key/fivetuple/Factory.h"
#include "zeek/util-types.h"
namespace zeek::conn_key::vxlan_vni_fivetuple {
class VxlanVniConnKey : public zeek::IPBasedConnKey {
public:
VxlanVniConnKey() {
// Ensure padding holes in the key struct are filled with zeroes.
memset(static_cast<void*>(&key), 0, sizeof(key));
}
detail::PackedConnTuple& PackedTuple() override { return key.tuple; }
const detail::PackedConnTuple& PackedTuple() const override { return key.tuple; }
protected:
zeek::session::detail::Key DoSessionKey() const override {
return {reinterpret_cast<const void*>(&key), sizeof(key), session::detail::Key::CONNECTION_KEY_TYPE};
}
void DoPopulateConnIdVal(zeek::RecordVal& conn_id, zeek::RecordVal& ctx) override {
// Base class populates conn_id fields (orig_h, orig_p, resp_h, resp_p)
zeek::IPBasedConnKey::DoPopulateConnIdVal(conn_id, ctx);
if ( conn_id.GetType() != id::conn_id )
return;
if ( (key.vxlan_vni & 0xFF000000) == 0 ) // High-bits unset: Have VNI
ctx.Assign(GetVxlanVniOffset(), static_cast<zeek_uint_t>(key.vxlan_vni));
else
ctx.Remove(GetVxlanVniOffset());
}
// Extract VNI from most outer VXLAN layer.
void DoInit(const Packet& pkt) override {
static const auto& analyzer = zeek::packet_mgr->GetAnalyzer("VXLAN");
// Set the high-bits: This is needed because keys can get reused.
key.vxlan_vni = 0xFF000000;
if ( ! analyzer || ! analyzer->IsEnabled() )
return;
auto spans = zeek::packet_mgr->GetAnalyzerData(analyzer);
if ( spans.empty() || spans[0].size() < 8 )
return;
key.vxlan_vni = spans[0][4] << 16 | spans[0][5] << 8 | spans[0][6];
}
static int GetVxlanVniOffset() {
static const auto& conn_id_ctx = zeek::id::find_type<zeek::RecordType>("conn_id_ctx");
static int vxlan_vni_offset = conn_id_ctx->FieldOffset("vxlan_vni");
return vxlan_vni_offset;
}
private:
friend class Factory;
struct {
struct detail::PackedConnTuple tuple;
uint32_t vxlan_vni;
} __attribute__((packed, aligned)) key; // packed and aligned due to usage for hashing
};
zeek::ConnKeyPtr Factory::DoNewConnKey() const { return std::make_unique<VxlanVniConnKey>(); }
zeek::expected<zeek::ConnKeyPtr, std::string> Factory::DoConnKeyFromVal(const zeek::Val& v) const {
if ( v.GetType() != id::conn_id )
return zeek::unexpected<std::string>{"unexpected value type"};
auto ck = zeek::conn_key::fivetuple::Factory::DoConnKeyFromVal(v);
if ( ! ck.has_value() )
return ck;
int vxlan_vni_offset = VxlanVniConnKey::GetVxlanVniOffset();
static int ctx_offset = id::conn_id->FieldOffset("ctx");
auto* k = static_cast<VxlanVniConnKey*>(ck.value().get());
auto* ctx = v.AsRecordVal()->GetFieldAs<zeek::RecordVal>(ctx_offset);
if ( vxlan_vni_offset < 0 )
return zeek::unexpected<std::string>{"missing vlxan_vni field"};
if ( ctx->HasField(vxlan_vni_offset) )
k->key.vxlan_vni = ctx->GetFieldAs<zeek::CountVal>(vxlan_vni_offset);
return ck;
}
} // namespace zeek::conn_key::vxlan_vni_fivetuple

View file

@ -0,0 +1,18 @@
#pragma once
#include "zeek/ConnKey.h"
#include "zeek/packet_analysis/protocol/ip/conn_key/fivetuple/Factory.h"
namespace zeek::conn_key::vxlan_vni_fivetuple {
class Factory : public zeek::conn_key::fivetuple::Factory {
public:
static zeek::conn_key::FactoryPtr Instantiate() { return std::make_unique<Factory>(); }
private:
// Returns a VxlanVniConnKey instance.
zeek::ConnKeyPtr DoNewConnKey() const override;
zeek::expected<zeek::ConnKeyPtr, std::string> DoConnKeyFromVal(const zeek::Val& v) const override;
};
} // namespace zeek::conn_key::vxlan_vni_fivetuple

View file

@ -0,0 +1,26 @@
#include "Plugin.h"
#include <zeek/conn_key/Component.h>
#include "Factory.h"
namespace plugin {
namespace Zeek_ConnKey_Vxlan_Vni_Fivetuple {
Plugin plugin;
}
} // namespace plugin
using namespace plugin::Zeek_ConnKey_Vxlan_Vni_Fivetuple;
zeek::plugin::Configuration Plugin::Configure() {
zeek::plugin::Configuration config;
config.name = "Zeek::ConnKey_Vxlan_Vni_Fivetuple";
config.description = "ConnKey implementation using the most outer VXLAN VNI";
config.version = {0, 1, 0};
AddComponent(new zeek::conn_key::Component("VXLAN_VNI_FIVETUPLE",
zeek::conn_key::vxlan_vni_fivetuple::Factory::Instantiate));
return config;
}

View file

@ -0,0 +1,17 @@
#pragma once
#include <zeek/plugin/Plugin.h>
namespace plugin {
namespace Zeek_ConnKey_Vxlan_Vni_Fivetuple {
class Plugin : public zeek::plugin::Plugin {
protected:
zeek::plugin::Configuration Configure() override;
};
extern Plugin plugin;
} // namespace Zeek_ConnKey_Vxlan_Vni_Fivetuple
} // namespace plugin

View file

@ -0,0 +1,3 @@
build
*.log
.state

View file

@ -0,0 +1,13 @@
cmake_minimum_required(VERSION 3.15 FATAL_ERROR)
project(ZeekPluginEventLatency)
include(ZeekPlugin)
zeek_add_plugin(
Zeek
EventLatency
SOURCES
src/Plugin.cc
SCRIPT_FILES scripts/__load__.zeek
)

View file

@ -0,0 +1,26 @@
Copyright (c) 2025 by the Zeek Project. All rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View file

@ -0,0 +1,23 @@
#
# Convenience Makefile providing a few common top-level targets.
#
cmake_build_dir=build
arch=`uname -s | tr A-Z a-z`-`uname -m`
all: build-it
build-it:
( cd $(cmake_build_dir) && make )
install:
( cd $(cmake_build_dir) && make install )
clean:
( cd $(cmake_build_dir) && make clean )
distclean:
rm -rf $(cmake_build_dir)
test:
make -C tests

View file

@ -0,0 +1 @@
0.1.0

View file

@ -0,0 +1,193 @@
#!/bin/sh
#
# Wrapper for viewing/setting options that the plugin's CMake
# scripts will recognize.
#
# Don't edit this. Edit configure.plugin to add plugin-specific options.
#
set -e
command="$0 $*"
if [ -e $(dirname $0)/configure.plugin ]; then
# Include custom additions.
. $(dirname $0)/configure.plugin
fi
usage() {
cat 1>&2 <<EOF
Usage: $0 [OPTIONS]
Plugin Options:
--cmake=PATH Path to CMake binary
--zeek-dist=DIR Path to Zeek source tree
--install-root=DIR Path where to install plugin into
--with-binpac=DIR Path to BinPAC installation root
--with-broker=DIR Path to Broker installation root
--with-bifcl=PATH Path to bifcl executable
--enable-debug Compile in debugging mode
--disable-cpp-tests Don't build C++ unit tests
EOF
if type plugin_usage >/dev/null 2>&1; then
plugin_usage 1>&2
fi
echo
exit 1
}
# Function to append a CMake cache entry definition to the
# CMakeCacheEntries variable
# $1 is the cache entry variable name
# $2 is the cache entry variable type
# $3 is the cache entry variable value
append_cache_entry() {
CMakeCacheEntries="$CMakeCacheEntries -D $1:$2=$3"
}
# set defaults
builddir=build
zeekdist=""
installroot="default"
zeek_plugin_begin_opts=""
CMakeCacheEntries=""
while [ $# -ne 0 ]; do
case "$1" in
-*=*) optarg=$(echo "$1" | sed 's/[-_a-zA-Z0-9]*=//') ;;
*) optarg= ;;
esac
case "$1" in
--help | -h)
usage
;;
--cmake=*)
CMakeCommand=$optarg
;;
--zeek-dist=*)
zeekdist=$(cd $optarg && pwd)
;;
--install-root=*)
installroot=$optarg
;;
--with-binpac=*)
append_cache_entry BinPAC_ROOT_DIR PATH $optarg
binpac_root=$optarg
;;
--with-broker=*)
append_cache_entry BROKER_ROOT_DIR PATH $optarg
broker_root=$optarg
;;
--with-bifcl=*)
append_cache_entry BifCl_EXE PATH $optarg
;;
--enable-debug)
append_cache_entry BRO_PLUGIN_ENABLE_DEBUG BOOL true
;;
--disable-cpp-tests)
zeek_plugin_begin_opts="DISABLE_CPP_TESTS;$zeek_plugin_begin_opts"
;;
*)
if type plugin_option >/dev/null 2>&1; then
plugin_option $1 && shift && continue
fi
echo "Invalid option '$1'. Try $0 --help to see available options."
exit 1
;;
esac
shift
done
if [ -z "$CMakeCommand" ]; then
# prefer cmake3 over "regular" cmake (cmake == cmake2 on RHEL)
if command -v cmake3 >/dev/null 2>&1; then
CMakeCommand="cmake3"
elif command -v cmake >/dev/null 2>&1; then
CMakeCommand="cmake"
else
echo "This plugin requires CMake, please install it first."
echo "Then you may use this script to configure the CMake build."
echo "Note: pass --cmake=PATH to use cmake in non-standard locations."
exit 1
fi
fi
if [ -z "$zeekdist" ]; then
if type zeek-config >/dev/null 2>&1; then
zeek_config="zeek-config"
else
echo "Either 'zeek-config' must be in PATH or '--zeek-dist=<path>' used"
exit 1
fi
append_cache_entry BRO_CONFIG_PREFIX PATH $(${zeek_config} --prefix)
append_cache_entry BRO_CONFIG_INCLUDE_DIR PATH $(${zeek_config} --include_dir)
append_cache_entry BRO_CONFIG_PLUGIN_DIR PATH $(${zeek_config} --plugin_dir)
append_cache_entry BRO_CONFIG_LIB_DIR PATH $(${zeek_config} --lib_dir)
append_cache_entry BRO_CONFIG_CMAKE_DIR PATH $(${zeek_config} --cmake_dir)
append_cache_entry CMAKE_MODULE_PATH PATH $(${zeek_config} --cmake_dir)
build_type=$(${zeek_config} --build_type)
if [ "$build_type" = "debug" ]; then
append_cache_entry BRO_PLUGIN_ENABLE_DEBUG BOOL true
fi
if [ -z "$binpac_root" ]; then
append_cache_entry BinPAC_ROOT_DIR PATH $(${zeek_config} --binpac_root)
fi
if [ -z "$broker_root" ]; then
append_cache_entry BROKER_ROOT_DIR PATH $(${zeek_config} --broker_root)
fi
else
if [ ! -e "$zeekdist/zeek-path-dev.in" ]; then
echo "$zeekdist does not appear to be a valid Zeek source tree."
exit 1
fi
# BRO_DIST is the canonical/historical name used by plugin CMake scripts
# ZEEK_DIST doesn't serve a function at the moment, but set/provided anyway
append_cache_entry BRO_DIST PATH $zeekdist
append_cache_entry ZEEK_DIST PATH $zeekdist
append_cache_entry CMAKE_MODULE_PATH PATH $zeekdist/cmake
fi
if [ "$installroot" != "default" ]; then
mkdir -p $installroot
append_cache_entry BRO_PLUGIN_INSTALL_ROOT PATH $installroot
fi
if [ -n "$zeek_plugin_begin_opts" ]; then
append_cache_entry ZEEK_PLUGIN_BEGIN_OPTS STRING "$zeek_plugin_begin_opts"
fi
if type plugin_addl >/dev/null 2>&1; then
plugin_addl
fi
echo "Build Directory : $builddir"
echo "Zeek Source Directory : $zeekdist"
mkdir -p $builddir
cd $builddir
"$CMakeCommand" $CMakeCacheEntries ..
echo "# This is the command used to configure this build" >config.status
echo $command >>config.status
chmod u+x config.status

View file

@ -0,0 +1,11 @@
module EventLatency;
redef enum EventMetadata::ID += {
## Identifier for the absolute time at which Zeek published this event.
WALLCLOCK_TIMESTAMP = 10001000,
};
event zeek_init()
{
assert EventMetadata::register(WALLCLOCK_TIMESTAMP, time);
}

View file

@ -0,0 +1 @@
# Empty

View file

@ -0,0 +1,65 @@
#include "Plugin.h"
#include <zeek/Event.h>
#include <zeek/Val.h>
#include <zeek/cluster/Backend.h>
#include <zeek/plugin/Plugin.h>
#include <zeek/telemetry/Manager.h>
namespace plugin {
namespace Zeek_EventLatency {
Plugin plugin;
}
} // namespace plugin
using namespace plugin::Zeek_EventLatency;
zeek::plugin::Configuration Plugin::Configure() {
zeek::plugin::Configuration config;
config.name = "Zeek::EventLatency";
config.description = "Track remote event latencies";
config.version = {0, 1, 0};
EnableHook(zeek::plugin::HOOK_PUBLISH_EVENT);
EnableHook(zeek::plugin::HOOK_QUEUE_EVENT);
return config;
}
void Plugin::InitPostScript() {
double bounds[] = {0.0002, 0.0004, 0.0006, 0.0008, 0.0010, 0.0012, 0.0014, 0.0016, 0.0018, 0.0020};
histogram =
zeek::telemetry_mgr->HistogramInstance("zeek", "cluster_event_latency_seconds", {}, bounds, "event latency");
}
bool Plugin::HookPublishEvent(zeek::cluster::Backend& backend, const std::string& topic,
zeek::cluster::detail::Event& event) {
static const auto& wallclock_id = zeek::id::find_val<zeek::EnumVal>("EventLatency::WALLCLOCK_TIMESTAMP");
auto now_val = zeek::make_intrusive<zeek::TimeVal>(zeek::util::current_time(/*real=*/true));
if ( ! event.AddMetadata(wallclock_id, now_val) )
zeek::reporter->FatalError("failed to add wallclock timestamp metadata");
return true;
}
bool Plugin::HookQueueEvent(zeek::Event* event) {
static const auto& wallclock_id = zeek::id::find_val<zeek::EnumVal>("EventLatency::WALLCLOCK_TIMESTAMP");
if ( event->Source() == zeek::util::detail::SOURCE_LOCAL )
return false;
auto timestamps = event->MetadataValues(wallclock_id);
if ( timestamps->Size() > 0 ) {
double remote_ts = timestamps->ValAt(0)->AsTime();
auto now = zeek::util::current_time(/*real=*/true);
auto latency = std::max(0.0, now - remote_ts);
histogram->Observe(latency);
}
else
zeek::reporter->Warning("missing wallclock timestamp metadata");
return false;
}

View file

@ -0,0 +1,29 @@
#pragma once
#include <zeek/plugin/Plugin.h>
#include <zeek/telemetry/Histogram.h>
namespace plugin {
namespace Zeek_EventLatency {
class Plugin : public zeek::plugin::Plugin {
protected:
// Overridden from zeek::plugin::Plugin.
zeek::plugin::Configuration Configure() override;
void InitPostScript() override;
bool HookPublishEvent(zeek::cluster::Backend& backend, const std::string& topic,
zeek::cluster::detail::Event& event) override;
bool HookQueueEvent(zeek::Event* event) override;
private:
zeek::telemetry::HistogramPtr histogram;
};
extern Plugin plugin;
} // namespace Zeek_EventLatency
} // namespace plugin

View file

@ -0,0 +1,103 @@
.. _event-metadata-plugin:
=====================
Event Metadata Plugin
=====================
.. versionadded:: 8.0
Zeek's plugin API allows adding metadata to Zeek events. In the Zeek-script
layer, the :zeek:see:`EventMetadata::current` and :zeek:see:`EventMetadata::current_all`
functions can be used to introspect metadata attached to events. In a Zeek cluster,
metadata is transported via remote events for consumption by other Zeek nodes.
This section describes the functionality in form of a tutorial. We'll
be using custom event metadata to track the latency of Zeek events in a
cluster and expose them as a Prometheus histogram.
If you're unfamiliar with plugin development, head over to the
:ref:`Writing Plugins <writing-plugins>` section. For more information
about telemetry and Prometheus, see also the :ref:`Telemetry framework's <framework-telemetry>`
documentation.
Registering Metadata
====================
Initially, we make Zeek's core aware of the metadata to attach to events. This
requires two steps.
First, redefining the :zeek:see:`EventMetadata::ID` enumeration with our
custom enumeration value ``WALLCLOCK_TIMESTAMP``. This is our metadata identifier.
Its value represents the Unix timestamps when an event was published.
Second, registering the metadata identifier with Zeek's :zeek:see:`time` type
by calling :zeek:see:`EventMetadata::register` in a :zeek:see:`zeek_init` handler.
This instructs Zeek to convert metadata items in received remote events with
identifier ``10001000`` to a :zeek:see:`time` value.
For simplicity, the second step is done in the plugin's ``scripts/__init__.zeek`` file
that's loaded automatically when Zeek loads the plugin.
.. literalinclude:: event-metadata-plugin-src/scripts/__load__.zeek
:caption: main.zeek
:language: zeek
:linenos:
:tab-width: 4
The ``10001000`` represents the metadata identifier for serialization purposes. It
needs to be unique and have a defined meaning and consistent type for a given Zeek
deployment. Metadata identifiers below ``200`` are reserved for Zeek's internal use.
Users are free to choose any other value. Zeek will fail to start or fail to
register the type in the case of conflicting identifiers in third-party packages.
Implementing the Plugin
=======================
Next, we implement the ``InitPostScript()``, ``HookPublishEvent()`` and
``HookQueueEvent()`` methods in our plugin.
In the ``InitPostScript()`` method, a histogram instance is initialized using
Zeek's telemetry manager with hard-coded bounds. These define buckets for latency
monitoring.
The ``HookPublishEvent()`` method adds ``WALLCLOCK_TIMESTAMP`` metadata with
the current time to the event, while the ``HookQueueEvent()`` method extracts
the sender's timestamp and computes the latency based on its own local time.
Finally, the latency is recorded with the histogram by calling ``Observe()``.
.. literalinclude:: event-metadata-plugin-src/src/Plugin.cc
:caption: main.zeek
:language: zeek
:linenos:
:lines: 28-
:tab-width: 4
Resulting Prometheus Metrics
============================
Deploying the plugin outlined above in a cluster and querying the manager's
metrics endpoint presents the following result::
$ curl -s localhost:10001/metrics | grep '^zeek_cluster_event_latency'
zeek_cluster_event_latency_seconds_count{endpoint="manager"} 11281
zeek_cluster_event_latency_seconds_sum{endpoint="manager"} 7.960928916931152
zeek_cluster_event_latency_seconds_bucket{endpoint="manager",le="0.0002"} 37
zeek_cluster_event_latency_seconds_bucket{endpoint="manager",le="0.0004"} 583
zeek_cluster_event_latency_seconds_bucket{endpoint="manager",le="0.0005999999999999999"} 3858
zeek_cluster_event_latency_seconds_bucket{endpoint="manager",le="0.0008"} 7960
zeek_cluster_event_latency_seconds_bucket{endpoint="manager",le="0.001"} 10185
zeek_cluster_event_latency_seconds_bucket{endpoint="manager",le="0.0012"} 10957
zeek_cluster_event_latency_seconds_bucket{endpoint="manager",le="0.0014"} 11239
zeek_cluster_event_latency_seconds_bucket{endpoint="manager",le="0.0016"} 11269
zeek_cluster_event_latency_seconds_bucket{endpoint="manager",le="0.0018"} 11279
zeek_cluster_event_latency_seconds_bucket{endpoint="manager",le="0.002"} 11281
zeek_cluster_event_latency_seconds_bucket{endpoint="manager",le="+Inf"} 11281
This example indicates that there were a total of 11281 latencies observed,
the summed up latency was around 8 seconds, 37 events had a latency less or equal
to 0.2 milliseconds, 583 with less or equal than 0.4 milliseconds and none
that took more than 2 milliseconds.
This sort of data is usually scraped and ingested by a `Prometheus server <https://prometheus.io/>`_ and
then visualized using `Grafana <https://grafana.com/>`_.

View file

@ -0,0 +1,46 @@
#!/bin/bash
#
# Copyright (c) 2020-2023 by the Zeek Project. See LICENSE for details.
#
# Tool to update autogenerated docs that require external files. Must be
# run manually and requires access to the Spicy TFTP analyzer.
set -e
if [ $# != 1 ]; then
echo "usage: $(basename "$0") <spicy-tftp-repo>"
exit 1
fi
TFTP=$1
if [ ! -d "${TFTP}"/analyzer ]; then
echo "${TFTP} does not seem to point to a spicy-tftp repository."
exit 1
fi
set -o errexit
set -o nounset
ZEEK="$(cd "$( dirname "${BASH_SOURCE[0]}")" >/dev/null 2>&1 && pwd)/../../.."
DOC="${ZEEK}/doc"
SPICY="${ZEEK}/auxil/spicy"
SPICYDOC="${ZEEK}/build/auxil/spicy/bin/spicy-doc"
AUTOGEN_FINAL="${ZEEK}/doc/devel/spicy/autogen"
if [ ! -x "${SPICYDOC}" ]; then
>&2 echo "Warning: Could not find spicy-doc in build directory, aborting"
exit 0
fi
"${SPICY}/doc/scripts/autogen-spicy-lib" functions zeek < "${ZEEK}/scripts/spicy/zeek.spicy" > "${AUTOGEN_FINAL}/zeek-functions.spicy" || exit 1
# Copy some static files over.
cp "${TFTP}"/scripts/main.zeek "${AUTOGEN_FINAL}"/tftp.zeek || exit 1
cp "${TFTP}"/analyzer/tftp.spicy "${AUTOGEN_FINAL}"/tftp.spicy || exit 1
cp "${TFTP}"/analyzer/tftp.evt "${AUTOGEN_FINAL}"/tftp.evt || exit 1
# Copy some files from the Zeek source tree so that zeek-docs remains standaline for CI.
cp "${ZEEK}/scripts/base/frameworks/spicy/init-bare.zeek" "${AUTOGEN_FINAL}/"
cp "${ZEEK}/scripts/base/frameworks/spicy/init-framework.zeek" "${AUTOGEN_FINAL}/"
cp "${ZEEK}/auxil/spicy/doc/scripts/spicy-pygments.py" "${DOC}/ext"

View file

@ -0,0 +1,38 @@
module Spicy;
export {
# doc-options-start
## Constant for testing if Spicy is available.
const available = T;
## Show output of Spicy print statements.
const enable_print = F &redef;
# Record and display profiling information, if compiled into analyzer.
const enable_profiling = F &redef;
## abort() instead of throwing HILTI exceptions.
const abort_on_exceptions = F &redef;
## Include backtraces when reporting unhandled exceptions.
const show_backtraces = F &redef;
## Maximum depth of recursive file analysis (Spicy analyzers only)
const max_file_depth: count = 5 &redef;
# doc-options-end
# doc-types-start
## Result type for :zeek:see:`Spicy::resource_usage`. The values reflect resource
## usage as reported by the Spicy runtime system.
type ResourceUsage: record {
user_time : interval; ##< user CPU time of the Zeek process
system_time :interval; ##< system CPU time of the Zeek process
memory_heap : count; ##< memory allocated on the heap by the Zeek process
num_fibers : count; ##< number of fibers currently in use
max_fibers: count; ##< maximum number of fibers ever in use
max_fiber_stack_size: count; ##< maximum fiber stack size ever in use
cached_fibers: count; ##< number of fibers currently cached
};
# doc-types-end
}

View file

@ -0,0 +1,85 @@
# doc-common-start
module Spicy;
export {
# doc-functions-start
## Enable a specific Spicy protocol analyzer if not already active. If this
## analyzer replaces an standard analyzer, that one will automatically be
## disabled.
##
## tag: analyzer to toggle
##
## Returns: true if the operation succeeded
global enable_protocol_analyzer: function(tag: Analyzer::Tag) : bool;
## Disable a specific Spicy protocol analyzer if not already inactive. If
## this analyzer replaces an standard analyzer, that one will automatically
## be re-enabled.
##
## tag: analyzer to toggle
##
## Returns: true if the operation succeeded
global disable_protocol_analyzer: function(tag: Analyzer::Tag) : bool;
## Enable a specific Spicy file analyzer if not already active. If this
## analyzer replaces an standard analyzer, that one will automatically be
## disabled.
##
## tag: analyzer to toggle
##
## Returns: true if the operation succeeded
global enable_file_analyzer: function(tag: Files::Tag) : bool;
## Disable a specific Spicy file analyzer if not already inactive. If
## this analyzer replaces an standard analyzer, that one will automatically
## be re-enabled.
##
## tag: analyzer to toggle
##
## Returns: true if the operation succeeded
global disable_file_analyzer: function(tag: Files::Tag) : bool;
## Returns current resource usage as reported by the Spicy runtime system.
global resource_usage: function() : ResourceUsage;
# doc-functions-end
}
# Marked with &is_used to suppress complaints when there aren't any
# Spicy file analyzers loaded, and hence this event can't be generated.
event spicy_analyzer_for_mime_type(a: Files::Tag, mt: string) &is_used
{
Files::register_for_mime_type(a, mt);
}
# Marked with &is_used to suppress complaints when there aren't any
# Spicy protocol analyzers loaded, and hence this event can't be generated.
event spicy_analyzer_for_port(a: Analyzer::Tag, p: port) &is_used
{
Analyzer::register_for_port(a, p);
}
function enable_protocol_analyzer(tag: Analyzer::Tag) : bool
{
return Spicy::__toggle_analyzer(tag, T);
}
function disable_protocol_analyzer(tag: Analyzer::Tag) : bool
{
return Spicy::__toggle_analyzer(tag, F);
}
function enable_file_analyzer(tag: Files::Tag) : bool
{
return Spicy::__toggle_analyzer(tag, T);
}
function disable_file_analyzer(tag: Files::Tag) : bool
{
return Spicy::__toggle_analyzer(tag, F);
}
function resource_usage() : ResourceUsage
{
return Spicy::__resource_usage();
}

View file

@ -0,0 +1,91 @@
# Copyright (c) 2021 by the Zeek Project. See LICENSE for details.
#
# Trivial File Transfer Protocol
#
# Specs from https://tools.ietf.org/html/rfc1350
module TFTP;
import spicy;
# Common header for all messages:
#
# 2 bytes
# ---------------
# | TFTP Opcode |
# ---------------
public type Packet = unit { # public top-level entry point for parsing
op: uint16 &convert=Opcode($$);
switch ( self.op ) {
Opcode::RRQ -> rrq: Request(True);
Opcode::WRQ -> wrq: Request(False);
Opcode::DATA -> data: Data;
Opcode::ACK -> ack: Acknowledgement;
Opcode::ERROR -> error: Error;
};
};
# TFTP supports five types of packets [...]:
#
# opcode operation
# 1 Read request (RRQ)
# 2 Write request (WRQ)
# 3 Data (DATA)
# 4 Acknowledgment (ACK)
# 5 Error (ERROR)
type Opcode = enum {
RRQ = 0x01,
WRQ = 0x02,
DATA = 0x03,
ACK = 0x04,
ERROR = 0x05
};
# Figure 5-1: RRQ/WRQ packet
#
# 2 bytes string 1 byte string 1 byte
# ------------------------------------------------
# | Opcode | Filename | 0 | Mode | 0 |
# ------------------------------------------------
type Request = unit(is_read: bool) {
filename: bytes &until=b"\x00";
mode: bytes &until=b"\x00";
};
# Figure 5-2: DATA packet
#
# 2 bytes 2 bytes n bytes
# ----------------------------------
# | Opcode | Block # | Data |
# ----------------------------------
type Data = unit {
num: uint16;
data: bytes &eod;
};
# Figure 5-3: ACK packet
#
# 2 bytes 2 bytes
# ---------------------
# | Opcode | Block # |
# ---------------------
type Acknowledgement = unit {
num: uint16;
};
# Figure 5-4: ERROR packet
#
# 2 bytes 2 bytes string 1 byte
# -----------------------------------------
# | Opcode | ErrorCode | ErrMsg | 0 |
# -----------------------------------------
type Error = unit {
code: uint16;
msg: bytes &until=b"\x00";
};

View file

@ -0,0 +1,16 @@
# Copyright (c) 2021 by the Zeek Project. See LICENSE for details.
#
# Note: When line numbers change in this file, update the documentation that pulls it in.
protocol analyzer spicy::TFTP over UDP:
parse with TFTP::Packet,
port 69/udp;
import TFTP;
on TFTP::Request if ( is_read ) -> event tftp::read_request($conn, $is_orig, self.filename, self.mode);
on TFTP::Request if ( ! is_read ) -> event tftp::write_request($conn, $is_orig, self.filename, self.mode);
on TFTP::Data -> event tftp::data($conn, $is_orig, self.num, self.data);
on TFTP::Acknowledgement -> event tftp::ack($conn, $is_orig, self.num);
on TFTP::Error -> event tftp::error($conn, $is_orig, self.code, self.msg);

View file

@ -0,0 +1,95 @@
# Copyright (c) 2021 by the Zeek Project. See LICENSE for details.
#
# Trivial File Transfer Protocol
#
# Specs from https://tools.ietf.org/html/rfc1350
module TFTP;
import spicy;
# Common header for all messages:
#
# 2 bytes
# ---------------
# | TFTP Opcode |
# ---------------
public type Packet = unit {
# public top-level entry point for parsing
op: uint16 &convert=Opcode($$);
switch (self.op) {
Opcode::RRQ -> rrq: Request(True);
Opcode::WRQ -> wrq: Request(False);
Opcode::DATA -> data: Data;
Opcode::ACK -> ack: Acknowledgement;
Opcode::ERROR -> error: Error;
};
};
# TFTP supports five types of packets [...]:
#
# opcode operation
# 1 Read request (RRQ)
# 2 Write request (WRQ)
# 3 Data (DATA)
# 4 Acknowledgment (ACK)
# 5 Error (ERROR)
type Opcode = enum {
RRQ = 0x01,
WRQ = 0x02,
DATA = 0x03,
ACK = 0x04,
ERROR = 0x05,
};
# Figure 5-1: RRQ/WRQ packet
#
# 2 bytes string 1 byte string 1 byte
# ------------------------------------------------
# | Opcode | Filename | 0 | Mode | 0 |
# ------------------------------------------------
type Request = unit(is_read: bool) {
filename: bytes &until=b"\x00";
mode: bytes &until=b"\x00";
on %done {
spicy::accept_input();
}
};
# Figure 5-2: DATA packet
#
# 2 bytes 2 bytes n bytes
# ----------------------------------
# | Opcode | Block # | Data |
# ----------------------------------
type Data = unit {
num: uint16;
data: bytes &eod;
};
# Figure 5-3: ACK packet
#
# 2 bytes 2 bytes
# ---------------------
# | Opcode | Block # |
# ---------------------
type Acknowledgement = unit {
num: uint16;
};
# Figure 5-4: ERROR packet
#
# 2 bytes 2 bytes string 1 byte
# -----------------------------------------
# | Opcode | ErrorCode | ErrMsg | 0 |
# -----------------------------------------
type Error = unit {
code: uint16;
msg: bytes &until=b"\x00";
};

View file

@ -0,0 +1,162 @@
# Copyright (c) 2021 by the Zeek Project. See LICENSE for details.
module TFTP;
export {
redef enum Log::ID += { LOG };
type Info: record {
## Timestamp for when the request happened.
ts: time &log;
## Unique ID for the connection.
uid: string &log;
## The connection's 4-tuple of endpoint addresses/ports.
id: conn_id &log;
## True for write requests, False for read request.
wrq: bool &log;
## File name of request.
fname: string &log;
## Mode of request.
mode: string &log;
## UID of data connection
uid_data: string &optional &log;
## Number of bytes sent.
size: count &default=0 &log;
## Highest block number sent.
block_sent: count &default=0 &log;
## Highest block number ackknowledged.
block_acked: count &default=0 &log;
## Any error code encountered.
error_code: count &optional &log;
## Any error message encountered.
error_msg: string &optional &log;
# Set to block number of final piece of data once received.
final_block: count &optional;
# Set to true once logged.
done: bool &default=F;
};
## Event that can be handled to access the TFTP logging record.
global log_tftp: event(rec: Info);
}
# Maps a partial data connection ID to the request's Info record.
global expected_data_conns: table[addr, port, addr] of Info;
redef record connection += {
tftp: Info &optional;
};
event zeek_init() &priority=5
{
Log::create_stream(TFTP::LOG, [$columns = Info, $ev = log_tftp, $path="tftp"]);
}
function log_pending(c: connection)
{
if ( ! c?$tftp || c$tftp$done )
return;
Log::write(TFTP::LOG, c$tftp);
c$tftp$done = T;
}
function init_request(c: connection, is_orig: bool, fname: string, mode: string, is_read: bool)
{
log_pending(c);
local info: Info;
info$ts = network_time();
info$uid = c$uid;
info$id = c$id;
info$fname = fname;
info$mode = mode;
info$wrq = (! is_read);
c$tftp = info;
# The data will come in from a different source port.
Analyzer::schedule_analyzer(c$id$resp_h, c$id$orig_h, c$id$orig_p, Analyzer::ANALYZER_SPICY_TFTP, 1min);
expected_data_conns[c$id$resp_h, c$id$orig_p, c$id$orig_h] = info;
}
event scheduled_analyzer_applied(c: connection, a: Analyzer::Tag) &priority=10
{
local id = c$id;
if ( [c$id$orig_h, c$id$resp_p, c$id$resp_h] in expected_data_conns )
{
c$tftp = expected_data_conns[c$id$orig_h, c$id$resp_p, c$id$resp_h];
c$tftp$uid_data = c$uid;
add c$service["spicy_tftp_data"];
}
}
event tftp::read_request(c: connection, is_orig: bool, fname: string, mode: string)
{
init_request(c, is_orig, fname, mode, T);
}
event tftp::write_request(c: connection, is_orig: bool, fname: string, mode: string)
{
init_request(c, is_orig, fname, mode, F);
}
event tftp::data(c: connection, is_orig: bool, block_num: count, data: string)
{
if ( ! c?$tftp || c$tftp$done )
return;
local info = c$tftp;
if ( block_num <= info$block_sent )
# Duplicate (or previous gap; we don't track that)
return;
info$size += |data|;
info$block_sent = block_num;
if ( |data| < 512 )
# Last block, per spec.
info$final_block = block_num;
}
event tftp::ack(c: connection, is_orig: bool, block_num: count)
{
if ( ! c?$tftp || c$tftp$done )
return;
local info = c$tftp;
info$block_acked = block_num;
if ( block_num <= info$block_acked )
# Duplicate (or previous gap, we don't track that)
return;
info$block_acked = block_num;
# If it's an ack for the last block, we're done.
if ( info?$final_block && info$final_block == block_num )
log_pending(c);
}
event tftp::error(c: connection, is_orig: bool, code: count, msg: string)
{
if ( ! c?$tftp || c$tftp$done )
return;
local info = c$tftp;
info$error_code = code;
info$error_msg = msg;
log_pending(c);
}
event connection_state_remove(c: connection)
{
if ( ! c?$tftp || c$tftp$done )
return;
log_pending(c);
}

View file

@ -0,0 +1,736 @@
.. _spicy_confirm_protocol:
.. rubric:: ``function zeek::confirm_protocol()``
[Deprecated] Triggers a DPD protocol confirmation for the current connection.
This function has been deprecated and will be removed. Use ``spicy::accept_input``
instead, which will have the same effect with Zeek.
.. _spicy_reject_protocol:
.. rubric:: ``function zeek::reject_protocol(reason: string)``
[Deprecated] Triggers a DPD protocol violation for the current connection.
This function has been deprecated and will be removed. Use ``spicy::decline_input``
instead, which will have the same effect with Zeek.
.. _spicy_weird:
.. rubric:: ``function zeek::weird(id: string, addl: string = "") : &cxxname="zeek::spicy::rt::weird";``
Reports a "weird" to Zeek. This should be used with similar semantics as in
Zeek: something quite unexpected happening at the protocol level, which however
does not prevent us from continuing to process the connection.
id: the name of the weird, which (just like in Zeek) should be a *static*
string identifying the situation reported (e.g., ``unexpected_command``).
addl: additional information to record along with the weird
.. _spicy_is_orig:
.. rubric:: ``function zeek::is_orig() : bool``
Returns true if we're currently parsing the originator side of a connection.
.. _spicy_uid:
.. rubric:: ``function zeek::uid() : string``
Returns the current connection's UID.
.. _spicy_conn_id:
.. rubric:: ``function zeek::conn_id() : tuple<orig_h: addr, orig_p: port, resp_h: addr, resp_p: port>``
Returns the current connection's 4-tuple ID to make IP address and port information available.
.. _spicy_flip_roles:
.. rubric:: ``function zeek::flip_roles()``
Instructs Zeek to flip the directionality of the current connection.
.. _spicy_number_packets:
.. rubric:: ``function zeek::number_packets() : uint64``
Returns the number of packets seen so far on the current side of the current connection.
.. _spicy_has_analyzer:
.. rubric:: ``function zeek::has_analyzer(analyzer: string, if_enabled: bool = True) : bool``
Checks if there is a Zeek analyzer of a given name.
analyzer: the Zeek-side name of the analyzer to check for
if_enabled: if true, only checks for analyzers that are enabled
Returns the type of the analyzer if it exists, or ``Undef`` if it does not.
.. _spicy_analyzer_type:
.. rubric:: ``function zeek::analyzer_type(analyzer: string, if_enabled: bool = True) : AnalyzerType``
Returns the type of a Zeek analyzer of a given name.
analyzer: the Zeek-side name of the analyzer to check
if_enabled: if true, only checks for analyzers that are enabled
Returns the type of the analyzer if it exists, or ``Undef`` if it does not.
.. _spicy_protocol_begin:
.. rubric:: ``function zeek::protocol_begin(analyzer: optional<string>, protocol: spicy::Protocol = spicy::Protocol::TCP)``
Adds a Zeek-side child protocol analyzer to the current connection.
If the same analyzer was added previously with `protocol_handle_get_or_create` or
`protocol_begin` with same argument, and not closed with `protocol_handle_close`
or `protocol_end`, no new analyzer will be added.
See `protocol_handle_get_or_create` for lifetime and error semantics.
analyzer: type of analyzer to instantiate, specified through its Zeek-side
name (similar to what Zeek's signature action `enable` takes)
protocol: the transport-layer protocol that the analyzer uses; only TCP is
currently supported here
Note: For backwards compatibility, the analyzer argument can be left unset to add
a DPD analyzer. This use is deprecated, though; use the single-argument version of
`protocol_begin` for that instead.
.. _spicy_protocol_begin_2:
.. rubric:: ``function zeek::protocol_begin(protocol: spicy::Protocol = spicy::Protocol::TCP)``
Adds a Zeek-side DPD child protocol analyzer performing dynamic protocol detection
on subsequently provided data.
If the same DPD analyzer was added previously with `protocol_handle_get_or_create` or
`protocol_begin` with same argument, and not closed with `protocol_handle_close`
or `protocol_end`, no new analyzer will be added.
See `protocol_handle_get_or_create` for lifetime and error semantics.
protocol: the transport-layer protocol on which to perform protocol detection;
only TCP is currently supported here
.. _spicy_protocol_handle_get_or_create:
.. rubric:: ``function zeek::protocol_handle_get_or_create(analyzer: string, protocol: spicy::Protocol = spicy::Protocol::TCP) : ProtocolHandle``
Gets a handle to a Zeek-side child protocol analyzer for the current connection.
If no such child exists yet it will be added; otherwise a handle to the
existing child protocol analyzer will be returned.
This function will return an error if:
- not called from a protocol analyzer, or
- the requested child protocol analyzer is of unknown type or not support by the requested transport protocol, or
- creation of a child analyzer of the requested type was prevented by a
previous call of `disable_analyzer` with `prevent=T`
By default, any newly created child protocol analyzer will remain alive
until Zeek expires the current connection's state. Alternatively, one
can call `protocol_handle_close` or `protocol_end` to delete the analyzer
earlier.
analyzer: type of analyzer to get or instantiate, specified through its Zeek-side
name (similar to what Zeek's signature action `enable` takes).
protocol: the transport-layer protocol that the analyser uses; only TCP is
currently supported here
.. _spicy_protocol_data_in:
.. rubric:: ``function zeek::protocol_data_in(is_orig: bool, data: bytes, protocol: spicy::Protocol = spicy::Protocol::TCP)``
Forwards protocol data to all previously instantiated Zeek-side child protocol analyzers of a given transport-layer.
is_orig: true to feed the data to the child's originator side, false for the responder
data: chunk of data to forward to child analyzer
protocol: the transport-layer protocol of the children to forward to; only TCP is currently supported here
.. _spicy_protocol_data_in_2:
.. rubric:: ``function zeek::protocol_data_in(is_orig: bool, data: bytes, h: ProtocolHandle)``
Forwards protocol data to a specific previously instantiated Zeek-side child analyzer.
is_orig: true to feed the data to the child's originator side, false for the responder
data: chunk of data to forward to child analyzer
h: handle to the child analyzer to forward data into
.. _spicy_protocol_gap:
.. rubric:: ``function zeek::protocol_gap(is_orig: bool, offset: uint64, len: uint64, h: optional<ProtocolHandle> = Null)``
Signals a gap in input data to all previously instantiated Zeek-side child protocol analyzers.
is_orig: true to signal gap to the child's originator side, false for the responder
offset: start offset of gap in input stream
len: size of gap
h: optional handle to the child analyzer signal a gap to, else signal to all child analyzers
.. _spicy_protocol_end:
.. rubric:: ``function zeek::protocol_end()``
Signals end-of-data to all previously instantiated Zeek-side child protocol
analyzers and removes them.
.. _spicy_protocol_handle_close:
.. rubric:: ``function zeek::protocol_handle_close(handle: ProtocolHandle)``
Signals end-of-data to the given child analyzer and removes it.
The given handle must be live, i.e., it must not have been used in a
previous protocol_handle_close call, and must not have been live when
protocol_end was called. If the handle is not live a runtime error will
be triggered.
handle: handle to the child analyzer to remove
.. _spicy_file_begin:
.. rubric:: ``function zeek::file_begin(mime_type: optional<string> = Null, fuid: optional<string> = Null) : string``
Signals the beginning of a file to Zeek's file analysis, associating it with the current connection.
Optionally, a mime type can be provided. It will be passed on to Zeek's file analysis framework.
Optionally, a file ID can be provided. It will be passed on to Zeek's file analysis framework.
Returns the Zeek-side file ID of the new file.
This function creates a new Zeek file analyzer that will remain alive until
either `file_end` gets called, or Zeek eventually expires the analyzer
through a timeout. (As Zeek does not tie a file analyzer's lifetime to any
connection, it may survive the termination of the current connection.)
.. _spicy_fuid:
.. rubric:: ``function zeek::fuid() : string``
Returns the current file's FUID.
.. _spicy_terminate_session:
.. rubric:: ``function zeek::terminate_session()``
Terminates the currently active Zeek-side session, flushing all state. Any
subsequent activity will start a new session from scratch. This can only be
called from inside a protocol analyzer.
.. _spicy_skip_input:
.. rubric:: ``function zeek::skip_input()``
Tells Zeek to skip sending any further input data to the current analyzer.
This is supported for protocol and file analyzers.
.. _spicy_file_set_size:
.. rubric:: ``function zeek::file_set_size(size: uint64, fid: optional<string> = Null)``
Signals the expected size of a file to Zeek's file analysis.
size: expected size of file
fid: Zeek-side ID of the file to operate on; if not given, the file started by the most recent file_begin() will be used
.. _spicy_file_data_in:
.. rubric:: ``function zeek::file_data_in(data: bytes, fid: optional<string> = Null)``
Passes file content on to Zeek's file analysis.
data: chunk of raw data to pass into analysis
fid: Zeek-side ID of the file to operate on; if not given, the file started by the most recent file_begin() will be used
.. _spicy_file_data_in_at_offset:
.. rubric:: ``function zeek::file_data_in_at_offset(data: bytes, offset: uint64, fid: optional<string> = Null)``
Passes file content at a specific offset on to Zeek's file analysis.
data: chunk of raw data to pass into analysis
offset: position in file where data starts
fid: Zeek-side ID of the file to operate on; if not given, the file started by the most recent file_begin() will be used
.. _spicy_file_gap:
.. rubric:: ``function zeek::file_gap(offset: uint64, len: uint64, fid: optional<string> = Null)``
Signals a gap in a file to Zeek's file analysis.
offset: position in file where gap starts
len: size of gap
fid: Zeek-side ID of the file to operate on; if not given, the file started by the most recent file_begin() will be used
.. _spicy_file_end:
.. rubric:: ``function zeek::file_end(fid: optional<string> = Null)``
Signals the end of a file to Zeek's file analysis.
fid: Zeek-side ID of the file to operate on; if not given, the file started by the most recent file_begin() will be used
.. _spicy_forward_packet:
.. rubric:: ``function zeek::forward_packet(identifier: uint32)``
Inside a packet analyzer, forwards what data remains after parsing the top-level unit
on to another analyzer. The index specifies the target, per the current dispatcher table.
.. _spicy_network_time:
.. rubric:: ``function zeek::network_time() : time``
Gets the network time from Zeek.
.. _spicy_get_address:
.. rubric:: ``function zeek::get_address(id: string) : addr``
Returns the value of a global Zeek script variable of Zeek type ``addr``.
Throws an exception if there's no such Zeek of that name, or if it's not of
the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_bool:
.. rubric:: ``function zeek::get_bool(id: string) : bool``
Returns the value of a global Zeek script variable of Zeek type ``bool``.
Throws an exception if there's no such Zeek of that name, or if it's not of
the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_count:
.. rubric:: ``function zeek::get_count(id: string) : uint64``
Returns the value of a global Zeek script variable of Zeek type ``count``.
Throws an exception if there's no such Zeek of that name, or if it's not of
the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_double:
.. rubric:: ``function zeek::get_double(id: string) : real``
Returns the value of a global Zeek script variable of Zeek type ``double``.
Throws an exception if there's no such Zeek of that name, or if it's not of
the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_enum:
.. rubric:: ``function zeek::get_enum(id: string) : string``
Returns the value of a global Zeek script variable of Zeek type ``enum``.
The value is returned as a string containing the enum's label name, without
any scope. Throws an exception if there's no such Zeek of that name, or if
it's not of the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_int:
.. rubric:: ``function zeek::get_int(id: string) : int64``
Returns the value of a global Zeek script variable of Zeek type ``int``.
Throws an exception if there's no such Zeek of that name, or if it's not of
the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_interval:
.. rubric:: ``function zeek::get_interval(id: string) : interval``
Returns the value of a global Zeek script variable of Zeek type
``interval``. Throws an exception if there's no such Zeek of that name, or
if it's not of the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_port:
.. rubric:: ``function zeek::get_port(id: string) : port``
Returns the value of a global Zeek script variable of Zeek type ``port``.
Throws an exception if there's no such Zeek of that name, or if it's not of
the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_record:
.. rubric:: ``function zeek::get_record(id: string) : ZeekRecord``
Returns the value of a global Zeek script variable of Zeek type ``record``.
The value is returned as an opaque handle to the record, which can be used
with the ``zeek::record_*()`` functions to access the record's fields.
Throws an exception if there's no such Zeek of that name, or if it's not of
the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_set:
.. rubric:: ``function zeek::get_set(id: string) : ZeekSet``
Returns the value of a global Zeek script variable of Zeek type ``set``. The
value is returned as an opaque handle to the set, which can be used with the
``zeek::set_*()`` functions to access the set's content. Throws an exception
if there's no such Zeek of that name, or if it's not of the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_string:
.. rubric:: ``function zeek::get_string(id: string) : bytes``
Returns the value of a global Zeek script variable of Zeek type ``string``.
The string's value is returned as a Spicy ``bytes`` value. Throws an
exception if there's no such Zeek of that name, or if it's not of the
expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_subnet:
.. rubric:: ``function zeek::get_subnet(id: string) : network``
Returns the value of a global Zeek script variable of Zeek type ``subnet``.
Throws an exception if there's no such Zeek of that name, or if it's not of
the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_table:
.. rubric:: ``function zeek::get_table(id: string) : ZeekTable``
Returns the value of a global Zeek script variable of Zeek type ``table``.
The value is returned as an opaque handle to the set, which can be used with
the ``zeek::set_*()`` functions to access the set's content. Throws an
exception if there's no such Zeek of that name, or if it's not of the
expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_time:
.. rubric:: ``function zeek::get_time(id: string) : time``
Returns the value of a global Zeek script variable of Zeek type ``time``.
Throws an exception if there's no such Zeek of that name, or if it's not of
the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_vector:
.. rubric:: ``function zeek::get_vector(id: string) : ZeekVector``
Returns the value of a global Zeek script variable of Zeek type ``vector``.
The value is returned as an opaque handle to the vector, which can be used
with the ``zeek::vector_*()`` functions to access the vector's content.
Throws an exception if there's no such Zeek of that name, or if it's not of
the expected type.
id: fully-qualified name of the global Zeek variable to retrieve
.. _spicy_get_value:
.. rubric:: ``function zeek::get_value(id: string) : ZeekVal``
Returns an opaque handle to a global Zeek script variable. The handle can be
used with the ``zeek::as_*()`` functions to access the variable's value.
Throws an exception if there's no Zeek variable of that name.
.. _spicy_as_address:
.. rubric:: ``function zeek::as_address(v: ZeekVal) : addr``
Returns a Zeek ``addr`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_bool:
.. rubric:: ``function zeek::as_bool(v: ZeekVal) : bool``
Returns a Zeek ``bool`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_count:
.. rubric:: ``function zeek::as_count(v: ZeekVal) : uint64``
Returns a Zeek ``count`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_double:
.. rubric:: ``function zeek::as_double(v: ZeekVal) : real``
Returns a Zeek ``double`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_enum:
.. rubric:: ``function zeek::as_enum(v: ZeekVal) : string``
Returns a Zeek ``enum`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_int:
.. rubric:: ``function zeek::as_int(v: ZeekVal) : int64``
Returns a Zeek ``int`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_interval:
.. rubric:: ``function zeek::as_interval(v: ZeekVal) : interval``
Returns a Zeek ``interval`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_port:
.. rubric:: ``function zeek::as_port(v: ZeekVal) : port``
Returns a Zeek ``port`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_record:
.. rubric:: ``function zeek::as_record(v: ZeekVal) : ZeekRecord``
Returns a Zeek ``record`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_set:
.. rubric:: ``function zeek::as_set(v: ZeekVal) : ZeekSet``
Returns a Zeek ``set`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_string:
.. rubric:: ``function zeek::as_string(v: ZeekVal) : bytes``
Returns a Zeek ``string`` value refereced by an opaque handle. The string's
value is returned as a Spicy ``bytes`` value. Throws an exception if the
referenced value is not of the expected type.
.. _spicy_as_subnet:
.. rubric:: ``function zeek::as_subnet(v: ZeekVal) : network``
Returns a Zeek ``subnet`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_table:
.. rubric:: ``function zeek::as_table(v: ZeekVal) : ZeekTable``
Returns a Zeek ``table`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_time:
.. rubric:: ``function zeek::as_time(v: ZeekVal) : time``
Returns a Zeek ``time`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_as_vector:
.. rubric:: ``function zeek::as_vector(v: ZeekVal) : ZeekVector``
Returns a Zeek ``vector`` value refereced by an opaque handle. Throws an
exception if the referenced value is not of the expected type.
.. _spicy_set_contains:
.. rubric:: ``function zeek::set_contains(id: string, v: any) : bool``
Returns true if a Zeek set contains a given value. Throws an exception if
the given ID does not exist, or does not have the expected type.
id: fully-qualified name of the global Zeek set to check
v: value to check for, which must be of the Spicy-side equivalent of the set's key type
.. _spicy_set_contains_2:
.. rubric:: ``function zeek::set_contains(s: ZeekSet, v: any) : bool``
Returns true if a Zeek set contains a given value. Throws an exception if
the set does not have the expected type.
s: opaque handle to the Zeek set, as returned by other functions
v: value to check for, which must be of the Spicy-side equivalent of the set's key type
.. _spicy_table_contains:
.. rubric:: ``function zeek::table_contains(id: string, v: any) : bool``
Returns true if a Zeek table contains a given value. Throws an exception if
the given ID does not exist, or does not have the expected type.
id: fully-qualified name of the global Zeek table to check
v: value to check for, which must be of the Spicy-side equivalent of the table's key type
.. _spicy_table_contains_2:
.. rubric:: ``function zeek::table_contains(t: ZeekTable, v: any) : bool``
Returns true if a Zeek table contains a given value. Throws an exception if
the given ID does not exist, or does not have the expected type.
t: opaque handle to the Zeek table, as returned by other functions
v: value to check for, which must be of the Spicy-side equivalent of the table's key type
.. _spicy_table_lookup:
.. rubric:: ``function zeek::table_lookup(id: string, v: any) : optional<ZeekVal>``
Returns the value associated with a key in a Zeek table. Returns an error
result if the key does not exist in the table. Throws an exception if the
given table ID does not exist, or does not have the expected type.
id: fully-qualified name of the global Zeek table to check
v: value to lookup, which must be of the Spicy-side equivalent of the table's key type
.. _spicy_table_lookup_2:
.. rubric:: ``function zeek::table_lookup(t: ZeekTable, v: any) : optional<ZeekVal>``
Returns the value associated with a key in a Zeek table. Returns an error
result if the key does not exist in the table. Throws an exception if the
given table ID does not exist, or does not have the expected type.
t: opaque handle to the Zeek table, as returned by other functions
v: value to lookup, which must be of the Spicy-side equivalent of the table's key type
.. _spicy_record_has_value:
.. rubric:: ``function zeek::record_has_value(id: string, field: string) : bool``
Returns true if a Zeek record provides a value for a given field. This
includes fields with `&default` values. Throws an exception if the given ID
does not exist, or does not have the expected type.
id: fully-qualified name of the global Zeek record to check field: name of
the field to check
.. _spicy_record_has_value_2:
.. rubric:: ``function zeek::record_has_value(r: ZeekRecord, field: string) : bool``
Returns true if a Zeek record provides a value for a given field.
This includes fields with `&default` values.
r: opaque handle to the Zeek record, as returned by other functions
field: name of the field to check
.. _spicy_record_has_field:
.. rubric:: ``function zeek::record_has_field(id: string, field: string) : bool``
Returns true if the type of a Zeek record has a field of a given name.
Throws an exception if the given ID does not exist, or does not have the
expected type.
id: fully-qualified name of the global Zeek record to check
field: name of the field to check
.. _spicy_record_has_field_2:
.. rubric:: ``function zeek::record_has_field(r: ZeekRecord, field: string) : bool``
Returns true if the type of a Zeek record has a field of a given name.
r: opaque handle to the Zeek record, as returned by other functions
field: name of the field to check
.. _spicy_record_field:
.. rubric:: ``function zeek::record_field(id: string, field: string) : ZeekVal``
Returns a field's value from a Zeek record. Throws an exception if the given
ID does not exist, or does not have the expected type; or if there's no such
field in the record type, or if the field does not have a value.
id: fully-qualified name of the global Zeek record to check
field: name of the field to retrieve
.. _spicy_record_field_2:
.. rubric:: ``function zeek::record_field(r: ZeekRecord, field: string) : ZeekVal``
Returns a field's value from a Zeek record. Throws an exception if the given
record does not have such a field, or if the field does not have a value.
r: opaque handle to the Zeek record, as returned by other functions
field: name of the field to retrieve
.. _spicy_vector_index:
.. rubric:: ``function zeek::vector_index(id: string, index: uint64) : ZeekVal``
Returns the value of an index in a Zeek vector. Throws an exception if the
given ID does not exist, or does not have the expected type; or if the index
is out of bounds.
id: fully-qualified name of the global Zeek vector to check
index: index of the element to retrieve
.. _spicy_vector_index_2:
.. rubric:: ``function zeek::vector_index(v: ZeekVector, index: uint64) : ZeekVal``
Returns the value of an index in a Zeek vector. Throws an exception if the
index is out of bounds.
v: opaque handle to the Zeek vector, as returned by other functions
index: index of the element to retrieve
.. _spicy_vector_size:
.. rubric:: ``function zeek::vector_size(id: string) : uint64``
Returns the size of a Zeek vector. Throws an exception if the given ID does
not exist, or does not have the expected type.
id: fully-qualified name of the global Zeek vector to check
.. _spicy_vector_size_2:
.. rubric:: ``function zeek::vector_size(v: ZeekVector) : uint64``
Returns the size of a Zeek vector.
v: opaque handle to the Zeek vector, as returned by other functions

View file

@ -0,0 +1,5 @@
protocol analyzer spicy::MyHTTP over TCP:
parse originator with MyHTTP::RequestLine,
port 12345/tcp;
on MyHTTP::RequestLine -> event MyHTTP::request_line($conn, self.method, self.uri, self.version.number);

View file

@ -0,0 +1,26 @@
# @TEST-EXEC: echo "GET /index.html HTTP/1.0" | spicy-driver %INPUT >output
# @TEST-EXEC: btest-diff output
module MyHTTP;
const Token = /[^ \t\r\n]+/;
const WhiteSpace = /[ \t]+/;
const NewLine = /\r?\n/;
type Version = unit {
: /HTTP\//;
number: /[0-9]+\.[0-9]+/;
};
public type RequestLine = unit {
method: Token;
: WhiteSpace;
uri: Token;
: WhiteSpace;
version: Version;
: NewLine;
on %done {
print self.method, self.uri, self.version.number;
}
};

View file

@ -0,0 +1,4 @@
event MyHTTP::request_line(c: connection, method: string, uri: string, version: string)
{
print fmt("Zeek saw from %s: %s %s %s", c$id$orig_h, method, uri, version);
}

Binary file not shown.

View file

@ -0,0 +1,37 @@
function schedule_tftp_analyzer(id: conn_id)
{
# Schedule the TFTP analyzer for the expected next packet coming in on different
# ports. We know that it will be exchanged between same IPs and reuse the
# originator's port. "Spicy_TFTP" is the Zeek-side name of the TFTP analyzer
# (generated from "Spicy::TFTP" in tftp.evt).
Analyzer::schedule_analyzer(id$resp_h, id$orig_h, id$orig_p, Analyzer::ANALYZER_SPICY_TFTP, 1min);
}
event tftp::read_request(c: connection, is_orig: bool, filename: string, mode: string)
{
print "TFTP read request", c$id, filename, mode;
schedule_tftp_analyzer(c$id);
}
event tftp::write_request(c: connection, is_orig: bool, filename: string, mode: string)
{
print "TFTP write request", c$id, filename, mode;
schedule_tftp_analyzer(c$id);
}
# Add handlers for other packet types so that we see their events being generated.
event tftp::data(c: connection, is_orig: bool, block_num: count, data: string)
{
print "TFTP data", block_num, data;
}
event tftp::ack(c: connection, is_orig: bool, block_num: count)
{
print "TFTP ack", block_num;
}
event tftp::error(c: connection, is_orig: bool, code: count, msg: string)
{
print "TFTP error", code, msg;
}

View file

@ -0,0 +1,7 @@
protocol analyzer spicy::TFTP over UDP:
parse with TFTP::Packet,
port 69/udp;
import TFTP;
on TFTP::Request -> event tftp::request($conn, $is_orig, self.filename, self.mode);

View file

@ -0,0 +1,4 @@
event tftp::request(c: connection, is_orig: bool, filename: string, mode: string)
{
print "TFTP request", c$id, is_orig, filename, mode;
}

View file

@ -0,0 +1,7 @@
protocol analyzer spicy::TFTP over UDP:
parse with TFTP::Packet,
port 69/udp;
import TFTP;
on TFTP::Request -> event tftp::request($conn);

View file

@ -0,0 +1,4 @@
event tftp::request(c: connection)
{
print "TFTP request", c$id;
}

View file

@ -0,0 +1,9 @@
event tftp::read_request(c: connection, is_orig: bool, filename: string, mode: string)
{
print "TFTP read request", c$id, is_orig, filename, mode;
}
event tftp::write_request(c: connection, is_orig: bool, filename: string, mode: string)
{
print "TFTP write request", c$id, is_orig, filename, mode;
}

88
doc/devel/spicy/faq.rst Normal file
View file

@ -0,0 +1,88 @@
===
FAQ
===
.. _faq_zeek_install_spicy_and_plugin_to_use_parsers:
.. rubric:: Do I need to install Spicy and/or a Zeek plugin to use Spicy parsers in Zeek?
If you're using Zeek >= 5.0 with a default build configuration,
there's nothing else you need to install. After installing Zeek, the
same folder containing the ``zeek`` binary will also have the relevant
Spicy tools, such as ``spicyc`` (provided by Spicy) and ``spicyz``
(provided by Zeek). To double check that the Spicy support is indeed
available, look for ``Zeek::Spicy`` in the output of ``zeek -N``::
# zeek -N
<...>
Zeek::Spicy - Support for Spicy parsers (``*.spicy``, ``*.evt``, ``*.hlto``) (built-in)
Note that it remains possible to build Zeek against an external Spicy
installation, or even without any Spicy support at all. Look at Zeek's
``configure`` for corresponding options.
.. note::
For some historic background: Zeek 5.0 started bundling Spicy, as well
as the former Zeek plugin for Spicy, so that now nothing else needs to
be installed separately anymore to use Spicy parsers. Since Zeek 6.0,
the code for that former plugin has further moved into Zeek itself,
and is now maintained directly by the Zeek developers.
.. _faq_zeek_spicy_dpd_support:
.. rubric:: Does Spicy support *Dynamic Protocol Detection (DPD)*?
Yes, see the :ref:`corresponding section <spicy_dpd>` on how to add it
to your analyzers.
.. _faq_zeek_layer2_analyzer:
.. rubric:: Can I write a Layer 2 protocol analyzer with Spicy?
Yes, you can. In Zeek terminology a layer 2 protocol analyzer is a packet
analyzer, see the :ref:`corresponding section <spicy_packet_analyzer>` on how
to declare such an analyzer.
.. _faq_zeek_print_statements_no_effect:
.. rubric:: I have ``print`` statements in my Spicy grammar, why do I not see any output when running Zeek?
Zeek by default disables the output of Spicy-side ``print``
statements. To enable them, add ``Spicy::enable_print=T`` to the Zeek
command line (or ``redef Spicy::enable_print=T;`` to a Zeek script
that you are loading).
.. _faq_zeek_tcp_analyzer_not_all_messages_recognized:
.. rubric:: My analyzer recognizes only one or two TCP packets even though there are more in the input.
In Zeek, a Spicy analyzer parses the sending and receiving sides of a TCP
connection each according to the given Spicy grammar. This means that
if more than one message can be sent per side the grammar needs to
allow for that. For example, if the grammar parses messages of the
protocol as ``Message``, the top-level parsing unit given in the EVT
file needs to be able to parse a list of messages ``Message[]``.
One way to express this is to introduce a parser which wraps messages
of the protocol in an :spicylink:`anonymous field
<programming/parsing.html#anonymous-fields>`.
.. warning:: Since in general the number of messages exchanged over a TCP
connection is unbounded, an anonymous field should be used. If a named field
was used instead the parser would need to store all messages over the
connection which would lead to unbounded memory growth.
.. code-block:: spicy
type Message = unit {
# Fields for messages of the protocol.
};
# Parser used e.g., in EVT file.
public type Messages = unit {
: Message[];
};

View file

@ -0,0 +1,118 @@
===============
Getting Started
===============
Spicy's own :spicylink:`Getting Started <getting-started.html>` guide
uses the following Spicy code to parse a simple HTTP request line:
.. literalinclude:: examples/my-http.spicy
:lines: 4-
:caption: my-http.spicy
:language: spicy
While the Spicy documentation goes on to show :spicylink:`how to use
this to parse corresponding data from the command line
<getting-started.html#a-simple-parser>`, here we will instead leverage
the ``RequestLine`` parser to build a proof-of-concept protocol
analyzer for Zeek. While this all remains simplified here, the
following, more in-depth :ref:`spicy_tutorial` demonstrates how
to build a complete analyzer for a real protocol.
.. rubric:: Preparations
Because Zeek works from network packets, we first need a packet trace
with the payload we want to parse. We can't just use a normal HTTP
session as our simple parser wouldn't go further than just the first
line of the protocol exchange and then bail out with an error. So
instead, for our example we create a custom packet trace with a TCP
connection that carries just a single HTTP request line as its
payload::
# tcpdump -i lo0 -w request-line.pcap port 12345 &
# nc -l 12345 &
# echo "GET /index.html HTTP/1.0" | nc localhost 12345
# killall tcpdump nc
This gets us :download:`this trace file <examples/request-line.pcap>`.
.. _example_spicy_my_http_adding_analyzer:
.. rubric:: Adding a Protocol Analyzer
Now we can go ahead and add a new protocol analyzer to Zeek. We
already got the Spicy grammar to parse our connection's payload, it's
in ``my-http.spicy``. In order to use this with Zeek, we have two
additional things to do: (1) We need to let Zeek know about our new
protocol analyzer, including when to use it; and (2) we need to define
at least one Zeek event that we want our parser to generate, so that
we can then write a Zeek script working with the information that it
extracts.
We do both of these by creating an additional control file for Zeek:
.. literalinclude:: examples/my-http.evt
:caption: my-http.evt
:linenos:
:language: spicy-evt
The first block (lines 1-3) tells Zeek that we have a new protocol
analyzer to provide. The analyzer's Zeek-side name is
``spicy::MyHTTP``, and it's meant to run on top of TCP connections
(line 1). Lines 2-3 then provide Zeek with more specifics: The entry
point for originator-side payload is the ``MyHTTP::RequestLine`` unit
type that our Spicy grammar defines (line 2); and we want Zeek to
activate our analyzer for all connections with a responder port of
12345 (which, of course, matches the packet trace we created).
The second block (line 5) tells Zeek that we want to
define one event. On the left-hand side of that line we give the unit
that is to trigger the event. The right-hand side defines its name and
arguments. What we are saying here is that every time a ``RequestLine``
line has been fully parsed, we'd like a ``MyHTTP::request_line`` event
to go to Zeek. Each event instance will come with four parameters:
Three of them are the values of corresponding unit fields, accessed
just through normal Spicy expressions (inside an event argument
expression, ``self`` refers to the unit instance that has led to the
generation of the current event). The first parameter, ``$conn``, is a
"magic" keyword that passes the Zeek-side
connection ID (``conn_id``) to the event.
Now we got everything in place that we need for our new protocol
analyzer---except for a Zeek script actually doing something with the
information we are parsing. Let's use this:
.. literalinclude:: examples/my-http.zeek
:caption: my-http.zeek
:language: zeek
You see an Zeek event handler for the event that we just defined,
having the expected signature of four parameters matching the types of
the parameter expressions that the ``*.evt`` file specifies. The
handler's body then just prints out what it gets.
.. _example_spicy_my_http:
Finally we can put together our pieces by compiling the Spicy grammar and the
EVT file into an HLTO file with ``spicyz``, and by pointing Zeek at the produced
file and the analyzer-specific Zeek scripts::
# spicyz my-http.spicy my-http.evt -o my-http.hlto
# zeek -Cr request-line.pcap my-http.hlto my-http.zeek
Zeek saw from 127.0.0.1: GET /index.html 1.0
When Zeek starts up here the Spicy integration registers a protocol analyzer to
the entry point of our Spicy grammar as specified in the EVT file. It then
begins processing the packet trace as usual, now activating our new analyzer
whenever it sees a TCP connection on port 12345. Accordingly, the
``MyHTTP::request_line`` event gets generated once the parser gets to process
the session's payload. The Zeek event handler then executes and prints the
output we would expect.
.. note::
By default, Zeek suppresses any output from Spicy-side
``print`` statements. You can add ``Spicy::enable_print=T`` to the
command line to see it. In the example above, you would then get
an additional line of output: ``GET, /index.html, 1.0``.

73
doc/devel/spicy/index.rst Normal file
View file

@ -0,0 +1,73 @@
============================
Writing Analyzers with Spicy
============================
:spicylink:`Spicy <index.html>` is a parser generator that makes it
easy to create robust C++ parsers for network protocols, file formats,
and more. Zeek supports integrating Spicy analyzers so that one can
create Zeek protocol, packet and file analyzers. This section digs
into how that integration works. We begin with a short "Getting
Started" guide showing you the basics of using Spicy with Zeek,
followed by an in-depth tutorial on adding a complete protocol
analyzer to Zeek. The final part consists of a reference section
documenting everything the Spicy integration supports.
While this documentation walks through all the bits and pieces that an
analyzer consists of, there's an easy way to get started when writing
a new analyzer from scratch: the `Zeek package manager
<https://docs.zeek.org/projects/package-manager>`_ can create analyzer
scaffolding for you that includes an initial Spicy grammar
(``*.spicy``), Zeek integration glue code (``*.evt``; see below) and a
corresponding CMake build setup. To create that scaffolding, use the
package managers ``create`` command and pass one of
``--features=spicy-protocol-analyzer``,
``--features=spicy-packet-analyzer``, or
``--features=spicy-file-analyzer`` to create a Zeek protocol, packet,
or file analyzer, respectively. See :ref:`the tutorial
<zkg_create_package>` for more on this.
Note that Zeek itself installs the grammars of its builtin Spicy
analyzers for potential reuse. For example, the `Finger grammar
<https://github.com/zeek/zeek/blob/master/src/analyzer/protocol/finger/finger.spicy>`_
gets installed to ``<PREFIX>/share/spicy/finger/finger.spicy``. It can
be used in custom code by importing it with ``import Finger from
finger;``.
.. toctree::
:maxdepth: 2
:caption: Table of Contents
installation
getting-started
tutorial
reference
faq
.. note::
This documentation focuses on writing *external* Spicy analyzers
that you can load into Zeek at startup. Zeek also comes with the
infrastructure to build Spicy analyzers directly into the
executable itself, just like traditional built-in analyzers. We
will document this more as we're converting more of Zeek's built-in
analyzers over to Spicy. For now, we recommend locking at one of
the existing built-in Spicy analyzers (Syslog, Finger) as examples.
.. _spicy_terminology:
Terminology
===========
A word on terminology: In Zeek, the term "analyzer" generally refers
to a component that processes a particular protocol ("protocol
analyzer"), file format ("file analyzer"), or low-level packet
structure ("packet analyzer"). "Processing" here means more than just
parsing content: An analyzer controls when it wants to be used (e.g.,
with connections on specific ports, or with files of a specific MIME
type); what events to generate for Zeek's scripting layer; and how to
handle any errors occurring during parsing. While Spicy itself focuses
just on the parsing part, Spicy makes it possible to provide the
remaining pieces to Zeek, turning a Spicy parser into a full Zeek
analyzer. That's what we refer to as a "Spicy (protocol/file/packet)
analyzer" for Zeek.

View file

@ -0,0 +1,18 @@
.. _spicy_installation:
Installation
============
Since Zeek version 5.0, support for Spicy is built right into Zeek by
default. To confirm that Spicy is indeed available, you can inspect
the output of ``zeek -N``::
# zeek -N Zeek::Spicy
Zeek::Spicy - Support for Spicy parsers (*.hlto) (built-in)
It remains possible to build Zeek against an external Spicy
installation through Zeek's ``configure`` option
``--with-spicy=PATH``, where ``PATH`` points to the Spicy installation
directory. In that case, you also need to ensure that the Spicy tools
(e.g., ``spicyc``, ``spicy-config``) are available in ``PATH``.

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,441 @@
.. _spicy_tutorial:
Tutorial
========
This tutorial walks through the integration of a simple TFTP analyzer
into Zeek. This discussion continues the example from
:spicylink:`Spicy's own tutorial <tutorial/index.html>` that develops
the TFTP grammar, now focusing on how to use it with Zeek. Please go
through that Spicy tutorial first before continuing here.
To turn a Spicy-side grammar into a Zeek analyzer, we need to provide
Zeek with a description of how to employ it. There are two parts to
that: Telling Zeek when to activate the analyzer, and defining events
to generate. In addition, we will need a Zeek-side script to do
something with our new TFTP events. We will walk through this in the
following, starting with the mechanics of compiling the Spicy analyzer
for Zeek. While we will build up the files involved individually
first, see the :ref:`final section <zkg_create_package>` for how the
Zeek package manager, *zkg*, can be used to bootstrap a new Zeek
package with a skeleton of everything needed for an analyzer.
Before proceeding, make sure that your Zeek comes with Spicy support
built-in---which is the default since Zeek version 5.0::
# zeek -N Zeek::Spicy
Zeek::Spicy - Support for Spicy parsers (*.hlto) (built-in)
You should also have ``spicyz`` in your ``PATH``::
# which spicyz
/usr/local/zeek/bin/spicyz
.. note::
There are a number of pieces involved in creating a full Zeek
analyzer, in particular if you want to distribute it as a Zeek
package. To help you get started with that, Zeek's package manager
can create a skeleton Spicy package by running::
# zkg create --features=spicy-protocol-analyzer --packagedir <packagedir>
The generated files mark places that will need manual editing with
``TODO``. See the :ref:`tutorial <zkg_create_package>` for more on
this.
Compiling the Analyzer
----------------------
Zeek comes with a tool :ref:`spicyz <spicyz>` that compiles Spicy
analyzers into binary code that Zeek can load through a Spicy plugin.
The following command line produces a binary object file ``tftp.hlto``
containing the executable analyzer code:
.. code::
# spicyz -o tftp.hlto tftp.spicy
Below, we will prepare an additional interface definition file
``tftp.evt`` that describes the analyzer's integration into Zeek. We
will need to give that to ``spicyz`` as well, and our full
compilation command hence becomes:
.. code::
# spicyz -o tftp.hlto tftp.spicy tftp.evt
When starting Zeek, we add ``tftp.hlto`` to its command line:
.. code::
# zeek -r tftp_rrq.pcap tftp.hlto
Activating the Analyzer
-----------------------
In *Getting Started*, :ref:`we already saw
<example_spicy_my_http_adding_analyzer>` how to inform Zeek about a new
protocol analyzer. We follow the same scheme here and put the
following into ``tftp.evt``, the analyzer definition file:
.. literalinclude:: autogen/tftp.evt
:lines: 5-7
:language: spicy-evt
The first line provides our analyzer with a Zeek-side name
(``spicy::TFTP``) and also tells Zeek that we are adding an
application analyzer on top of UDP (``over UDP``). ``TFTP::Packet``
provides the top-level entry point for parsing both sides of a TFTP
connection. Furthermore, we want Zeek to automatically activate our
analyzer for all sessions on UDP port 69 (i.e., TFTP's well known
port). See :ref:`spicy_evt_analyzer_setup` for more details on defining
such a ``protocol analyzer`` section.
.. note::
We use the ``port`` attribute in the ``protocol analyzer`` section
mainly for convenience; it's not the only way to define the
well-known ports. For a production analyzer, it's more idiomatic
to use the a Zeek script instead; see :ref:`this note
<zeek_init_instead_of_port>` for more information.
With this in place, we can already employ the analyzer inside Zeek. It
will not generate any events yet, but we can at least see the output of
the ``on %done { print self; }`` hook that still remains part of the
grammar from earlier:
.. code::
# zeek -r tftp_rrq.pcap tftp.hlto Spicy::enable_print=T
[$opcode=Opcode::RRQ, $rrq=[$filename=b"rfc1350.txt", $mode=b"octet"], $wrq=(not set), $data=(not set), $ack=(not set), $error=(not set)]
As by default, the Zeek plugin does not show the output of Spicy-side
``print`` statements, we added ``Spicy::enable_print=T`` to the
command line to turn that on. We see that Zeek took care of the
lower network layers, extracted the UDP payload from the Read Request,
and passed that into our Spicy parser. (If you want to view more about
the internals of what is happening here, there are a couple kinds of
:ref:`debug output available <spicy_debugging>`.)
You might be wondering why there is only one line of output, even
though there are multiple TFTP packets in our pcap trace. Shouldn't
the ``print`` execute multiple times? Yes, it should, but it does not
currently: Due to some intricacies of the TFTP protocol, our analyzer
gets to see only the first packet for now. We will fix this later. For
now, we focus on the Read Request packet that the output above shows.
Defining Events
---------------
The core task of any Zeek analyzer is to generate events for Zeek
scripts to process. For binary protocols, events will often correspond
pretty directly to data units specified by their specifications---and
TFTP is no exception. We start with an event for Read/Write Requests
by adding this definition to ``tftp.evt``:
.. literalinclude:: examples/tftp-single-request.evt
:lines: 5-7
:language: spicy-evt
The first line makes our Spicy TFTP grammar available to the rest of
the file. The line ``on ...`` defines one event: Every time a
``Request`` unit will be parsed, we want to receive an event
``tftp::request`` with one parameter: the connection it belongs to.
Here, ``$conn`` is a reserved identifier that will turn into the
standard `connection record
<https://docs.zeek.org/en/current/scripts/base/init-bare.zeek.html#type-connection>`_
record on the Zeek side.
Now we need a Zeek event handler for our new event. Let's put this
into ``tftp.zeek``:
.. literalinclude:: examples/tftp-single-request.zeek
:language: zeek
Running Zeek then gives us:
.. code::
# spicyz -o tftp.hlto tftp.spicy tftp.evt
# zeek -r tftp_rrq.pcap tftp.hlto tftp.zeek
TFTP request, [orig_h=192.168.0.253, orig_p=50618/udp, resp_h=192.168.0.10, resp_p=69/udp]
Let's extend the event signature a bit by passing further arguments:
.. literalinclude:: examples/tftp-single-request-more-args.evt
:lines: 5-7
:language: spicy-evt
This shows how each parameter gets specified as a Spicy expression:
``self`` refers to the instance currently being parsed (``self``), and
``self.filename`` retrieves the value of its ``filename`` field.
``$is_orig`` is another reserved ID that turns into a boolean that
will be true if the event has been triggered by originator-side
traffic. On the Zeek side, our event now has the following signature:
.. literalinclude:: examples/tftp-single-request-more-args.zeek
:language: zeek
.. code::
# spicyz -o tftp.hlto tftp.spicy tftp.evt
# zeek -r tftp_rrq.pcap tftp.hlto tftp.zeek
TFTP request, [orig_h=192.168.0.253, orig_p=50618/udp, resp_h=192.168.0.10, resp_p=69/udp], T, rfc1350.txt, octet
Going back to our earlier discussion of Read vs Write Requests, we do
not yet make that distinction with the ``request`` event that we are
sending to Zeek-land. However, since we had introduced the ``is_read``
unit parameter, we can easily separate the two by gating event
generation through an additional ``if`` condition:
.. literalinclude:: autogen/tftp.evt
:lines: 11-12
:language: spicy-evt
This now defines two separate events, each being generated only for
the corresponding value of ``is_read``. Let's try it with a new
``tftp.zeek``:
.. literalinclude:: examples/tftp-two-requests.zeek
:language: zeek
.. code::
# spicyz -o tftp.hlto tftp.spicy tftp.evt
# zeek -r tftp_rrq.pcap tftp.hlto tftp.zeek
TFTP read request, [orig_h=192.168.0.253, orig_p=50618/udp, resp_h=192.168.0.10, resp_p=69/udp], T, rfc1350.txt, octet
If we look at the :file:`conn.log` that Zeek produces during this run, we
will see that the ``service`` field is not filled in yet. That's
because our analyzer does not yet confirm to Zeek that it has been
successful in parsing the content. To do that, we can call a library
function that Spicy makes available once we have successfully parsed a
request: :spicylink:`spicy::accept_input
<programming/library.html#spicy-accept-input>`. That function signals
the host application---i.e., Zeek in our case—--that the parser is
processing the expected protocol.
First, we need to make sure the Spicy standard library is imported
in ``tftp.spicy``, so that we will have its functions available:
.. code::
import spicy;
With that, our request looks like this now:
.. code-block::
type Request = unit(is_read: bool) {
filename: bytes &until=b"\x00";
mode: bytes &until=b"\x00";
on %done { spicy::accept_input(); }
};
Let's try it again:
.. code::
# spicyz -o tftp.hlto tftp.spicy tftp.evt
# zeek -r tftp_rrq.pcap tftp.hlto tftp.zeek
TFTP read request, [orig_h=192.168.0.253, orig_p=50618/udp, resp_h=192.168.0.10, resp_p=69/udp], T, rfc1350.txt, octet
# cat conn.log
[...]
1367411051.972852 C1f7uj4uuv6zu2aKti 192.168.0.253 50618 192.168.0.10 69 udp spicy_tftp - - - S0 - -0 D 1 48 0 0 -
[...]
Now the service field says TFTP! (There will be a 2nd connection in
the log that we are not showing here; see the next section on that).
Turning to the other TFTP packet types, it is straight-forward to add
events for them as well. The following is our complete ``tftp.evt``
file:
.. literalinclude:: autogen/tftp.evt
:lines: 5-
:language: spicy-evt
Detour: Zeek vs. TFTP
---------------------
We noticed above that Zeek seems to be seeing only a single TFTP
packet from our input trace, even though ``tcpdump`` shows that the
pcap file contains multiple different types of packets. The reason
becomes clear once we look more closely at the UDP ports that are in
use:
.. code::
# tcpdump -ttnr tftp_rrq.pcap
1367411051.972852 IP 192.168.0.253.50618 > 192.168.0.10.69: 20 RRQ "rfc1350.txtoctet" [tftp]
1367411052.077243 IP 192.168.0.10.3445 > 192.168.0.253.50618: UDP, length 516
1367411052.081790 IP 192.168.0.253.50618 > 192.168.0.10.3445: UDP, length 4
1367411052.086300 IP 192.168.0.10.3445 > 192.168.0.253.50618: UDP, length 516
1367411052.088961 IP 192.168.0.253.50618 > 192.168.0.10.3445: UDP, length 4
1367411052.088995 IP 192.168.0.10.3445 > 192.168.0.253.50618: UDP, length 516
[...]
Turns out that only the first packet is using the well-known TFTP port
69/udp, whereas all the subsequent packets use ephemeral ports. Due to
the port difference, Zeek believes it is seeing two independent
network connections, and it does not associate TFTP with the second
one at all due to its lack of the well-known port (neither does
``tcpdump``!). Zeek's connection log confirms this by showing two
separate entries:
.. code::
# cat conn.log
1367411051.972852 CH3xFz3U1nYI1Dp1Dk 192.168.0.253 50618 192.168.0.10 69 udp spicy_tftp - - - S0 - - 0 D 1 48 0 0 -
1367411052.077243 CfwsLw2TaTIeo3gE9g 192.168.0.10 3445 192.168.0.253 50618 udp - 0.181558 24795 196 SF - - 0 Dd 49 26167 49 1568 -
Switching the ports for subsequent packets is a quirk in TFTP that
resembles similar behaviour in standard FTP, where data connections
get set up separately as well. Fortunately, Zeek provides a built-in
function to designate a specific analyzer for an anticipated future
connection. We can call that function when we see the initial request:
.. literalinclude:: examples/tftp-schedule-analyzer.zeek
:language: zeek
.. code::
# spicyz -o tftp.hlto tftp.spicy tftp.evt
# zeek -r tftp_rrq.pcap tftp.hlto tftp.zeek
TFTP read request, [orig_h=192.168.0.253, orig_p=50618/udp, resp_h=192.168.0.10, resp_p=69/udp], rfc1350.txt, octet
TFTP data, 1, \x0a\x0a\x0a\x0a\x0a\x0aNetwork Working Group [...]
TFTP ack, 1
TFTP data, 2, B Official Protocol\x0a Standards" for the [...]
TFTP ack, 2
TFTP data, 3, protocol was originally designed by Noel Chia [...]
TFTP ack, 3
TFTP data, 4, r mechanism was suggested by\x0a PARC's EFT [...]
TFTP ack, 4
[...]
Now we are seeing all the packets as we would expect.
Zeek Script
-----------
Analyzers normally come along with a Zeek-side script that implements
a set of standard base functionality, such as recording activity into
a protocol specific log file. These scripts provide handlers for the
analyzers' events, and collect and correlate their activity as
desired. We have created such :download:`a script for TFTP
<autogen/tftp.zeek>`, based on the events that our Spicy analyzer
generates. Once we add that to the Zeek command line, we will see a
new :file:`tftp.log`:
.. code::
# spicyz -o tftp.hlto tftp.spicy tftp.evt
# zeek -r tftp_rrq.pcap tftp.hlto tftp.zeek
# cat tftp.log
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p wrq fname mode uid_data size block_sent block_acked error_code error_msg
1367411051.972852 CKWH8L3AIekSHYzBU 192.168.0.253 50618 192.168.0.10 69 F rfc1350.txt octet ClAr3P158Ei77Fql8h 24599 49 49 - -
The TFTP script also labels the second session as TFTP data by
adding a corresponding entry to the ``service`` field inside the
Zeek-side connection record. With that, we are now seeing this in
:file:`conn.log`:
.. code::
1367411051.972852 ChbSfq3QWKuNirt9Uh 192.168.0.253 50618 192.168.0.10 69 udp spicy_tftp - - - S0 - -0 D 1 48 0 0 -
1367411052.077243 CowFQj20FHHduhHSYk 192.168.0.10 3445 192.168.0.253 50618 udp spicy_tftp_data 0.181558 24795 196 SF -- 0 Dd 49 26167 49 1568 -
The TFTP script ends up being a bit more complex than one would expect
for such a simple protocol. That's because it tracks the two related
connections (initial request and follow-up traffic on a different
port), and combines them into a single TFTP transaction for logging.
Since there is nothing Spicy-specific in that Zeek script, we skip
discussing it here in more detail.
.. _zkg_create_package:
Creating a Zeek Package
-----------------------
We have now assembled all the parts needed for providing a new
analyzer to Zeek. By adding a few further pieces, we can wrap that
analyzer into a full *Zeek package* for others to install easily
through *zkg*. To help create that wrapping, *zkg* provides a template
for instantiating a skeleton analyzer package as a starting point. The
skeleton comes in three different flavors, depending on which kind of
analyzer you want to create: protocol, file, or packet analyzer.
In each case, it creates all the necessary files along with the
appropriate directory layout, and even includes a couple of
standard test cases.
To create the scaffolding for our TFTP analyzer, execute the following
command and provide the requested information::
# zkg create --features spicy-protocol-analyzer --packagedir spicy-tftp
"package-template" requires a "name" value (the name of the package, e.g. "FooBar" or "spicy-http"):
name: spicy-tftp
"package-template" requires a "analyzer" value (name of the Spicy analyzer, which typically corresponds to the protocol/format being parsed (e.g. "HTTP", "PNG")):
analyzer: TFTP
"package-template" requires a "protocol" value (transport protocol for the analyzer to use: TCP or UDP):
protocol: UDP
"package-template" requires a "unit_orig" value (name of the top-level Spicy parsing unit for the originator side of the connection (e.g. "Request")):
unit_orig: Packet
"package-template" requires a "unit_resp" value (name of the top-level Spicy parsing unit for the responder side of the connection (e.g. "Reply"); may be the same as originator side):
unit_resp: Packet
The above creates the following files (skipping anything related to
``.git``)::
spicy-tftp/CMakeLists.txt
spicy-tftp/COPYING
spicy-tftp/README
spicy-tftp/analyzer/CMakeLists.txt
spicy-tftp/analyzer/tftp.evt
spicy-tftp/analyzer/tftp.spicy
spicy-tftp/cmake/FindSpicyPlugin.cmake
spicy-tftp/scripts/__load__.zeek
spicy-tftp/scripts/dpd.sig
spicy-tftp/scripts/main.zeek
spicy-tftp/testing/Baseline/tests.run-pcap/conn.log
spicy-tftp/testing/Baseline/tests.run-pcap/output
spicy-tftp/testing/Baseline/tests.standalone/
spicy-tftp/testing/Baseline/tests.standalone/output
spicy-tftp/testing/Baseline/tests.trace/output
spicy-tftp/testing/Baseline/tests.trace/tftp.log
spicy-tftp/testing/Files/random.seed
spicy-tftp/testing/Makefile
spicy-tftp/testing/Scripts/README
spicy-tftp/testing/Scripts/diff-remove-timestamps
spicy-tftp/testing/Scripts/get-zeek-env
spicy-tftp/testing/Traces/tcp-port-12345.pcap
spicy-tftp/testing/Traces/udp-port-12345.pcap
spicy-tftp/testing/btest.cfg
spicy-tftp/testing/tests/availability.zeek
spicy-tftp/testing/tests/standalone.spicy
spicy-tftp/testing/tests/trace.zeek
spicy-tftp/zkg.meta
Note the ``*.evt``, ``*.spicy``, ``*.zeek`` files: they correspond to
the files we created for TFTP in the preceding sections; we can just
move our versions in there. Furthermore, the generated scaffolding
marks places with ``TODO`` that need manual editing: use ``git grep
TODO`` inside the ``spicy-tftp`` directory to find them. We won't go
through all the specific customizations for TFTP here, but for
reference you can find the full TFTP package as created from the *zkg*
template on `GitHub <https://github.com/zeek/spicy-tftp>`_.
If instead of a protocol analyzer, you'd like to create a file or
packet analyzer, run zkg with ``--features spicy-file-analyzer`` or
``--features spicy-packet-analyzer``, respectively. The generated
skeleton will be suitably adjusted then.

317
doc/devel/websocket-api.rst Normal file
View file

@ -0,0 +1,317 @@
.. _websocket-api:
.. _websocat: https://github.com/vi/websocat
======================================
Interacting with Zeek using WebSockets
======================================
Introduction
============
Usually, Zeek produces protocol logs consumed by external applications. These
external applications might be SIEMs, real-time streaming analysis platforms
or basic archival processes compressing logs for long term storage.
Certain use-cases require interacting and influencing Zeek's runtime behavior
outside of static configuration via ``local.zeek``.
The classic :ref:`framework-input` and :ref:`framework-configuration` can be
leveraged for runtime configuration of Zeek as well as triggering arbitrary
events or script execution via option handlers. These frameworks are mostly
file- or process-based and may feel a bit unusual in environments where creation
of files is uncommon or even impossible due to separation of concerns. In many
of today's environments, interacting using HTTP-based APIs or other remote
interfaces is more common.
.. note::
As an aside, if you need more flexibility than the WebSocket API offers today,
an alternative could be to use :ref:`javascript` within Zeek. This opens the
possibility to run a separate HTTP or a totally different Node.js based server
within a Zeek process for quick experimentation and evaluation of other
approaches.
Background and Setup
====================
Since Zeek 5.0, Zeek allows connections from external clients over WebSocket.
This allows these clients to interact with Zeek's publish-subscribe layer and
exchange Zeek events with other Zeek nodes.
Initially, this implementation resided in the Broker subsystem.
With Zeek 8.0, most of the implementation has been moved into core Zeek
itself with the v1 serialization format remaining in Broker.
WebSocket clients may subscribe to a fixed set of topics and will receive
Zeek events matching these topics that Zeek cluster nodes, but also other
WebSocket clients, publish.
With Zeek 8.0, Zeekctl has received support to interact with Zeek cluster nodes
using the WebSocket protocol. If you're running a Zeekctl based cluster and
want to experiment with WebSocket functionality, add ``UseWebSocket = 1`` to
your ``zeekctl.cfg``:
.. code-block:: ini
# zeekctl.cfg
...
UseWebSocket = 1
This will essentially add the following snippet, enabling a WebSocket server
on the Zeek manager:
.. code-block:: zeek
:caption: websocket.zeek
event zeek_init()
{
if ( Cluster::local_node_type() == Cluster::MANAGER )
{
Cluster::listen_websocket([
$listen_addr=127.0.0.1,
$listen_port=27759/tcp,
]);
}
}
To verify that the WebSocket API is functional in your deployment use, for example,
`websocat`_ as a quick check.
.. code-block:: shell
$ echo '[]' | websocat ws://127.0.0.1:27759/v1/messages/json
{"type":"ack","endpoint":"3eece35d-9f94-568d-861c-6a16c433e090-websocket-2","version":"8.0.0-dev.684"}
Zeek's ``cluster.log`` file will also have an entry for the WebSocket client connection.
The empty array in the command specifies the client's subscriptions, in this case none.
Version 1
=========
The currently implemented protocol is accessible at ``/v1/messages/json``.
The `data representation <https://docs.zeek.org/projects/broker/en/current/web-socket.html#data-representation>`_
is documented in detail within the Broker project. Note that this format is a
direct translation of Broker's binary format into JSON, resulting in a fairly
tight coupling between WebSocket clients and the corresponding Zeek scripts.
Most prominently is the representation of record values as vectors instead
of objects, making the protocol sensitive against reordering or introduction
of optional fields to records.
.. note::
We're looking into an iteration of the format. If you have feedback or
would like to contribute, please reach out on the usual community channels.
Handshake and Acknowledgement
-----------------------------
The first message after a WebSocket connection has been established originates
from the client. This message is a JSON array of strings that represent the
topics the WebSocket client wishes to subscribe to.
Zeek replies with an acknowledgement message that's a JSON object or an error.
Events
------
After the acknowledgement, WebSocket clients receive all events arriving on
topics they have subscribed to.
.. code-block:: shell
$ websocat ws://127.0.0.1:27759/v1/messages/json
["zeek.test"]
{"type":"ack","endpoint":"d955d990-ad8a-5ed4-8bc5-bee252d4a2e6-websocket-0","version":"8.0.0-dev.684"}
{"type":"data-message","topic":"zeek.test","@data-type":"vector","data":[{"@data-type":"count","data":1},{"@data-type":"count","data":1},{"@data-type":"vector","data":[{"@data-type":"string","data":"hello"},{"@data-type":"vector","data":[{"@data-type":"count","data":3}]},{"@data-type":"vector","data":[]}]}]}
The received messages, again, are encoded in Broker's JSON format. Above ``data-message``
represents an event received on topic ``zeek.test``. The event's name is ``hello``.
This event has a single argument of type :zeek:type:`count`. In the example above
its value is ``3``.
To send events, WebSocket clients similarly encode their event representation
to Broker's JSON format and send them as `text data frames <https://datatracker.ietf.org/doc/html/rfc6455#section-5.6>`_.
X-Application-Name Header
-------------------------
When a WebSocket client includes an ``X-Application-Name`` HTTP header in
the initial WebSocket Handshake's GET request, that header's value is available
in the :zeek:see:`Cluster::websocket_client_added` event's ``endpoint`` argument (see :zeek:see:`Cluster::EndpointInfo`).
The header's value will also be included in ``cluster.log`` messages.
Additionally, if the cluster telemetry for WebSocket clients is set to
:zeek:see:`Cluster::Telemetry::VERBOSE` or :zeek:see:`Cluster::Telemetry::DEBUG`
via :zeek:see:`Cluster::Telemetry::websocket_metrics`, the header's value is
included as ``app`` label in metrics exposed by the :ref:`framework-telemetry`.
As of Zeek 8.0, a WebSocket client will be rejected if the header is set, but
its value doesn't match ``[-/_.=:*@a-zA-Z0-9]+``.
Language Bindings
-----------------
Note that it's possible to use any language that offers WebSocket bindings.
The ones listed below mostly add a bit of convenience features around the
initial Handshake message, error handling and serializing Zeek events and
values into the Broker-specific serialization format.
For example, using the Node.js `builtin WebSocket functionality <https://nodejs.org/en/learn/getting-started/websocket>`_,
the ``websocat`` example from above can be reproduced as follows:
.. code-block:: javascript
:caption: client.js
// client.js
const socket = new WebSocket('ws://192.168.122.107:27759/v1/messages/json');
socket.addEventListener('open', event => {
socket.send('["zeek.test"]');
});
socket.addEventListener('message', event => {
console.log('Message from server: ', event.data);
});
.. code-block:: shell
$ node ./client.js
Message from server: {"type":"ack","endpoint":"2e951b0c-3ca4-504c-ae8a-5d3750fec588-websocket-10","version":"8.0.0-dev.684"}
Message from server: {"type":"data-message","topic":"zeek.test","@data-type":"vector","data":[{"@data-type":"count","data":1},{"@data-type":"count","data":1},{"@data-type":"vector","data":[{"@data-type":"string","data":"hello"},{"@data-type":"vector","data":[{"@data-type":"count","data":374}]},{"@data-type":"vector","data":[]}]}]}
Golang
^^^^^^
* `Zeek Broker websocket interface library for Golang <https://github.com/corelight/go-zeek-broker-ws>`_ (not an official Zeek project)
Rust
^^^^
* `Rust types for interacting with Zeek over WebSocket <https://github.com/bbannier/zeek-websocket-rs>`_ (not an official Zeek project)
Python
^^^^^^
There are no ready to use Python libraries available, but the third-party
`websockets <https://github.com/python-websockets/websockets>`_ package
allows to get started quickly.
You may take inspiration from `zeek-client's implementation <https://github.com/zeek/zeek-client>`_
or the `small helper library <https://raw.githubusercontent.com/zeek/zeek/refs/heads/master/testing/btest/Files/ws/wstest.py>`_ used by various of Zeek's own tests for the
WebSocket API.
Zeekctl similarly ships a `light implementation <https://github.com/zeek/zeekctl/blob/93459b37c3deab4bec9e886211672024fa3e4759/ZeekControl/events.py#L159>`_
using the ``websockets`` library to implement its ``netstats`` and ``print`` commands.
Outgoing Connections
====================
For some deployment scenarios, Zeek only offering a WebSocket server can be cumbersome.
Concretely, when multiple independent Zeek clusters interact with
a single instance of a remote API. For instance, this could be needed for
configuring a central firewall.
In such scenarios, it is more natural for Zeek to connect out to the
remote API, rather than the remote API connecting to the Zeek cluster.
For these use-cases, the current suggestion is to run a WebSocket bridge between
a Zeek cluster and the remote API. One concrete tool that can be used
for this purpose is `websocat`_.
.. note::
This topic has previously been discussed elsewhere. The following
`GitHub issue <https://github.com/zeek/zeek/issues/3597>`_ and
`discussion <https://github.com/zeek/zeek/discussions/4768>`_
provide more background and details.
Example Architecture
--------------------
.. figure:: ../images/websocket-api/one-api-many-zeek.svg
:width: 300
Multiple Zeek instances and a single remote API
The following proposal decouples the components using a WebSocket
bridge for every Zeek cluster. This ensures that the depicted remote API
does not need knowledge about an arbitrary number of Zeek clusters.
.. figure:: ../images/websocket-api/one-api-many-zeek-ws-bridge.svg
:width: 300
Multiple Zeek instances and a single remote API with WebSocket bridges.
Example Implementation
----------------------
Assuming the depicted remote API provides a WebSocket server as well,
it is possible to use ``websocat`` as the bridge directly.
The crux for the remote API is that upon a new WebSocket client connection,
the first message is the topic array that the remote API wishes to subscribe
to on a Zeek cluster.
Putting these pieces together, the following JavaScript script presents the
remote API, implemented using the `ws library <https://github.com/websockets/ws?tab=readme-ov-file>`_.
It accepts WebSocket clients on port 8080 and sends the topic array as the first message
containing just ``zeek.bridge.test``. Thereafter, it simply echos all incoming
WebSocket messages.
.. literalinclude:: websocket-api/server.js
:caption: server.js
:language: javascript
The Zeek side starts a WebSocket server on port 8000 and regularly publishes
a ``hello`` event to the ``zeek.bridge.test`` topic.
.. literalinclude:: websocket-api/server.zeek
:caption: server.zeek
:language: zeek
These two servers can now be connected by running ``websocat`` as follows:
.. code-block:: shell
# In terminal 1 (use node if your Zeek has no JavaScript support)
$ zeek server.js
# In terminal 2
$ zeek server.zeek
# In terminal 3
$ while true; do websocat --text -H='X-Application-Name: client1' ws://localhost:8000/v1/messages/json ws://localhost:8080 || sleep 0.1 ; done
The first few lines of output in terminal 1 should then look as follows:
.. code-block:: shell
# zeek server.js
client1: connected, sending topics array ["zeek.bridge.test"]
client1: received: {"type":"ack","endpoint":"9089e06b-8d33-5585-ad79-4f7f6348754e-websocket-135","version":"8.1.0-dev.91"}
client1: received: {"type":"data-message","topic":"zeek.bridge.test","@data-type":"vector","data":[{"@data-type":"count","data":1},{"@data-type":"count","data":1},{"@data-type":"vector","data":[{"@data-type":"string","data":"hello"},{"@data-type":"vector","data":[{"@data-type":"count","data":1792}]},{"@data-type":"vector","data":[]}]}]}
...
If you require synchronization between the Zeek instance and the remote API, this
is best achieved with events once the connection between the remote API and the
Zeek cluster is established.
Alternative Approaches
----------------------
Since v21, Node.js contains a built-in `WebSocket client <https://nodejs.org/en/learn/getting-started/websocket>`_,
making it possible to use vanilla :ref:`javascript` within
Zeek to establish outgoing WebSocket connections, too.
The ``websocat`` tool provides more flexibility, potentially allowing
to forward WebSocket messages to external commands which in turn could
use HTTP POST requests to an external API.

View file

@ -0,0 +1,23 @@
// server.js
import WebSocket, { WebSocketServer } from 'ws';
const wss = new WebSocketServer({ port: 8080 });
wss.on('connection', (ws, req) => {
ws.on('error', console.error);
ws.on('close', () => { console.log('%s: gone', ws.zeek.app); });
ws.on('message', function message(data) {
console.log('%s: received: %s', ws.zeek.app, data);
});
let topics = ['zeek.bridge.test'];
let app = req.headers['x-application-name'] || '<unknown application>'
ws.zeek = {
app: app,
topics: topics,
};
console.log(`${app}: connected, sending topics array ${JSON.stringify(topics)}`);
ws.send(JSON.stringify(topics));
});

View file

@ -0,0 +1,15 @@
global hello: event(c : count);
global c = 0;
event tick()
{
Cluster::publish("zeek.bridge.test", hello, ++c);
schedule 1.0sec { tick() };
}
event zeek_init()
{
Cluster::listen_websocket([$listen_addr=127.0.0.1, $listen_port=8000/tcp]);
event tick();
}