Copy docs into Zeek repo directly

This is based on commit 2731def9159247e6da8a3191783c89683363689c from the
zeek-docs repo.
This commit is contained in:
Tim Wojtulewicz 2025-09-15 15:52:18 -07:00
parent 83f1e74643
commit ded98cd373
1074 changed files with 169319 additions and 0 deletions

1817
doc/scripting/basics.rst Normal file

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,101 @@
.. _script-conn-id-ctx:
==============================
Use of :zeek:see:`conn_id_ctx`
==============================
.. versionadded:: 8.0
.. note::
Were still iterating on patterns for working with the new pluggable
connection keys and :zeek:see:`conn_id_ctx` instances.
If you have feedback or run into limitations for your use-cases, please reach out!
In some deployments, Zeek receives traffic from different network
segments that share overlapping IP ranges.
Such settings usually provide some additional means of separating
the ranges, such as VLAN numbers.
For example, host 10.0.0.37 in VLAN 1 and host 10.0.0.37 in VLAN 2 may share
the same IP address, but represent different systems.
In Zeek's terminology, such IP addresses (or their connections) occur in different
*contexts*. In this case the context is the VLAN ID; in other settings,
the context could be, say, Virtual Network Identifiers (VNIs) as used with
UDP-based tunnels like VXLAN or Geneve.
From Zeek's perspective, the context can be any kind of value that
it can derive from packet data.
Since version 8.0, Zeek can extract these contexts through
:ref:`plugin-provided connection key implementations <connkey-plugin>`
and include them into its core connection tracking. Such plugins will normally also
:zeek:keyword:`redefine <redef>` :zeek:see:`conn_id_ctx` with additional
fields to expose this context to the Zeek scripting layer.
For example, loading :doc:`/scripts/policy/frameworks/conn_key/vlan_fivetuple.zeek`
adds :zeek:field:`vlan` and :zeek:field:`inner_vlan` fields to :zeek:see:`conn_id_ctx`.
Script writers can use the :zeek:field:`conn_id$ctx <conn_id$ctx>` field to
distinguish :zeek:type:`addr` values observed in different contexts.
For example, to count the number of connections per originator address in
a context-aware manner, add the :zeek:see:`conn_id_ctx` to the table index:
.. code-block:: zeek
global connection_counts: table[conn_id_ctx, addr] of count &default=0;
event new_connection(c: connection)
{
++connection_counts[c$id$ctx, c$id$orig_h];
}
If, for example, :zeek:field:`ctx` is populated with fields for VLAN tags,
that table will create individual entries per ``(VLAN, addr)`` pair.
This will also work correctly if no context has been defined: ``c$id$ctx`` will
be an empty record with no fields.
Alternatively, users can define their own record type that includes both :zeek:see:`conn_id_ctx` and :zeek:type:`addr`,
and use instances of such records to index into tables:
.. literalinclude:: conn_id_ctx_my_endpoint.zeek
:caption: conn_id_ctx_my_endpoint.zeek
:language: zeek
:linenos:
:tab-width: 4
This example tracks services that an originator IP address has been observed to interact with.
When loading the :doc:`/scripts/policy/frameworks/conn_key/vlan_fivetuple.zeek`
script, IP addresses in different VLANs are tracked separately:
.. code-block:: shell
$ zeek -r vlan-collisions.pcap frameworks/conn_key/vlan_fivetuple conn_id_ctx_my_endpoint.zeek
[ctx=[vlan=42, inner_vlan=<uninitialized>], a=141.142.228.5], HTTP
[ctx=[vlan=10, inner_vlan=20], a=141.142.228.5], HTTP
[ctx=[vlan=<uninitialized>, inner_vlan=<uninitialized>], a=141.142.228.5], HTTP
Note that while this script isn't VLAN-specific, it is VLAN-aware. When
using a different connection key plugin like the one discussed in the
:ref:`connection key tutorial <connkey-plugin>`, the result becomes the following,
discriminating entries in the ``talks_with_service`` table by the value of
``c$id$ctx$vxlan_vni``:
.. code-block:: shell
$ zeek -C -r vxlan-overlapping-http-get.pcap ConnKey::factory=ConnKey::CONNKEY_VXLAN_VNI_FIVETUPLE conn_id_ctx_my_endpoint.zeek
[ctx=[vxlan_vni=<uninitialized>], a=141.142.228.5], HTTP
[ctx=[vxlan_vni=<uninitialized>], a=127.0.0.1], VXLAN
[ctx=[vxlan_vni=4711], a=141.142.228.5], HTTP
[ctx=[vxlan_vni=4242], a=141.142.228.5], HTTP
When using Zeek's default five-tuple connection key, the :zeek:see:`conn_id_ctx`
record is empty and originator address 141.142.228.5 appears as a single entry
in the table instead:
.. code-block:: shell
$ zeek -C -r vxlan-overlapping-http-get.pcap conn_id_ctx_my_endpoint.zeek
[ctx=[], a=141.142.228.5], HTTP
[ctx=[], a=127.0.0.1], VXLAN

View file

@ -0,0 +1,21 @@
type MyEndpoint: record {
ctx: conn_id_ctx;
a: addr;
};
global talks_with_service: table[MyEndpoint] of set[string] &default_insert=set();
event connection_state_remove(c: connection)
{
local endp = MyEndpoint($ctx=c$id$ctx, $a=c$id$orig_h);
for ( s in c$service )
add talks_with_service[endp][s];
}
event zeek_done()
{
for ( e, es in talks_with_service )
print e, join_string_set(es, ", ");
}

View file

@ -0,0 +1,6 @@
@load base/protocols/conn
event connection_state_remove(c: connection)
{
print c;
}

View file

@ -0,0 +1,7 @@
@load base/protocols/conn
@load base/protocols/http
event connection_state_remove(c: connection)
{
print c;
}

View file

@ -0,0 +1,22 @@
type Service: record {
name: string;
ports: set[port];
rfc: count;
};
function print_service(serv: Service)
{
print fmt("Service: %s(RFC%d)",serv$name, serv$rfc);
for ( p in serv$ports )
print fmt(" port: %s", p);
}
event zeek_init()
{
local dns: Service = [$name="dns", $ports=set(53/udp, 53/tcp), $rfc=1035];
local http: Service = [$name="http", $ports=set(80/tcp, 8080/tcp), $rfc=2616];
print_service(dns);
print_service(http);
}

View file

@ -0,0 +1,41 @@
type Service: record {
name: string;
ports: set[port];
rfc: count;
};
type System: record {
name: string;
services: set[Service];
};
function print_service(serv: Service)
{
print fmt(" Service: %s(RFC%d)",serv$name, serv$rfc);
for ( p in serv$ports )
print fmt(" port: %s", p);
}
function print_system(sys: System)
{
print fmt("System: %s", sys$name);
for ( s in sys$services )
print_service(s);
}
event zeek_init()
{
local server01: System;
server01$name = "morlock";
add server01$services[[ $name="dns", $ports=set(53/udp, 53/tcp), $rfc=1035]];
add server01$services[[ $name="http", $ports=set(80/tcp, 8080/tcp), $rfc=2616]];
print_system(server01);
# local dns: Service = [ $name="dns", $ports=set(53/udp, 53/tcp), $rfc=1035];
# local http: Service = [ $name="http", $ports=set(80/tcp, 8080/tcp), $rfc=2616];
# print_service(dns);
# print_service(http);
}

View file

@ -0,0 +1,22 @@
event zeek_init()
{
local ssl_ports: set[port];
local non_ssl_ports = set( 23/tcp, 80/tcp, 143/tcp, 25/tcp );
# SSH
add ssl_ports[22/tcp];
# HTTPS
add ssl_ports[443/tcp];
# IMAPS
add ssl_ports[993/tcp];
# Check for SMTPS
if ( 587/tcp !in ssl_ports )
add ssl_ports[587/tcp];
for ( i in ssl_ports )
print fmt("SSL Port: %s", i);
for ( i in non_ssl_ports )
print fmt("Non-SSL Port: %s", i);
}

View file

@ -0,0 +1,13 @@
event zeek_init()
{
local samurai_flicks: table[string, string, count, string] of string;
samurai_flicks["Kihachi Okamoto", "Toho", 1968, "Tatsuya Nakadai"] = "Kiru";
samurai_flicks["Hideo Gosha", "Fuji", 1969, "Tatsuya Nakadai"] = "Goyokin";
samurai_flicks["Masaki Kobayashi", "Shochiku Eiga", 1962, "Tatsuya Nakadai" ] = "Harakiri";
samurai_flicks["Yoji Yamada", "Eisei Gekijo", 2002, "Hiroyuki Sanada" ] = "Tasogare Seibei";
for ( [d, s, y, a] in samurai_flicks )
print fmt("%s was released in %d by %s studios, directed by %s and starring %s", samurai_flicks[d, s, y, a], y, s, d, a);
}

View file

@ -0,0 +1,11 @@
event zeek_init()
{
# local samurai_flicks: ...
for ( [d, _, _, _], name in samurai_flicks )
print fmt("%s was directed by %s", name, d);
for ( _, name in samurai_flicks )
print fmt("%s is a movie", name);
}

View file

@ -0,0 +1,19 @@
event zeek_init()
{
# Declaration of the table.
local ssl_services: table[string] of port;
# Initialize the table.
ssl_services = table(["SSH"] = 22/tcp, ["HTTPS"] = 443/tcp);
# Insert one key-value pair into the table.
ssl_services["IMAPS"] = 993/tcp;
# Check if the key "SMTPS" is not in the table.
if ( "SMTPS" !in ssl_services )
ssl_services["SMTPS"] = 587/tcp;
# Iterate over each key in the table.
for ( k in ssl_services )
print fmt("Service Name: %s - Common Port: %s", k, ssl_services[k]);
}

View file

@ -0,0 +1,7 @@
event zeek_init()
{
local v: vector of count = vector(1, 2, 3, 4);
local w = vector(1, 2, 3, 4);
print v;
print w;
}

View file

@ -0,0 +1,15 @@
event zeek_init()
{
local v1: vector of count;
local v2 = vector(1, 2, 3, 4);
v1 += 1;
v1 += 2;
v1 += 3;
v1 += 4;
print fmt("contents of v1: %s", v1);
print fmt("length of v1: %d", |v1|);
print fmt("contents of v2: %s", v2);
print fmt("length of v2: %d", |v2|);
}

View file

@ -0,0 +1,7 @@
event zeek_init()
{
local addr_vector: vector of addr = vector(1.2.3.4, 2.3.4.5, 3.4.5.6);
for ( i in addr_vector )
print mask_addr(addr_vector[i], 18);
}

View file

@ -0,0 +1,7 @@
event zeek_init()
{
local addr_vector: vector of addr = vector(1.2.3.4, 2.3.4.5, 3.4.5.6);
for ( _, a in addr_vector )
print mask_addr(a, 18);
}

View file

@ -0,0 +1,9 @@
const port_list: table[port] of string &redef;
redef port_list += { [6666/tcp] = "IRC"};
redef port_list += { [80/tcp] = "WWW" };
event zeek_init()
{
print port_list;
}

View file

@ -0,0 +1,4 @@
@load base/protocols/http
redef HTTP::default_capture_password = T;

View file

@ -0,0 +1,9 @@
event zeek_init()
{
local a: int;
a = 10;
local b = 10;
if ( a == b )
print fmt("A: %d, B: %d", a, b);
}

View file

@ -0,0 +1,18 @@
# Store the time the previous connection was established.
global last_connection_time: time;
# boolean value to indicate whether we have seen a previous connection.
global connection_seen: bool = F;
event connection_established(c: connection)
{
local net_time: time = network_time();
print fmt("%s: New connection established from %s to %s", strftime("%Y/%m/%d %H:%M:%S", net_time), c$id$orig_h, c$id$resp_h);
if ( connection_seen )
print fmt(" Time since last connection: %s", net_time - last_connection_time);
last_connection_time = net_time;
connection_seen = T;
}

View file

@ -0,0 +1,11 @@
function add_two(i: count): count
{
local added_two = i+2;
print fmt("i + 2 = %d", added_two);
return added_two;
}
event zeek_init()
{
local test = add_two(10);
}

View file

@ -0,0 +1,13 @@
event zeek_init()
{
local test_string = "The quick brown fox jumps over the lazy dog.";
local test_pattern = /quick|lazy/;
if ( test_pattern in test_string )
{
local results = split_string(test_string, test_pattern);
print results[0];
print results[1];
print results[2];
}
}

View file

@ -0,0 +1,10 @@
event zeek_init()
{
local test_string = "equality";
local test_pattern = /equal/;
print fmt("%s and %s %s equal", test_string, test_pattern, test_pattern == test_string ? "are" : "are not");
test_pattern = /equality/;
print fmt("%s and %s %s equal", test_string, test_pattern, test_pattern == test_string ? "are" : "are not");
}

View file

@ -0,0 +1,25 @@
module Conn;
export {
## The record type which contains column fields of the connection log.
type Info: record {
ts: time &log;
uid: string &log;
id: conn_id &log;
proto: transport_proto &log;
service: string &log &optional;
duration: interval &log &optional;
orig_bytes: count &log &optional;
resp_bytes: count &log &optional;
conn_state: string &log &optional;
local_orig: bool &log &optional;
local_resp: bool &log &optional;
missed_bytes: count &log &default=0;
history: string &log &optional;
orig_pkts: count &log &optional;
orig_ip_bytes: count &log &optional;
resp_pkts: count &log &optional;
resp_ip_bytes: count &log &optional;
tunnel_parents: set[string] &log;
};
}

View file

@ -0,0 +1,15 @@
event zeek_init()
{
local subnets = vector(172.16.0.0/20, 172.16.16.0/20, 172.16.32.0/20, [2001:db8:b120::]/64);
local addresses = vector(172.16.4.56, 172.16.47.254, 172.16.1.1, [2001:db8:b120::1]);
for ( a in addresses )
{
for ( s in subnets )
{
if ( addresses[a] in subnets[s] )
print fmt("%s belongs to subnet %s", addresses[a], subnets[s]);
}
}
}

View file

@ -0,0 +1,4 @@
event connection_established(c: connection)
{
print fmt("%s: New connection established from %s to %s\n", strftime("%Y/%m/%d %H:%M:%S", network_time()), c$id$orig_h, c$id$resp_h);
}

View file

@ -0,0 +1,115 @@
.. _script-event-groups:
============
Event Groups
============
Zeek supports enabling and disabling event and hook handlers at runtime
through event groups. While named event groups, hook handlers are covered
due to their structural similarity to event handlers as well.
Event and hook handlers can be part of multiple event groups. An event or
hook handler is disabled if any of the groups it's part of is disabled.
Conversely, event and hook handlers are enabled when all groups they
are part of are enabled. When Zeek starts, all event groups are implicitly
enabled. An event or hook handler that is not part of any event group is
always enabled.
Currently, two types of event groups exist: Attribute and module based.
Attribute Based Event Group
===========================
Attribute based event groups come into existence when an event or hook
handler has a :zeek:attr:`&group` attribute. The value of the group
attribute is a string identifying the group. There's a single global namespace
for attribute based event groups. Two event handlers in different files
or modules, but with the same group attribute value, are part of the same group.
Event and hook handlers can have more than one group attributes.
.. literalinclude:: event_groups_attr_01.zeek
:caption:
:language: zeek
:linenos:
:tab-width: 4
This example shows ``http_request``, ``http_header`` and ``http_reply`` event
handlers, all with a group attribute of ``http-print-debugging``.
When running Zeek against a pcap containing a single HTTP transaction,
the output is as follows.
.. code-block:: console
$ zeek -r traces/get.trace ./event_groups_attr_01.zeek
HTTP request: GET /download/CHANGES.bro-aux.txt (141.142.228.5->192.150.187.43)
HTTP header : User-Agent=Wget/1.14 (darwin12.2.0) (141.142.228.5->192.150.187.43)
HTTP reply: 200/OK version 1.1 (192.150.187.43->141.142.228.5)
HTTP header : Server=Apache/2.4.3 (Fedora) (192.150.187.43->141.142.228.5)
Such debugging functionality would generally only be enabled on demand. Extending
the above script, we introduce an option and a change handler function from the
:ref:`configuration framework`<framework-configuration>`
to enable and disable the ``http-print-debugging`` event group at runtime.
.. literalinclude:: event_groups_attr_02.zeek
:caption:
:language: zeek
:linenos:
:tab-width: 4
Whenever the option ``Debug::http_print_debugging`` is set to ``T``,
:zeek:see:`enable_event_group` is invoked to ensure the ``http-print-debugging``
group is enabled. Conversely, when the option is set to ``F``,
:zeek:see:`disable_event_group` disables all event handlers in the group
``http-print-debugging``.
The very same behavior can be achieved by testing the ``Debug::http_print_debugging``
option within the respective event handlers using and ``if`` statement and
early return. In contrast, event groups work in a more declarative way.
Further, when disabling event handlers via event groups, their implementation
is never invoked and is therefore a more performant way to short-circuit
execution.
Module Based Event Group
========================
Besides attribute based event groups, Zeek supports implicit module based
event groups. Event and hook handlers are part of an event group that
represents the module in which they were implemented. The builtin functions
:zeek:see:`disable_module_events` and :zeek:see:`enable_module_events` can
be used to disable and enable all event and hook handlers within modules.
An interesting idea here is to implement enabling and disabling of Zeek packages
at runtime. For example, the `CommunityID <https://github.com/corelight/zeek-community-id>`_
package implements its functionality in the ``CommunityID`` and
``CommunityID::Notice`` modules. The `JA3 <https://github.com/salesforce/ja3>`_
package implements its event handlers in the ``JA3`` and ``JA3_Server`` modules.
.. literalinclude:: event_groups_module_01.zeek
:caption:
:language: zeek
:linenos:
:tab-width: 4
The above script implements toggling of Zeek package functionality at
runtime via the options ``Packages::ja3_enabled`` and ``Packages::community_id_enabled``.
While for most packages and deployments a Zeek restart is an acceptable
way to disable or enable a package - generally this isn't a regular operation -
module based event groups provide a powerful primitive to support runtime
toggling of scripting functionality.
.. note::
A caveat around the above example: The JA3 package builds up state based
on the :zeek:see:`ssl_extension` events from SSL ClientHello and ServerHello
messages. When the JA3 event handlers are enabled right during processing
of these events, the resulting JA3 hash might be based on a partial list
of extensions only.
While all :zeek:see:`ssl_extension` handlers are processed jointly
for each instance of the event, generally state build up and
dynamic enabling and disabling may need careful consideration.

View file

@ -0,0 +1,19 @@
event http_request(c: connection, method: string, original_URI: string, unescaped_URI: string, version: string) &group="http-print-debugging"
{
print fmt("HTTP request: %s %s (%s->%s)", method, original_URI, c$id$orig_h, c$id$resp_h);
}
event http_header(c: connection, is_orig: bool, original_name: string, name: string, value: string) &group="http-print-debugging"
{
if ( name != "USER-AGENT" && name != "SERVER" )
return;
local snd = is_orig ? c$id$orig_h : c$id$resp_h;
local rcv = is_orig ? c$id$resp_h : c$id$orig_h;
print fmt("HTTP header : %s=%s (%s->%s)", original_name, value, snd, rcv);
}
event http_reply(c: connection, version: string, code: count, reason: string) &group="http-print-debugging"
{
print fmt("HTTP reply: %s/%s version %s (%s->%s)", code, reason, version, c$id$resp_h, c$id$orig_h);
}

View file

@ -0,0 +1,46 @@
@load base/frameworks/config
redef Config::config_files += { "./myconfig.dat" };
module Debug;
export {
option http_print_debugging = F;
}
event http_request(c: connection, method: string, original_URI: string, unescaped_URI: string, version: string) &group="http-print-debugging"
{
print fmt("HTTP request: %s %s (%s->%s)", method, original_URI, c$id$orig_h, c$id$resp_h);
}
event http_header(c: connection, is_orig: bool, original_name: string, name: string, value: string) &group="http-print-debugging"
{
if ( name != "USER-AGENT" && name != "SERVER" )
return;
local snd = is_orig ? c$id$orig_h : c$id$resp_h;
local rcv = is_orig ? c$id$resp_h : c$id$orig_h;
print fmt("HTTP header : %s=%s (%s->%s)", original_name, value, snd, rcv);
}
event http_reply(c: connection, version: string, code: count, reason: string) &group="http-print-debugging"
{
print fmt("HTTP reply : %s/%s version %s (%s->%s)", code, reason, version, c$id$resp_h, c$id$orig_h);
}
event zeek_init()
{
Option::set_change_handler("Debug::http_print_debugging", function(id: string, new_value: bool): bool {
print id, new_value;
if ( new_value )
enable_event_group("http-print-debugging");
else
disable_event_group("http-print-debugging");
return new_value;
});
# Trigger the change handler, once.
Config::set_value("Debug::http_print_debugging", http_print_debugging);
}

View file

@ -0,0 +1,47 @@
@load base/frameworks/config
@load ja3
@load zeek-community-id
@load zeek-community-id/notice
redef Config::config_files += { "./myconfig.dat" };
module Packages;
export {
# All packages off by default.
option community_id_enabled = F;
option ja3_enabled = F;
}
event zeek_init()
{
local package_change_handler = function(id: string, new_value: bool): bool {
local modules: set[string];
if ( id == "Packages::community_id_enabled" )
modules = ["CommunityID", "CommunityID::Notice"];
else if ( id == "Packages::ja3_enabled" )
modules = ["JA3", "JA3_Server"];
else
{
Reporter::error(fmt("Unknown option: %s", id));
return new_value;
}
# Toggle the modules.
for ( m in modules )
if ( new_value )
enable_module_events(m);
else
disable_module_events(m);
return new_value;
};
Option::set_change_handler("Packages::community_id_enabled", package_change_handler);
Option::set_change_handler("Packages::ja3_enabled", package_change_handler);
Config::set_value("Packages::community_id_enabled", community_id_enabled);
Config::set_value("Packages::ja3_enabled", ja3_enabled);
}

View file

@ -0,0 +1,19 @@
module Factor;
function factorial(n: count): count
{
if ( n == 0 )
return 1;
else
return ( n * factorial(n - 1) );
}
event zeek_init()
{
local numbers: vector of count = vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
for ( n in numbers )
print fmt("%d", factorial(numbers[n]));
}

View file

@ -0,0 +1,35 @@
module Factor;
export {
# Append the value LOG to the Log::ID enumerable.
redef enum Log::ID += { LOG };
# Define a new type called Factor::Info.
type Info: record {
num: count &log;
factorial_num: count &log;
};
}
function factorial(n: count): count
{
if ( n == 0 )
return 1;
else
return ( n * factorial(n - 1) );
}
event zeek_init()
{
# Create the logging stream.
Log::create_stream(LOG, [$columns=Info, $path="factor"]);
}
event zeek_done()
{
local numbers: vector of count = vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
for ( n in numbers )
Log::write( Factor::LOG, [$num=numbers[n],
$factorial_num=factorial(numbers[n])]);
}

View file

@ -0,0 +1,45 @@
module Factor;
export {
redef enum Log::ID += { LOG };
type Info: record {
num: count &log;
factorial_num: count &log;
};
}
function factorial(n: count): count
{
if ( n == 0 )
return 1;
else
return (n * factorial(n - 1));
}
event zeek_done()
{
local numbers: vector of count = vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
for ( n in numbers )
Log::write( Factor::LOG, [$num=numbers[n],
$factorial_num=factorial(numbers[n])]);
}
function mod5(id: Log::ID, path: string, rec: Factor::Info) : string
{
if ( rec$factorial_num % 5 == 0 )
return "factor-mod5";
else
return "factor-non5";
}
event zeek_init()
{
Log::create_stream(LOG, [$columns=Info, $path="factor"]);
local filter: Log::Filter = [$name="split-mod5s", $path_func=mod5];
Log::add_filter(Factor::LOG, filter);
Log::remove_filter(Factor::LOG, "default");
}

View file

@ -0,0 +1,50 @@
module Factor;
export {
redef enum Log::ID += { LOG };
type Info: record {
num: count &log;
factorial_num: count &log;
};
global log_factor: event(rec: Info);
}
function factorial(n: count): count
{
if ( n == 0 )
return 1;
else
return (n * factorial(n - 1));
}
event zeek_init()
{
Log::create_stream(LOG, [$columns=Info, $ev=log_factor, $path="factor"]);
}
event zeek_done()
{
local numbers: vector of count = vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
for ( n in numbers )
Log::write( Factor::LOG, [$num=numbers[n],
$factorial_num=factorial(numbers[n])]);
}
function mod5(id: Log::ID, path: string, rec: Factor::Info) : string
{
if ( rec$factorial_num % 5 == 0 )
return "factor-mod5";
else
return "factor-non5";
}
event zeek_init()
{
local filter: Log::Filter = [$name="split-mod5s", $path_func=mod5];
Log::add_filter(Factor::LOG, filter);
Log::remove_filter(Factor::LOG, "default");
}

View file

@ -0,0 +1,7 @@
@load policy/protocols/ssh/interesting-hostnames.zeek
hook Notice::policy(n: Notice::Info)
{
if ( n$note == SSH::Interesting_Hostname_Login )
add n$actions[Notice::ACTION_EMAIL];
}

View file

@ -0,0 +1,7 @@
@load policy/protocols/ssl/expiring-certs.zeek
hook Notice::policy(n: Notice::Info)
{
if ( n$note == SSL::Certificate_Expires_Soon )
n$suppress_for = 12hrs;
}

View file

@ -0,0 +1,7 @@
@load policy/protocols/ssh/interesting-hostnames.zeek
@load base/protocols/ssh/
redef Notice::emailed_types += {
SSH::Interesting_Hostname_Login
};

View file

@ -0,0 +1,6 @@
@load policy/protocols/ssh/interesting-hostnames.zeek
@load base/protocols/ssh/
redef Notice::type_suppression_intervals += {
[SSH::Interesting_Hostname_Login] = 1day,
};

View file

@ -0,0 +1,7 @@
module HTTP;
export {
## This setting changes if passwords used in Basic-Auth are captured or
## not.
const default_capture_password = F &redef;
}

15
doc/scripting/index.rst Normal file
View file

@ -0,0 +1,15 @@
=========================
Introduction to Scripting
=========================
.. toctree::
:maxdepth: 2
basics
usage
event-groups
conn-id-ctx
tracing-events
optimization
javascript

View file

@ -0,0 +1,509 @@
.. _javascript:
==========
JavaScript
==========
.. versionadded:: 6.0
.. note::
Link to external `ZeekJS documentation`_.
.. note::
The JavaScript integration does not provide Zeek's typical backwards
compatibility guarantees at this point. The plugin itself is at semantic
version 0.9.1 at the time of writing meaning the API is not stable.
That said, we'll avoid unnecessary breakage.
Preamble
========
In the scope of integrating with external systems, Zeek can be extended by
:ref:`implementing C++ plugins <writing-plugins>` or using the :zeek:see:`system`
function to call external programs from Zeek scripts. The :ref:`framework-input`
can be leveraged for data ingestion (with :ref:`raw reader <input-raw-reader>`
reader providing flexibility to consume input from external programs as events).
The :ref:`broker-framework` is popular for exchanging events between
Zeek and an external program using WebSockets.
The external program sometimes solely acts as a proxy between Zeek and another
external system.
JavaScript integration adds to the above by enabling Zeek to load JavaScript
code directly, thereby allowing developers to use its rich ecosystem of
built-in and third-party libraries directly within Zeek.
If you previously wanted to start a `HTTP server`_ within Zeek, record Zeek
event data on-the-fly to a `Redis`_ database, got scared at looking at
:zeek:see:`ActiveHTTP`'s implementation (or annoyed that it eats all newlines
in HTTP responses), you may want to give JavaScript a go!
Built-in Plugin
===============
The external `ZeekJS`_ plugin is included with Zeek as an optional built-in plugin.
When `Node.js`_ development headers and libraries are found when building Zeek
from source, the plugin is automatically included.
If Node.js is installed in a non-standard location, ``-D NODEJS_ROOT_DIR`` has
to be provided to ``./configure``.
Assuming an installation of Node.js in ``/opt/node-19.8``, the command to
use is as follows. Discovered headers and libraries will be reported in the
output.
On Linux distributions providing Node.js development packages
(Ubuntu 22.10, Fedora, Debian bookworm) the extra ``-D NODEJS_ROOT_DIR``
is not required.
.. code-block:: console
$ ./configure -D NODEJS_ROOT_DIR:string=/opt/node-19.8
...
-- Looking for __system_property_get
-- Looking for __system_property_get - not found
-- Found Nodejs: /opt/node-19.8/include (found version "19.8.1")
-- version: 19.8.1
-- libraries: /opt/node-19.8/lib/libnode.so.111
-- uv.h: /opt/node-19.8/include/node
-- v8config.h: /opt/node-19.8/include/node
-- Building in plugin: zeekjs (/home/user/zeek/auxil/zeekjs)
...
$ make -j
...
$ sudo make install
To test if the plugin is available on a given Zeek installation, run ``zeek -N Zeek::JavaScript``.
The ``zeek`` executable will also be dynamically linked against ``libnode.so``.
.. code-block:: console
$ zeek -NN Zeek::JavaScript
Zeek::JavaScript - Experimental JavaScript support for Zeek (built-in)
Implements LoadFile (priority 0)
$ ldd $(which zeek) | grep libnode
libnode.so.111 => /opt/node-19.8/lib/libnode.so.111 (0x00007f281aa25000)
The main hooking mechanism used by the plugin is loading files with ``.js`` and ``.cjs`` suffixes.
If no such files are provided on the command-line or via ``@load``, neither
the Node.js environment nor the V8 JavaScript engine will be initialized and there
will be no runtime overhead of having the plugin available. When JavaScript
code is loaded, additional overhead may come from processing JavaScript's IO
loop or running garbage collection.
Hello World
===========
When JavaScript is executed by Zeek, a ``zeek`` object is added to
the JavaScript's global namespace.
This object can be used to register event or hook handlers, raise new Zeek
events, invoking Zeek side functions, etc. This is similar to the global
``document`` object in a browser, but for Zeek functionality.
The API documentation for the global ``zeek`` object created is available
in the `ZeekJS documentation`_.
.. note: External due to requiring npm/jsdoc during building.
The following script calls the :zeek:see:`zeek_version` built-in
function and uses JavaScript's ``console.log()`` for printing a Hello message
within a ``zeek_init`` handler:
.. literalinclude:: js/hello.js
:caption: hello.js
:language: javascript
.. code-block:: console
$ zeek js/hello.js
Hello, Zeek 6.0.0!
Execution Model
===============
There are two ways in which Zeek executes JavaScript code.
First, JavaScript event or hook handlers are added as additional ``Func::Body``
instances to the respective ``Func`` objects. These extra bodies
point to instances of a custom ``Stmt`` subclass with tag ``STMT_EXTERN``.
The ``Stmt::Exec()`` implementation of this class calls the listener function,
a ``v8::Function``, registered through ``zeek.on()``.
When Zeek executes all bodies of an event or hook handler during ``Func::Invoke()``,
some bodies execute JavaScript functions instead of Zeek script statements.
This approach allows to register JavaScript listener functions using Zeek's priority
mechanism. Further, changes done by JavaScript code to global Zeek variables or
record fields are visible to Zeek script and vice versa. In summary, execution
of Zeek and JavaScript code is interleaved when executing event or hook handlers.
Second, the Node.js IO loop (`libuv`_) is registered as an ``IOSource`` with
Zeek's main loop. When there's any IO activity in Node.js, libuv's backend
file descriptor becomes ready, waking up the Zeek main loop. Zeek then transfers
control through the registered ``IOsource`` to the JavaScript plugin which
runs the libuv IO loop until there's no more work to be done. At this point,
the plugin yields control back to Zeek's main loop, draining any queued events,
processing timers, or simply waiting for the next network packet to arrive.
From the above it follows that there is no parallel JavaScript code execution
happening in a separate thread. Zeek script and JavaScript execute interleaved
on Zeek's main thread, driven by the main loop's logic. This also implies that
long running JavaScript code will block Zeek's main loop and Zeek script
execution. This is no different than what would happen in a web browser or an
asynchronous Node.js network server, however, and the same applies to a long
running Zeek script event handler.
Types
=====
JavaScript doesn't support types as rich as Zeek and is further dynamically
typed. As of now, most atomic types like :zeek:see:`addr` or :zeek:see:`subnet` are created as JavaScript strings or another primitive type.
For example, values of type :zeek:see:`count` become JavaScript `BigInt`_ values.
:zeek:see:`time` and :zeek:see:`interval` are converted to numbers representing
seconds with :zeek:see:`time` representing the Unix timestamp.
A list of type conversions implemented is presented in the following table.
.. list-table:: Type Conversions
* - Zeek
- JavaScript
* - bool
- boolean (true, false)
* - count
- `BigInt`_
* - int
- `Number`_
* - double
- `Number`_
* - interval
- `Number`_ as seconds
* - time
- `Number`_ as unix timestamp in seconds
* - string
- string (latin1 encoding assumed)
* - enum
- string
* - addr
- string
* - subnet
- string
* - port
- `Object`_ with ``port`` an ``proto`` properties and a custom ``toJSON()`` method only returning the port
* - vector
- Copied as `Array`_, see :ref:`below <js-set-and-vector>`
* - set
- Copied as `Array`_, see :ref:`below <js-set-and-vector>`
* - table
- `Object`_ holding a reference to a Zeek table value
* - record
- `Object`_ holding a reference to a Zeek record value
Some type conversions are not implemented, they'll cause an error message
and have a ``null`` value in JavaScript. :zeek:see:`pattern` values is one
such example.
.. note::
These type conversions may change in the future or become configurable via
callbacks.
Record values
-------------
Record values are passed by reference from Zeek to JavaScript. That is,
JavaScript objects keep a pointer to the Zeek record they represent.
Holding a JavaScript object referencing a Zeek record value
will keep it alive within Zeek even if Zeek itself does not reference
it anymore. Updates to fields in Zeek become visible within JavaScript.
Updates to properties of such objects in JavaScript become visible in Zeek.
On the other hand, normal JavaScript objects (``{}`` or ``Object()``) are passed
from JavaScript to Zeek as new Zeek record values. Changes
to the original JavaScript object will not be reflected within Zeek.
In the example below, the ``intel_item`` JavaScript object will be converted to
a new :zeek:see:`Intel::Item` Zeek record which is then
passed to the :zeek:see:`Intel::insert` function. Modifying properties of
``intel_item`` after it has been inserted to the Intel data store has
no impact.
.. literalinclude:: js/intel-insert.js
:caption: intel-insert.js
:language: javascript
.. note::
The background to this is that Zeek's base has no knowledge of anything
JavaScript related, while the ZeekJS plugin does have intimate knowledge
about Zeek values and internals.
Table values
------------
Table values are treated very similar to records. JavaScript objects representing
table values keep a reference to the Zeek value. Accessing multi-index Zeek tables
from JavaScript is not supported, however, as there's no easy way to translate
Zeek's multi-value keys to properties or map keys in JavaScript.
Global tables can be modified from JavaScript directly through the ``zeek.global_vars`` object.
The following script provides an example how to change the content
of :zeek:see:`Conn::analyzer_inactivity_timeouts` in JavaScript.
The update to the table becomes visible on the Zeek side and will be
in effect for future connections.
.. literalinclude:: js/global-vars.js
:caption: global-vars.js
:language: javascript
.. code-block:: console
$ zeek global-vars.js -e 'event zeek_init() &priority=-5 { print "zeek", Conn::analyzer_inactivity_timeouts; }'
js {
[AllAnalyzers::ANALYZER_ANALYZER_SSH]: 42,
[AllAnalyzers::ANALYZER_ANALYZER_FTP]: 3600
}
zeek, {
[AllAnalyzers::ANALYZER_ANALYZER_SSH] = 42.0 secs,
[AllAnalyzers::ANALYZER_ANALYZER_FTP] = 1.0 hr
}
.. _js-set-and-vector:
Set and vector values
---------------------
The :zeek:see:`set` and :zeek:see:`vector` types are currently copied from
Zeek to JavaScript as `Array`_ objects. These objects don't reference the
original set or vector on the Zeek side. This means that mutation of the
JavaScript side objects via accessors on ``Array`` do not modify the
Zeek side value. However, objects referencing the Zeek record values within
these arrays are mutable.
This mainly becomes relevant if you wanted to modify state attached to
a connection within JavaScript. Re-assigning ``c.service`` below works
as expected, the ``c.service.push()`` approach on the other had would
not change the set on the Zeek-side.
.. literalinclude:: js/connection-service.js
:caption: connection-service.js
:language: javascript
.. code-block:: console
$ zeek -r ../../traces/get.trace ./connection-service.js
service-from-js,http
.. note::
The current approach was mostly chosen for implementation simplicity
and the assumption that modifying Zeek side vectors or sets from JavaScript
is an edge case. This may change in the future.
Any and zeek.as()
-----------------
Some of Zeek's function take a value of type :zeek:see:`any`. This makes it
impossible to implicitly convert from a JavaScript type to the appropriate
Zeek type.
The function ``zeek.as()`` can be leveraged within JavaScript to create an
object given a JavaScript value and a Zeek type name. That object is then
referencing a Zeek value and when used to call a function taking an any
parameter, the plugin directly threads through the referenced Zeek value
and the call succeeds.
.. literalinclude:: js/zeek-as.js
:caption: zeek-as.js
:language: javascript
The first call to ``zeek.invoke()`` throws an exception due to the failing
type conversion, the second one succeeds.
.. code-block:: console
$ zeek -B plugin-Zeek-JavaScript zeek-as.js
error: Unable to convert JS value '192.168.0.0/16' of type string to Zeek type any
good: type_name is subnet
Debugging
---------
There might be limitations, surprises and bugs with the type conversions.
If Zeek was built with debugging enabled, the ``plugin-Zeek-JavaScript``
debug stream may provide some helpful clues.
.. code-block:: console
$ ZEEK_DEBUG_LOG_STDERR=1 zeek -B plugin-Zeek-JavaScript hello.js
0.000000/1685018723.447965 [plugin Zeek::JavaScript] Hooked .js file=hello.js (./hello.js)
0.000000/1685018723.457376 [plugin Zeek::JavaScript] Hooked 1 .js files: Initializing!
0.000000/1685018723.457639 [plugin Zeek::JavaScript] Init: Node initialized. Compiled with v19.8.1
0.000000/1685018723.458774 [plugin Zeek::JavaScript] Init: V8 initialized. Version 10.8.168.25-node.12
0.000000/1685018723.539618 [plugin Zeek::JavaScript] ExecuteAndWaitForInit: init() result=object 1
0.000000/1685018723.539644 [plugin Zeek::JavaScript] ExecuteAndWaitForInit: zeek_javascript_init returned promise, state=0 - running JS loop
0.000000/1685018723.551058 [plugin Zeek::JavaScript] Registering zeek_init priority=0, js_eh=0x603001cac710
0.000000/1685018723.551120 [plugin Zeek::JavaScript] Registered zeek_init
1685018723.601898/1685018723.621106 [plugin Zeek::JavaScript] ZeekInvoke: invoke for zeek_version
1685018723.601898/1685018723.621177 [plugin Zeek::JavaScript] Invoke zeek_version with 0 args
1685018723.601898/1685018723.621212 [plugin Zeek::JavaScript] ZeekInvoke: invoke for zeek_version returned: Hello, Zeek 6.0.0-dev.636-debug!
1685018723.644485/1685018723.644726 [plugin Zeek::JavaScript] Done...
1685018723.644485/1685018723.644754 [plugin Zeek::JavaScript] Done: uv_loop not alive anymore on iteration 0
Examples
========
HTTP API
--------
The following JavaScript file provides an HTTP API for generically invoking
Zeek functions and Zeek events using ``curl``. It's 60 lines of vanilla
Node.js JavaScript (with limited error handling), but allows for experiments
and runtime reconfiguration of a Zeek process that's hard to achieve with
Zeek provided functionality. Essentially, all that is used is ``zeek.event``
and ``zeek.invoke`` and relying on implicit type conversion to mostly do
the right thing.
The two supported endpoints are ``/events/<event_name>``
and ``/functions/<function_name>``. Arguments are passed in an ``args`` array
as JSON in the POST request's body.
.. literalinclude:: js/api.zeek
:caption: api.zeek
:language: zeek
.. literalinclude:: js/api.js
:caption: api.js
:language: javascript
.. code-block:: console
$ zeek -C -i lo ./api.zeek
Listening on 127.0.0.1:8080...
listening on lo
As a first example, the :zeek:see:`get_net_stats` built-in function is
invoked and returns the current monitoring statistics in response.
.. code-block:: console
$ curl -XPOST http://localhost:8080/functions/get_net_stats
{
"result": {
"pkts_recvd": 3558,
"pkts_dropped": 0,
"pkts_link": 7126,
"bytes_recvd": 27982155
}
}
Posting to ``/events/MyAPI::print_msg`` raises the ``MyAPI::print_msg`` event
implemented in the ``api.zeek`` file.
.. code-block:: console
$ curl -4 --data-raw '{"args": ["Hello Zeek!"]}' http://localhost:8080/events/MyAPI::print_msg
{}
# The Zeek process will output:
ZEEK, print_msg, 1685121096.892404, Hello Zeek!
It is possible to runtime disable (and enable) analyzers as well by
leveraging :zeek:see:`Analyzer::disable_analyzer`. Here shown for the SSL analyzer.
.. code-block:: console
$ curl -XPOST --data '{"args": ["AllAnalyzers::ANALYZER_ANALYZER_SSL"]}' localhost:8080/functions/Analyzer::disable_analyzer
{
"result": true
}
.. todo::
Using ``Analyzer::ANALYZER_SSL`` is currently not possible due to
:zeek:see:`Analyzer::disable_analyzer` taking an :zeek:see:`AllAnalyzers::Tag`
and the enum names are different.
As a fairly advanced example, creating a new :zeek:see:`Log::Filter` instance
for the :zeek:see:`Conn::LOG` stream at runtime using :zeek:see:`Log::add_filter`
is possible. Removal works, too.
.. code-block:: console
$ curl -XPOST --data '{"args": ["Conn::LOG", {"name": "my-conn-rotate", "path": "my-conn-rotate", "include": ["ts", "id.orig_h", "id.res_h", "history"], "interv": 10}]}' \
localhost:8080/functions/Log::add_filter
{
"result": true
}
$ curl -XPOST --data '{"args": ["Conn::LOG", "my-conn-rotate"]}' localhost:8080/functions/Log::remove_filter
{
"result": true
}
This API can also be used to invoke :zeek:see:`terminate`, so you want to be
careful deploying this in an actual production environment:
.. code-block:: console
$ curl -XPOST --data '{"args": []}' localhost:8080/functions/terminate
{
"result": true
}
# Zeek is now stopping with:
1685121663.854714 <params>, line 1: received termination signal
1685121663.854714 <params>, line 1: 53 packets received on interface lo, 0 (0.00%) dropped, 0 (0.00%) not processed
More
----
More examples can be found in the `ZeekJS documentation`_
and `repository <https://github.com/corelight/zeekjs/tree/main/examples>`_.
TypeScript
==========
`TypeScript`_ adds typing to JavaScript. While ZeekJS has no TypeScript awareness,
there's nothing preventing you from using it. Use ``tsc`` for type checking and
provide the produced ``.js`` files to Zeek.
You may need a ``zeek.d.ts`` file for the ``zeek`` object. A bare
`zeek.d.ts <https://github.com/corelight/zeekjs/pull/20/>`_ file has been
tested, but not integrated with ZeekJS at this point.
.. _ZeekJS documentation: https://zeekjs.readthedocs.io/en/latest/
.. _Node.js: https://nodejs.org/enhttps://nodejs.org/en
.. _ZeekJS: https://github.com/corelight/zeekjs
.. _BigInt: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt
.. _Number: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number
.. _Array: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array
.. _Object: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object
.. _HTTP server: https://nodejs.org/api/http.html#httpcreateserveroptions-requestlistener
.. _Redis: https://redis.io/
.. _TypeScript: https://www.typescriptlang.org/
.. _libuv: https://github.com/libuv/libuv

60
doc/scripting/js/api.js Normal file
View file

@ -0,0 +1,60 @@
// api.js
//
// HTTP API allowing to invoke any Zeek events and functions using a simple JSON payload.
//
// Triggering and intel match (this will log to intel.log)
//
// $ curl --data-raw '{"args": [{"indicator": "50.3.2.1", "indicator_type": "Intel::ADDR", "where":"Intel::IN_ANYWHERE"}, []]}' \
// http://localhost:8080/events/Intel::match
//
// Calling a Zeek function:
//
// $ curl -XPOST --data '{"args": [1000]}' localhost:8080/functions/rand
// {
// "result": 730
// }
//
const http = require('node:http');
// Light-weight safe-json-stringify replacement.
BigInt.prototype.toJSON = function () { return parseInt(this.toString()); };
const handleCall = (cb, req, res) => {
const name = req.url.split('/').at(-1);
const body = [];
req.on('data', (chunk) => {
body.push(chunk);
}).on('end', () => {
try {
const parsed = JSON.parse(Buffer.concat(body).toString() || '{}');
const args = parsed.args || [];
const result = cb(name, args);
res.writeHead(202);
return res.end(`${JSON.stringify({ result: result }, null, 2)}\n`);
} catch (err) {
console.error(`error: ${err}`);
res.writeHead(400);
return res.end(`${JSON.stringify({ error: err.toString() })}\n`);
}
});
};
const server = http.createServer((req, res) => {
if (req.method === 'POST') {
if (req.url.startsWith('/events/')) {
return handleCall(zeek.event, req, res);
} else if (req.url.startsWith('/functions/')) {
return handleCall(zeek.invoke, req, res);
}
}
res.writeHead(404);
return res.end();
});
const host = process.env.API_HOST || '127.0.0.1';
const port = parseInt(process.env.API_PORT || 8080, 10);
server.listen(port, host, () => {
console.log(`Listening on ${host}:${port}...`);
});

14
doc/scripting/js/api.zeek Normal file
View file

@ -0,0 +1,14 @@
## api.zeek
##
## Sample events to be invoked by api.js
module MyAPI;
export {
global print_msg: event(msg: string, ts: time &default=network_time());
}
event MyAPI::print_msg(msg: string, ts: time) {
print "ZEEK", "print_msg", ts, msg;
}
@load ./api.js

View file

@ -0,0 +1,9 @@
// connection-service.js
zeek.on('connection_state_remove', { priority: 10 }, (c) => {
// c.service.push('service-from-js'); only modifies JavaScript array
c.service = c.service.concat('service-from-js');
});
zeek.hook('Conn::log_policy', (rec, id, filter) => {
console.log(rec.service);
});

View file

@ -0,0 +1,9 @@
// global-vars.js
const timeouts = zeek.global_vars['Conn::analyzer_inactivity_timeouts'];
// Similar to redef.
timeouts['AllAnalyzers::ANALYZER_ANALYZER_SSH'] = 42.0;
zeek.on('zeek_init', () => {
console.log('js', timeouts);
});

View file

@ -0,0 +1,5 @@
// hello.js
zeek.on('zeek_init', () => {
let version = zeek.invoke('zeek_version');
console.log(`Hello, Zeek ${version}!`);
});

View file

@ -0,0 +1,10 @@
// intel-insert.js
zeek.on('zeek_init', () => {
let intel_item = {
indicator: '192.168.0.1',
indicator_type: 'Intel::ADDR',
meta: { source: 'some intel source' },
};
zeek.invoke('Intel::insert', [intel_item]);
});

View file

@ -0,0 +1,13 @@
// zeek-as.js
zeek.on('zeek_init', () => {
try {
// This throws because type_name takes an any parameter
zeek.invoke('type_name', ['192.168.0.0/16']);
} catch (e) {
console.error(`error: ${e}`);
}
// Explicit conversion of string to addr type.
let type_string = zeek.invoke('type_name', [zeek.as('subnet', '192.168.0.0/16')]);
console.log(`good: type_name is ${type_string}`);
});

View file

@ -0,0 +1,110 @@
.. _zam:
===================
Script Optimization
===================
.. versionadded:: 7.0
.. note::
ZAM has been available in Zeek for a number of releases, but as of Zeek 7
it has matured to a point where we encourage regular users to explore it.
Introduction
============
The *Zeek Abstract Machine* (ZAM) is an optional script optimization engine
built into Zeek. Using ZAM changes the basic execution model for Zeek scripts in
an effort to gain higher performance. Normally, Zeek parses scripts into
abstract syntax trees that it then executes by recursively interpreting each
node in a given tree. With ZAM's script optimization, Zeek first compiles the
trees into a low-level form that it can then generally execute more efficiently.
To enable this feature, include ``-O ZAM`` on the command line.
How much faster will your scripts run? There's no simple answer to that. It
depends heavily on several factors:
* What proportion of the processing during execution is spent in the Zeek core's
event engine, rather than executing scripts. ZAM optimization doesn't help
with event engine execution.
* What proportion of the script's processing is spent executing built-in
functions (BiFs), i.e., functions callable from the script layer but
implemented in native code. ZAM optimization improves execution for some
select, simple BiFs, like :zeek:id:`network_time`, but it doesn't help for
complex ones. It might well be that most of your script processing actually
occurs in the underpinnings of the :ref:`logging framework
<framework-logging>`, for example, and thus you won't see much improvement.
* Those two factors add up to gains very often on the order of only 10-15%,
rather than something a lot more dramatic.
.. note::
At startup, ZAM takes a few seconds to generate the low-level code for the
loaded set of scripts, unless you're using Zeek's *bare mode* (via the
``-b`` command-line option), which loads only a minimal set of scripts. Keep
this in mind when comparing Zeek runtimes, to ensure you're comparing only
actual script execution time.
To isolate ZAM's code generation overhead when running Zeek on a pcap, simply
leave out the traffic. That is, turn this ...
.. code-block:: sh
$ zcat 2009-M57-day11-18.trace.gz | zeek -O ZAM -r - <args>
into
.. code-block:: sh
$ time zeek -O ZAM <args>
and, since Zeek drops into interactive mode when run without arguments,
.. code-block:: sh
$ time zeek -O ZAM /dev/null
when there are none.
To determine the runtime after ZAM's code generation, you can measure the time
between :zeek:id:`zeek_init` and :zeek:id:`zeek_done` event handlers:
.. code-block:: zeek
:caption: runtime.zeek
global t0: time;
event zeek_init()
{
t0 = current_time();
}
event zeek_done()
{
print current_time() - t0;
}
Here's a quick example of ZAM's effect on Zeek's typical processing of a larger
packet capture, from one of our testsuites:
.. code-block:: sh
$ zcat 2009-M57-day11-18.trace.gz | zeek -r - runtime.zeek
14.0 secs 252.0 msecs 107.858658 usecs
$ zcat 2009-M57-day11-18.trace.gz | zeek -O ZAM -r - runtime.zeek
12.0 secs 345.0 msecs 857.990265 usecs
A roughly 13% improvement in runtime.
Other Optimization Features
===========================
You can tune various features of ZAM via additional options to ``-O``, see the
output of ``zeek -O help`` for details. For example, you can study the script
transformations ZAM applies, and use ZAM selectively in certain files (via
``--optimize-files``) or functions (via ``--optimize-funcs``). Most users
won't need to use these.

View file

@ -0,0 +1,87 @@
.. _tracing_events:
==============
Tracing Events
==============
Zeek provides a mechanism for recording the events that occur during
an execution run (on live traffic, or from a pcap) in a manner that you
can then later replay to get the same effect but without the traffic source.
You can also edit the recording to introduce differences between the original,
such as introducing corner-cases to aid in testing, or anonymizing sensitive
information.
You create a trace using:
.. code-block:: console
zeek --event-trace=mytrace.zeek <traffic-option> <other-options> <scripts...>
or, equivalently:
.. code-block:: console
zeek -E mytrace.zeek <traffic-option> <other-options> <scripts...>
Here, the *traffic-option* would be ``-i`` or ``-r`` to arrange for
a source of network traffic. The trace will be written to the file
``mytrace.zeek`` which, as the extension suggests, is itself a Zeek script.
You can then replay the events using:
.. code-block:: console
zeek <other-options> <scripts...> mytrace.zeek
One use case for event-tracing is to turn a sensitive PCAP that can't
be shared into a reflection of that same activity that - with some editing, for
example to change IP addresses - is safe to share. To facilitate such
editing, the generated script includes at the end a summary of all of
the constants present in the script that might be sensitive and require
editing (such as addresses and strings), to make it easier to know what
to search for and edit in the script. The generated script also includes
a global ``__base_time`` that's used to make it easy to alter (most of)
the times in the trace without altering their relative offsets.
The generated script aims to ensure that event values that were related
during the original run stay related when replayed; re-execution should
proceed in a manner identical to how it did originally. There are however
several considerations:
* Zeek is unable to accurately trace events that include values that cannot
be faithfully recreated in a Zeek script, namely those having types of
``opaque``, ``file``, or ``any``. Upon encountering these, it generates
variables reflecting their unsupported nature, such as ``global
__UNSUPPORTED21: opaque of x509;``, and initializes them with code like
``__UNSUPPORTED21 = UNSUPPORTED opaque of x509;``. The generated script
is meant to produce syntax errors if run directly, and the names make
it easy to search for the elements that need to somehow be addressed.
* Zeek only traces events that reflect traffic processing, i.e., those
occurring after :zeek:id:`network_time` is set. Even if you don't include
a network traffic source, it skips the :zeek:id:`zeek_init` event
(since it is always automatically generated).
* The trace does *not* include events generated by scripts, only those
generated by the "event engine".
* The trace is generated upon Zeek cleanly exiting, so if Zeek crashes,
no trace will be produced. Stopping Zeek via *ctrl-c* does trigger a
clean exit.
* A subtle issue arises regarding any changes that the scripts in the
original execution made to values present in subsequent events. If
you re-run using the event trace script as well as those scripts,
the changes the scripts make during the re-run will be discarded and
instead replaced with the changes made during the original execution.
This generally won't matter if you're using the exact same scripts for
replay as originally, but if you've made changes to those scripts, then
it could. If you need the replay script to "respond" to changes made
during the re-execution, you can delete from the replay script every
line marked with the comment ``# from script``.
.. note::
It's possible that some timers will behave differently upon replay
than originally. If you encounter this and it creates a problem, we
would be interested to hear about it so we can consider whether the
problem can be remedied.

26
doc/scripting/usage.rst Normal file
View file

@ -0,0 +1,26 @@
.. _script-usage-errors:
==============================
Finding Potential Usage Errors
==============================
Usage errors concern variables used-but-not-guaranteed-set or
set-but-not-ever-used. Zeek generates reports for these if you specify
the ``-u`` flag. It exits after producing the report, so if it simply exits
with no output, then it did not find any usage errors.
Variables reported as "used without definition" appear to have a code path
to them the could access their value even though it has not been initialized.
If upon inspection you determine that there is no actual hazard, you can
mark the definition with an :zeek:attr:`&is_assigned` attribute to assure the optimizer
that the value will be set.
Variables reported as "assignment unused" have a value assigned to them
that is meaningless since prior to any use of that value, another value
is assigned to the same variable. Such assignments are worth inspecting
as they sometimes reflect logic errors. Similar logic applies to unused
event handlers, hooks, and functions. You can suppress these reports by
adding an :zeek:attr:`&is_used` attribute to the original definition. If the
determination is indeed incorrect, that represents a bug in Zeek's analysis,
so something to report via the Issue Tracker.