mirror of
https://github.com/zeek/zeek.git
synced 2025-10-02 06:38:20 +00:00

Add a "broxygen" domain Sphinx extension w/ directives to allow on-the-fly documentation to be generated w/ Bro and included in files. This means all autogenerated reST docs are now done by Bro. The odd CMake/Python glue scipts which used to generate some portions are now gone. Bro and the Sphinx extension handle checking for outdated docs themselves. Parallel builds of `make doc` target should now work (mostly because I don't think there's any tasks that can be done in parallel anymore). Overall, this seems to simplify things and make the Broxygen-generated portions of the documentation visible/traceable from the main Sphinx source tree. The one odd thing still is that per-script documentation is rsync'd in to a shadow copy of the Sphinx source tree within the build dir. This is less elegant than using the new broxygen extension to make per-script docs, but rsync is faster and simpler. Simpler as in less code because it seems like, in the best case, I'd need to write a custom Sphinx Builder to be able to get that to even work.
390 lines
13 KiB
ReStructuredText
390 lines
13 KiB
ReStructuredText
|
|
.. _framework-logging:
|
|
|
|
=================
|
|
Logging Framework
|
|
=================
|
|
|
|
.. rst-class:: opening
|
|
|
|
Bro comes with a flexible key-value based logging interface that
|
|
allows fine-grained control of what gets logged and how it is
|
|
logged. This document describes how logging can be customized and
|
|
extended.
|
|
|
|
.. contents::
|
|
|
|
Terminology
|
|
===========
|
|
|
|
Bro's logging interface is built around three main abstractions:
|
|
|
|
Log streams
|
|
A stream corresponds to a single log. It defines the set of
|
|
fields that a log consists of with their names and fields.
|
|
Examples are the ``conn`` for recording connection summaries,
|
|
and the ``http`` stream for recording HTTP activity.
|
|
|
|
Filters
|
|
Each stream has a set of filters attached to it that determine
|
|
what information gets written out. By default, each stream has
|
|
one default filter that just logs everything directly to disk
|
|
with an automatically generated file name. However, further
|
|
filters can be added to record only a subset, split a stream
|
|
into different outputs, or to even duplicate the log to
|
|
multiple outputs. If all filters are removed from a stream,
|
|
all output is disabled.
|
|
|
|
Writers
|
|
A writer defines the actual output format for the information
|
|
being logged. At the moment, Bro comes with only one type of
|
|
writer, which produces tab separated ASCII files. In the
|
|
future we will add further writers, like for binary output and
|
|
direct logging into a database.
|
|
|
|
Basics
|
|
======
|
|
|
|
The data fields that a stream records are defined by a record type
|
|
specified when it is created. Let's look at the script generating Bro's
|
|
connection summaries as an example,
|
|
:doc:`/scripts/base/protocols/conn/main.bro`. It defines a record
|
|
:bro:type:`Conn::Info` that lists all the fields that go into
|
|
``conn.log``, each marked with a ``&log`` attribute indicating that it
|
|
is part of the information written out. To write a log record, the
|
|
script then passes an instance of :bro:type:`Conn::Info` to the logging
|
|
framework's :bro:id:`Log::write` function.
|
|
|
|
By default, each stream automatically gets a filter named ``default``
|
|
that generates the normal output by recording all record fields into a
|
|
single output file.
|
|
|
|
In the following, we summarize ways in which the logging can be
|
|
customized. We continue using the connection summaries as our example
|
|
to work with.
|
|
|
|
Filtering
|
|
---------
|
|
|
|
To create a new output file for an existing stream, you can add a
|
|
new filter. A filter can, e.g., restrict the set of fields being
|
|
logged:
|
|
|
|
.. code:: bro
|
|
|
|
event bro_init()
|
|
{
|
|
# Add a new filter to the Conn::LOG stream that logs only
|
|
# timestamp and originator address.
|
|
local filter: Log::Filter = [$name="orig-only", $path="origs", $include=set("ts", "id.orig_h")];
|
|
Log::add_filter(Conn::LOG, filter);
|
|
}
|
|
|
|
Note the fields that are set for the filter:
|
|
|
|
``name``
|
|
A mandatory name for the filter that can later be used
|
|
to manipulate it further.
|
|
|
|
``path``
|
|
The filename for the output file, without any extension (which
|
|
may be automatically added by the writer). Default path values
|
|
are generated by taking the stream's ID and munging it slightly.
|
|
:bro:enum:`Conn::LOG` is converted into ``conn``,
|
|
:bro:enum:`PacketFilter::LOG` is converted into
|
|
``packet_filter``, and :bro:enum:`Known::CERTS_LOG` is
|
|
converted into ``known_certs``.
|
|
|
|
``include``
|
|
A set limiting the fields to the ones given. The names
|
|
correspond to those in the :bro:type:`Conn::Info` record, with
|
|
sub-records unrolled by concatenating fields (separated with
|
|
dots).
|
|
|
|
Using the code above, you will now get a new log file ``origs.log``
|
|
that looks like this::
|
|
|
|
#separator \x09
|
|
#path origs
|
|
#fields ts id.orig_h
|
|
#types time addr
|
|
1128727430.350788 141.42.64.125
|
|
1128727435.450898 141.42.64.125
|
|
|
|
If you want to make this the only log file for the stream, you can
|
|
remove the default filter (which, conveniently, has the name
|
|
``default``):
|
|
|
|
.. code:: bro
|
|
|
|
event bro_init()
|
|
{
|
|
# Remove the filter called "default".
|
|
Log::remove_filter(Conn::LOG, "default");
|
|
}
|
|
|
|
An alternate approach to "turning off" a log is to completely disable
|
|
the stream:
|
|
|
|
.. code:: bro
|
|
|
|
event bro_init()
|
|
{
|
|
Log::disable_stream(Conn::LOG);
|
|
}
|
|
|
|
If you want to skip only some fields but keep the rest, there is a
|
|
corresponding ``exclude`` filter attribute that you can use instead of
|
|
``include`` to list only the ones you are not interested in.
|
|
|
|
A filter can also determine output paths *dynamically* based on the
|
|
record being logged. That allows, e.g., to record local and remote
|
|
connections into separate files. To do this, you define a function
|
|
that returns the desired path:
|
|
|
|
.. code:: bro
|
|
|
|
function split_log(id: Log::ID, path: string, rec: Conn::Info) : string
|
|
{
|
|
# Return "conn-local" if originator is a local IP, otherwise "conn-remote".
|
|
local lr = Site::is_local_addr(rec$id$orig_h) ? "local" : "remote";
|
|
return fmt("%s-%s", path, lr);
|
|
}
|
|
|
|
event bro_init()
|
|
{
|
|
local filter: Log::Filter = [$name="conn-split", $path_func=split_log, $include=set("ts", "id.orig_h")];
|
|
Log::add_filter(Conn::LOG, filter);
|
|
}
|
|
|
|
Running this will now produce two files, ``local.log`` and
|
|
``remote.log``, with the corresponding entries. One could extend this
|
|
further for example to log information by subnets or even by IP
|
|
address. Be careful, however, as it is easy to create many files very
|
|
quickly ...
|
|
|
|
.. sidebar:: A More Generic Path Function
|
|
|
|
The ``split_log`` method has one draw-back: it can be used
|
|
only with the :bro:enum:`Conn::LOG` stream as the record type is hardcoded
|
|
into its argument list. However, Bro allows to do a more generic
|
|
variant:
|
|
|
|
.. code:: bro
|
|
|
|
function split_log(id: Log::ID, path: string, rec: record { id: conn_id; } ) : string
|
|
{
|
|
return Site::is_local_addr(rec$id$orig_h) ? "local" : "remote";
|
|
}
|
|
|
|
This function can be used with all log streams that have records
|
|
containing an ``id: conn_id`` field.
|
|
|
|
While so far we have seen how to customize the columns being logged,
|
|
you can also control which records are written out by providing a
|
|
predicate that will be called for each log record:
|
|
|
|
.. code:: bro
|
|
|
|
function http_only(rec: Conn::Info) : bool
|
|
{
|
|
# Record only connections with successfully analyzed HTTP traffic
|
|
return rec$service == "http";
|
|
}
|
|
|
|
event bro_init()
|
|
{
|
|
local filter: Log::Filter = [$name="http-only", $path="conn-http", $pred=http_only];
|
|
Log::add_filter(Conn::LOG, filter);
|
|
}
|
|
|
|
This will result in a log file ``conn-http.log`` that contains only
|
|
traffic detected and analyzed as HTTP traffic.
|
|
|
|
Extending
|
|
---------
|
|
|
|
You can add further fields to a log stream by extending the record
|
|
type that defines its content. Let's say we want to add a boolean
|
|
field ``is_private`` to :bro:type:`Conn::Info` that indicates whether the
|
|
originator IP address is part of the :rfc:`1918` space:
|
|
|
|
.. code:: bro
|
|
|
|
# Add a field to the connection log record.
|
|
redef record Conn::Info += {
|
|
## Indicate if the originator of the connection is part of the
|
|
## "private" address space defined in RFC1918.
|
|
is_private: bool &default=F &log;
|
|
};
|
|
|
|
|
|
Now we need to set the field. A connection's summary is generated at
|
|
the time its state is removed from memory. We can add another handler
|
|
at that time that sets our field correctly:
|
|
|
|
.. code:: bro
|
|
|
|
event connection_state_remove(c: connection)
|
|
{
|
|
if ( c$id$orig_h in Site::private_address_space )
|
|
c$conn$is_private = T;
|
|
}
|
|
|
|
Now ``conn.log`` will show a new field ``is_private`` of type
|
|
``bool``.
|
|
|
|
Notes:
|
|
|
|
- For extending logs this way, one needs a bit of knowledge about how
|
|
the script that creates the log stream is organizing its state
|
|
keeping. Most of the standard Bro scripts attach their log state to
|
|
the :bro:type:`connection` record where it can then be accessed, just
|
|
as the ``c$conn`` above. For example, the HTTP analysis adds a field
|
|
``http`` of type :bro:type:`HTTP::Info` to the :bro:type:`connection`
|
|
record. See the script reference for more information.
|
|
|
|
- When extending records as shown above, the new fields must always be
|
|
declared either with a ``&default`` value or as ``&optional``.
|
|
Furthermore, you need to add the ``&log`` attribute or otherwise the
|
|
field won't appear in the output.
|
|
|
|
Hooking into the Logging
|
|
------------------------
|
|
|
|
Sometimes it is helpful to do additional analysis of the information
|
|
being logged. For these cases, a stream can specify an event that will
|
|
be generated every time a log record is written to it. All of Bro's
|
|
default log streams define such an event. For example, the connection
|
|
log stream raises the event :bro:id:`Conn::log_conn`. You
|
|
could use that for example for flagging when a connection to a
|
|
specific destination exceeds a certain duration:
|
|
|
|
.. code:: bro
|
|
|
|
redef enum Notice::Type += {
|
|
## Indicates that a connection remained established longer
|
|
## than 5 minutes.
|
|
Long_Conn_Found
|
|
};
|
|
|
|
event Conn::log_conn(rec: Conn::Info)
|
|
{
|
|
if ( rec$duration > 5mins )
|
|
NOTICE([$note=Long_Conn_Found,
|
|
$msg=fmt("unusually long conn to %s", rec$id$resp_h),
|
|
$id=rec$id]);
|
|
}
|
|
|
|
Often, these events can be an alternative to post-processing Bro logs
|
|
externally with Perl scripts. Much of what such an external script
|
|
would do later offline, one may instead do directly inside of Bro in
|
|
real-time.
|
|
|
|
Rotation
|
|
--------
|
|
|
|
By default, no log rotation occurs, but it's globally controllable for all
|
|
filters by redefining the :bro:id:`Log::default_rotation_interval` option:
|
|
|
|
.. code:: bro
|
|
|
|
redef Log::default_rotation_interval = 1 hr;
|
|
|
|
Or specifically for certain :bro:type:`Log::Filter` instances by setting
|
|
their ``interv`` field. Here's an example of changing just the
|
|
:bro:enum:`Conn::LOG` stream's default filter rotation.
|
|
|
|
.. code:: bro
|
|
|
|
event bro_init()
|
|
{
|
|
local f = Log::get_filter(Conn::LOG, "default");
|
|
f$interv = 1 min;
|
|
Log::remove_filter(Conn::LOG, "default");
|
|
Log::add_filter(Conn::LOG, f);
|
|
}
|
|
|
|
ASCII Writer Configuration
|
|
--------------------------
|
|
|
|
The ASCII writer has a number of options for customizing the format of
|
|
its output, see :doc:`/scripts/base/frameworks/logging/writers/ascii.bro`.
|
|
|
|
Adding Streams
|
|
==============
|
|
|
|
It's easy to create a new log stream for custom scripts. Here's an
|
|
example for the ``Foo`` module:
|
|
|
|
.. code:: bro
|
|
|
|
module Foo;
|
|
|
|
export {
|
|
# Create an ID for our new stream. By convention, this is
|
|
# called "LOG".
|
|
redef enum Log::ID += { LOG };
|
|
|
|
# Define the fields. By convention, the type is called "Info".
|
|
type Info: record {
|
|
ts: time &log;
|
|
id: conn_id &log;
|
|
};
|
|
|
|
# Define a hook event. By convention, this is called
|
|
# "log_<stream>".
|
|
global log_foo: event(rec: Info);
|
|
|
|
}
|
|
|
|
# This event should be handled at a higher priority so that when
|
|
# users modify your stream later and they do it at priority 0,
|
|
# their code runs after this.
|
|
event bro_init() &priority=5
|
|
{
|
|
# Create the stream. This also adds a default filter automatically.
|
|
Log::create_stream(Foo::LOG, [$columns=Info, $ev=log_foo]);
|
|
}
|
|
|
|
You can also add the state to the :bro:type:`connection` record to make
|
|
it easily accessible across event handlers:
|
|
|
|
.. code:: bro
|
|
|
|
redef record connection += {
|
|
foo: Info &optional;
|
|
}
|
|
|
|
Now you can use the :bro:id:`Log::write` method to output log records and
|
|
save the logged ``Foo::Info`` record into the connection record:
|
|
|
|
.. code:: bro
|
|
|
|
event connection_established(c: connection)
|
|
{
|
|
local rec: Foo::Info = [$ts=network_time(), $id=c$id];
|
|
c$foo = rec;
|
|
Log::write(Foo::LOG, rec);
|
|
}
|
|
|
|
See the existing scripts for how to work with such a new connection
|
|
field. A simple example is :doc:`/scripts/base/protocols/syslog/main.bro`.
|
|
|
|
When you are developing scripts that add data to the :bro:type:`connection`
|
|
record, care must be given to when and how long data is stored.
|
|
Normally data saved to the connection record will remain there for the
|
|
duration of the connection and from a practical perspective it's not
|
|
uncommon to need to delete that data before the end of the connection.
|
|
|
|
Other Writers
|
|
-------------
|
|
|
|
Bro supports the following output formats other than ASCII:
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
|
|
logging-dataseries
|
|
logging-elasticsearch
|
|
logging-input-sqlite
|