diff --git a/doc/frameworks/logging-input-sqlite.rst b/doc/frameworks/logging-input-sqlite.rst index 15df91b8c6..6f5e867686 100644 --- a/doc/frameworks/logging-input-sqlite.rst +++ b/doc/frameworks/logging-input-sqlite.rst @@ -67,8 +67,8 @@ that are present in the ASCII log files:: 'id.orig_p' integer, ... -Note that the ASCII ``conn.log`` will still be created. To disable the ASCII writer for a -log stream, you can remove the default filter: +Note that the ASCII ``conn.log`` will still be created. To prevent this file +from being created, you can remove the default filter: .. code:: bro diff --git a/doc/frameworks/logging.rst b/doc/frameworks/logging.rst index 765d7bed23..7cb21da36b 100644 --- a/doc/frameworks/logging.rst +++ b/doc/frameworks/logging.rst @@ -19,195 +19,128 @@ Terminology Bro's logging interface is built around three main abstractions: - Log streams - A stream corresponds to a single log. It defines the set of - fields that a log consists of with their names and fields. - Examples are the ``conn`` for recording connection summaries, + Streams + A log stream corresponds to a single log. It defines the set of + fields that a log consists of with their names and types. + Examples are the ``conn`` stream for recording connection summaries, and the ``http`` stream for recording HTTP activity. Filters Each stream has a set of filters attached to it that determine what information gets written out. By default, each stream has - one default filter that just logs everything directly to disk - with an automatically generated file name. However, further - filters can be added to record only a subset, split a stream - into different outputs, or to even duplicate the log to - multiple outputs. If all filters are removed from a stream, - all output is disabled. + one default filter that just logs everything directly to disk. + However, additional filters can be added to record only a subset + of the log records, write to different outputs, or set a custom + rotation interval. If all filters are removed from a stream, + then output is disabled for that stream. Writers - A writer defines the actual output format for the information - being logged. At the moment, Bro comes with only one type of - writer, which produces tab separated ASCII files. In the - future we will add further writers, like for binary output and - direct logging into a database. + Each filter has a writer. A writer defines the actual output + format for the information being logged. The default writer is + the ASCII writer, which produces tab-separated ASCII files. Other + writers are available, like for binary output or direct logging + into a database. -Basics -====== +There are several different ways to customize Bro's logging: you can create +a new log stream, you can extend an existing log with new fields, you +can apply filters to an existing log stream, or you can customize the output +format by setting log writer options. All of these approaches are +described in this document. -The data fields that a stream records are defined by a record type -specified when it is created. Let's look at the script generating Bro's -connection summaries as an example, -:doc:`/scripts/base/protocols/conn/main.bro`. It defines a record -:bro:type:`Conn::Info` that lists all the fields that go into -``conn.log``, each marked with a ``&log`` attribute indicating that it -is part of the information written out. To write a log record, the -script then passes an instance of :bro:type:`Conn::Info` to the logging -framework's :bro:id:`Log::write` function. +Streams +======= -By default, each stream automatically gets a filter named ``default`` -that generates the normal output by recording all record fields into a -single output file. +In order to log data to a new log stream, all of the following needs to be +done: -In the following, we summarize ways in which the logging can be -customized. We continue using the connection summaries as our example -to work with. +- A :bro:type:`record` type must be defined which consists of all the + fields that will be logged (by convention, the name of this record type is + usually "Info"). +- A log stream ID (an :bro:type:`enum` with type name "Log::ID") must be + defined that uniquely identifies the new log stream. +- A log stream must be created using the :bro:id:`Log::create_stream` function. +- When the data to be logged becomes available, the :bro:id:`Log::write` + function must be called. -Filtering ---------- - -To create a new output file for an existing stream, you can add a -new filter. A filter can, e.g., restrict the set of fields being -logged: +In the following example, we create a new module "Foo" which creates +a new log stream. .. code:: bro - event bro_init() + module Foo; + + export { + # Create an ID for our new stream. By convention, this is + # called "LOG". + redef enum Log::ID += { LOG }; + + # Define the record type that will contain the data to log. + # By convention, the type is called "Info". + # All fields to log must have the &log attribute. + # If a field might not contain any data, use the + # &optional attribute. + type Info: record { + ts: time &log; + id: conn_id &log; + service: string &log &optional; + }; + } + + # Optionally, we can add a new field to the connection record so that + # the data we are logging (our "Info" record) will be easily + # accessible in a variety of event handlers. + # By convention, the name of this new field is the lowercase name of + # the module. + redef record connection += { + foo: Info &optional; + }; + + # This event is handled at a priority higher than zero so that if + # users modify this stream in another script, they can do so at the + # default priority of zero. + event bro_init() &priority=5 { - # Add a new filter to the Conn::LOG stream that logs only - # timestamp and originator address. - local filter: Log::Filter = [$name="orig-only", $path="origs", $include=set("ts", "id.orig_h")]; - Log::add_filter(Conn::LOG, filter); + # Create the stream. This adds a default filter automatically. + Log::create_stream(Foo::LOG, [$columns=Info, $path="foo"]); } -Note the fields that are set for the filter: - - ``name`` - A mandatory name for the filter that can later be used - to manipulate it further. - - ``path`` - The filename for the output file, without any extension (which - may be automatically added by the writer). Default path values - are generated by taking the stream's ID and munging it slightly. - :bro:enum:`Conn::LOG` is converted into ``conn``, - :bro:enum:`PacketFilter::LOG` is converted into - ``packet_filter``, and :bro:enum:`Known::CERTS_LOG` is - converted into ``known_certs``. - - ``include`` - A set limiting the fields to the ones given. The names - correspond to those in the :bro:type:`Conn::Info` record, with - sub-records unrolled by concatenating fields (separated with - dots). - -Using the code above, you will now get a new log file ``origs.log`` -that looks like this:: - - #separator \x09 - #path origs - #fields ts id.orig_h - #types time addr - 1128727430.350788 141.42.64.125 - 1128727435.450898 141.42.64.125 - -If you want to make this the only log file for the stream, you can -remove the default filter (which, conveniently, has the name -``default``): +At this point, the only thing missing is a call to the :bro:id:`Log::write` +function to send data to the logging framework. The actual event handler +where this should take place will depend on where your data becomes available. +In this example, the connection_established event provides our data, and we +also store a copy of the data being logged into the :bro:type:`connection` +record: .. code:: bro - event bro_init() + event connection_established(c: connection) { - # Remove the filter called "default". - Log::remove_filter(Conn::LOG, "default"); + local rec: Foo::Info = [$ts=network_time(), $id=c$id]; + + # Store a copy of the data in the connection record so other + # event handlers can access it. + c$foo = rec; + + Log::write(Foo::LOG, rec); } -An alternate approach to "turning off" a log is to completely disable -the stream: +When you are developing scripts that add data to the :bro:type:`connection` +record, care must be given to when and how long data is stored. +Normally data saved to the connection record will remain there for the +duration of the connection and from a practical perspective it's not +uncommon to need to delete that data before the end of the connection. -.. code:: bro - event bro_init() - { - Log::disable_stream(Conn::LOG); - } +Add Fields to a Log +------------------- -If you want to skip only some fields but keep the rest, there is a -corresponding ``exclude`` filter attribute that you can use instead of -``include`` to list only the ones you are not interested in. +You can add additional fields to a log by extending the record +type that defines its content, and setting a value for the new fields +before each log record is written. -A filter can also determine output paths *dynamically* based on the -record being logged. That allows, e.g., to record local and remote -connections into separate files. To do this, you define a function -that returns the desired path: - -.. code:: bro - - function split_log(id: Log::ID, path: string, rec: Conn::Info) : string - { - # Return "conn-local" if originator is a local IP, otherwise "conn-remote". - local lr = Site::is_local_addr(rec$id$orig_h) ? "local" : "remote"; - return fmt("%s-%s", path, lr); - } - - event bro_init() - { - local filter: Log::Filter = [$name="conn-split", $path_func=split_log, $include=set("ts", "id.orig_h")]; - Log::add_filter(Conn::LOG, filter); - } - -Running this will now produce two files, ``local.log`` and -``remote.log``, with the corresponding entries. One could extend this -further for example to log information by subnets or even by IP -address. Be careful, however, as it is easy to create many files very -quickly ... - -.. sidebar:: A More Generic Path Function - - The ``split_log`` method has one draw-back: it can be used - only with the :bro:enum:`Conn::LOG` stream as the record type is hardcoded - into its argument list. However, Bro allows to do a more generic - variant: - - .. code:: bro - - function split_log(id: Log::ID, path: string, rec: record { id: conn_id; } ) : string - { - return Site::is_local_addr(rec$id$orig_h) ? "local" : "remote"; - } - - This function can be used with all log streams that have records - containing an ``id: conn_id`` field. - -While so far we have seen how to customize the columns being logged, -you can also control which records are written out by providing a -predicate that will be called for each log record: - -.. code:: bro - - function http_only(rec: Conn::Info) : bool - { - # Record only connections with successfully analyzed HTTP traffic - return rec$service == "http"; - } - - event bro_init() - { - local filter: Log::Filter = [$name="http-only", $path="conn-http", $pred=http_only]; - Log::add_filter(Conn::LOG, filter); - } - -This will result in a log file ``conn-http.log`` that contains only -traffic detected and analyzed as HTTP traffic. - -Extending ---------- - -You can add further fields to a log stream by extending the record -type that defines its content. Let's say we want to add a boolean -field ``is_private`` to :bro:type:`Conn::Info` that indicates whether the -originator IP address is part of the :rfc:`1918` space: +Let's say we want to add a boolean field ``is_private`` to +:bro:type:`Conn::Info` that indicates whether the originator IP address +is part of the :rfc:`1918` space: .. code:: bro @@ -218,9 +151,21 @@ originator IP address is part of the :rfc:`1918` space: is_private: bool &default=F &log; }; +As this example shows, when extending a log stream's "Info" record, each +new field must always be declared either with a ``&default`` value or +as ``&optional``. Furthermore, you need to add the ``&log`` attribute +or otherwise the field won't appear in the log file. -Now we need to set the field. A connection's summary is generated at -the time its state is removed from memory. We can add another handler +Now we need to set the field. Although the details vary depending on which +log is being extended, in general it is important to choose a suitable event +in which to set the additional fields because we need to make sure that +the fields are set before the log record is written. Sometimes the right +choice is the same event which writes the log record, but at a higher +priority (in order to ensure that the event handler that sets the additional +fields is executed before the event handler that writes the log record). + +In this example, since a connection's summary is generated at +the time its state is removed from memory, we can add another handler at that time that sets our field correctly: .. code:: bro @@ -232,31 +177,56 @@ at that time that sets our field correctly: } Now ``conn.log`` will show a new field ``is_private`` of type -``bool``. +``bool``. If you look at the Bro script which defines the connection +log stream :doc:`/scripts/base/protocols/conn/main.bro`, you will see +that the connection log record is written in an event handler for the +same event as used in this example to set the additional fields, but at a +lower priority than the one used in this example. -Notes: +For extending logs this way, one needs a bit of knowledge about how +the script that creates the log stream is organizing its state +keeping. Most of the standard Bro scripts attach their log state to +the :bro:type:`connection` record where it can then be accessed, just +like ``c$conn`` above. For example, the HTTP analysis adds a field +``http`` of type :bro:type:`HTTP::Info` to the :bro:type:`connection` +record. -- For extending logs this way, one needs a bit of knowledge about how - the script that creates the log stream is organizing its state - keeping. Most of the standard Bro scripts attach their log state to - the :bro:type:`connection` record where it can then be accessed, just - as the ``c$conn`` above. For example, the HTTP analysis adds a field - ``http`` of type :bro:type:`HTTP::Info` to the :bro:type:`connection` - record. See the script reference for more information. -- When extending records as shown above, the new fields must always be - declared either with a ``&default`` value or as ``&optional``. - Furthermore, you need to add the ``&log`` attribute or otherwise the - field won't appear in the output. - -Hooking into the Logging ------------------------- +Define a Logging Event +---------------------- Sometimes it is helpful to do additional analysis of the information being logged. For these cases, a stream can specify an event that will -be generated every time a log record is written to it. All of Bro's -default log streams define such an event. For example, the connection -log stream raises the event :bro:id:`Conn::log_conn`. You +be generated every time a log record is written to it. To do this, we +need to modify the example module shown above to look something like this: + +.. code:: bro + + module Foo; + + export { + redef enum Log::ID += { LOG }; + + type Info: record { + ts: time &log; + id: conn_id &log; + }; + + # Define a logging event. By convention, this is called + # "log_". + global log_foo: event(rec: Info); + } + + event bro_init() &priority=5 + { + # Here we specify both the "Info" record type and the "log_foo" + # event. + Log::create_stream(Foo::LOG, [$columns=Info, $ev=log_foo, + $path="foo"]); + } + +All of Bro's default log streams define such an event. For example, the +connection log stream raises the event :bro:id:`Conn::log_conn`. You could use that for example for flagging when a connection to a specific destination exceeds a certain duration: @@ -281,15 +251,193 @@ externally with Perl scripts. Much of what such an external script would do later offline, one may instead do directly inside of Bro in real-time. -Rotation --------- +Disable a Stream +---------------- -By default, no log rotation occurs, but it's globally controllable for all -filters by redefining the :bro:id:`Log::default_rotation_interval` option: +One way to "turn off" a log is to completely disable the stream. For +example, the following example will prevent the conn.log from being written: .. code:: bro - redef Log::default_rotation_interval = 1 hr; + event bro_init() + { + Log::disable_stream(Conn::LOG); + } + +Note that this must run after the stream is created, so the priority +of this event handler must be lower than the priority of the event handler +where the stream was created. + + +Filters +======= + +A stream has one or more filters attached to it (a stream without any filters +will not produce any log output). When a stream is created, it automatically +gets a default filter attached to it. This default filter can be removed +or replaced, or other filters can be added to the stream. This is accomplished +by using either the :bro:id:`Log::add_filter` or :bro:id:`Log::remove_filter` +function. This section shows how to use filters to do such tasks as +rename a log file, split the output into multiple files, control which +records are written, and set a custom rotation interval. + +Rename Log File +--------------- + +Normally, the log filename for a given log stream is determined when the +stream is created, unless you explicitly specify a different one by adding +a filter. + +The easiest way to change a log filename is to simply replace the +default log filter with a new filter that specifies a value for the "path" +field. In this example, "conn.log" will be changed to "myconn.log": + +.. code:: bro + + event bro_init() + { + # Replace default filter for the Conn::LOG stream in order to + # change the log filename. + + local f = Log::get_filter(Conn::LOG, "default"); + f$path = "myconn"; + Log::add_filter(Conn::LOG, f); + } + +Keep in mind that the "path" field of a log filter never contains the +filename extension. The extension will be determined later by the log writer. + +Add a New Log File +------------------ + +Normally, a log stream writes to only one log file. However, you can +add filters so that the stream writes to multiple files. This is useful +if you want to restrict the set of fields being logged to the new file. + +In this example, a new filter is added to the Conn::LOG stream that writes +two fields to a new log file: + +.. code:: bro + + event bro_init() + { + # Add a new filter to the Conn::LOG stream that logs only + # timestamp and originator address. + + local filter: Log::Filter = [$name="orig-only", $path="origs", + $include=set("ts", "id.orig_h")]; + Log::add_filter(Conn::LOG, filter); + } + + +Notice how the "include" filter attribute specifies a set that limits the +fields to the ones given. The names correspond to those in the +:bro:type:`Conn::Info` record (however, because the "id" field is itself a +record, we can specify an individual field of "id" by the dot notation +shown in the example). + +Using the code above, in addition to the regular ``conn.log``, you will +now also get a new log file ``origs.log`` that looks like the regular +``conn.log``, but will have only the fields specified in the "include" +filter attribute. + +If you want to skip only some fields but keep the rest, there is a +corresponding ``exclude`` filter attribute that you can use instead of +``include`` to list only the ones you are not interested in. + +If you want to make this the only log file for the stream, you can +remove the default filter: + +.. code:: bro + + event bro_init() + { + # Remove the filter called "default". + Log::remove_filter(Conn::LOG, "default"); + } + +Determine Log Path Dynamically +------------------------------ + +Instead of using the "path" filter attribute, a filter can determine +output paths *dynamically* based on the record being logged. That +allows, e.g., to record local and remote connections into separate +files. To do this, you define a function that returns the desired path, +and use the "path_func" filter attribute: + +.. code:: bro + + function myfunc(id: Log::ID, path: string, rec: Conn::Info) : string + { + # Return "conn-local" if originator is a local IP, otherwise + # return "conn-remote". + + local r = Site::is_local_addr(rec$id$orig_h) ? "local" : "remote"; + return fmt("%s-%s", path, r); + } + + event bro_init() + { + local filter: Log::Filter = [$name="conn-split", + $path_func=myfunc, $include=set("ts", "id.orig_h")]; + Log::add_filter(Conn::LOG, filter); + } + +Running this will now produce two new files, ``conn-local.log`` and +``conn-remote.log``, with the corresponding entries. One could extend this +further for example to log information by subnets or even by IP +address. Be careful, however, as it is easy to create many files very +quickly. + +The ``myfunc`` function has one drawback: it can be used +only with the :bro:enum:`Conn::LOG` stream as the record type is hardcoded +into its argument list. However, Bro allows to do a more generic +variant: + +.. code:: bro + + function myfunc(id: Log::ID, path: string, + rec: record { id: conn_id; } ) : string + { + local r = Site::is_local_addr(rec$id$orig_h) ? "local" : "remote"; + return fmt("%s-%s", path, r); + } + +This function can be used with all log streams that have records +containing an ``id: conn_id`` field. + +Filter Log Records +------------------ + +We have seen how to customize the columns being logged, but +you can also control which records are written out by providing a +predicate that will be called for each log record: + +.. code:: bro + + function http_only(rec: Conn::Info) : bool + { + # Record only connections with successfully analyzed HTTP traffic + return rec$service == "http"; + } + + event bro_init() + { + local filter: Log::Filter = [$name="http-only", $path="conn-http", + $pred=http_only]; + Log::add_filter(Conn::LOG, filter); + } + +This will result in a new log file ``conn-http.log`` that contains only +traffic detected and analyzed as HTTP traffic. + +Rotation +-------- + +The log rotation interval is globally controllable for all +filters by redefining the :bro:id:`Log::default_rotation_interval` option +(note that when using BroControl, this option is set automatically via +the BroControl configuration). Or specifically for certain :bro:type:`Log::Filter` instances by setting their ``interv`` field. Here's an example of changing just the @@ -301,90 +449,62 @@ their ``interv`` field. Here's an example of changing just the { local f = Log::get_filter(Conn::LOG, "default"); f$interv = 1 min; - Log::remove_filter(Conn::LOG, "default"); Log::add_filter(Conn::LOG, f); } -ASCII Writer Configuration --------------------------- +Writers +======= -The ASCII writer has a number of options for customizing the format of -its output, see :doc:`/scripts/base/frameworks/logging/writers/ascii.bro`. +Each filter has a writer. If you do not specify a writer when adding a +filter to a stream, then the ASCII writer is the default. -Adding Streams -============== +There are two ways to specify a non-default writer. To change the default +writer for all log filters, just redefine the :bro:id:`Log::default_writer` +option. Alternatively, you can specify the writer to use on a per-filter +basis by setting a value for the filter's "writer" field. Consult the +documentation of the writer to use to see if there are other options that are +needed. -It's easy to create a new log stream for custom scripts. Here's an -example for the ``Foo`` module: +ASCII Writer +------------ + +The ASCII writer has a number of options for customizing the format of its +output, see :doc:`/scripts/base/frameworks/logging/writers/ascii.bro`. +If you change the output format options, then be careful to check whether +your postprocessing scripts can still recognize your log files. + +Some writer options are global (i.e., they affect all log filters using +that log writer). For example, to change the output format of all ASCII +logs to JSON format: .. code:: bro - module Foo; + redef LogAscii::use_json = T; - export { - # Create an ID for our new stream. By convention, this is - # called "LOG". - redef enum Log::ID += { LOG }; +Some writer options are filter-specific (i.e., they affect only the filters +that explicitly specify the option). For example, to change the output +format of the ``conn.log`` only: - # Define the fields. By convention, the type is called "Info". - type Info: record { - ts: time &log; - id: conn_id &log; - }; +.. code:: bro - # Define a hook event. By convention, this is called - # "log_". - global log_foo: event(rec: Info); - - } - - # This event should be handled at a higher priority so that when - # users modify your stream later and they do it at priority 0, - # their code runs after this. - event bro_init() &priority=5 + event bro_init() { - # Create the stream. This also adds a default filter automatically. - Log::create_stream(Foo::LOG, [$columns=Info, $ev=log_foo, $path="foo"]); + local f = Log::get_filter(Conn::LOG, "default"); + # Use tab-separated-value mode + f$config = table(["tsv"] = "T"); + Log::add_filter(Conn::LOG, f); } -You can also add the state to the :bro:type:`connection` record to make -it easily accessible across event handlers: - -.. code:: bro - - redef record connection += { - foo: Info &optional; - } - -Now you can use the :bro:id:`Log::write` method to output log records and -save the logged ``Foo::Info`` record into the connection record: - -.. code:: bro - - event connection_established(c: connection) - { - local rec: Foo::Info = [$ts=network_time(), $id=c$id]; - c$foo = rec; - Log::write(Foo::LOG, rec); - } - -See the existing scripts for how to work with such a new connection -field. A simple example is :doc:`/scripts/base/protocols/syslog/main.bro`. - -When you are developing scripts that add data to the :bro:type:`connection` -record, care must be given to when and how long data is stored. -Normally data saved to the connection record will remain there for the -duration of the connection and from a practical perspective it's not -uncommon to need to delete that data before the end of the connection. Other Writers ------------- -Bro supports the following built-in output formats other than ASCII: +Bro supports the following additional built-in output formats: .. toctree:: :maxdepth: 1 logging-input-sqlite -Further formats are available as external plugins. +Additional writers are available as external plugins. +