===================== Loading Data into Bro ===================== .. rst-class:: opening Bro comes with a flexible input interface that allows to read previously stored data. Data is either read into bro tables or sent to scripts using events. This document describes how the input framework can be used. .. contents:: Terminology =========== Bro's input framework is built around three main abstracts, that are very similar to the abstracts used in the logging framework: Input Streams An input stream corresponds to a single input source (usually a textfile). It defined the information necessary to find the source (e.g. the filename) Filters Each input stream has a set of filters attached to it, that determine exaclty what kind of information is read. There are two different kind of streams, event streams and table streams. By default, event streams generate an event for each line read from the input source. Table streams on the other hand read the input source in a bro table for easy later access. Readers A reader defines the input format for the specific input stream. At the moment, Bro comes with two types of reader. The default reader is READER_ASCII, which can read the tab seperated ASCII logfiles that were generated by the logging framework. READER_RAW can files containing records separated by a character(like e.g. newline) and send one event per line. Basics ====== For examples, please look at the unit tests in ``testing/btest/scripts/base/frameworks/input/``. A very basic example to open an input stream is: .. code:: bro module Foo; export { # Create an ID for our new stream redef enum Input::ID += { INPUT }; } event bro_init() { Input::create_stream(FOO::INPUT, [$source="input.log"]); } The fields that can be set when creating a stream are: ``source`` A mandatory string identifying the source of the data. For the ASCII reader this is the filename. ``reader`` The reader used for this stream. Default is ``READER_ASCII``. ``mode`` The mode in which the stream is opened. Possible values are ``MANUAL``, ``REREAD`` and ``STREAM``. Default is ``MANUAL``. ``MANUAL`` means, that the files is not updated after it has been read. Changes to the file will not be reflected in the data bro knows. ``REREAD`` means that the whole file is read again each time a change is found. This should be used for files that are mapped to a table where individual lines can change. ``STREAM`` means that the data from the file is streamed. Events / table entries will be generated as new data is added to the file. ``autostart`` If set to yes, the first update operation is triggered automatically after the first filter has been added to the stream. This has to be set to false if several filters are added to the input source. In this case Input::force_update has to be called manually once after all filters have been added. Filters ======= Each filter defines the data fields that it wants to receive from the respective input file. Depending on the type of filter, events or a table are created from the data in the source file. Event Filters ------------- Event filters are filters that generate an event for each line in of the input source. For example, a simple filter retrieving the fields ``i`` and ``b`` from an inputSource could be defined as follows: .. code:: bro type Val: record { i: int; b: bool; }; event line(tpe: Input::Event, i: int, b: bool) { # work with event data } event bro_init { # Input stream definition, etc ... Input::add_eventfilter(Foo::INPUT, [$name="input", $fields=Val, $ev=line]); } The fields that can be set for an event filter are: ``name`` A mandatory name for the filter that can later be used to manipulate it further. ``fields`` Name of a record type containing the fields, which should be retrieved from the input stream. ``ev`` The event which is fired, after a line has been read from the input source. The first argument that is passed to the event is an Input::Event structure, followed by the data, either inside of a record (if ``want_record is set``) or as individual fields. The Input::Event structure can contain information, if the received line is ``NEW``, has been ``CHANGED`` or ``DELETED``. Singe the ascii reader cannot track this information for event filters, the value is always ``NEW`` at the moment. ``want_record`` Boolean value, that defines if the event wants to receive the fields inside of a single record value, or individually (default). Table Filters ------------- Table filters are the second, more complex type of filter. Table filters store the information they read from an input source in a bro table. For example, when reading a file that contains ip addresses and connection attemt information one could use an approach similar to this: .. code:: bro type Idx: record { a: addr; }; type Val: record { tries: count; }; global conn_attempts: table[addr] of count = table(); event bro_init { # Input stream definitions, etc. ... Input::add_tablefilter(Foo::INPUT, [$name="ssh", $idx=Idx, $val=Val, $destination=conn_attempts]); # read the file after all filters have been set (only needed if autostart is set to false) Input::force_update(Foo::INPUT); } The table conn_attempts will then contain the information about connection attemps. The possible fields that can be set for an table filter are: ``name`` A mandatory name for the filter that can later be used to manipulate it further. ``idx`` Record type that defines the index of the table ``val`` Record type that defines the values of the table ``want_record`` Defines if the values of the table should be stored as a record (default), or as a simple value. Has to be set if Val contains more than one element. ``destination`` The destination table ``ev`` Optional event that is raised, when values are added to, changed in or deleted from the table. Events are passed an Input::Event description as the first argument, the index record as the second argument and the values as the third argument. ``pred`` Optional predicate, that can prevent entries from being added to the table and events from being sent.