===================== Loading Data into Bro ===================== .. rst-class:: opening Bro comes with a flexible input interface that allows to read previously stored data. Data is either read into bro tables or sent to scripts using events. This document describes how the input framework can be used. .. contents:: Terminology =========== Bro's input framework is built around three main abstracts, that are very similar to the abstracts used in the logging framework: Input Streams An input stream corresponds to a single input source (usually a textfile). It defined the information necessary to find the source (e.g. the filename), the reader that it used to get data from it (see below). It also defines exactly what data is read from the input source. There are two different kind of streams, event streams and table streams. By default, event streams generate an event for each line read from the input source. Table streams on the other hand read the input source in a bro table for easy later access. Readers A reader defines the input format for the specific input stream. At the moment, Bro comes with two types of reader. The default reader is READER_ASCII, which can read the tab seperated ASCII logfiles that were generated by the logging framework. READER_RAW can files containing records separated by a character(like e.g. newline) and send one event per line. Event Streams ============= For examples, please look at the unit tests in ``testing/btest/scripts/base/frameworks/input/``. Event Streams are streams that generate an event for each line in of the input source. For example, a simple stream retrieving the fields ``i`` and ``b`` from an inputSource could be defined as follows: .. code:: bro type Val: record { i: int; b: bool; }; event line(description: Input::EventDescription, tpe: Input::Event, i: int, b: bool) { # work with event data } event bro_init { Input::add_event([$source="input.log", $name="input", $fields=Val, $ev=line]); } The fields that can be set for an event stream are: ``want_record`` Boolean value, that defines if the event wants to receive the fields inside of a single record value, or individually (default). ``source`` A mandatory string identifying the source of the data. For the ASCII reader this is the filename. ``reader`` The reader used for this stream. Default is ``READER_ASCII``. ``mode`` The mode in which the stream is opened. Possible values are ``MANUAL``, ``REREAD`` and ``STREAM``. Default is ``MANUAL``. ``MANUAL`` means, that the files is not updated after it has been read. Changes to the file will not be reflected in the data bro knows. ``REREAD`` means that the whole file is read again each time a change is found. This should be used for files that are mapped to a table where individual lines can change. ``STREAM`` means that the data from the file is streamed. Events / table entries will be generated as new data is added to the file. ``name`` A mandatory name for the stream that can later be used to remove it. ``fields`` Name of a record type containing the fields, which should be retrieved from the input stream. ``ev`` The event which is fired, after a line has been read from the input source. The first argument that is passed to the event is an Input::Event structure, followed by the data, either inside of a record (if ``want_record is set``) or as individual fields. The Input::Event structure can contain information, if the received line is ``NEW``, has been ``CHANGED`` or ``DELETED``. Singe the ascii reader cannot track this information for event filters, the value is always ``NEW`` at the moment. Table Streams ============= Table streams are the second, more complex type of input streams. Table streams store the information they read from an input source in a bro table. For example, when reading a file that contains ip addresses and connection attemt information one could use an approach similar to this: .. code:: bro type Idx: record { a: addr; }; type Val: record { tries: count; }; global conn_attempts: table[addr] of count = table(); event bro_init { Input::add_table([$source="input.txt", $name="input", $idx=Idx, $val=Val, $destination=conn_attempts]); } The table conn_attempts will then contain the information about connection attemps. The possible fields that can be set for an table stream are: ``want_record`` Boolean value, that defines if the event wants to receive the fields inside of a single record value, or individually (default). ``source`` A mandatory string identifying the source of the data. For the ASCII reader this is the filename. ``reader`` The reader used for this stream. Default is ``READER_ASCII``. ``mode`` The mode in which the stream is opened. Possible values are ``MANUAL``, ``REREAD`` and ``STREAM``. Default is ``MANUAL``. ``MANUAL`` means, that the files is not updated after it has been read. Changes to the file will not be reflected in the data bro knows. ``REREAD`` means that the whole file is read again each time a change is found. This should be used for files that are mapped to a table where individual lines can change. ``STREAM`` means that the data from the file is streamed. Events / table entries will be generated as new data is added to the file. ``name`` A mandatory name for the filter that can later be used to manipulate it further. ``idx`` Record type that defines the index of the table ``val`` Record type that defines the values of the table ``want_record`` Defines if the values of the table should be stored as a record (default), or as a simple value. Has to be set if Val contains more than one element. ``destination`` The destination table ``ev`` Optional event that is raised, when values are added to, changed in or deleted from the table. Events are passed an Input::Event description as the first argument, the index record as the second argument and the values as the third argument. ``pred`` Optional predicate, that can prevent entries from being added to the table and events from being sent.