Moving docs from web server into distribution.

Note that while these aren't remove yet from www, they will be soon
and these are now the authorative files.
This commit is contained in:
Robin Sommer 2011-10-10 18:54:13 -07:00
parent de999fb0dd
commit 4f8a7c95aa
6 changed files with 1792 additions and 0 deletions

76
doc/cluster.rst Normal file
View file

@ -0,0 +1,76 @@
Bro Cluster
===========
Intro
------
Bro is not multithreaded, so once the limitations of a single processor core are reached, the only option currently is to spread the workload across many cores or even many physical computers. The cluster deployment scenario for Bro is the current solution to build these larger systems. The accompanying tools and scripts provide the structure to easily manage many Bro processes examining packets and doing correlation activities but acting as a singular, cohesive entity.
Architecture
---------------
There are 4 main components to a Bro cluster which are described in detail below.
Frontend
********
This is a discrete hardware device or on-host technique that will split your traffic into many streams or flows. The Bro binary does not do this job. There are numerous ways to accomplish this task, some of which are described below in the Frontend <link to that section> section.
Manager
*******
This is a Bro process which has two primary jobs. It receives log messages and notices from the rest of the nodes in the cluster using the Bro communications protocol. The result is that you will end up with single logs for each log instead of many discrete logs that you have to later combine in some manner with post processing. The manager also takes the opportunity to de-duplicate notices and it has the ability to do so since its acting as the choke point for notices and how notices might be processed into actions such as emailing, paging, or blocking.
The manager process is started first by BroControl and it only opens its designated port and waits for connections, it doesnt initiate any connections to the rest of the cluster. Once the workers are started and connect to the manager, logs and notices will start arriving to the manager process from the workers.
Proxy
*****
This is a Bro process which manages synchronized state. Variables can be synchronized across connected Bro processes automatically in Bro and proxies will help the workers by alleviating the need for all of the workers to connect directly to each other.
Examples of synchronized state from the scripts that ship with Bro are things such as the full list of “known” hosts and services which are hosts or services which have been detected as performing full TCP handshakes or an analyzed protocol has been found on the connection. If worker A detects host 1.2.3.4 as an active host, it would be beneficial for worker B to know that as well so worker A shares that information as an insertion to a set <link to set documentation would be good here> which travels to the clusters proxy and the proxy then sends that same set insertion to worker B. The result is that worker A and worker B have shared knowledge about host and services that are active on the network being monitored.
The proxy model extends to having multiple proxies as well if necessary for performance reasons, it only adds one additional step for the Bro processes. Each proxy connects to another proxy in a ring and the workers are shared between them as evenly as possible. When a proxy receives some new bit of state, it will share that with its proxy which is then shared around the ring of proxies and down to all of the workers. From a practical standpoint, there are no rules of thumb established yet for the number of proxies necessary for the number of workers they are serving. Best is to start with a single proxy and add more if communication performance problems are found.
Bro processes acting as proxies dont tend to be extremely intense to CPU or memory and users frequently run proxy processes on the same physical host as the manager.
Worker
******
This is the Bro process that sniffs network traffic and does protocol analysis on the reassembled traffic streams. Most of the work of an active cluster takes place on the workers and as such, the workers typically represent the bulk of the Bro processes that are running in a cluster. The fastest memory and CPU core speed you can afford is best here since all of the protocol parsing and most analysis will take place here. There are no particular requirements for the disks in workers since almost all logging is done remotely to the manager and very little is normally written to disk.
The rule of thumb we have followed recently is to allocate approximately 1 core for every 80Mbps of traffic that is being analyzed, however this estimate could be extremely traffic mix specific. It has generally worked for mixed traffic with many users and servers. For example, if your traffic peaks around 2Gbps (combined) and you want to handle traffic at peak load, you may want to have 26 cores available (2048 / 80 == 25.6). If the 80Mbps estimate works for your traffic, this could be handled by 3 physical hosts dedicated to being workers with each one containing dual 6-core processors.
Once a flow based load balancer is put into place this model is extremely easy to scale as well so its recommended that you guess at the amount of hardware you will need to fully analyze your traffic. If it turns out that you need more, its relatively easy to easy increase the size of the cluster in most cases.
Frontend Options
----------------
There are many options for setting up a frontend flow distributor and in many cases it may even be beneficial to do multiple stages of flow distribution on the network and on the host.
Discrete hardware flow balancers
********************************
cPacket
^^^^^^^
If you are monitoring one or more 10G physical interfaces, the recommended solution is to use either a cFlow or cVu device from cPacket (is this too much? i dont want to recommend gigamon with all of the problems weve had various places and i dont know enough about VSS monitorings offerings to make a judgement. i dont want to recommend netoptics director hardware either, from what i understand it doesnt fit the use case very well). These devices will perform layer-2 load balancing by rewriting the destination ethernet MAC address to cause each packet associated with a particular flow to have the same destination MAC. The packets can then be passed directly to a monitoring host or onward to a commodity switch to split the traffic out to multiple 1G interfaces for the workers. This can ultimately greatly reduce costs since workers can use relatively inexpensive 1G interfaces.
OpenFlow Switches
^^^^^^^^^^^^^^^^^
We are currently exploring the use of OpenFlow based switches to do flow based load balancing directly on the switch which can greatly reduce frontend costs for many users. This document will be updated when we have more information.
On host flow balancing
**********************
PF_RING
^^^^^^^
The PF_RING software for Linux has a “clustering” feature which will do flow based load balancing across a number of processes that are sniffing the same interface. This will allow you to easily take advantage of multiple cores in a single physical host because Bros main event loop is single threaded and cant natively utilize all of the cores. More information about Bro with PF_RING can be found here: (someone want to write a quick Bro/PF_RING tutorial to link to here? document installing kernel module, libpcap wrapper, building Bro with the --with-pcap configure option)
Netmap
^^^^^^
FreeBSD has an in-progress project named Netmap which will enabled flow based load balancing as well. When it becomes viable for real world use, this document will be updated.
Click! Software Router
^^^^^^^^^^^^^^^^^^^^^^
Click! can be used for flow based load balancing with a simple configuration. (link to an example for the config). This solution is not recommended on Linux due to Bros PF_RING support and only as a last resort on other operating systems since it causes a lot of overhead due to context switching back and forth between kernel and userland several times per packet.

352
doc/logging.rst Normal file
View file

@ -0,0 +1,352 @@
==========================
Customizing Bro's Logging
==========================
.. class:: opening
Bro comes with a flexible key-value based logging interface that
allows fine-grained control of what gets logged and how it is
logged. This document describes how logging can be customized and
extended.
.. contents::
Terminology
===========
Bro's logging interface is built around three main abstractions:
Log streams
A stream corresponds to a single log. It defines the set of
fields that a log consists of with their names and fields.
Examples are the ``conn`` for recording connection summaries,
and the ``http`` stream for recording HTTP activity.
Filters
Each stream has a set of filters attached to it that determine
what information gets written out. By default, each stream has
one default filter that just logs everything directly to disk
with an automatically generated file name. However, further
filters can be added to record only a subset, split a stream
into different outputs, or to even duplicate the log to
multiple outputs. If all filters are removed from a stream,
all output is disabled.
Writers
A writer defines the actual output format for the information
being logged. At the moment, Bro comes with only one type of
writer, which produces tab separated ASCII files. In the
future we will add further writers, like for binary output and
direct logging into a database.
Basics
======
The data fields that a stream records are defined by a record type
specified when it is created. Let's look at the script generating
Bro's connection summaries as an example,
``base/protocols/conn/main.bro``. It defines a record ``Conn::Info``
that lists all the fields that go into ``conn.log``, each marked with
a ``&log`` attribute indicating that it is part of the information
written out. To write a log record, the script then passes an instance
of ``Conn::Info`` to the logging framework's ``Log::write`` function.
By default, each stream automatically gets a filter named ``default``
that generates the normal output by recording all record fields into a
single output file.
In the following, we summarize ways in which the logging can be
customized. We continue using the connection summaries as our example
to work with.
Filtering
---------
To create new a new output file for an existing stream, you can add a
new filter. A filter can, e.g., restrict the set of fields being
logged:
.. code:: bro:
event bro_init()
{
# Add a new filter to the Conn::LOG stream that logs only
# timestamp and originator address.
local filter: Log::Filter = [$name="orig-only", $path="origs", $include=set("ts", "id.orig_h")];
Log::add_filter(Conn::LOG, filter);
}
Note the fields that are set for the filter:
``name``
A mandatory name for the filter that can later be used
to manipulate it further.
``path``
The filename for the output file, without any extension (which
may be automatically added by the writer). Default path values
are generated by taking the stream's ID and munging it
slightly. ``Conn::LOG`` is converted into ``conn``,
``PacketFilter::LOG`` is converted into ``packet_filter``, and
``Notice::POLICY_LOG`` is converted into ``notice_policy``.
``include``
A set limiting the fields to the ones given. The names
correspond to those in the ``Conn::LOG`` record, with
sub-records unrolled by concatenating fields (separated with
dots).
Using the code above, you will now get a new log file ``origs.log``
that looks like this::
#separator \x09
#path origs
#fields ts id.orig_h
#types time addr
1128727430.350788 141.42.64.125
1128727435.450898 141.42.64.125
If you want to make this the only log file for the stream, you can
remove the default filter (which, conveniently, has the name
``default``):
.. code:: bro
event bro_init()
{
# Remove the filter called "default".
Log::remove_filter(Conn::LOG, "default");
}
An alternate approach to "turning off" a log is to completely disable
the stream:
.. code:: bro
event bro_init()
{
Log::disable_stream(Conn::LOG);
}
If you want to skip only some fields but keep the rest, there is a
corresponding ``exclude`` filter attribute that you can use instead of
``include`` to list only the ones you are not interested in.
A filter can also determine output paths *dynamically* based on the
record being logged. That allows, e.g., to record local and remote
connections into separate files. To do this, you define a function
that returns the desired path:
.. code:: bro
function split_log(id: Log::ID, path: string, rec: Conn::Info) : string
{
# Return "conn-local" if originator is a local IP, otherwise "conn-remote".
local lr = Site::is_local_addr(rec$id$orig_h) ? "local" : "remote";
return fmt("%s-%s", path, lr);
}
event bro_init()
{
local filter: Log::Filter = [$name="conn-split", $path_func=split_log, $include=set("ts", "id.orig_h")];
Log::add_filter(Conn::LOG, filter);
}
Running this will now produce two files, ``local.log`` and
``remote.log``, with the corresponding entries. One could extend this
further for example to log information by subnets or even by IP
address. Be careful, however, as it is easy to create many files very
quickly ...
.. sidebar:
The show ``split_log`` method has one draw-back: it can be used
only with the ``Conn::Log`` stream as the record type is hardcoded
into its argument list. However, Bro allows to do a more generic
variant:
.. code:: bro
function split_log(id: Log::ID, path: string, rec: record { id: conn_id; } ) : string
{
return Site::is_local_addr(rec$id$orig_h) ? "local" : "remote";
}
This function can be used with all log streams that have records
containing an ``id: conn_id`` field.
While so far we have seen how to customize the columns being logged,
you can also control which records are written out by providing a
predicate that will be called for each log record:
.. code:: bro
function http_only(rec: Conn::Info) : bool
{
# Record only connections with successfully analyzed HTTP traffic
return rec$service == "http";
}
event bro_init()
{
local filter: Log::Filter = [$name="http-only", $path="conn-http", $pred=http_only];
Log::add_filter(Conn::LOG, filter);
}
This will results in a log file ``conn-http.log`` that contains only
traffic detected and analyzed as HTTP traffic.
Extending
---------
You can add further fields to a log stream by extending the record
type that defines its content. Let's say we want to add a boolean
field ``is_private`` to ``Conn::Info`` that indicates whether the
originator IP address is part of the RFC1918 space:
.. code:: bro
# Add a field to the connection log record.
redef record Conn::Info += {
## Indicate if the originator of the connection is part of the
## "private" address space defined in RFC1918.
is_private: bool &default=F &log;
};
Now we need to set the field. A connection's summary is generated at
the time its state is removed from memory. We can add another handler
at that time that sets our field correctly:
.. code:: bro
event connection_state_remove(c: connection)
{
if ( c$id$orig_h in Site::private_address_space )
c$conn$is_private = T;
}
Now ``conn.log`` will show a new field ``is_private`` of type
``bool``.
Notes:
- For extending logs this way, one needs a bit of knowledge about how
the script that creates the log stream is organizing its state
keeping. Most of the standard Bro scripts attach their log state to
the ``connection`` record where it can then be accessed, just as the
``c$conn`` above. For example, the HTTP analysis adds a field ``http
: HTTP::Info`` to the ``connection`` record. See the script
reference for more information.
- When extending records as shown above, the new fields must always be
declared either with a ``&default`` value or as ``&optional``.
Furthermore, you need to add the ``&log`` attribute or otherwise the
field won't appear in the output.
Hooking into the Logging
------------------------
Sometimes it is helpful to do additional analysis of the information
being logged. For these cases, a stream can specify an event that will
be generated every time a log record is written to it. All of Bro's
default log streams define such an event. For example, the connection
log stream raises the event ``Conn::log_conn(rec: Conn::Info)``: You
could use that for example for flagging when an a connection to
specific destination exceeds a certain duration:
.. code:: bro
redef enum Notice::Type += {
## Indicates that a connection remained established longer
## than 5 minutes.
Long_Conn_Found
};
event Conn::log_conn(rec: Conn::Info)
{
if ( rec$duration > 5mins )
NOTICE([$note=Long_Conn_Found,
$msg=fmt("unsually long conn to %s", rec$id$resp_h),
$id=rec$id]);
}
Often, these events can be an alternative to post-processing Bro logs
externally with Perl scripts. Much of what such an external script
would do later offline, one may instead do directly inside of Bro in
real-time.
Rotation
--------
ASCII Writer Configuration
--------------------------
The ASCII writer has a number of options for customizing the format of
its output, see XXX.bro.
Adding Streams
==============
It's easy to create a new log stream for custom scripts. Here's an
example for the ``Foo`` module:
.. code:: bro
module Foo;
export {
# Create an ID for the our new stream. By convention, this is
# called "LOG".
redef enum Log::ID += { LOG };
# Define the fields. By convention, the type is called "Info".
type Info: record {
ts: time &log;
id: conn_id &log;
};
# Define a hook event. By convention, this is called
# "log_<stream>".
global log_foo: event(rec: Info);
}
# This event should be handled at a higher priority so that when
# users modify your stream later and they do it at priority 0,
# their code runs after this.
event bro_init() &priority=5
{
# Create the stream. This also adds a default filter automatically.
Log::create_stream(Foo::LOG, [$columns=Info, $ev=log_foo]);
}
You can also the state to the ``connection`` record to make it easily
accessible across event handlers:
.. code:: bro
redef record connection += {
foo: Info &optional;
}
Now you can use the ``Log::write`` method to output log records and
save the logged ``Foo::Info`` record into the connection record:
.. code:: bro
event connection_established(c: connection)
{
local rec: Foo::Info = [$ts=network_time(), $id=c$id];
c$foo = rec;
Log::write(Foo::LOG, rec);
}
See the existing scripts for how to work with such a new connection
field. A simple example is ``base/protocols/syslog/main.bro``.
When you are developing scripts that add data to the ``connection``
record, care must be given to when and how long data is stored.
Normally data saved to the connection record will remain there for the
duration of the connection and from a practical perspective it's not
uncommon to need to delete that data before the end of the connection.

232
doc/notices.rst Normal file
View file

@ -0,0 +1,232 @@
Telling Bro What's Important
============================
.. class:: opening
One of the easiest ways to customize Bro is writing a local
*notice policy*. Bro can detect a large number of potentially
interesting situations, and the notice policy tells which of them
the user wants to be escalated into *alarms*. The notice policy
can also specify further actions to be taken, such as sending an
email. This pages gives an introduction into writing such a notice
policy.
.. contents::
Overview
--------
Let us start with a little bit of background on Bro's philosophy on
reporting things. Bro ships with a large number of policy scripts
which perform a wide variety of analyses. Most of these scripts
monitor for activity which might be of interest for the administrator.
However, none of these scripts determines the importance of what it
finds itself. Instead, the scripts only flag situations as
*potentially* interesting, leaving it to the local configuration to
define which of them are in fact alarm-worthy. This decoupling of
detection and reporting allows Bro to address the different needs that
sites have: definitions of what constitutes an attack differ quite a
bit between environments, and activity deemed malicious at one place
might be fully acceptable at another.
Whenever one of Bro's analysis scripts sees something potentially
interesting, it flags the situation by raising a *Notice*. A Notice
has a *type*, which reflects the kind of activity that has been seen,
and it is usually also augmented with fruther *context* about the
situation. For example, whenever the HTTP analyzer sees a suspicious
URL being requested (such as ``/etc/passwd``), it raises a Notice of
the type HTTP_SensitiveURI and augments it with the requested URL
itself as well as the involved hosts.
In terms of script code, "raising a Notice" is just a call to a
predefined function called NOTICE . For example, to raise an
HTTP_SensitiveURI such a call could look like this:
.. code:: bro
NOTICE([$note=HTTP_SensitiveURI, $conn=connection, $URL=url, ...])
If one wants to know which types of Notices a Bro script can raise,
one can just grep the script for calls to the NOTICE function.
Once raised, all Notices are processed centrally. By default, all
Notices *are* in fact automatically turned into alarms and will
therefore show up in ``alarm.log``. The local site policy can however
change this default behavior, as we describe in the following.
In general, each raised Notice gets mapped to one out of a set of
predefined *actions*. Such an action can, e.g., be to send a mail to
the administrator or to simply ignore the Notice. Currently, the
following actions are defined:
.. list-table::
:widths: 20 80
:header-rows: 1
* - Action
- Description
* - ``NOTICE_IGNORE``
- Ignore Notice completely.
* - ``NOTICE_FILE``
- File Notice only to ``notice.log``; do not write an entry into
``alarm.log``.
* - ``NOTICE_ALARM_ALWAYS``
- Report in ``alarm.log``.
* - ``NOTICE_EMAIL``
- Send out a mail and report in ``alarm.log``
* - ``NOTICE_PAGE``
- Page security officer and report in ``alarm.log``.
* - ``NOTICE_DROP``
- Block connectivity for offending IP and report in ``alarm.log``.
``NOTICE_ALARM_ALWAYS`` reflects the default behavior if no other
action is defined for a Notice. All notice actions except
``NOTICE_IGNORE`` also log to ``notice.log`` .
We can define which action is taken for a Notice in two ways. The
first is to generally assign an action to all instances of a
particular Notice type; the second provides the flexibility to filter
individual Notice instances independent of their type. We discuss both
in turn.
Notice Action Filters
---------------------
To generally apply the same action to all instances of a specific
type, we assign a *notice action filter* to the type. In the most
simple case, such a filter does directly correspond to the intended
action, per the following table:
.. list-table::
:widths: 20 20
:header-rows: 1
* - Filter Name
- Action
* - ``ignore_notice``
- ``NOTICE_IGNORE``
* - ``file_notice``
- ``NOTICE_FILE``
* - ``send_email_notice``
- ``NOTICE_EMAIL``
* - ``send_page_notice``
- ``NOTICE_PAGE``
* - ``drop_source``
- ``NOTICE_DROP``
(As ``NOTICE_ALARM_ALWAYS`` is the default action, there is no
corresponding filter).
We map a Notice type to such a filter by adding an entry to Bro's
predefined ``notice_action_filters`` table. For example, to just file
all sensitive URIs into ``notice.log`` rather than turning them into
alarms, we define:
.. code:: bro
@load notice-action-filters
redef notice_action_filters += {
[HTTP_SensitiveURI] = file_notice
};
Notice action filters are more powerful than just directly defining an
action. Each filter is in fact a script function which gets the Notice
instance as a parameter and returns the action Bro should take. In
general, these functions can implement arbitrary schemes to settle on
an action, which is why they are called "filters". In addition to the
filters mentioned above (which just return the corresponding action
without further ado), Bro's default script
``notice-action-filters.bro`` also defines the following ones (and
more):
.. list-table::
:widths: 20 80
:header-rows: 1
* - Filter
- Description
* - ``tally_notice_type``
- Count how often each Notice type occurred. The totals are
reported when Bro terminates as new Notices of the type
``NoticeTally``. The original Notices are just filed into
``notice.log``.
* - ``tally_notice_type_and_ignore``
- Similar to ``tally_notice_type`` but discards original
Notices.
* - ``file_if_remote``
- Do not alarm if Notice was triggered by a remote address.
* - ``notice_alarm_per_orig``
- Alarm only the first time we see the Notice type for each
source address.
* - ``notice_alarm_per_orig_tally``
- Count Notice types per source address. Totals are reported, by
default, every 5 hours as new ``NoticeTally`` Notices. The
original Notices are just filed into ``notice.log``.
Notice Policy
-------------
The predefined set ``notice_policy`` provides the second way to define
an action to be taken for a Notice. While ``notice_action_filters``
maps all instances of a particular Notice type to the same filter,
``notice_policy`` works on individual Notice instances. Each entry of
``notice_policy`` defines (1) a condition to be matched against all
raised Notices, and (2) an action to be taken if the condition matches.
Here's a simple example which tells Bro to ignore all Notices of type
``HTTP_SensitiveURI`` if the requested URL indicates that an image was
requested (simplified example taken from
``policy/notice-policy.bro``):
.. code:: bro
redef notice_policy += {
[$pred(n: notice_info) = {
return n$note == HTTP::HTTP_SensitiveURI &&
n$URL == /.*\.(gif|jpg|png)/;
},
$result = NOTICE_IGNORE]
};
While the syntax might look a bit convoluted at first, it provides a
lot of flexibility by leveraging Bro's match-statement. ``$pred``
defines the entry's condition in the form of a predicate written as a
Bro function. The function gets passed the raised Notice and it
returns a boolean indicating whether the entry applies. If the
predicate evaluates to true, Bro takes the action specified by
``$result``. (If ``$result`` is omitted, the default action for a
matching entry is ``NOTICE_FILE``).
The ``notice_policy`` set can hold an arbitrary number of such
entries. For each Notice, Bro evaluates the predicates of all of them.
If multiple predicates evaluate to true, it is undefined which of the
matching results is taken. One can however associate a *priority* with
an entry by adding a field ``$priority=<int>`` to its definition; see
``policy/notice-policy.bro`` for examples. In the case of multiple
matches with different priorities, Bro picks the one with the highest.
If ``$priority`` is omitted, as it is in the example above, the
default priority is 1.

572
doc/quickstart.rst Normal file
View file

@ -0,0 +1,572 @@
.. _CMake: http://www.cmake.org
.. _SWIG: http://www.swig.org
.. _MacPorts: http://www.macports.org
.. _Fink: http://www.finkproject.org
.. _Homebrew: http://mxcl.github.com/homebrew
=================
Quick Start Guide
=================
.. class:: opening
The short story for getting Bro up and running in a simple configuration
for analysis of either live traffic from a network interface or a packet
capture trace file.
.. contents::
Installation
============
Bro works on most modern, Unix-based systems and requires no custom
hardware. It can be downloaded in either pre-built binary package or
source code forms.
Pre-Built Binary Release Packages
---------------------------------
See the `downloads page <http://www.bro-ids.org/download/index.html>`_ for currently
supported/targeted platforms.
The primary install prefix for binary packages is ``/opt/bro``.
Non-MacOS packages that include BroControl also put variable/runtime data
(e.g. bro logs) in ``/var/opt/bro``.
* RPM
.. console::
> sudo yum localinstall Bro-all*.rpm
* DEB
.. console::
> sudo gdebi Bro-all-*.deb
* MacOS Disk Image with Installer
Just open the ``Bro-all-*.dmg`` and then run the ``.pkg`` installer.
Everything installed by the package will go into ``/opt/bro``.
* FreeBSD
TODO: ports will eventually be available.
Building From Source
--------------------
Required Dependencies
~~~~~~~~~~~~~~~~~~~~~
* RPM/RedHat-based Linux:
.. console::
> sudo yum install cmake make gcc gcc-c++ flex bison libpcap-devel openssl-devel python-devel swig
* DEB/Debian-based Linux:
.. console::
> sudo apt-get install cmake make gcc g++ flex bison libpcap-dev libssl-dev python-dev swig
* FreeBSD
Most required dependencies should come with a minimal FreeBSD install
except for the following.
.. console::
> sudo pkg_add -r cmake swig bison python
* Mac OS X
Snow Leopard (10.6) comes with all required dependencies except for CMake_.
Lion (10.7) comes with all required dependencies except for CMake_ and SWIG_.
Distributions of these dependencies can be obtained from the project websites
linked above, but they're also likely available from your preferred Mac OS X
package management system (e.g. MacPorts_, Fink_, or Homebrew_).
Optional Dependencies
~~~~~~~~~~~~~~~~~~~~~
Bro can use libmagic for identifying file types, libGeoIP for geo-locating
IP addresses, libz for (de)compression during analysis and communication,
and sendmail for sending emails.
* RPM/RedHat-based Linux:
.. console::
> sudo yum install zlib-devel file-devel GeoIP-devel sendmail
* DEB/Debian-based Linux:
.. console::
> sudo apt-get install zlib1g-dev libmagic-dev libgeoip-dev sendmail
* Ports-based FreeBSD
(libz, libmagic, and sendmail are typically already available)
.. console::
> sudo pkg_add -r GeoIP
* Mac OS X
Vanilla OS X installations don't ship with libmagic or libGeoIP, but
if installed from your preferred package management system (e.g. MacPorts,
Fink Homebrew), they should be automatically detected and Bro will compile
against them.
Additional steps may be needed to `get the right GeoIP database
<http://www.bro-ids.org/documentation/geoip.html>`_.
Compiling Bro Source Code
~~~~~~~~~~~~~~~~~~~~~~~~~
Bro releases are bundled into source packages for convenience and
available from the `downloads page <http://www.bro-ids.org/download/index.html>`_.
The latest Bro development versions are obtainable through git repositories
hosted at `{{cfg_git_url}} <{{cfg_git_url}}>`_. See our `git development
documentation <http://www.bro-ids.org/development/process.html>`_ for comprehensive
information on Bro's use of git revision control, but the short story for
downloading the full source code experience for Bro via git is:
.. console::
git clone --recursive git://git.bro-ids.org/bro
.. note:: If you choose to clone the ``bro`` repository non-recursively for
a "minimal Bro experience", be aware that compiling it depends on
BinPAC, which has it's own ``binpac`` repository. Either install it
first or initizalize/update the cloned ``bro`` repository's
``aux/binpac`` submodule.
See the ``INSTALL`` file included with the source code for more information
on compiling, but this is the typical way to build and install from source
(of course, changing the value of the ``--prefix`` option to point to the
desired root install path):
.. console::
> ./configure --prefix=/desired/install/path
> make
> make install
The default installation prefix is ``/usr/local/bro``, which would typically
require root privileges when doing the ``make install``.
Configure the Run-Time Environment
----------------------------------
Just remember that you may need to adjust your ``PATH`` environment variable
according to the platform/shell/package you're using. For example:
Bourne-Shell Syntax:
.. console::
> export PATH=/usr/local/bro/bin:$PATH
C-Shell Syntax:
.. console::
> setenv PATH /usr/local/bro/bin:$PATH
Or substitute ``/opt/bro/bin`` instead if you installed from a binary package.
Using BroControl
================
BroControl is an interactive shell for easily operating/managing Bro
installations on a single system or even across multiple systems in a
traffic-monitoring cluster.
.. note:: Below, ``$PREFIX``, is used to reference the Bro installation
root directory.
Minimal Starting Config
-----------------------
The basic configuration changes to make for a minimal BroControl installation
that will manage a single Bro instance on the ``localhost``:
1) In ``$PREFIX/etc/node.cfg``, set the right interface to monitor.
2) In ``$PREFIX/etc/networks.cfg``, comment out the default settings and add
the networking that Bro will consider local to the monitored environment.
3) In ``$PREFIX/etc/broctl.cfg``, change the ``MailTo`` email address to a
desired recipient and the ``LogRotationInterval`` to a desired log
archival frequency.
Now start the BroControl shell like:
.. console::
> broctl
Since this is the first-time use of the shell, perform an initial installation
of the BroControl configuration:
.. console::
[BroControl] > install
Then start up a Bro instance:
.. console::
[BroControl] > start
If there are errors while trying to start the Bro instance, you can
can view the details with the ``diag`` command. If started successfully,
the Bro instance will begin analyzing traffic according to a default
policy and output the results in ``$PREFIX/logs``.
.. note:: The `FAQ <http://www.bro-ids.org/documentation/faq.html>`_ entries about
capturing as an unprivileged user and checksum offloading are particularly
relevant at this point.
You can leave it running for now, but to stop this Bro instance you would do:
.. console::
[BroControl] > stop
Browsing Log Files
------------------
By default, logs are written in human-readable (ASCII) format and data
is organized into columns (tab-delimited). Logs that are part of the
current rotation interval are accumulated in ``$PREFIX/logs/current/``
(if Bro is not running, then there will not be any log files in this
directory). For example, the ``http.log`` contains the results of
analysis performed by scripts in ``$PREFIX/share/bro/``\ **base**\
``/protocols/http/`` or ``$PREFIX/share/bro/``\ **policy**\
``/protocols/http/`` (both contain code that may contribute to what ends
up in the log).
Here's the first few columns of ``http.log``::
# ts uid orig_h orig_p resp_h resp_p
1311627961.8 HSH4uV8KVJg 192.168.1.100 52303 192.150.187.43 80
Logs that deal with analysis of a network protocol will often start like this:
a timestamp, a connection identifier (UID), and a connection 4-tuple
(originator host/port and responder host/port). The UID can be used to
identify all logged activity (possibly across multiple log files) associated
with a given connection 4-tuple over its lifetime.
The remaining columns of protocol-specific logs then detail the
protocol-dependent activity that's occurring. E.g. ``http.log``'s next few
columns (shortened for brevity) show a request to the root of Bro website::
# method host uri referrer user_agent
GET bro-ids.org / - <...>Chrome/12.0.742.122<...>
Some logs are worth explicit mention:
``weird.log`` contains unusual/exceptional activity that can indicate
malformed connections, traffic that doesn't conform to a particular
protocol, malfunctioning/misconfigured hardware, or even an attacker
attempting to avoid/confuse a sensor. Without context, it's hard to judge
whether this category of activity is interesting and so that is left up to
the user to configure.
``notice.log`` identifies specific activity that Bro recognizes as
potentially interesting, odd, or bad.
``alarm.log`` is just a filtered version of ``notice.log``, containing
only notices for which the user has taught Bro to recognize as
interesting/bad.
By default, ``BroControl`` regularly takes all the logs from
``$PREFIX/logs/current``, and archives/compresses them to a directory
named by date, e.g. ``$PREFIX/logs/2011-10-06``. The frequency
at which this is done can be configured via the ``LogRotationInterval``
option in ``$PREFIX/etc/broctl.cfg``.
Deployment Customization
------------------------
The goal of most Bro *deployments* may be to send email alarms when a network
event requires human intervention/investigation, but sometimes that conflicts
with Bro's goal as a *distribution* to remain policy and site neutral -- the
events on one network may be less noteworthy than the same events on another.
As a result, deploying Bro can be an iterative process of
updating its policy to take different actions for events that are noticed, and
using its scripting language to programmatically extend traffic analysis
in a precise way.
One of the first steps to take in customizing Bro might be to get familiar
with the notices it can generate by default and either tone down or escalate
the action that's taken when specific ones occur.
Let's say that we've been looking at the ``notice.log`` for a bit and see two
changes we want to make:
1) ``SSL::Invalid_Server_Cert`` (found in the ``note`` column) is one type of
notice that means an SSL connection was established and the server's
certificate couldn't be validated using Bro's default trust roots, but
we want to ignore it.
2) ``SSH::Login`` is a notice type that is triggered when an SSH connection
attempt looks like it may have been successful, and we want email when
that happens, but only for certain servers.
So we've defined *what* we want to do, but need to know *where* to do it.
The answer is to use a script written in the Bro programming language, so
let's do a quick intro to Bro scripting.
Bro Scripts
~~~~~~~~~~~
Bro ships with many pre-written scripts that are highly customizable to
support traffic analysis for your specific environment. By default,
these will be installed into ``$PREFIX/share/bro`` and can be identified
by the use of a ``.bro`` file name extension. These files should
**never** be edited directly as changes will be lost when upgrading to
newer versions of Bro. The exception to this rule is that any ``.bro``
file in ``$PREFIX/share/bro/site`` can be modified without fear of being
clobbered later. If desired, the ``site`` directory can also be used to
store new scripts. The other main script directories under
``$PREFIX/share/bro`` are ``base`` and ``policy``. By default, Bro
automatically loads all scripts under ``base`` (unless the ``-b``
command line option is supplied), which deal either with collecting
basic/useful state about network activities or providing
frameworks/utilities that extend Bro's functionality without any
performance cost. Scripts under the ``policy`` directory may be more
situational or costly, and so users must explicitly choose if
they want to load them.
The main entry point for the default analysis configuration of a standalone
Bro instance managed by BroControl is the ``$PREFIX/share/bro/site/local.bro``
script. So we'll be adding to that in the following sections, but first
we have to figure out what to add.
Redefining Script Option Variables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Many simple customizations just require you to redefine (using the ``redef``
operator) a variable from a standard Bro script with your own value.
The typical way a standard Bro script advertises tweak-able options to users
is by defining variables with the ``&redef`` attribute and ``const`` qualifier.
A redefineable constant might seem strange, but what that really means is that
the variable's value may not change at run-time, but whose initial value can be
modified via the ``redef`` operator at parse-time.
So let's continue on our path to modify the behavior for the two SSL and SSH
notices. Looking at
`$PREFIX/share/bro/base/frameworks/notice/main.bro <{{ git('base.frameworks.notice.main.bro.txt', 'master:bro/scripts/base/frameworks/notice/main.bro') }}>`_,
we see that it advertises:
.. code:: bro
module Notice;
export {
...
## Ignored notice types.
const ignored_types: set[Notice::Type] = {} &redef;
}
That's exactly what we want to do for the SSL notice. So add to ``local.bro``:
.. code:: bro
redef Notice::ignored_types += { SSL::Invalid_Server_Cert };
.. note:: The ``Notice`` namespace scoping is necessary here because the
variable was declared and exported inside the ``Notice`` module, but is
being referenced from outside of it. Variables declared and exported
inside a module do not have to be scoped if referring to them while still
inside the module.
Then go into the BroControl shell to check whether the configuration change
is valid before installing it and then restarting the Bro instance:
.. console::
[BroControl] > check
bro is ok.
[BroControl] > install
removing old policies in /usr/local/bro/spool/policy/site ... done.
removing old policies in /usr/local/bro/spool/policy/auto ... done.
creating policy directories ... done.
installing site policies ... done.
generating standalone-layout.bro ... done.
generating local-networks.bro ... done.
generating broctl-config.bro ... done.
updating nodes ... done.
[BroControl] > restart
stopping bro ...
starting bro ...
Now that the SSL notice is ignored, let's look at how to send an email on
the SSH notice. The notice framework has a similar option called
``emailed_types``, but that can't differentiate between SSH servers and we
only want email for logins to certain ones. Then we come to the ``PolicyItem``
record and ``policy`` set and realize that those are actually what get used
to implement the simple functionality of ``ignored_types`` and
``emailed_types``, but it's extensible such that the condition and action taken
on notices can be user-defined.
In ``local.bro``, let's add a new ``PolicyItem`` record to the ``policy`` set
that only takes the email action for SSH logins to a defined set of servers:
.. code:: bro
const watched_servers: set[addr] = {
192.168.1.100,
192.168.1.101,
192.168.1.102,
} &redef;
redef Notice::policy += {
[$result = Notice::ACTION_EMAIL,
$pred(n: Notice::Info) =
{
return n$note == SSH::Login && n$id$resp_h in watched_servers;
}
]
};
You'll just have to trust the syntax for now, but what we've done is first
first declare our own variable to hold a set watched addresses,
``watched_servers``, then added a record to the policy that will generate
an email on the condition that the predicate function evaluates to true, which
is whenever the notice type is an SSH login and the responding host stored
inside the ``Info`` record's connection field is in the set of watched servers.
.. note:: record field member access is done with the '$' character instead
of a '.' as might be expected in order to avoid ambiguity with the builtin
address type's use of '.' in IPv4 dotted decimal representations.
Remember, to finalize that configuration change perform the ``check``,
``install``, ``restart`` commands in that order inside the BroControl shell.
Next Steps
----------
By this point, we've learned how to set up the most basic Bro instance and
tweak the most basic options. Here's some suggestions on what to explore next:
* We only looked at how to change options declared in the notice framework,
there's many more options to look at in other script packages.
* Reading the code of scripts that ship with Bro is also a great way to gain
understanding of the language and how you can start writing your own custom
analysis.
* Review the `FAQ <http://www.bro-ids.org/documentation/faq.html>`_.
* Check out more `documentation <http://www.bro-ids.org/documentation/index.html>`_.
* Continue reading below for another mini-tutorial on using Bro as a standalone
command-line utility.
Bro, the Command-Line Utility
=============================
If you prefer not to use BroControl (e.g. don't need its automation and
management features), here's how to directly control Bro for your analysis
activities.
Monitoring Live Traffic
-----------------------
Analyzing live traffic from an interface is simple:
.. console::
> bro -i en0 <list of scripts to load>
``en0`` can be replaced by the interface of your choice and for the list of
scripts, you can just use "all" for now to perform all the default analysis
that's available.
Bro will output log files into the working directory.
.. note:: The `FAQ <http://www.bro-ids.org/documentation/faq.html>`_ entries about
capturing as an unprivileged user and checksum offloading are particularly
relevant at this point.
Reading Packet Capture (pcap) Files
-----------------------------------
Capturing packets from an interface and writing them to a file can be done
like this:
.. console::
> sudo tcpdump -i en0 -s 0 -w mypackets.trace
Where ``en0`` can be replaced by the correct interface for your system as
shown by e.g. ``ifconfig``. (The ``-s 0`` argument tells it to capture
whole packets; in cases where it's not supported use ``-s 65535`` instead).
After a while of capturing traffic, kill the ``tcpdump`` (with ctrl-c),
and tell Bro to perform all the default analysis on the capture which primarily includes :
.. console::
> bro -r mypackets.trace
Bro will output log files into the working directory.
If you are interested in more detection, you can load the ``local``
script that we include as a suggested configuration:
.. console::
> bro -r mypackets.trace local
This will cause Bro to print a warning about lacking the
``Site::local_nets`` variable being configured. You can supply this
information at the command line like this (supply your "local" subnets
in place of the example subnets):
.. console::
> bro -r mypackets.trace local "Site::local_nets += { 1.2.3.0/24, 5.6.7.0/24 }"
Telling Bro Which Scripts to Load
---------------------------------
A command-line invocation of Bro typically looks like:
.. console::
> bro <options> <policies...>
Where the last arguments are the specific policy scripts that this Bro
instance will load. These arguments don't have to include the ``.bro``
file extension, and if the corresponding script resides under the default
installation path, ``$PREFIX/share/bro``, then it requires no path
qualification. Further, a directory of scripts can be specified as
an argument to be loaded as a "package" if it contains a ``__load__.bro``
script that defines the scripts that are part of the package.
This example does all of the base analysis (primarily protocol
logging) and adds SSL certificate validation.
.. console::
> bro -r mypackets.trace protocols/ssl/validate-certs
You might notice that a script you load from the command line uses the
``@load`` directive in the Bro language to declare dependence on other scripts.
This directive is similar to the ``#include`` of C/C++, except the semantics
are "load this script if it hasn't already been loaded".
.. note:: If one wants Bro to be able to load scripts that live outside the
default directories in Bro's installation root, the ``BROPATH`` environment
variable will need to be extended to include all the directories that need
to be searched for scripts. See the default search path by doing
``bro --help``.

390
doc/signatures.rst Normal file
View file

@ -0,0 +1,390 @@
==========
Signatures
==========
.. class:: opening
Bro relies primarily on its extensive scripting language for
defining and analyzing detection policies. In addition, however,
Bro also provides an independent *signature language* for doing
low-level, Snort-style pattern matching. While signatures are
*not* Bro's preferred detection tool, they sometimes come in handy
and are closer to what many people are familiar with from using
other NIDS. This page gives a brief overview on Bro's signatures
and covers some of their technical subtleties.
.. contents::
:depth: 2
Basics
======
Let's look at an example signature first:
.. code:: bro-sig
signature my-first-sig {
ip-proto == tcp
dst-port == 80
payload /.*root/
event "Found root!"
}
This signature asks Bro to match the regular expression ``.*root`` on
all TCP connections going to port 80. When the signature triggers, Bro
will raise an event ``signature_match`` of the form:
.. code:: bro
event signature_match(state: signature_state, msg: string, data: string)
Here, ``state`` contains more information on the connection that
triggered the match, ``msg`` is the string specified by the
signature's event statement (``Found root!``), and data is the last
piece of payload which triggered the pattern match.
To turn such ``signature_match`` events into actual alarms, you can
load Bro's ``signature.bro`` script. This script contains a default
event handler that raises ``SensitiveSignature`` `Notices
<notices.html>`_ (as well as others; see the beginning of the script).
As signatures are independent of Bro's policy scripts, they are put
into their own file(s). There are two ways to specify which files
contain signatures: By using the ``-s`` flag when you invoke Bro, or
by extending the Bro variable ``signatures_files`` using the ``+=``
operator. If a signature file is given without a path, it is searched
along the normal ``BROPATH``. The default extension of the file name
is ``.sig``, and Bro appends that automatically when neccesary.
Signature language
==================
Let's look at the format of a signature more closely. Each individual
signature has the format ``signature <id> { <attributes> }``. ``<id>``
is a unique label for the signature. There are two types of
attributes: *conditions* and *actions*. The conditions define when the
signature matches, while the actions declare what to do in the case of
a match. Conditions can be further divided into four types: *header*,
*content*, *dependency*, and *context*. We discuss these all in more
detail in the following.
Conditions
----------
Header Conditions
~~~~~~~~~~~~~~~~~
Header conditions limit the applicability of the signature to a subset
of traffic that contains matching packet headers. For TCP, this match
is performed only for the first packet of a connection. For other
protocols, it is done on each individual packet.
There are pre-defined header conditions for some of the most used
header fields. All of them generally have the format ``<keyword> <cmp>
<value-list>``, where ``<keyword>`` names the header field; ``cmp`` is
one of ``==``, ``!=``, ``<``, ``<=``, ``>``, ``>=``; and
``<value-list>`` is a list of comma-separated values to compare
against. The following keywords are defined:
``src-ip``/``dst-ip <cmp> <address-list>``
Source and destination address, repectively. Addresses can be
given as IP addresses or CIDR masks.
``src-port``/``dst-port`` ``<int-list>``
Source and destination port, repectively.
``ip-proto tcp|udp|icmp``
IP protocol.
For lists of multiple values, they are sequentially compared against
the corresponding header field. If at least one of the comparisons
evaluates to true, the whole header condition matches (exception: with
``!=``, the header condition only matches if all values differ).
In addition to these pre-defined header keywords, a general header
condition can be defined either as
.. code:: bro-sig
header <proto>[<offset>:<size>] [& <integer>] <cmp> <value-list>
This compares the value found at the given position of the packet
header with a list of values. ``offset`` defines the position of the
value within the header of the protocol defined by ``proto`` (which
can be ``ip``, ``tcp``, ``udp`` or ``icmp``). ``size`` is either 1, 2,
or 4 and specifies the value to have a size of this many bytes. If the
optional ``& <integer>`` is given, the packet's value is first masked
with the integer before it is compared to the value-list. ``cmp`` is
one of ``==``, ``!=``, ``<``, ``<=``, ``>``, ``>=``. ``value-list`` is
a list of comma-separated integers similar to those described above.
The integers within the list may be followed by an additional ``/
mask`` where ``mask`` is a value from 0 to 32. This corresponds to the
CIDR notation for netmasks and is translated into a corresponding
bitmask applied to the packet's value prior to the comparison (similar
to the optional ``& integer``).
Putting all together, this is an example conditiation that is
equivalent to ``dst- ip == 1.2.3.4/16, 5.6.7.8/24``:
.. code:: bro-sig
header ip[16:4] == 1.2.3.4/16, 5.6.7.8/24
Internally, the predefined header conditions are in fact just
short-cuts and mappend into a generic condition.
Content Conditions
~~~~~~~~~~~~~~~~~~
Content conditions are defined by regular expressions. We
differentiate two kinds of content conditions: first, the expression
may be declared with the ``payload`` statement, in which case it is
matched against the raw payload of a connection (for reassembled TCP
streams) or of a each packet (for ICMP, UDP, and non-reassembled TCP).
Second, it may be prefixed with an analyzer-specific label, in which
case the expression is matched against the data as extracted by the
corresponding analyzer.
A ``payload`` condition has the form:
.. code:: bro-sig
payload /<regular expression>/
Currently, the following analyzer-specific content conditions are
defined (note that the corresponding analyzer has to be activated by
loading its policy script):
``http-request /<regular expression>/``
The regular expression is matched against decoded URIs of HTTP
requests. Obsolete alias: ``http``.
``http-request-header /<regular expression>/``
The regular expression is matched against client-side HTTP headers.
``http-request-body /<regular expression>/``
The regular expression is matched against client-side bodys of
HTTP requests.
``http-reply-header /<regular expression>/``
The regular expression is matched against server-side HTTP headers.
``http-reply-body /<regular expression>/``
The regular expression is matched against server-side bodys of
HTTP replys.
``ftp /<regular expression>/``
The regular expression is matched against the command line input
of FTP sessions.
``finger /<regular expression>/``
The regular expression is matched against finger requests.
For example, ``http-request /.*(etc/(passwd|shadow)/`` matches any URI
containing either ``etc/passwd`` or ``etc/shadow``. To filter on request
types, e.g. ``GET``, use ``payload /GET /``.
Note that HTTP pipelining (that is, multiple HTTP transactions in a
single TCP connection) has some side effects on signature matches. If
multiple conditions are specified within a single signature, this
signature matches if all conditions are met by any HTTP transaction
(not necessarily always the same!) in a pipelined connection.
Dependency Conditions
~~~~~~~~~~~~~~~~~~~~~
To define dependencies between signatures, there are two conditions:
``requires-signature [!] <id>``
Defines the current signature to match only if the signature given
by ``id`` matches for the same connection. Using ``!`` negates the
condition: The current signature only matches if ``id`` does not
match for the same connection (using this defers the match
decision until the connection terminates).
``requires-reverse-signature [!] <id>``
Similar to ``requires-signature``, but ``id`` has to match for the
opposite direction of the same connection, compared the current
signature. This allows to model the notion of requests and
replies.
Context Conditions
~~~~~~~~~~~~~~~~~~
Context conditions pass the match decision on to other components of
Bro. They are only evaluated if all other conditions have already
matched. The following context conditions are defined:
``eval <policy-function>``
The given policy function is called and has to return a boolean
confirming the match. If false is returned, no signature match is
going to be triggered. The function has to be of type ``function
cond(state: signature_state, data: string): bool``. Here,
``content`` may contain the most recent content chunk available at
the time the signature was matched. If no such chunk is available,
``content`` will be the empty string. ``signature_state`` is
defined as follows:
.. code:: bro
type signature_state: record {
id: string; # ID of the signature
conn: connection; # Current connection
is_orig: bool; # True if current endpoint is originator
payload_size: count; # Payload size of the first packet
};
``payload-size <cmp> <integer>``
Compares the integer to the size of the payload of a packet. For
reassembled TCP streams, the integer is compared to the size of
the first in-order payload chunk. Note that the latter is not very
well defined.
``same-ip``
Evaluates to true if the source address of the IP packets equals
its destination address.
``tcp-state <state-list>``
Imposes restrictions on the current TCP state of the connection.
``state-list`` is a comma-separated list of the keywords
``established`` (the three-way handshake has already been
performed), ``originator`` (the current data is send by the
originator of the connection), and ``responder`` (the current data
is send by the responder of the connection).
Actions
-------
Actions define what to do if a signature matches. Currently, there are
two actions defined:
``event <string>``
Raises a ``signature_match`` event. The event handler has the
following type:
.. code:: bro
event signature_match(state: signature_state, msg: string, data: string)
The given string is passed in as ``msg``, and data is the current
part of the payload that has eventually lead to the signature
match (this may be empty for signatures without content
conditions).
``enable <string>``
Enables the protocol analyzer ``<string>`` for the matching
connection (``"http"``, ``"ftp"``, etc.). This is used by Bro's
dynamic protocol detection to activate analyzers on the fly.
Things to keep in mind when writing signatures
==============================================
* Each signature is reported at most once for every connection,
further matches of the same signature are ignored.
* The content conditions perform pattern matching on elements
extracted from an application protocol dialogue. For example, ``http
/.*passwd/`` scans URLs requested within HTTP sessions. The thing to
keep in mind here is that these conditions only perform any matching
when the corresponding application analyzer is actually *active* for
a connection. Note that by default, analyzers are not enabled if the
corresponding Bro script has not been loaded. A good way to
double-check whether an analyzer "sees" a connection is checking its
log file for corresponding entries. If you cannot find the
connection in the analyzer's log, very likely the signature engine
has also not seen any application data.
* As the name indicates, the ``payload`` keyword matches on packet
*payload* only. You cannot use it to match on packet headers; use
the header conditions for that.
* For TCP connections, header conditions are only evaluated for the
*first packet from each endpoint*. If a header condition does not
match the initial packets, the signature will not trigger. Bro
optimizes for the most common application here, which is header
conditions selecting the connections to be examined more closely
with payload statements.
* For UDP and ICMP flows, the payload matching is done on a per-packet
basis; i.e., any content crossing packet boundaries will not be
found. For TCP connections, the matching semantics depend on whether
Bro is *reassembling* the connection (i.e., putting all of a
connection's packets in sequence). By default, Bro is reassembling
the first 1K of every TCP connection, which means that within this
window, matches will be found without regards to packet order or
boundaries (i.e., *stream-wise matching*).
* For performance reasons, by default Bro *stops matching* on a
connection after seeing 1K of payload; see the section on options
below for how to change this behaviour. The default was chosen with
Bro's main user of signatures in mind: dynamic protocol detection
works well even when examining just connection heads.
* Regular expressions are implicitly anchored, i.e., they work as if
prefixed with the ``^`` operator. For reassembled TCP connections,
they are anchored at the first byte of the payload *stream*. For all
other connections, they are anchored at the first payload byte of
each packet. To match at arbitrary positions, you can prefix the
regular expression with ``.*``, as done in the examples above.
* To match on non-ASCII characters, Bro's regular expressions support
the ``\x<hex>`` operator. CRs/LFs are not treated specially by the
signature engine and can be matched with ``\r`` and ``\n``,
respectively. Generally, Bro follows `flex's regular expression
syntax
<http://www.gnu.org/software/flex/manual/html_chapter/flex_7.html>`_.
See the DPD signatures in ``policy/sigs/dpd.bro`` for some examples
of fairly complex payload patterns.
* The data argument of the ``signature_match`` handler might not carry
the full text matched by the regular expression. Bro performs the
matching incrementally as packets come in; when the signature
eventually fires, it can only pass on the most recent chunk of data.
Options
=======
The following options control details of Bro's matching process:
``dpd_reassemble_first_packets: bool`` (default: ``T``)
If true, Bro reassembles the beginning of every TCP connection (of
up to ``dpd_buffer_size`` bytes, see below), to facilitate
reliable matching across packet boundaries. If false, only
connections are reassembled for which an application-layer
analyzer gets activated (e.g., by Bro's dynamic protocol
detection).
``dpd_match_only_beginning : bool`` (default: ``T``)
If true, Bro performs packet matching only within the initial
payload window of ``dpd_buffer_size``. If false, it keeps matching
on subsequent payload as well.
``dpd_buffer_size: count`` (default: ``1024``)
Defines the buffer size for the two preceding options. In
addition, this value determines the amount of bytes Bro buffers
for each connection in order to activate application analyzers
even after parts of the payload have already passed through. This
is needed by the dynamic protocol detection capability to defer
the decision which analyzers to use.
So, how about using Snort signatures with Bro?
==============================================
There was once a script, ``snort2bro``, that converted Snort
signatures automatically into Bro's signature syntax. However, in our
experience this didn't turn out to be a very useful thing to do
because by simply using Snort signatures, one can't benefit from the
additional capabilities that Bro provides; the approaches of the two
systems are just too different. We therefore stopped maintaining the
``snort2bro`` script, and there are now many newer Snort options which
it doesn't support. The script is now no longer part of the Bro
distribution.

170
doc/upgrade.rst Normal file
View file

@ -0,0 +1,170 @@
=============================
Upgrading From Bro 1.5 to 2.0
=============================
.. class:: opening
This guide details differences between Bro version 1.5 and 2.0 that
may be important for users to know as they work on updating their
Bro deployment/configuration to the later version.
.. contents::
New Development Process
=======================
Bro development has moved from using SVN to Git for revision control.
Users that like to use the latest Bro developments by checking it out
from the source repositories should see the `development process
<http://www.bro-ids.org/development/process.html>`_
Bro now uses `CMake <http://www.cmake.org>`_ for its build system so
that is a new required dependency when building from source.
New Script Organization/Hierarchy
=================================
In versions before 2.0, Bro scripts were all maintained in a flat
directory called ``policy/`` in the source tree. This directory is now
renamed to ``scripts/`` and contains major subdirectories ``base/``,
``policy/``, and ``site/``, each of which may also be subdivided further
The contents of the new ``scripts/`` directory, like the old/flat
``policy/`` still gets installed under under the ``share/bro``
subdirectory of the installation prefix path just like previous
versions. For example, if Bro was compiled like ``./configure
--prefix=/usr/local/bro && make && make install``, then the script
hierarchy can be found in ``/usr/local/bro/share/bro``. And main
subdirectories of that hierarchy are as follows:
- ``base/`` contains all scripts that are loaded by Bro by default
(unless the ``-b`` command line option is used to run Bro in a minimal
configuration). Scripts under this directory generally either provide
extra Bro scripting-layer functionality that has no performance cost,
configure a default/recommended mode of operation, or accumulate/log
useful state/protocol information for monitored traffic.
- ``policy/`` contains all scripts that a user will need to explicitly
tell Bro to load. These are scripts that implement
functionality/analysis that not all users may want to use and may have
more significant performance costs.
- ``site/`` remains a directory that can be used to store locally
developed scripts, but now contains some extra scripts that may
contain some recommended default configurations. E.g. ``local.bro``
is loads extra scripts from ``policy/`` and does extra tuning.
These files can also be customized in place without being overwritten
by upgrades/reinstalls, unlike scripts in other directories.
Now, with version 2.0, the default/builtin ``BROPATH`` automatically
will search for scripts in only ``policy/``, ``site/`` and their parent
directory, but **not** ``base/``. Generally, everything under ``base/``
is loaded automatically, but for users of the ``-b``, option, scripts
it's important to know that loading a script in that directory requires
the extra ``base/`` path qualification. For example, the following two
scripts:
* ``$PREFIX/share/bro/base/protocols/ssl/main.bro``
* ``$PREFIX/share/bro/policy/protocols/ssl/validate-certs.bro``
Are referenced from another Bro script like:
.. code:: bro
@load base/protocols/ssl/main
@load protocols/ssl/validate-certs
Notice how ``policy/`` can be omitted as a convenience in the second
case.
Scripting-Layer API Changes
===========================
- The ``@prefixes`` directive works differently now.
Any added prefixes are now searched for and loaded *after* all input
files have been parsed. After all input files are parsed, Bro
searches ``BROPATH`` for prefixed, flattened versions of all of the
parsed input files. For example, if ``lcl`` is in ``@prefixes``, and
``site.bro`` is loaded, then a file named ``lcl.site.bro`` that's in
``BROPATH`` would end up being automatically loaded as well. Packages
work similarly, e.g. loading ``protocols/http`` means a file named
``lcl.protocols.http.bro`` in ``BROPATH`` gets loaded automatically.
- The ``make_addr`` BIF now returns a ``subnet`` versus an ``addr``
- The ``net`` type has been removed
New Default Settings
====================
- Dynamic Protocol Detection (DPD) is now enabled/loaded by default.
- The default packet filter now examines all packets instead of
dynamically building a filter based on which protocol analysis scripts
are loaded. See ``PacketFilter::all_packets`` for how to revert to old
behavior.
Script Overhaul/Modernization
=============================
Variable Naming
---------------
- ``Module`` is more widely used for namespacing. E.g. the new
``site.bro`` exports the ``local_nets`` identifier (among other
things) into the ``Site`` module.
- Identifiers may have been renamed to conform to `scripting
conventions
<http://www.bro-ids.org/development/script-conventions.html>`_
Logging Framework
-----------------
- The logs generated by scripts that ship with Bro are entirely redone
to use a standardized format via the new logging framework and
generally the content has changed towards making the logs even more
useful.
* a particular format change that may be useful to note is that the
``conn.log`` ``service`` field is derived from DPD instead of
well-known ports
- A common pattern found in the new scripts is to store logging
stream records for protocols inside ``connection`` records so that
state can be collected until enough is seen to log a coherent unit
of information regarding the activity of that connection. This state
is now frequently seen/accessible in event handlers, for example, like
``c$<protocol>`` where ``<protocol>`` is replaced by the name of the
protocol. This field is added to the ``connection`` record by
``redef``'ing it in a ``base/protocols/<protocol>/main.bro`` script.
- The new logging framework also makes it possible to extend and
filter logs. See `<logging.rst>`_.
Communication Framework
-----------------------
- The ``remote.bro`` script has evolved into the communication framework
* ``Remote`` module renamed to ``Communication``
* ``Remote::destinations`` renamed to ``Communication::nodes``
(the table of peers)
* ``Remote::Destination`` renamed to ``Communication::Node``
(the type that defines a remote peer)
Notice Framework
----------------
The way users interact with "notices" has changed significantly in order
to make it easier to define a site policy and more extensible for adding
customized actions.
TODO: we need new notice documentation with examples to link from
here. The `old notice documentation <notices.html>`_ can be used as a
starting point.