diff --git a/doc/CMakeLists.txt b/doc/CMakeLists.txt index 427d2b86b6..373b4643ba 100644 --- a/doc/CMakeLists.txt +++ b/doc/CMakeLists.txt @@ -31,7 +31,7 @@ add_custom_target(broxygen ${DOC_SOURCE_WORKDIR}/scripts # append to the master index of all policy scripts COMMAND cat ${MASTER_POLICY_INDEX} >> - ${DOC_SOURCE_WORKDIR}/scripts/index.rst + ${DOC_SOURCE_WORKDIR}/scripts/scripts.rst # append to the master index of all policy packages COMMAND cat ${MASTER_PACKAGE_INDEX} >> ${DOC_SOURCE_WORKDIR}/scripts/packages.rst diff --git a/doc/cluster/index.rst b/doc/cluster/index.rst new file mode 100644 index 0000000000..6de70d38cc --- /dev/null +++ b/doc/cluster/index.rst @@ -0,0 +1,86 @@ + +======================== +Setting up a Bro Cluster +======================== + +Intro +------ + +Bro is not multithreaded, so once the limitations of a single processor core are reached, the only option currently is to spread the workload across many cores or even many physical computers. The cluster deployment scenario for Bro is the current solution to build these larger systems. The accompanying tools and scripts provide the structure to easily manage many Bro processes examining packets and doing correlation activities but acting as a singular, cohesive entity. + +Architecture +--------------- + +The figure below illustrates the main components of a Bro cluster. + +.. image:: /images/deployment.png + +Tap +*** +This is a mechanism that splits the packet stream in order to make a copy +available for inspection. Examples include the monitoring port on a switch and +an optical splitter for fiber networks. + +Frontend +******** +This is a discrete hardware device or on-host technique that will split your traffic into many streams or flows. The Bro binary does not do this job. There are numerous ways to accomplish this task, some of which are described below in `Frontend Options`_. + +Manager +******* +This is a Bro process which has two primary jobs. It receives log messages and notices from the rest of the nodes in the cluster using the Bro communications protocol. The result is that you will end up with single logs for each log instead of many discrete logs that you have to later combine in some manner with post processing. The manager also takes the opportunity to de-duplicate notices and it has the ability to do so since it’s acting as the choke point for notices and how notices might be processed into actions such as emailing, paging, or blocking. + +The manager process is started first by BroControl and it only opens it’s designated port and waits for connections, it doesn’t initiate any connections to the rest of the cluster. Once the workers are started and connect to the manager, logs and notices will start arriving to the manager process from the workers. + +Proxy +***** +This is a Bro process which manages synchronized state. Variables can be synchronized across connected Bro processes automatically in Bro and proxies will help the workers by alleviating the need for all of the workers to connect directly to each other. + +Examples of synchronized state from the scripts that ship with Bro are things such as the full list of “known” hosts and services which are hosts or services which have been detected as performing full TCP handshakes or an analyzed protocol has been found on the connection. If worker A detects host 1.2.3.4 as an active host, it would be beneficial for worker B to know that as well so worker A shares that information as an insertion to a set which travels to the cluster’s proxy and the proxy then sends that same set insertion to worker B. The result is that worker A and worker B have shared knowledge about host and services that are active on the network being monitored. + +The proxy model extends to having multiple proxies as well if necessary for performance reasons, it only adds one additional step for the Bro processes. Each proxy connects to another proxy in a ring and the workers are shared between them as evenly as possible. When a proxy receives some new bit of state, it will share that with it’s proxy which is then shared around the ring of proxies and down to all of the workers. From a practical standpoint, there are no rules of thumb established yet for the number of proxies necessary for the number of workers they are serving. Best is to start with a single proxy and add more if communication performance problems are found. + +Bro processes acting as proxies don’t tend to be extremely intense to CPU or memory and users frequently run proxy processes on the same physical host as the manager. + +Worker +****** +This is the Bro process that sniffs network traffic and does protocol analysis on the reassembled traffic streams. Most of the work of an active cluster takes place on the workers and as such, the workers typically represent the bulk of the Bro processes that are running in a cluster. The fastest memory and CPU core speed you can afford is best here since all of the protocol parsing and most analysis will take place here. There are no particular requirements for the disks in workers since almost all logging is done remotely to the manager and very little is normally written to disk. + +The rule of thumb we have followed recently is to allocate approximately 1 core for every 80Mbps of traffic that is being analyzed, however this estimate could be extremely traffic mix specific. It has generally worked for mixed traffic with many users and servers. For example, if your traffic peaks around 2Gbps (combined) and you want to handle traffic at peak load, you may want to have 26 cores available (2048 / 80 == 25.6). If the 80Mbps estimate works for your traffic, this could be handled by 3 physical hosts dedicated to being workers with each one containing dual 6-core processors. + +Once a flow based load balancer is put into place this model is extremely easy to scale as well so it’s recommended that you guess at the amount of hardware you will need to fully analyze your traffic. If it turns out that you need more, it’s relatively easy to increase the size of the cluster in most cases. + +Frontend Options +---------------- + +There are many options for setting up a frontend flow distributor and in many cases it may even be beneficial to do multiple stages of flow distribution on the network and on the host. + +Discrete hardware flow balancers +******************************** + +cPacket +^^^^^^^ + +If you are monitoring one or more 10G physical interfaces, the recommended solution is to use either a cFlow or cVu device from cPacket because they are currently being used very successfully at a number of sites. These devices will perform layer-2 load balancing by rewriting the destination ethernet MAC address to cause each packet associated with a particular flow to have the same destination MAC. The packets can then be passed directly to a monitoring host where each worker has a BPF filter to limit its visibility to only that stream of flows or onward to a commodity switch to split the traffic out to multiple 1G interfaces for the workers. This can ultimately greatly reduce costs since workers can use relatively inexpensive 1G interfaces. + +OpenFlow Switches +^^^^^^^^^^^^^^^^^ + +We are currently exploring the use of OpenFlow based switches to do flow based load balancing directly on the switch which can greatly reduce frontend costs for many users. This document will be updated when we have more information. + +On host flow balancing +********************** + +PF_RING +^^^^^^^ + +The PF_RING software for Linux has a “clustering” feature which will do flow based load balancing across a number of processes that are sniffing the same interface. This will allow you to easily take advantage of multiple cores in a single physical host because Bro’s main event loop is single threaded and can’t natively utilize all of the cores. More information about Bro with PF_RING can be found here: (someone want to write a quick Bro/PF_RING tutorial to link to here? document installing kernel module, libpcap wrapper, building Bro with the --with-pcap configure option) + +Netmap +^^^^^^ + +FreeBSD has an in-progress project named Netmap which will enable flow based load balancing as well. When it becomes viable for real world use, this document will be updated. + +Click! Software Router +^^^^^^^^^^^^^^^^^^^^^^ + +Click! can be used for flow based load balancing with a simple configuration. (link to an example for the config). This solution is not recommended on Linux due to Bro’s PF_RING support and only as a last resort on other operating systems since it causes a lot of overhead due to context switching back and forth between kernel and userland several times per packet. diff --git a/doc/components/binpac/README.rst b/doc/components/binpac/README.rst new file mode 100644 index 0000000000..683b5455f4 --- /dev/null +++ b/doc/components/binpac/README.rst @@ -0,0 +1,68 @@ +.. -*- mode: rst-mode -*- +.. +.. Version number is filled in automatically. +.. |version| replace:: 0.34-3 + +====== +BinPAC +====== + +.. rst-class:: opening + + BinPAC is a high level language for describing protocol parsers and + generates C++ code. It is currently maintained and distributed with the + Bro Network Security Monitor distribution, however, the generated parsers + may be used with other programs besides Bro. + +Download +-------- + +You can find the latest BinPAC release for download at +http://www.bro.org/download. + +BinPAC's git repository is located at `git://git.bro.org/binpac.git +`__. You can browse the repository +`here `__. + +This document describes BinPAC |version|. See the ``CHANGES`` +file for version history. + +Prerequisites +------------- + +BinPAC relies on the following libraries and tools, which need to be +installed before you begin: + + * Flex (Fast Lexical Analyzer) + Flex is already installed on most systems, so with luck you can + skip having to install it yourself. + + * Bison (GNU Parser Generator) + Bison is also already installed on many system. + + * CMake 2.6.3 or greater + CMake is a cross-platform, open-source build system, typically + not installed by default. See http://www.cmake.org for more + information regarding CMake and the installation steps below for + how to use it to build this distribution. CMake generates native + Makefiles that depend on GNU Make by default + +Installation +------------ + +To build and install into ``/usr/local``:: + + ./configure + cd build + make + make install + +This will perform an out-of-source build into the build directory using +the default build options and then install the binpac binary into +``/usr/local/bin``. + +You can specify a different installation directory with:: + + ./configure --prefix= + +Run ``./configure --help`` for more options. diff --git a/doc/components/bro-aux/README.rst b/doc/components/bro-aux/README.rst new file mode 100644 index 0000000000..822afea358 --- /dev/null +++ b/doc/components/bro-aux/README.rst @@ -0,0 +1,70 @@ +.. -*- mode: rst; -*- +.. +.. Version number is filled in automatically. +.. |version| replace:: 0.26-5 + +====================== +Bro Auxiliary Programs +====================== + +.. contents:: + +:Version: |version| + +Handy auxiliary programs related to the use of the Bro Network Security +Monitor (http://www.bro.org). + +Note that some files that were formerly distributed with Bro as part +of the aux/ tree are now maintained separately. See the +http://www.bro.org/download for their download locations. + +adtrace +======= + +Makefile and source for the adtrace utility. This program is used +in conjunction with the localnetMAC.pl perl script to compute the +network address that compose the internal and extern nets that bro +is monitoring. This program when run by itself just reads a pcap +(tcpdump) file and writes out the src MAC, dst MAC, src IP, dst +IP for each packet seen in the file. This output is processed by +the localnetMAC.pl script during 'make install'. + + +devel-tools +=========== + +A set of scripts used commonly for Bro development. + +extract-conn-by-uid: + Extracts a connection from a trace file based + on its UID found in Bro's conn.log + +gen-mozilla-ca-list.rb + Generates list of Mozilla SSL root certificates in + a format readable by Bro. + +update-changes + A script to maintain the CHANGES and VERSION files. + +git-show-fastpath + Show commits to the fastpath branch not yet merged into master. + +cpu-bench-with-trace + Run a number of Bro benchmarks on a trace file. + + +nftools +======= + +Utilities for dealing with Bro's custom file format for storing +NetFlow records. nfcollector reads NetFlow data from a socket and +writes it in Bro's format. ftwire2bro reads NetFlow "wire" format +(e.g., as generated by a 'flow-export' directive) and writes it in +Bro's format. + +rst +=== + +Makefile and source for the rst utility. "rst" can be invoked by +a Bro script to terminate an established TCP connection by forging +RST tear-down packets. See terminate_connection() in conn.bro. diff --git a/doc/components/broccoli-python/README.rst b/doc/components/broccoli-python/README.rst new file mode 100644 index 0000000000..b2203a906a --- /dev/null +++ b/doc/components/broccoli-python/README.rst @@ -0,0 +1,231 @@ +.. -*- mode: rst-mode -*- +.. +.. Version number is filled in automatically. +.. |version| replace:: 0.54 + +============================ +Python Bindings for Broccoli +============================ + +.. rst-class:: opening + + This Python module provides bindings for Broccoli, Bro's client + communication library. In general, the bindings provide the same + functionality as Broccoli's C API. + +.. contents:: + + +Download +-------- + +You can find the latest Broccoli-Python release for download at +http://www.bro.org/download. + +Broccoli-Python's git repository is located at `git://git.bro.org/broccoli-python.git +`__. You can browse the repository +`here `__. + +This document describes Broccoli-Python |version|. See the ``CHANGES`` +file for version history. + + +Installation +------------ + +Installation of the Python module is pretty straight-forward. After +Broccoli itself has been installed, it follows the standard installation +process for Python modules:: + + python setup.py install + +Try the following to test the installation. If you do not see any +error message, everything should be fine:: + + python -c "import broccoli" + +Usage +----- + +The following examples demonstrate how to send and receive Bro +events in Python. + +The main challenge when using Broccoli from Python is dealing with +the data types of Bro event parameters as there is no one-to-one +mapping between Bro's types and Python's types. The Python modules +automatically maps between those types which both systems provide +(such as strings) and provides a set of wrapper classes for Bro +types which do not have a direct Python equivalent (such as IP +addresses). + +Connecting to Bro +~~~~~~~~~~~~~~~~~ + +The following code sets up a connection from Python to a remote Bro +instance (or another Broccoli) and provides a connection handle for +further communication:: + + from broccoli import * + bc = Connection("127.0.0.1:47758") + +An ``IOError`` will be raised if the connection cannot be established. + +Sending Events +~~~~~~~~~~~~~~ + +Once you have a connection handle ``bc`` set up as shown above, you can +start sending events:: + + bc.send("foo", 5, "attack!") + +This sends an event called ``foo`` with two parameters, ``5`` and +``attack!``. Broccoli operates asynchronously, i.e., events scheduled +with ``send()`` are not always sent out immediately but might be +queued for later transmission. To ensure that all events get out +(and incoming events are processed, see below), you need to call +``bc.processInput()`` regularly. + +Data Types +~~~~~~~~~~ + +In the example above, the types of the event parameters are +automatically derived from the corresponding Python types: the first +parameter (``5``) has the Bro type ``int`` and the second one +(``attack!``) has Bro type ``string``. + +For types which do not have a Python equivalent, the ``broccoli`` +module provides wrapper classes which have the same names as the +corresponding Bro types. For example, to send an event called ``bar`` +with one ``addr`` argument and one ``count`` argument, you can write:: + + bc.send("bar", addr("192.168.1.1"), count(42)) + +The following table summarizes the available atomic types and their +usage. + +======== =========== =========================== +Bro Type Python Type Example +======== =========== =========================== +addr ``addr("192.168.1.1")`` +bool bool ``True`` +count ``count(42)`` +double float ``3.14`` +enum Type currently not supported +int int ``5`` +interval ``interval(60)`` +net Type currently not supported +port ``port("80/tcp")`` +string string ``"attack!"`` +subnet ``subnet("192.168.1.0/24")`` +time ``time(1111111111.0)`` +======== =========== =========================== + +The ``broccoli`` module also supports sending Bro records as event +parameters. To send a record, you first define a record type. For +example, a Bro record type:: + + type my_record: record { + a: int; + b: addr; + c: subnet; + }; + +turns into Python as:: + + my_record = record_type("a", "b", "c") + +As the example shows, Python only needs to know the attribute names +but not their types. The types are derived automatically in the same +way as discussed above for atomic event parameters. + +Now you can instantiate a record instance of the newly defined type +and send it out:: + + rec = record(my_record) + rec.a = 5 + rec.b = addr("192.168.1.1") + rec.c = subnet("192.168.1.0/24") + bc.send("my_event", rec) + +.. note:: The Python module does not support nested records at this time. + +Receiving Events +~~~~~~~~~~~~~~~~ + +To receive events, you define a callback function having the same +name as the event and mark it with the ``event`` decorator:: + + @event + def foo(arg1, arg2): + print arg1, arg2 + +Once you start calling ``bc.processInput()`` regularly (see above), +each received ``foo`` event will trigger the callback function. + +By default, the event's arguments are always passed in with built-in +Python types. For Bro types which do not have a direct Python +equivalent (see table above), a substitute built-in type is used +which corresponds to the type the wrapper class' constructor expects +(see the examples in the table). For example, Bro type ``addr`` is +passed in as a string and Bro type ``time`` is passed in as a float. + +Alternatively, you can define a _typed_ prototype for the event. If you +do so, arguments will first be type-checked and then passed to the +call-back with the specified type (which means instances of the +wrapper classes for non-Python types). Example:: + + @event(count, addr) + def bar(arg1, arg2): + print arg1, arg2 + +Here, ``arg1`` will be an instance of the ``count`` wrapper class and +``arg2`` will be an instance of the ``addr`` wrapper class. + +Protoyping works similarly with built-in Python types:: + + @event(int, string): + def foo(arg1, arg2): + print arg1, arg2 + +In general, the prototype specifies the types in which the callback +wants to receive the arguments. This actually provides support for +simple type casts as some types support conversion to into something +different. If for instance the event source sends an event with a +single port argument, ``@event(port)`` will pass the port as an +instance of the ``port`` wrapper class; ``@event(string)`` will pass it +as a string (e.g., ``"80/tcp"``); and ``@event(int)`` will pass it as an +integer without protocol information (e.g., just ``80``). If an +argument cannot be converted into the specified type, a ``TypeError`` +will be raised. + +To receive an event with a record parameter, the record type first +needs to be defined, as described above. Then the type can be used +with the ``@event`` decorator in the same way as atomic types:: + + my_record = record_type("a", "b", "c") + @event(my_record) + def my_event(rec): + print rec.a, rec.b, rec.c + +Helper Functions +---------------- + +The ``broccoli`` module provides one helper function: ``current_time()`` +returns the current time as a float which, if necessary, can be +wrapped into a ``time`` parameter (i.e., ``time(current_time()``) + +Examples +-------- + +There are some example scripts in the ``tests/`` subdirectory of the +``broccoli-python`` repository +`here `_: + + - ``broping.py`` is a (simplified) Python version of Broccoli's test program + ``broping``. Start Bro with ``broping.bro``. + + - ``broping-record.py`` is a Python version of Broccoli's ``broping`` + for records. Start Bro with ``broping-record.bro``. + + - ``test.py`` is a very ugly but comprehensive regression test and part of + the communication test-suite. Start Bro with ``test.bro``. diff --git a/doc/components/broccoli-ruby/README.rst b/doc/components/broccoli-ruby/README.rst new file mode 100644 index 0000000000..647a568cd8 --- /dev/null +++ b/doc/components/broccoli-ruby/README.rst @@ -0,0 +1,67 @@ +.. -*- mode: rst-mode -*- +.. +.. Version number is filled in automatically. +.. |version| replace:: 1.54 + +=============================================== +Ruby Bindings for Broccoli +=============================================== + +.. rst-class:: opening + + This is the broccoli-ruby extension for Ruby which provides access + to the Broccoli API. Broccoli is a library for + communicating with the Bro Intrusion Detection System. + + +Download +======== + +You can find the latest Broccoli-Ruby release for download at +http://www.bro.org/download. + +Broccoli-Ruby's git repository is located at `git://git.bro.org/broccoli-ruby.git +`__. You can browse the repository +`here `__. + +This document describes Broccoli-Ruby |version|. See the ``CHANGES`` +file for version history. + + +Installation +============ + +To install the extension: + +1. Make sure that the ``broccoli-config`` binary is in your path. + (``export PATH=/usr/local/bro/bin:$PATH``) + +2. Run ``sudo ruby setup.rb``. + +To install the extension as a gem (suggested): + +1. Install `rubygems `_. + +2. Make sure that the ``broccoli-config`` binary is in your path. + (``export PATH=/usr/local/bro/bin:$PATH``) + +3. Run, ``sudo gem install rbroccoli``. + +Usage +===== + +There aren't really any useful docs yet. Your best bet currently is +to read through the examples. + +One thing I should mention however is that I haven't done any optimization +yet. You may find that if you write code that is going to be sending or +receiving extremely large numbers of events, that it won't run fast enough and +will begin to fall behind the Bro server. The dns_requests.rb example is +a good performance test if your Bro server is sitting on a network with many +dns lookups. + +Contact +======= + +If you have a question/comment/patch, see the Bro `contact page +`_. diff --git a/doc/components/broccoli/README.rst b/doc/components/broccoli/README.rst new file mode 100644 index 0000000000..4860324292 --- /dev/null +++ b/doc/components/broccoli/README.rst @@ -0,0 +1,141 @@ +.. -*- mode: rst-mode -*- +.. +.. Version number is filled in automatically. +.. |version| replace:: 1.92-9 + +=============================================== +Broccoli: The Bro Client Communications Library +=============================================== + +.. rst-class:: opening + + Broccoli is the "Bro client communications library". It allows you + to create client sensors for the Bro intrusion detection system. + Broccoli can speak a good subset of the Bro communication protocol, + in particular, it can receive Bro IDs, send and receive Bro events, + and send and receive event requests to/from peering Bros. You can + currently create and receive values of pure types like integers, + counters, timestamps, IP addresses, port numbers, booleans, and + strings. + + +Download +-------- + +You can find the latest Broccoli release for download at +http://www.bro.org/download. + +Broccoli's git repository is located at +`git://git.bro.org/broccoli `_. You +can browse the repository `here `_. + +This document describes Broccoli |version|. See the ``CHANGES`` +file for version history. + + +Installation +------------ + +The Broccoli library has been tested on Linux, the BSDs, and Solaris. +A Windows build has not currently been tried but is part of our future +plans. If you succeed in building Broccoli on other platforms, let us +know! + + +Prerequisites +------------- + +Broccoli relies on the following libraries and tools, which need to be +installed before you begin: + + Flex (Fast Lexical Analyzer) + Flex is already installed on most systems, so with luck you + can skip having to install it yourself. + + Bison (GNU Parser Generator) + This comes with many systems, but if you get errors compiling + parse.y, you will need to install it. + + OpenSSL headers and libraries + For encrypted communication. These are likely installed, + though some platforms may require installation of a 'devel' + package for the headers. + + CMake 2.6.3 or greater + CMake is a cross-platform, open-source build system, typically + not installed by default. See http://www.cmake.org for more + information regarding CMake and the installation steps below + for how to use it to build this distribution. CMake generates + native Makefiles that depend on GNU Make by default. + +Broccoli can also make use of some optional libraries if they are found at +installation time: + +Libpcap headers and libraries + Network traffic capture library + + +Installation +------------ + +To build and install into ``/usr/local``:: + + ./configure + make + make install + +This will perform an out-of-source build into the build directory using the +default build options and then install libraries into ``/usr/local/lib``. + +You can specify a different installation directory with:: + + ./configure --prefix= + +Or control the python bindings install destination more precisely with:: + + ./configure --python-install-dir= + +Run ``./configure --help`` for more options. + + +Further notable configure options: + + ``--enable-debug`` + This one enables lots of debugging output. Be sure to disable + this when using the library in a production environment! The + output could easily end up in undersired places when the stdout + of the program you've instrumented is used in other ways. + + ``--with-configfile=FILE`` + Broccoli can read key/value pairs from a config file. By default + it is located in the etc directory of the installation root + (exception: when using ``--prefix=/usr``, ``/etc`` is used + instead of /usr/etc). The default config file name is + broccoli.conf. Using ``--with-configfile``, you can override the + location and name of the config file. + +To use the library in other programs & configure scripts, use the +``broccoli-config`` script. It gives you the necessary configuration flags +and linker flags for your system, see ``--cflags`` and ``--libs``. + +The API is contained in broccoli.h and pretty well documented. A few +usage examples can be found in the test directory, in particular, the +``broping`` tool can be used to test event transmission and reception. Have +a look at the policy file ``broping.bro`` for the events that need to be +defined at the peering Bro. Try ``broping -h`` for a look at the available +options. + +Broccoli knows two kinds of version numbers: the release version number +(as in "broccoli-x.y.tar.gz", or as shipped with Bro) and the shared +library API version number (as in libbroccoli.so.3.0.0). The former +relates to changes in the tree, the latter to compatibility changes in +the API. + +Comments, feedback and patches are appreciated; please check the `Bro +website `_. + +Documentation +------------- + +Please see the `Broccoli User Manual <./broccoli-manual.html>`_ and +the `Broccoli API Reference <../../broccoli-api/index.html>`_. diff --git a/doc/components/broccoli/broccoli-manual.rst b/doc/components/broccoli/broccoli-manual.rst new file mode 100644 index 0000000000..4d3c8ec79f --- /dev/null +++ b/doc/components/broccoli/broccoli-manual.rst @@ -0,0 +1,1355 @@ +=============================================== +Broccoli: The Bro Client Communications Library +=============================================== + +This page documents Broccoli, the Bro client communications library. +It allows you to create client sensors for the Bro intrusion detection +system. Broccoli can speak a good subset of the Bro communication +protocol, in particular, it can receive Bro IDs, send and receive Bro +events, and send and receive event requests to/from peering Bros. + +.. contents:: + +Introduction +############ + +What is Broccoli? +================= + +Broccoli is the BRO Client COmmunications LIbrary. It allows you to +write applications that speak the communication protocol of the `Bro +intrusion detection system `_. + +Broccoli is free software under terms of the BSD license as given in the +COPYING file distributed with its source code. + +In this document, we assume that you are familiar with the basic +concepts of Bro, so please first review the documentation/publications +available from the Bro website if necessary. + +Feedback, patches and bug reports are all welcome, please see +http://www.bro.org/community for instructions on how to participate +in the Bro community. + +Why do I care? +============== + +Having a single IDS on your network is good, but things become a lot +more interesting when you can communicate information among multiple +vantage points in your network. Bro agents can communicate with other +Bro agents, sending and receiving events and other state information. In +the Bro context this is particularly interesting because it means that +you can build sophisticated policy-controlled distributed event +management systems. + +Broccoli enters the picture when it comes to integrating components that +are not Bro agents themselves. Broccoli lets you create applications +that can speak the Bro communication protocol. You can compose, send, +request, and receive events. You can register your own event handlers. +You can talk to other Broccoli applications or Bro agents -- Bro agents +cannot tell whether they are talking to another Bro or a Broccoli +application. Broccoli allows you to integrate applications of your +choosing into a distributed policy-controlled event management system. +Broccoli is intended to be portable: it should build on Linux, the BSDs, +Solaris, and Windows (in the `MinGW `_ +environment). + +Unlike other distributed IDSs, Bro does not assume a strict +sensor-manager hierarchy in the information flow. Instead, Bro agents +can request delivery of arbitrary *events* from other instances. When +an event is triggered in a Bro agent, it checks whether any connected +agents have requested notification of this event, and sends a *copy* of +the event, including the *event arguments*. Recall that in Bro, an +event handler is essentially a function defined in the Bro language, +and an event materializes through invocation of an event handler. Each +remote agent can define its own event handlers. + +Broccoli applications will typically do one or more of the following: + +- *Configuration/Management Tasks:* the Broccoli application + is used to configure remotely running Bros without the need for a + restart. + +- *Interfacing with other Systems:* the Broccoli application + is used to convert Bro events to other alert/notice formats, or into + syslogd entries. + +- *Host-based Sensor Feeds into Bro:* the Broccoli + application reports events based on host-based activity generated in + kernel space or user space applications. + +Installing Broccoli +################### + +The installation process will hopefully be painless: Broccoli is +installed from source using the usual ``./configure && make && +make install`` routine after extraction of the tarball. + +Some relevant configuration options to pass to configure are: + +- ``--prefix=``: sets the installation root to DIR. + The default is to install below ``/usr/local``. + +- ``--enable-debug``: enables debugging output. + Please refer to the `Configuring Debugging Output`_ section for + details on configuring and using debugging output. + +- ``--with-configfile=``: use FILE as location of configuration + file. See the section on `Configuration Files`_ for more on this. + +- ``--with-openssl=``: use the OpenSSL installation below DIR. + +After installation, you'll find the library in shared and static +versions in ``/lib``, the header file for compilation in +``/include``. + +Using Broccoli +############## + +Obtaining information about your build using ``broccoli-config`` +================================================================ + +Similarly to many other software packages, the Broccoli distribution +provides a script that you can use to obtain details about your Broccoli +setup. The script currently provides the following flags: + +- ``--build`` prints the name of the machine the build was + made on, when, and whether debugging support was enabled or not. + +- ``--prefix`` prints the directory in the filesystem + below which Broccoli was installed. + +- ``--version`` prints the version of the distribution + you have installed. + +- ``--libs`` prints the flags to pass to the + linker in order to link in the Broccoli library. + +- ``--cflags`` prints the flags to pass to the + compiler in order to properly include Broccoli's header file. + +- ``--config`` prints the location of the system-wide + config file your installation will use. + +The ``--cflags`` and ``--libs`` flags are the suggested way of obtaining +the necessary information for integrating Broccoli into your build +environment. It is generally recommended to use ``broccoli-config`` for +this purpose, rather than, say, develop new **autoconf** tests. If you +use the **autoconf/automake** tools, we recommend something along the +following lines for your ``configure`` script:: + + dnl ################################################## + dnl # Check for Broccoli + dnl ################################################## + AC_ARG_WITH(broccoli-config, + AC_HELP_STRING(\[--with-broccoli-config=FILE], \[Use given broccoli-config]), + [ brocfg="$withval" ], + [ AC_PATH_GENERIC(broccoli,, + brocfg="broccoli-config", + AC_MSG_ERROR(Cannot find Broccoli: Is broccoli-config in path? Use more fertilizer?)) ]) + + broccoli_libs=`$brocfg --libs` + broccoli_cflags=`$brocfg --cflags` + AC_SUBST(broccoli_libs) + AC_SUBST(broccoli_cflags)`` + +You can then use the compiler/linker flags in your Makefile.in/ams by +substituting in the values accordingly, which might look as follows:: + + CFLAGS = -W -Wall -g -DFOOBAR @broccoli_cflags@ + LDFLAGS = -L/usr/lib/foobar @broccoli_libs@ + +Suggestions for instrumenting applications +========================================== + +Often you will want to make existing applications Bro-aware, that is, +*instrument* them so that they can send and receive Bro events at +appropriate moments in the execution flow. This will involve modifying +an existing code tree, so care needs to be taken to avoid unwanted side +effects. By protecting the instrumented code with ``#ifdef``/``#endif`` +statements you can still build the original application, using the +instrumented source tree. The ``broccoli-config`` script helps you in +doing so because it already adds ``-DBROCCOLI`` to the compiler flags +reported when run with the ``--cflags`` option: + +.. console:: + + > broccoli-config --cflags + -I/usr/local/include -I/usr/local/include -DBROCCOLI + +So simply surround all inserted code with a preprocessor check for +``BROCCOLI`` and you will be able to build the original application as +soon as ``BROCCOLI`` is not defined. + +The Broccoli API +================ + +Time for some code. In the code snippets below we will introduce variables +whenever context requires them and not necessarily when C requires them. +The library does not require calling a global initialization function. +In order to make the API known, include ``broccoli.h``: + +.. code:: c + + #ifdef BROCCOLI + #include + #endif + +.. note:: + *Broccoli's memory management philosophy:* + + Broccoli generally does not release objects you allocate. + The approach taken is "you clean up what you allocate." + +Initialization +-------------- + +Broccoli requires global initialization before most of its other +functions can be used. Generally, the way to initialize Broccoli is as +follows: + +.. code:: c + + bro_init(NULL); + +The argument to ``bro_init()`` provides optional initialization context, +and may be kept ``NULL`` for normal use. If required, you may allocate a +``BroCtx`` structure locally, initialize it using ``bro_ctx_init()``, +fill in additional values as required and subsequently pass it to +``bro_init()``: + +.. code:: c + + BroCtx ctx; + bro_ctx_init(&ctx); + /* Make adjustments to the context structure as required...*/ + bro_init(&ctx); + +.. note:: The ``BroCtx`` structure currently contains a set of five + different callback function pointers. These are *required* for + thread-safe operation of OpenSSL (Broccoli itself is thread-safe). + If you intend to use Broccoli in a multithreaded environment, you + need to implement functions and register them via the ``BroCtx`` + structure. The O'Reilly book "Network Security with OpenSSL" by + Viega et al. shows how to implement these callbacks. + +.. warning:: You *must* call ``bro_init()`` at the start of your + application. Undefined behavior may result if you don't. + +Data types in Broccoli +---------------------- + +Broccoli declares a number of data types in ``broccoli.h`` that you +should know about. The more complex ones are kept opaque, while you do +get access to the fields in the simpler ones. The full list is as +follows: + +- Simple signed and unsigned types: int, uint, uint16, uint32, uint64 + and uchar. + +- Connection handles: BroConn, kept opaque. + +- Bro events: BroEvent, kept opaque. + +- Buffer objects: BroBuf, kept opaque. See also `Using Dynamic + Buffers`_. + +- Ports: BroPort for network ports, defined as follows: + + .. code:: c + + typedef struct bro_port { + uint16 port_num; /* port number in host byte order */ + int port_proto; /* IPPROTO_xxx */ + } BroPort; + +- Records: BroRecord, kept opaque. See also `Handling Records`_. + +- Strings (character and binary): BroString, defined as follows: + + .. code:: c + + typedef struct bro_string { + int str_len; + char str_val; + } BroString; + +- BroStrings are mostly kept transparent for convenience; please have a + look at the `Broccoli API Reference`_. + +- Tables: BroTable, kept opaque. See also `Handling Tables`_. + +- Sets: BroSet, kept opaque. See also `Handling Sets`_. + +- IP Address: BroAddr, defined as follows: + + .. code:: c + + typedef struct bro_addr { + uint32 addr[4]; /* IP address in network byte order */ + int size; /* Number of 4-byte words occupied in addr */ + } BroAddr; + + Both IPv4 and IPv6 addresses are supported, with the former occupying + only the first 4 bytes of the ``addr`` array. + +- Subnets: BroSubnet, defined as follows: + + .. code:: c + + typedef struct bro_subnet { + BroAddr sn_net; /* IP address in network byte order */ + uint32 sn_width; /* Length of prefix to consider. */ + } BroSubnet; + +Managing Connections +-------------------- + +You can use Broccoli to establish a connection to a remote Bro, or to +create a Broccoli-enabled server application that other Bros will +connect to (this means that in principle, you can also use Broccoli +purely as middleware and have multiple Broccoli applications communicate +directly). + +In order to establish a connection to a remote Bro, you first obtain a +connection handle. You then use this connection handle to request +events, connect to the remote Bro, send events, etc. Connection handles +are pointers to ``BroConn`` structures, which are kept opaque. Use +``bro_conn_new()`` or ``bro_conn_new_str()`` to obtain a handle, +depending on what parameters are more convenient for you: the former +accepts the IP address and port number as separate numerical arguments, +the latter uses a single string to encode both, in "hostname:port" +format. + +To write a Broccoli-enabled server, you first need to implement the +usual ``socket()`` / ``bind()`` / ``listen()`` / ``accept()`` routine. +Once you have obtained a file descriptor for the new connection from +``accept()``, you pass it to the third function that returns a +``BroConn`` handle, ``bro_conn_new_socket()``. The rest of the +connection handling then proceeds as in the client scenario. + +All three calls accept additional flags for fine-tuning connection +behaviour. These flags are: + +- ``BRO_CFLAG_NONE``: no functionality. Use when no flags are desired. + +- ``BRO_CFLAG_RECONNECT``: + When using this option, Broccoli will attempt to reconnect to the peer + transparently after losing connectivity. Essentially whenever you try to + read from or write to the peer and its connection has broke down, a full + reconnect including complete handshaking is attempted. You can check + whether the connection to a peer is alive at any time using + ``bro_conn_alive()``. + +- ``BRO_CFLAG_ALWAYS_QUEUE``: + When using this option, Broccoli will queue any events you send for + later transmission when a connection is currently down. Without using + this flag, any events you attempt to send while a connection is down + get dropped on the floor. Note that Broccoli maintains a maximum queue + size per connection so if you attempt to send lots of events while the + connection is down, the oldest events may start to get dropped + nonetheless. Again, you can check whether the connection is currently + okay by using ``bro_conn_alive()``. + +- ``BRO_CFLAG_DONTCACHE``: + When using this option, Broccoli will ask the peer not to use caching + on the objects it sends to us. This is the default, and the flag need + not normally be used. It is kept to maintain backward compatibility. + +- ``BRO_CFLAG_CACHE``: + When using this option, Broccoli will ask the peer to use caching on + the objects it sends to us. Caching is normally disabled. + +- ``BRO_CFLAG_YIELD``: + When using this option, ``bro_conn_process_input()`` processes at most + one event at a time and then returns. + +By obtaining a connection handle, you do not also establish a connection +right away. This is done using ``bro_conn_connect()``. The main reason +for this is to allow you to subscribe to events (using +``bro_event_registry_add()``, see `Receiving Events`_) before +establishing the connection. Upon returning from ``bro_conn_connect()`` +you are guaranteed to receive all instances of the event types you have +requested, while later on during the connection some time may elapse +between the issuing of a request for events and the processing of that +request at the remote end. Connections are established via TCP, +optionally using SSL encryption. See "`Configuring Encrypted +Communication`_", for more information on setting up encryption. The +port numbers Bro agents and Broccoli applications listen on can vary +from peer to peer. + +Finally, ``bro_conn_delete()`` terminates a connection and releases all +resources associated with it. You can create as many connections as you +like, to one or more peers. You can obtain the file descriptor of a +connection using ``bro_conn_get_fd()``: + +.. code:: c + + char host_str = "bro.yourorganization.com"; + int port = 1234; + struct hostent *host; + BroConn *bc; + + if (! (host = gethostbyname(host_str)) || ! + (host->h_addr_list[0])) + { + /* Error handling -- could not resolve host */ + } + + /* In this example, we obtain a connection handle, then register + event handlers, and finally connect to the remote Bro. */ + /* First obtain a connection handle: */ + if (! (bc = bro_conn_new((struct in_addr*) host->h_addr_list[0], + htons(port), BRO_CFLAG_NONE))) + { + /* Error handling - could not get connection handle */ + } + + /* Register event handlers: */ + bro_event_registry_add(bc, "foo", bro_foo_handler, NULL); + /* ... */ + + /* Now connect to the peer: */ + if (! bro_conn_connect(bc)) + { + /* Error handling - could not connect to remote Bro. */ + } + + /* Send and receive events ... */ + + /* Disconnect from Bro and clean up connection */ + bro_conn_delete(bc); + +Or simply use the string-based version: + +.. code:: c + + char host_str = "bro.yourcompany.com:1234"; + BroConn bc; + + /* In this example we don't request any events from the peer, + but we ask it not to use the serialization cache. */ + /* Again, first obtain a connection handle: */ + if (! (bc = bro_conn_new_str(host_str, BRO_CFLAG_DONTCACHE))) + { + /* Error handling - could not get connection handle */ + } + + /* Now connect to the peer: */ + if (! bro_conn_connect(bc)) + { + /* Error handling - could not connect to remote Bro. */ + } + + /* ... */ + +Connection Classes +------------------ + +When you want to establish connections from multiple Broccoli +applications with different purposes, the peer needs a means to +understand what kind of application each connection belongs to. The real +meaning of "kind of application" here is "sets of event types to +request", because depending on the class of an application, the peer +will likely want to receive different types of events. + +Broccoli lets you set the class of a connection using +``bro_conn_set_class()``. When using this feature, you need to call that +function before issuing a ``bro_conn_connect()`` since the class of a +connection is determined at connection startup: + +.. code:: c + + if (! (bc = bro_conn_new_str(host_str, BRO_CFLAG_DONTCACHE))) + { + /* Error handling - could not get connection handle */ + } + + /* Set class of this connection: */ + bro_conn_set_class(bc, "syslog"); + + if (! bro_conn_connect(bc)) + { + /* Error handling - could not connect to remote Bro. */ + } + +If your peer is a Bro node, you need to match the chosen connection +class in the remote Bro's ``Communication::nodes`` configuration. See +`Configuring event reception in Bro scripts`_, for how to do +this. Finally, in order to obtain the class of a connection as +indicated by the remote side, use ``bro_conn_get_peer_class()``. + +Composing and sending events +---------------------------- + +In order to send an event to the remote Bro agent, you first create an +empty event structure with the name of the event, then add parameters to +pass to the event handler at the remote agent, and then send off the +event. + +.. note: + *Bro peers ignore unrequested events.* + + You need to make sure that the remote Bro agent is interested in + receiving the events you send. This interest is expressed in policy + configuration. We'll explain this in more detail in `Configuring + event reception in Bro scripts`_, and for now assume that our + remote peer is configured to receive the events we send. + +Let's assume we want to request a report of all connections a remote Bro +currently keeps state for that match a given destination port and host +name and that have amassed more than a certain number of bytes. The +idea is to send an event to the remote Bro that contains the query, +identifiable through a request ID, and have the remote Bro answer us +with ``remote_conn`` events containing the information we asked for. The +definition of our requesting event could look as follows in the Bro +policy: + +.. code:: bro + + event report_conns(req_id: int, dest_host: string, + dest_port: port, min_size: count); + +First, create a new event: + +.. code:: c + + BroEvent *ev; + + if (! (ev = bro_event_new("report_conns"))) + { + /* Error handling - could not allocate new event. */ + } + +Now we need to add parameters to the event. The sequence and types must +match the event handler declaration -- check the Bro policy to make sure +they match. The function to use for adding parameter values is +``bro_event_add_val()``. All values are passed as *pointer arguments* +and are copied internally, so the object you're pointing to stays +unmodified at all times. You clean up what you allocate. In order to +indicate the type of the value passed into the function, you need to +pass a numerical type identifier along as well. Table-1_ lists the +value types that Broccoli supports along with the type identifier and +data structures to point to. + +.. _Table-1: + +Types, type tags, and data structures for event parameters in Broccoli +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +============================== ===================== ==================== +Type Type tag Data type pointed to +============================== ===================== ==================== +Boolean ``BRO_TYPE_BOOL`` ``int`` +Integer value ``BRO_TYPE_INT`` ``uint64`` +Counter (nonnegative integers) ``BRO_TYPE_COUNT`` ``uint64`` +Enums (enumerated values) ``BRO_TYPE_ENUM`` ``uint64`` (see also description of ``bro_event_add_val()``'s ``type_name`` argument) +Floating-point number ``BRO_TYPE_DOUBLE`` ``double`` +Timestamp ``BRO_TYPE_TIME`` ``double`` (see also ``bro_util_timeval_to_double()`` and ``bro_util_current_time()``) +Time interval ``BRO_TYPE_INTERVAL`` ``double`` +Strings (text and binary) ``BRO_TYPE_STRING`` ``BroString`` (see also family of ``bro_string_xxx()`` functions) +Network ports ``BRO_TYPE_PORT`` ``BroPort``, with the port number in host byte order +IPv4/IPv6 address ``BRO_TYPE_IPADDR`` ``BroAddr``, with the ``addr`` member in network byte order and ``size`` member indicating the address family and number of 4-byte words of ``addr`` that are occupied (1 for IPv4 and 4 for IPv6) +IPv4/IPv6 subnet ``BRO_TYPE_SUBNET`` ``BroSubnet``, with the ``sn_net`` member in network byte order +Record ``BRO_TYPE_RECORD`` ``BroRecord`` (see also the family of ``bro_record_xxx()`` functions and their explanation below) +Table ``BRO_TYPE_TABLE`` ``BroTable`` (see also the family of ``bro_table_xxx()`` functions and their explanation below) +Set ``BRO_TYPE_SET`` ``BroSet`` (see also the family of ``bro_set_xxx()`` functions and their explanation below) +============================== ===================== ==================== + +Knowing these, we can now compose a ``request_connections`` event: + +.. code:: c + + BroString dest_host; + BroPort dest_port; + uint32 min_size; + int req_id = 0; + + bro_event_add_val(ev, BRO_TYPE_INT, NULL, &req_id); + req_id++; + + bro_string_set(&dest_host, "desthost.destdomain.com"); + bro_event_add_val(ev, BRO_TYPE_STRING, NULL, &dest_host); + bro_string_cleanup(&dest_host); + + dest_port.dst_port = 80; + dest_port.dst_proto = IPPROTO_TCP; + bro_event_add_val(ev, BRO_TYPE_PORT, NULL, &dest_port); + + min_size = 1000; bro_event_add_val(ev, BRO_TYPE_COUNT, NULL, &min_size); + +The third argument to ``bro_event_add_val()`` lets you specify a +specialization of the types listed in Table-1_. This is generally not +necessary except for one situation: when using ``BRO_TYPE_ENUM``. You +currently cannot define a Bro-level enum type in Broccoli, and thus when +sending an enum value, you have to specify the type of the enum along +with the value. For example, in order to add an instance of enum +``transport_type`` defined in Bro's ``bro.init``, you would use: + +.. code:: c + + int transport_proto = 2; + /* ... */ + bro_event_add_val(ev, BRO_TYPE_ENUM, "transport_proto", &transport_proto); + +to get the equivalent of "udp" on the remote side. The same system is +used to point out type names when calling ``bro_event_set_val()``, +``bro_record_add_val()``, ``bro_record_set_nth_val()``, and +``bro_record_set_named_val()``. + +All that's left to do now is to send off the event. For this, use +``bro_event_send()`` and pass it the connection handle and the event. +The function returns ``TRUE`` when the event could be sent right away or +if it was queued for later delivery. ``FALSE`` is returned on error. If +the event gets queued, this does not indicate an error -- likely the +connection was just not ready to send the event at this point. Whenever +you call ``bro_event_send()``, Broccoli attempts to send as much of an +existing event queue as possible. Again, the event is copied internally +to make it easier for you to send the same event repeatedly. You clean +up what you allocate: + +.. code:: c + + bro_event_send(bc, ev); + bro_event_free(ev); + +Two other functions may be useful to you: ``bro_event_queue_length()`` +tells you how many events are currently queued, and +``bro_event_queue_flush()`` attempts to flush the current event queue +and returns the number of events that do remain in the queue after the +flush. + +.. note:: you do not normally need to call this function, queue + flushing is attempted every time you send an event. + +Receiving Events +---------------- + +Receiving events is a little more work because you need to + +1. tell Broccoli what to do when requested events arrive, + +#. let the remote Bro agent know that you would like to receive those + events, + +#. find a spot in the code path suitable for extracting and processing + arriving events. + +Each of these steps is explained in the following sections. + +Implementing event callbacks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When Broccoli receives an event, it tries to dispatch the event to +callbacks registered for that event type. The place where callbacks get +registered is called the callback registry. Any callbacks registered for +the arriving event's name are invoked with the parameters shipped with +the event. There are two styles of argument passing to the event +callbacks. Which one is better suited depends on your application. + +Expanded Argument Passing +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Each event argument is passed via a pointer to the callback. This makes +best sense when you know the type of the event and of its arguments, +because it provides you immediate access to arguments as when using a +normal C function. + +In order to register a callback with expanded argument passing, use +``bro_event_registry_add()`` and pass it the connection handle, the name +of the event for which you register the callback, the callback itself +that matches the signature of the ``BroEventFunc`` type, and any user +data (or ``NULL``) you want to see passed to the callback on each +invocation. The callback's type is defined rather generically as +follows: + +.. code:: c + + typedef void (*BroEventFunc) (BroConn *bc, void *user_data, ...); + +It requires a connection handle as its first argument and a pointer to +user-provided callback data as the second argument. Broccoli will pass +the connection handle of the connection on which the event arrived +through to the callback. ``BroEventFunc``'s are variadic, because each +callback you provide is directly invoked with pointers to the parameters +of the event, in a format directly usable in C. All you need to know is +what type to point to in order to receive the parameters in the right +layout. Refer to Table-1_ again for a summary of those types. Record +types are more involved and are addressed in more detail in `Handling +Records`_. + +.. note:: Note that *all* parameters are passed to the + callback as pointers, even elementary types such as ``int`` that + would normally be passed directly. Also note that Broccoli manages + the lifecycle of event parameters and therefore you do *not* have + to clean them up inside the event handler. + +Continuing our example, we will want to process the connection reports +that contain the responses to our ``report_conns`` event. Let's assume +those look as follows: + +.. code:: bro + + event remote_conn(req_id: int, conn: connection); + +The reply events contain the request ID so we can associate requests +with replies, and a connection record (defined in ``bro.init`` in Bro). +(It'd be nicer to report all replies in a single event but we'll +ignore that for now.) For this event, our callback would look like +this: + +.. code:: c + + void remote_conn_cb(BroConn *bc, void *user_data, int *req_id, + BroRecord *conn); + +Once more, you clean up what you allocate, and since you never allocated +the space these arguments point to, you also don't clean them up. +Finally, we register the callback using ``bro_event_registry_add()``: + +.. code:: c + + bro_event_registry_add(bc, "remote_conn", remote_conn_cb, NULL); + +In this case we have no additional data to be passed into the callback, +so we use ``NULL`` for the last argument. If you have multiple events +you are interested in, register each one in this fashion. + +Compact Argument Passing +^^^^^^^^^^^^^^^^^^^^^^^^ + +This is designed for situations when you have to determine how to handle +different types of events at runtime, for example when writing language +bindings or when implementing generic event handlers for multiple event +types. The callback is passed a connection handle and the user data as +above but is only passed one additional pointer, to a BroEvMeta +structure. This structure contains all metadata about the event, +including its name, timestamp (in UTC) of creation, number of arguments, +the arguments' types (via type tags as listed in Table-1_), and the +arguments themselves. + +In order to register a callback with compact argument passing, use +``bro_event_registry_add_compact()`` and pass it similar arguments as +you'd use with ``bro_event_registry_add()``. The callback's type is +defined as follows: + +.. code:: c + + typedef void (*BroCompactEventFunc) (BroConn *bc, void *user_data, + BroEvMeta *meta); + + +.. note:: As before, Broccoli manages the lifecycle of event parameters. + You do not have to clean up the BroEvMeta structure or any of its + contents. + +Below is sample code for extracting the arguments from the BroEvMeta +structure, using our running example. This is still written with the +assumption that we know the types of the arguments, but note that this +is not a requirement for this style of callback: + +.. code:: c + + void remote_conn_cb(BroConn *bc, void *user_data, + BroEvMeta *meta) { + int *req_id; BroRecord *rec; + + /* For demonstration, print out the event's name: */ + + printf("Handling a %s event.\n", meta->ev_name); + + /* Sanity-check the number of arguments: */ + + if (meta->ev_numargs != 2) + { /* error */ } + + /* Sanity-check the argument types: */ + + if (meta->ev_args[0].arg_type != BRO_TYPE_INT) + { /* error */ } + + if (meta->ev_args[1].arg_type != BRO_TYPE_RECORD) + { /* error */ } + + req_id = (int *) meta->ev_args[0].arg_data; + rec = (BroRecord *) meta->ev_args[1].arg_data; + + /* ... */ + } + +Finally, register the callback using +``bro_event_registry_add_compact()``: + +.. code:: c + + bro_event_registry_add_compact(bc, "remote_conn", remote_conn_cb, NULL); + +Requesting event delivery +~~~~~~~~~~~~~~~~~~~~~~~~~ + +At this point, Broccoli knows what to do with the requested events upon +arrival. What's left to do is to let the remote Bro know that you would +like to receive the events for which you registered. If you haven't yet +called ``bro_conn_connect()``, then there is nothing to do, since that +function will request the registered events anyway. Once connected, you +can still request events. To do so, call +``bro_event_registry_request()``: + +.. code:: c + + bro_event_registry_request(bc); + +This mechanism also implies that no unrequested events will be delivered +to us (and if that happened for whatever reason, the event would simply +be dropped on the floor). + +.. note:: At the moment you cannot unrequest events, nor can you request + events based on predicates on the values of the events' arguments. + +Reading events from the connection handle +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +At this point the remote Bro will start sending you the requested events +once they are triggered. What is left to do is to read the arriving +events from the connection and trigger dispatching them to the +registered callbacks. + +If you are writing a new Bro-enabled application, this is easy, and you +can choose among two approaches: polling explicitly via Broccoli's API, +or using ``select()`` on the file handle associated with a BroConn. The +former case is particularly straightforward; all you need to do is call +``bro_conn_process_input()``, which will go off and check if any events +have arrived and if so, dispatch them accordingly. This function does +not block -- if no events have arrived, then the call will return +immediately. For more fine-grained control over your I/O handling, you +will probably want to use ``bro_conn_get_fd()`` to obtain the file +descriptor of your connection and then incorporate that in your standard +``FD_SET``/``select()`` code. Once you have determined that data in fact +are ready to be read from the obtained file descriptor, you can then try +another ``bro_conn_process_input()`` this time knowing that it'll find +something to dispatch. + +As a side note, if you don't process arriving events frequently enough, +then TCP's flow control will start to slow down the sender until +eventually events will queue up and be dropped at the sending end. + +Handling Records +---------------- + +Broccoli supports record structures, i.e., types that pack a set of +values together, placing each value into its own field. In Broccoli, the +way you handle records is somewhat similar to events: after creating an +empty record (of opaque type ``BroRecord``), you can iteratively add +fields and values to it. The main difference is that you must specify a +field name with the value; each value in a record can be identified both +by position (a numerical index starting from zero), and by field name. +You can retrieve vals in a record by field index or field name. You can +also reassign values. There is no explicit, IDL-style definition of +record types. You define the type of a record implicitly by the sequence +of field names and the sequence of the types of the values you put into +the record. + +Note that all fields in a record must be assigned before it can be +shipped. + +The API for record composition consists of ``bro_record_new()``, +``bro_record_free()``, ``bro_record_add_val()``, +``bro_record_set_nth_val()``, and ``bro_record_set_named_val()``. + +On records that use field names, the names of individual fields can be +extracted using ``bro_record_get_nth_name()``. Extracting values from a +record is done using ``bro_record_get_nth_val()`` and +``bro_record_get_named_val()``. The former allows numerical indexing of +the fields in the record, the latter provides name-based lookups. Both +need to be passed the record you want to extract a value from, the index +or name of the field, and either a pointer to an int holding a +BRO_TYPE_xxx value (see again Table-1_ for a summary of those types) or +``NULL``. The pointer, if not ``NULL``, serves two purposes: type +checking and type retrieval. Type checking is performed if the value of +the int upon calling the functions is not BRO_TYPE_UNKNOWN. The type tag +of the requested record field then has to match the type tag stored in +the int, otherwise ``NULL`` is returned. If the int stores +BRO_TYPE_UNKNOWN upon calling, no type-checking is performed. In *both* +cases, the *actual* type of the requested record field is returned in +the int pointed to upon return from the function. Since you have no +guarantees of the type of the value upon return if you pass ``NULL`` as +the int pointer, this is a bad idea and either BRO_TYPE_UNKNOWN or +another type value should always be used. + +For example, you could extract the value of the record field "label", +which we assume should be a string, in the following ways: + +.. code:: c + + BroRecord *rec = /* obtained somehow */ + BroString *string; + int type; + + /* --- Example 1 --- */ + + type = BRO_TYPE_STRING; + /* Use type-checking, will not accept other types */ + + if (! (string = bro_record_get_named_val(rec, "label", &type))) + { + /* Error handling, either there's no field of that value or + the value is not of BRO_TYPE_STRING. The actual type is now + stored in "type". */ + } + + /* --- Example 2 --- */ + + type = BRO_TYPE_UNKNOWN; + /* No type checking, just report the existent type */ + + if (! (string = bro_record_get_named_val(rec, "label", &type))) + { + /* Error handling, no field of that name exists. */ + } + + printf("The type of the value in field 'label' is %i\n", type); + + /* --- Example 3 --- */ + + if (! (string = bro_record_get_named_val(rec, "label", NULL))) + { + /* Error handling, no field of that name exists. */ + } + + /* We now have a value, but we can't really be sure of its type */ + +Record fields can be records, for example in the case of Bro's standard +connection record type. In this case, in order to get to a nested +record, you use ``BRO_TYPE_RECORD``: + +.. code:: c + + void remote_conn_cb(BroConn *bc, int *req_id, BroRecord *conn) + { + BroRecord *conn_id; + int type = BRO_TYPE_RECORD; + if ( ! (conn_id = bro_record_get_named_val(conn, "id", &type))) + { + /* Error handling */ + } + } + +Handling Tables +--------------- + +Broccoli supports Bro-style tables, i.e., associative containers that +map instances of a key type to an instance of a value type. A given key +can only ever point to a single value. The key type can be *composite*, +i.e., it may consist of an ordered sequence of different types, or it +can be *direct*, i.e., consisting of a single type (such as an integer, +a string, or a record). + +The API for table manipulation consists of ``bro_table_new()`` +``bro_table_free()``, ``bro_table_insert()``, ``bro_table_find()``, +``bro_table_get_size()``, ``bro_table_get_types()``, and +``bro_table_foreach()``. + +Tables are handled similarly to records in that typing is determined +dynamically by the initial key/value pair inserted. The resulting types +can be obtained via ``bro_table_get_types()``. Should the types not +have been determined yet, ``BRO_TYPE_UNKNOWN`` will result. Also, as +with records, values inserted into the table are copied internally, and +the ones passed to the insertion functions remain unaffected. + +In contrast to records, table entries can be iterated. By passing a +function of signature ``BroTableCallback()`` and a pointer to data of +your choosing, ``bro_table_foreach()`` will invoke the given function +for each key/value pair stored in the table. Return ``TRUE`` to keep +the iteration going, or ``FALSE`` to stop it. + +.. note:: + The main thing to know about Broccoli's tables is how to use + composite key types. To avoid additional API calls, you may treat + composite key types exactly as records, though you do not need to use + field names when assigning elements to individual fields. So in order + to insert a key/value pair, you create a record with the needed items + assigned to its slots, and use this record as the key object. In + order to differentiate composite index types from direct ones + consisting of a single record, use ``BRO_TYPE_LIST`` as the type of + the record, as opposed to ``BRO_TYPE_RECORD``. Broccoli will then + know to interpret the record as an ordered sequence of items making + up a composite element, not a regular record. + +``brotable.c`` in the ``test/`` subdirectory of the Broccoli tree +contains an extensive example of using tables with composite as well as +direct indexing types. + +Handling Sets +------------- + +Sets are essentially tables with void value types. The API for set +manipulation consists of ``bro_set_new()``, ``bro_set_free()``, +``bro_set_insert()``, ``bro_set_find()``, ``bro_set_get_size()``, +``bro_set_get_type()``, and ``bro_set_foreach()``. + +Associating data with connections +--------------------------------- + +You will often find that you would like to connect data with a +``BroConn``. Broccoli provides an API that lets you associate data items +with a connection handle through a string-based key-value registry. The +functions of interest are ``bro_conn_data_set()``, +``bro_conn_data_get()``, and ``bro_conn_data_del()``. You need to +provide a string identifier for a data item and can then use that string +to register, look up, and remove the associated data item. Note that +there is currently no mechanism to trigger a destructor function for +registered data items when the Bro connection is terminated. You +therefore need to make sure that all data items that you do not have +pointers to via some other means are properly released before calling +``bro_disconnect()``. + +Configuration Files +------------------- + +Imagine you have instrumented the mother of all server applications. +Building it takes forever, and every now and then you need to change +some of the parameters that your Broccoli code uses, such as the host +names of the Bro agents to talk to. To allow you to do this quickly, +Broccoli comes with support for configuration files. All you need to do +is change the settings in the file and restart the application (we're +considering adding support for volatile configuration items that are +read from the file every time they are requested). + +A configuration is read from a single configuration file. This file can +be read from different locations. Broccoli searches in this order +for the config file: + +- The location specified by the ``BROCCOLI_CONFIG_FILE`` environment + variable. + +- A per-user configuration file stored in ``~/.broccoli.conf``. + +- The system-wide configuration file. You can obtain the location + of this config file by running ``broccoli-config --config``. + +.. note:: ``BROCCOLI_CONFIG_FILE`` or ``~/.broccoli.conf`` will only be + used if it is a regular file, not executable, and neither group nor + others have any permissions on the file. That is, the file's + permissions must look like ``-rw-------`` *or* ``-r--------``. + +In the configuration file, a ``#`` anywhere starts a comment that runs to +the end of the line. Configuration items are specified as key-value +pairs:: + + # This is the Broccoli system-wide configuration file. + # + # Entries are of the form , where the + # identifier is a sequence of letters, and value can be a string + # (including whitespace), and floating point or integer numbers. + # Comments start with a "#" and go to the end of the line. For + # boolean values, you may also use "yes", "on", "true", "no", + # "off", or "false". Strings may contain whitespace, but need + # to be surrounded by double quotes '"'. + # + # Examples: + # + Foo/PeerName mybro.securesite.com + Foo/PortNum 123 + Bar/SomeFloat 1.23443543 + Bar/SomeLongStr "Hello World" + +You can also have multiple sections in your configuration. Your +application can select a section as the current one, and queries for +configuration settings will then only be answered with values specified +in that section. A section is started by putting its name (no whitespace +please) between square brackets. Configuration items positioned before +the first section title are in the default domain and will be used by +default:: + + # This section contains all settings for myapp. + [ myapp ] + +You can name identifiers any way you like, but to keep things organized +it is recommended to keep a namespace hierarchy similar to the file +system. In the code, you can query configuration items using +``bro_conf_get_str()``, ``bro_conf_get_int()``, and +``bro_conf_get_dbl()``. You can switch between sections using +``bro_conf_set_domain()``. + +Using Dynamic Buffers +--------------------- + +Broccoli provides an API for dynamically allocatable, growable, +shrinkable, and consumable buffers with ``BroBuf``. You may or may not +find this useful -- Broccoli mainly provides this feature in +``broccoli.h`` because these buffers are used internally anyway and +because they are a typical case of something that people implement +themselves over and over again, for example to collect a set of data +before sending it through a file descriptor, etc. + +The buffers work as follows. The structure implementing a buffer is +called ``BroBuf``, and is initialized to a default size when +created via ``bro_buf_new()`` and released using ``bro_buf_free()``. +Each ``BroBuf`` has a content pointer that points to an arbitrary +location between the start of the buffer and the first byte after the +last byte currently used in the buffer (see ``buf_off`` in the +illustration below). The content pointer can seek to arbitrary +locations, and data can be copied from and into the buffer, adjusting +the content pointer accordingly. You can repeatedly append data to the end +of the buffer's used contents using ``bro_buf_append()``. +:: + + <---------------- allocated buffer space ------------> + <======== used buffer space ========> ^ + ^ ^ ^ | + | | | | + buf buf_ptr buf_off buf_len + +Have a look at the following functions for the details: +``bro_buf_new()``, ``bro_buf_free()``, ``bro_buf_append()``, +``bro_buf_consume()``, ``bro_buf_reset()``, ``bro_buf_get()``, +``bro_buf_get_end()``, ``bro_buf_get_size()``, +``bro_buf_get_used_size()``, ``bro_buf_ptr_get()``, +``bro_buf_ptr_tell()``, ``bro_buf_ptr_seek()``, ``bro_buf_ptr_check()``, +and ``bro_buf_ptr_read()``. + +Configuring Encrypted Communication +=================================== + +Encrypted communication between Bro peers takes place over an SSL +connection in which both endpoints of the connection are authenticated. +This requires at least some PKI in the form of a certificate authority +(CA) which you use to issue and sign certificates for your Bro peers. To +facilitate the SSL setup, each peer requires three documents: a +certificate signed by the CA and containing the public key, the +corresponding private key, and a copy of the CA's certificate. + +The OpenSSL command line tool ``openssl`` can be used to create all +files necessary, but its unstructured arguments and poor documentation +make it a pain to use and waste lots of people a lot of time [#]_. +For an alternative tool to create SSL certificates for secure Bro/Broccoli +communication, see the ``create-cert`` tool available at +ftp://ee.lbl.gov/create-cert.tar.gz. + +In order to enable encrypted communication for your Broccoli +application, you need to put the CA certificate and the peer certificate +in the ``/broccoli/ca_cert`` and ``/broccoli/host_cert`` keys, +respectively, in the configuration file. Optionally, you can store the +private key in a separate file specified by ``/broccoli/host_key``. To +quickly enable/disable a certificate configuration, the +``/broccoli/use_ssl`` key can be used. + +.. note:: + *This is where you configure whether to use encrypted or unencrypted + connections.* + + If the ``/broccoli/use_ssl`` key is present and set to one of "yes", + "true", "on", or 1, then SSL will be used and an incorrect or missing + certificate configuration will cause connection attempts to fail. If + the key's value is one of "no", "false", "off", or 0, then in no case + will SSL be used and connections will always be cleartext. + + If the ``/broccoli/use_ssl`` key is *not* present, then SSL will be + used if a certificate configuration is found, and invalid + certificates will cause the connection to fail. If no certificates + are configured, cleartext connections will be used. + + In no case does an SSL-enabled setup ever fall back to a cleartext + one. + +:: + + /broccoli/use_ssl yes + /broccoli/ca_cert /ca_cert.pem + /broccoli/host_cert /bro_cert.pem + /broccoli/host_key /bro_cert.key + +In a Bro policy, you need to load the ``frameworks/communication/listen.bro`` +script and redef ``Communication::listen_ssl=T``, +``ssl_ca_certificate``, and ``ssl_private_key``, defined in ``bro.init``: + +.. code:: bro + + @load frameworks/communication/listen + + redef Communication::listen_ssl=T; + redef ssl_ca_certificate = "/ca_cert.pem"; + redef ssl_private_key = "/bro.pem"; + +By default, you will be prompted for the passphrase for the private key +matching the public key in your agent's certificate. Depending on your +application's user interface and deployment, this may be inappropriate. +You can store the passphrase in the config file as well, using the +following identifier:: + + /broccoli/host_pass foobar + +.. warning:: *Make sure that access to your configuration is restricted.* + + If you provide the passphrase this way, it is obviously essential to + have restrictive permissions on the configuration file. Broccoli + partially enforces this. Please refer to the section on + `Configuration Files`_ for details. + +Configuring event reception in Bro scripts +========================================== + +Before a remote Bro will accept your connection and your events, it +needs to have its policy configured accordingly: + +1. Load ``frameworks/communication/listen``, and redef the boolean variable + ``Communication::listen_ssl`` depending on whether you want to have + encrypted or cleartext communication. Obviously, encrypting the event + exchange is recommended and cleartext should only be used for early + experimental setups. See below for details on how to set up encrypted + communication via SSL. + +#. You need to find a port to use for the Bros and Broccoli applications + that will listen for connections. Every such agent can use a + different port, though default ports are provided in the Bro + policies. To change the port the Bro agent will be listening on from + its default, redefine the ``Communication::listen_port``. Have a + look at these policies as well as + ``base/frameworks/communication/main.bro`` for the default values. + Here is the policy for the unencrypted case: + + .. code:: bro + + @load frameworks/communication/listen + redef Communication::listen_port = 12345/tcp; + + .. + + Including the settings for the cryptographic files introduced in the + previous section, here is the encrypted one: + + .. code:: bro + + @load frameworks/communication/listen + redef Communication::listen_ssl = T; + redef Communication::listen_port = 12345/tcp; + redef ssl_ca_certificate = "/ca_cert.pem"; + redef ssl_private_key = "/bro.pem"; + + .. + +#. The policy controlling which peers a Bro agent will communicate with + and how this communication will happen are defined in the + ``Communication::nodes`` table defined in + ``base/frameworks/communication/main.bro``. This table contains + entries of type ``Node``, whose members mostly provide default values + so you do not need to define everything. You need to come up with a + tag for the connection under which it can be found in the table (a + creative one would be "broccoli"), the IP address of the peer, the + pattern of names of the events the Bro will accept from you, whether + you want Bro to connect to your machine on startup or not, if so, a + port to connect to (default is ``Communication::default_port`` also defined in + ``base/frameworks/communication/main.bro``), a retry timeout, + whether to use SSL, and the class of a connection as set on the + Broccoli side via ``bro_conn_set_class()``. + + An example could look as follows: + + .. code:: bro + + redef Communication::nodes += { + ["broping"] = [$host = 127.0.0.1, $class="broping", + $events = /ping/, $connect=F, $ssl=F] + }; + + .. + + This example is taken from ``broping.bro``, the policy the remote Bro + must run when you want to use the ``broping`` tool explained in the + section on `test programs`_ below. It will allow an agent on the + local host to connect and send "ping" events. Our Bro will not + attempt to connect, and incoming connections will be expected in + cleartext. + +Configuring Debugging Output +============================ + +If your Broccoli installation was configured with ``--enable-debug``, +Broccoli will report two kinds of debugging information: + +1. function call traces and +#. individual debugging messages. + +Both are enabled by default, but can be adjusted in two ways. + +- In the configuration file: in the appropriate section of the + configuration file, you can set the keys ``/broccoli/debug_messages`` + and ``/broccoli/debug_calltrace`` to ``on``/``off`` to enable/disable + the corresponding output. + +- In code: you can set the variables + ``bro_debug_calltrace`` and ``bro_debug_messages`` to 1/0 at any time + to enable/disable the corresponding output. + +By default, debugging output is inactive (even with debug support +compiled in). You need to enable it explicitly either in your code by +assigning 1 to ``bro_debug_calltrace`` and ``bro_debug_messages`` or by +enabling it in the configuration file. + +Test programs +============= + +The Broccoli distribution comes with a few small test programs, located +in the ``test/`` directory of the tree. The most notable one is +``broping`` [#]_, a mini-version of ping. It sends "ping" events to a +remote Bro agent, expecting "pong" events in return. It operates in two +flavours: one uses atomic types for sending information across, and the +other one uses records. The Bro agent you want to ping needs to run +either the ``broping.bro`` or ``broping-record.bro`` policies. You can +find these in the ``test/`` directory of the source tree, and in +``/share/broccoli`` in the installed version. ``broping.bro`` is +shown below. By default, pinging a Bro on the same machine is +configured. If you want your Bro to be pinged from another machine, you +need to update the ``Communication::nodes`` variable accordingly: + +.. code:: bro + + @load frameworks/communication/listen; + + global ping_log = open_log_file("ping"); + + redef Communication::nodes += { + ["broping"] = [$host = 127.0.0.1, $events = /ping/, + $connect=F, $retry = 60 secs, $ssl=F] + }; + + event ping(src_time: time, seq: count) { + event pong(src_time, current_time(), seq); + } + + event pong(src_time: time, dst_time: time, seq: count) { + print ping_log, + fmt("ping received, seq %d, %f at src, %f at dest, one-way: %f", + seq, src_time, dst_time, dst_time-src_time); + } + +``broping`` sends ping events to Bro. Bro accepts those because they are +configured accordingly in the nodes table. As shown in the +policy, ping events trigger pong events, and ``broccoli`` requests +delivery of all pong events back to it. When running ``broping``, +you'll see something like this: + +.. console:: + + > ./test/broping + pong event from 127.0.0.1: seq=1, time=0.004700/1.010303 s + pong event from 127.0.0.1: seq=2, time=0.053777/1.010266 s + pong event from 127.0.0.1: seq=3, time=0.006435/1.010284 s + pong event from 127.0.0.1: seq=4, time=0.020278/1.010319 s + pong event from 127.0.0.1: seq=5, time=0.004563/1.010187 s + pong event from 127.0.0.1: seq=6, time=0.005685/1.010393 s + +Notes +===== + +.. [#] In other documents and books on OpenSSL you will find this + expressed more politely, using terms such as "daunting to the + uninitiated", "challenging", "complex", "intimidating". + +.. [#] Pronunciation is said to be somewhere on the continuum between + "brooping" and "burping". + +Broccoli API Reference +###################### + +The `API documentation <../../broccoli-api/index.html>`_ +describes Broccoli's public C interface. diff --git a/doc/components/broctl/README.rst b/doc/components/broctl/README.rst new file mode 100644 index 0000000000..1fd728c7cf --- /dev/null +++ b/doc/components/broctl/README.rst @@ -0,0 +1,1913 @@ +.. Autogenerated. Do not edit. + +.. -*- mode: rst-mode -*- +.. +.. Note: This file includes further autogenerated ones. +.. +.. Version number is filled in automatically. +.. |version| replace:: 1.1-3 + +========== +BroControl +========== + +.. rst-class:: opening + + This document summarizes installation and use of *BroControl*, + Bro's interactive shell for operating Bro installations. *BroControl* + has two modes of operation: a *stand-alone* mode for + managing a traditional, single-system Bro setup; and a *cluster* + mode for maintaining a multi-system setup of coordinated Bro + instances load-balancing the work across a set of independent + machines. Below, we describe the installation process separately + for the two modes. Once installed, the operation is pretty similar + for both types; just keep in mind that if this document refers to + "nodes" and you're in a stand-alone setup, there is only a + single one and no worker/proxies. + +.. contents:: + +Download +-------- + +You can find the latest BroControl release for download at +http://www.bro.org/download. + +BroControl's git repository is located at +`git://git.bro.org/broctl `_. You +can browse the repository `here `_. + +This document describes BroControl |version|. See the ``CHANGES`` +file for version history. + +Prerequisites +------------- + +Running *BroControl* requires the following prerequisites: + + - A Unix system. FreeBSD, Linux, and MacOS are supported and + should work out of the box. Other Unix systems will quite likely + require some tweaking. Note that in a cluster setup, all systems + must be running exactly the *same* operating system. + + - A version of *Python* >= 2.6. + + - A *bash* (note in particular, that on FreeBSD, *bash* is not + installed by default). + +Installation +------------ + +Stand-alone Bro +~~~~~~~~~~~~~~~ + +For installing a stand-alone Bro setup, just follow the +Bro :doc:`Quick Start Guide<../../quickstart>`. + +Bro Cluster +~~~~~~~~~~~ + +A *Bro Cluster* is a set of systems jointly analyzing the traffic of +a network link in a coordinated fashion. *BroControl* is able to +operate such a setup from a central manager system pretty much +transparently, hiding much of the complexity of the multi-machine +installation. + +A cluster consists of four types of components: + + Frontends. + One or more frontends: Frontends load-balance the traffic + across a set of worker machines. + + Worker nodes. + Workers are doing the actual analysis, with each seeing a + slice of the overall traffic as split by the frontend(s). + + One or more proxies. + Proxies relay the communication between worker nodes. + + One manager. + The manager provides the cluster's user-interface for + controlling and logging. During operation, the user only + interacts with the manager; this is where *BroControl* is + running. + +For more information about the cluster architecture, including options +for the frontend, see the :doc:`Bro Cluster<../../cluster>` documentation. + +This document focuses on the installation of manager, +workers, and the proxies. If not otherwise +stated, in the following we use the terms "manager", "worker", and +"proxy" to refer to Bro instances, not to physical machines; rather, +we use the term "node" to refer to physical machines. There may be +multiple Bro instances running on the same node. For example, it's +possible to run a proxy on the same node as the manager is operating +on. + +In the following, as an example setup, we will assume that our +cluster consists of four nodes (not counting the frontend). The host +names of the systems will be ``host1``, ``host2``, ``host3``, and +``host4``. We will configure the cluster so that ``host1`` runs the +manager and the (only) proxy, and ``host{2,3,4}`` are each running +one worker. This is a typical setup, which will work well for many +sites. + +When installing a cluster, in addition to the prerequisites +mentioned above, you need to + + - have the same user account set up on all nodes. On the worker + nodes, this user must have access to target network interface in + promiscuous mode. ``ssh`` access from the manager node to this + user account must be setup on all machines, and must work + without asking for a password/passphrase. + + - have some storage available on all nodes under the same path, + which we will call the cluster's *prefix* path. In the + following, we will use ``/usr/local/bro`` as an example. The Bro + user must be able to either create this directory or, where it + already exists, must have write permission inside this directory + on all nodes. + + - have ``ssh`` and ``rsync`` installed. + + +With all prerequisites in place, perform the following steps to +install a Bro cluster (as the Bro user) if you install from the Bro source +code (which includes BroControl): + +- Configure and compile the Bro distribution using the cluster's + prefix path as ``--prefix``:: + + > cd /path/to/bro/source/distribution + > ./configure --prefix=/usr/local/bro && make && make install + +- Add ``/bin`` to your ``PATH``. + +- Create a cluster configuration file. There is an example provided, + which you can edit according to the instructions in the file:: + + > cd /usr/local/bro + > vi etc/broctl.cfg + +- Create a node configuration file to define where manager, workers, + and proxies are to run. There is again an example, which defines + the example scenario described above and can be edited as needed:: + + > cd /usr/local/bro + > vi etc/node.cfg + +- Create a network configuration file that lists all of the networks + which the cluster should consider as local to the monitored + environment. Once again, the installation installs a template for + editing:: + + > cd /usr/local/bro + > vi etc/networks.cfg + +- Install workers and proxies using *BroControl*:: + + > broctl install + + This installation process uses ``ssh`` and ``rdist`` to copy the + configuration over to the remote machines so, as described above, + you need to ensure that logging in via SSH works before the install will + succeed. + +- Some tasks need to be run on a regular basis. On the manager node, + insert a line like this into the crontab of the user running the + cluster:: + + 0-59/5 * * * * /bin/broctl cron + +- Finally, you can start the cluster:: + + > broctl start + +Getting Started +--------------- + +*BroControl* is an interactive interface to the cluster which allows +you to, e.g., start/stop the monitoring or update its configuration. +It is started with the ``broctl`` script and then expects commands +on its command-line (alternatively, ``broctl`` can also be started +with a single command directly on the shell's command line):: + + > broctl + Welcome to BroControl x.y + + Type "help" for help. + + [BroControl] > + +As the message says, type help_ to see a list of +all commands. We will now briefly summarize the most important +commands. A full reference follows `Command Reference`_. + +Once ``broctl.cfg`` and ``node.cfg`` are set up as described above, +the monitoring can be started with the start_ command. In the cluster +setup, this will successively start manager, proxies, and workers. The +status_ command should then show all nodes as operating. To stop the +monitoring, issue the stop_ command. exit_ leaves the shell. + +On the manager system (and on the stand-alone system), you find the +current set of (aggregated) logs in ``logs/current`` (which is a +symlink to the corresponding spool directory). The proxies and workers +log into ``spool/proxy/`` and ``spool//``, respectively. +The manager/stand-alone logs are archived in ``logs/``, by default +once a day. Log files of workers and proxies are discarded at the +same rotation interval. + +Whenever the *BroControl* configuration is modified in any way +(including changes to configuration files and site-specific policy +scripts), install_ installs the new version. *No changes will take +effect until* install_ *is run*. Before you run install_, check_ can be +used to check for any potential errors in the new configuration, e.g., +typos in scripts. If check_ does not report any problems, doing +install_ will pretty likely not break anything. + +Note that generally configuration changes only take effect after a +restart of the affected nodes. The restart_ command triggers this. +Some changes however can be put into effect on-the-fly without +restarting any of the nodes by using the update_ command (again only +after doing install_ first). Such dynamic updates generally work with +all changes done which only modify const variables declared as +*redefinable* (i.e., with Bro's *&redef* attribute). + +Generally, site-specific tuning needs to be done with local policy +scripts, as in any Bro setup. This is described in +`Site-specific Customization`_. + +*BroControl* provides various options to control the behavior of +the setup. These options can be set by editing ``broctl.cfg``. +The config_ command gives a list of all options +with their current values. A list of the most important options also +follows `Option Reference`_. + +Site-specific Customization +--------------------------- + +You'll most likely want to adapt the Bro policy to the local +environment and much of the more specific tuning requires writing +local policy files. + +During the initial install, sample local policy scripts (which you can edit) +are installed in ``share/bro/site``. In the stand-alone setup, a single +file called ``local.bro`` gets loaded automatically. In the cluster +setup, the same ``local.bro`` gets loaded, followed by one of three +other files: ``local-manager.bro``, ``local-worker.bro``, and +``local-proxy.bro`` are loaded by the manager, workers, and proxy, +respectively. + +In the cluster setup, the main exception to putting everything into +``local.bro`` is notice filtering, which should be done only on the +manager. + +The next scripts that are loaded are the ones that are automatically +generated by BroControl. These scripts are created from the +``networks.cfg`` and ``broctl.cfg`` files. + +The last scripts loaded are any node-specific scripts specified with the +option ``aux_scripts`` in ``node.cfg``. This option can be used to +load additional scripts to individual nodes only. For example, one could +add a script ``experimental.bro`` to a single worker for trying out new +experimental code. + +The scripts_ command shows precisely which policy scripts get loaded (and +in what order) by a node; that can be very helpful. + +If you want to change which local policy scripts are loaded by the nodes, +you can set SitePolicyStandalone_ for all Bro instances, +SitePolicyManager_ for the manager, and SitePolicyWorker_ for the +workers. To change the directory where local policy scripts are +located, set the option SitePolicyPath_ to a different path. These +options can be changed in the ``broctl.cfg`` file. + +Command Reference +----------------- + +The following summary lists all commands supported by *BroControl*. +All commands may be either entered interactively or specified on the +shell's command line. If not specified otherwise, commands taking +*[]* as arguments apply their action either to the given set of +nodes, or to all nodes if none is given. + +[?1034h.. Automatically generated. Do not edit. + + +.. _attachgdb: + +*attachgdb* *[]* + Primarily for debugging, the command attaches a *gdb* to the main Bro + process on the given nodes. + + +.. _capstats: + +*capstats* *[] []* + Determines the current load on the network interfaces monitored by + each of the given worker nodes. The load is measured over the + specified interval (in seconds), or by default over 10 seconds. This + command uses the `capstats tool + `_, + which is installed along with ``broctl``. + + (Note: When using a CFlow and the CFlow command line utility is + installed as well, the ``capstats`` command can also query the device + for port statistics. *TODO*: document how to set this up.) + + +.. _check: + +*check* *[]* + Verifies a modified configuration in terms of syntactical correctness + (most importantly correct syntax in policy scripts). This command + should be executed for each configuration change *before* + install_ is used to put the change into place. Note + that ``check`` is the only command which operates correctly without a + former install_ command; ``check`` uses the policy + files as found in SitePolicyPath_ to make + sure they compile correctly. If they do, install_ + will then copy them over to an internal place from where the nodes + will read them at the next start_. This approach + ensures that new errors in a policy script will not affect currently + running nodes, even when one or more of them needs to be restarted. + + +.. _cleanup: + +*cleanup* *[--all] []* + Clears the nodes' spool directories (if they are not running + currently). This implies that their persistent state is flushed. Nodes + that were crashed are reset into *stopped* state. If ``--all`` is + specified, this command also removes the content of the node's + TmpDir_, in particular deleteing any data + potentially saved there for reference from previous crashes. + Generally, if you want to reset the installation back into a clean + state, you can first stop_ all nodes, then execute + ``cleanup --all``, and finally start_ all nodes + again. + + +.. _config: + +*config* + Prints all configuration options with their current values. + + +.. _cron: + +*cron* *[enable|disable|?] | [--no-watch]* + This command has two modes of operation. Without arguments (or just + ``--no-watch``), it performs a set of maintenance tasks, including + the logging of various statistical information, expiring old log + files, checking for dead hosts, and restarting nodes which terminated + unexpectedly. The latter can be suppressed with the ``--no-watch`` + option if no auto-restart is desired. This mode is intended to be + executed regularly via *cron*, as described in the installation + instructions. While not intended for interactive use, no harm will be + caused by executing the command manually: all the maintenance tasks + will then just be performed one more time. + + The second mode is for interactive usage and determines if the regular + tasks are indeed performed when ``broctl cron`` is executed. In other + words, even with ``broctl cron`` in your crontab, you can still + temporarily disable its execution by running ``cron disable``, and + then later reenable with ``cron enable``. This can be helpful while + working, e.g., on the BroControl configuration and ``cron`` would + interfere with that. ``cron ?`` can be used to query the current state. + + +.. _df: + +*df* *[]* + Reports the amount of disk space available on the nodes. Shows only + paths relevant to the broctl installation. + + +.. _diag: + +*diag* *[]* + If a node has terminated unexpectedly, this command prints a (somewhat + cryptic) summary of its final state including excerpts of any + stdout/stderr output, resource usage, and also a stack backtrace if a + core dump is found. The same information is sent out via mail when a + node is found to have crashed (the "crash report"). While the + information is mainly intended for debugging, it can also help to find + misconfigurations (which are usually, but not always, caught by the + check_ command). + + +.. _exec: + +*exec* ** + Executes the given Unix shell command line on all nodes configured to + run at least one Bro instance. This is handy to quickly perform an + action across all systems. + + +.. _exit: + +*exit* + Terminates the shell. + + +.. _help: + +*help* + Prints a brief summary of all commands understood by the shell. + + +.. _install: + +*install* + Reinstalls the given nodes, including all configuration files and + local policy scripts. This command must be executed after *all* + changes to any part of the broctl configuration, otherwise the + modifications will not take effect. Usually all nodes should be + reinstalled at the same time, as any inconsistencies between them will + lead to strange effects. Before executing ``install``, it is recommended + to verify the configuration with check_. + + +.. _netstats: + +*netstats* *[]* + Queries each of the nodes for their current counts of captured and + dropped packets. + + +.. _nodes: + +*nodes* + Prints a list of all configured nodes. + + +.. _peerstatus: + +*peerstatus* *[]* + Primarily for debugging, ``peerstatus`` reports statistics about the + network connections cluster nodes are using to communicate with other + nodes. + + +.. _print: + +*print* * []* + Reports the *current* live value of the given Bro script ID on all of + the specified nodes (which obviously must be running). This can for + example be useful to (1) check that policy scripts are working as + expected, or (2) confirm that configuration changes have in fact been + applied. Note that IDs defined inside a Bro namespace must be + prefixed with ``::`` (e.g., ``print SSH::did*ssh*version`` to + print the corresponding table from ``ssh.bro``.) + + +.. _process: + +*process* * [options] [-- ]* + Runs Bro offline on a given trace file using the same configuration as + when running live. It does, however, use the potentially + not-yet-installed policy files in SitePolicyPath_ and disables log + rotation. Additional Bro command line flags and scripts can + be given (each argument after a ``--`` argument is interpreted as + a script). + + Upon completion, the command prints a path where the log files can be + found. Subsequent runs of this command may delete these logs. + + In cluster mode, Bro is run with *both* manager and worker scripts + loaded into a single instance. While that doesn't fully reproduce the + live setup, it is often sufficient for debugging analysis scripts. + + +.. _quit: + +*quit* + Terminates the shell. + + +.. _restart: + +*restart* *[--clean] []* + Restarts the given nodes, or all nodes if none are specified. The + effect is the same as first executing stop_ followed + by a start_, giving the same nodes in both cases. + This command is most useful to activate any changes made to Bro policy + scripts (after running install_ first). Note that a + subset of policy changes can also be installed on the fly via the + update_, without requiring a restart. + + If ``--clean`` is given, the installation is reset into a clean state + before restarting. More precisely, a ``restart --clean`` turns into + the command sequence stop_, cleanup_ --all, check_, install_, and + start_. + + +.. _scripts: + +*scripts* *[-c] []* + Primarily for debugging Bro configurations, the ``scripts`` + command lists all the Bro scripts loaded by each of the nodes in the + order they will be parsed by the node at startup. + If ``-c`` is given, the command operates as check_ does: it reads + the policy files from their *original* location, not the copies + installed by install_. The latter option is useful to check a + not yet installed configuration. + + +.. _start: + +*start* *[]* + Starts the given nodes, or all nodes if none are specified. Nodes + already running are left untouched. + + +.. _status: + +*status* *[]* + Prints the current status of the given nodes. + + +.. _stop: + +*stop* *[]* + Stops the given nodes, or all nodes if none are specified. Nodes not + running are left untouched. + + +.. _top: + +*top* *[]* + For each of the nodes, prints the status of the two Bro + processes (parent process and child process) in a *top*-like + format, including CPU usage and memory consumption. If + executed interactively, the display is updated frequently + until key ``q`` is pressed. If invoked non-interactively, the + status is printed only once. + + +.. _update: + +*update* *[]* + After a change to Bro policy scripts, this command updates the Bro + processes on the given nodes *while they are running* (i.e., without + requiring a restart_). However, such dynamic + updates work only for a *subset* of Bro's full configuration. The + following changes can be applied on the fly: The value of all + const variables defined with the ``&redef`` attribute can be changed. + More extensive script changes are not possible during runtime and + always require a restart; if you change more than just the values of + ``&redef``-able consts and still issue ``update``, the results are + undefined and can lead to crashes. Also note that before running + ``update``, you still need to do an install_ (preferably after + check_), as otherwise ``update`` will not see the changes and it will + resend the old configuration. + + +Option Reference +---------------- + +This section summarizes the options that can be set in ``broctl.cfg`` +for customizing the behavior of *BroControl*. Usually, one only needs +to change the "user options", which are listed first. The "internal +options" are, as the name suggests, primarily used internally and set +automatically. They are documented here only for reference. + +.. Automatically generated. Do not edit. + +User Options +~~~~~~~~~~~~ +.. _BroArgs: + +*BroArgs* (string, default _empty_) + Additional arguments to pass to Bro on the command-line. + +.. _CFlowAddress: + +*CFlowAddress* (string, default _empty_) + If a cFlow load-balancer is used, the address of the device (format: :). + +.. _CFlowPassword: + +*CFlowPassword* (string, default _empty_) + If a cFlow load-balancer is used, the password for accessing its configuration interface. + +.. _CFlowUser: + +*CFlowUser* (string, default _empty_) + If a cFlow load-balancer is used, the user name for accessing its configuration interface. + +.. _CommTimeout: + +*CommTimeout* (int, default 10) + The number of seconds to wait before assuming Broccoli communication events have timed out. + +.. _CompressCmd: + +*CompressCmd* (string, default "gzip -9") + If archived logs will be compressed, the command to use for that. The specified command must compress its standard input to standard output. + +.. _CompressExtension: + +*CompressExtension* (string, default "gz") + If archived logs will be compressed, the file extension to use on compressed log files. When specifying a file extension, don't include the period character (e.g., specify 'gz' instead of '.gz'). + +.. _CompressLogs: + +*CompressLogs* (bool, default 1) + True to compress archived log files. + +.. _CronCmd: + +*CronCmd* (string, default _empty_) + A custom command to run everytime the cron command has finished. + +.. _Debug: + +*Debug* (bool, default 0) + Enable extensive debugging output in spool/debug.log. + +.. _HaveNFS: + +*HaveNFS* (bool, default 0) + True if shared files are mounted across all nodes via NFS (see FAQ). + +.. _IPv6Comm: + +*IPv6Comm* (bool, default 1) + Enable IPv6 communication between cluster nodes (and also between them and BroControl) + +.. _LogDir: + +*LogDir* (string, default "$\{BroBase}/logs") + Directory for archived log files. + +.. _LogExpireInterval: + +*LogExpireInterval* (int, default 0) + Number of days log files are kept (zero means disabled). + +.. _LogRotationInterval: + +*LogRotationInterval* (int, default 3600) + The frequency of log rotation in seconds for the manager/standalone node. + +.. _MailAlarmsTo: + +*MailAlarmsTo* (string, default "$\{MailTo}") + Destination address for alarm summary mails. Default is to use the same address as MailTo. + +.. _MailFrom: + +*MailFrom* (string, default "Big Brother ") + Originator address for broctl-generated mails. + +.. _MailReplyTo: + +*MailReplyTo* (string, default _empty_) + Reply-to address for broctl-generated mails. + +.. _MailSubjectPrefix: + +*MailSubjectPrefix* (string, default "[Bro]") + General Subject prefix for mails. + +.. _MailTo: + +*MailTo* (string, default "") + Destination address for non-alarm mails. + +.. _MakeArchiveName: + +*MakeArchiveName* (string, default "$\{BroBase}/share/broctl/scripts/make-archive-name") + Script to generate filenames for archived log files. + +.. _MemLimit: + +*MemLimit* (string, default "unlimited") + Maximum amount of memory for Bro processes to use (in KB, or the string 'unlimited'). + +.. _MinDiskSpace: + +*MinDiskSpace* (int, default 5) + Percentage of minimum disk space available before warning is mailed. + +.. _PFRINGClusterID: + +*PFRINGClusterID* (int, default @PF_RING_CLUSTER_ID@) + If PF_RING flow-based load balancing is desired, this is where the PF_RING cluster id is defined. The default value is configuration-dependent and determined automatically by CMake at configure-time based upon whether PF_RING's enhanced libpcap is available. Bro must be linked with PF_RING's libpcap wrapper for this option to work. + +.. _Prefixes: + +*Prefixes* (string, default "local") + Additional script prefixes for Bro, separated by colons. Use this instead of @prefix. + +.. _SaveTraces: + +*SaveTraces* (bool, default 0) + True to let backends capture short-term traces via '-w'. These are not archived but might be helpful for debugging. + +.. _SendMail: + +*SendMail* (string, default "@SENDMAIL@") + Location of the sendmail binary. Make this string blank to prevent email from being sent. The default value is configuration-dependent and determined automatically by CMake at configure-time. + +.. _SitePluginPath: + +*SitePluginPath* (string, default _empty_) + Directories to search for custom plugins, separated by colons. + +.. _SitePolicyManager: + +*SitePolicyManager* (string, default "local-manager.bro") + Space-separated list of local policy files for manager. + +.. _SitePolicyPath: + +*SitePolicyPath* (string, default "$\{PolicyDir}/site") + Directories to search for local policy files, separated by colons. + +.. _SitePolicyStandalone: + +*SitePolicyStandalone* (string, default "local.bro") + Space-separated list of local policy files for all Bro instances. + +.. _SitePolicyWorker: + +*SitePolicyWorker* (string, default "local-worker.bro") + Space-separated list of local policy files for workers. + +.. _StopTimeout: + +*StopTimeout* (int, default 60) + The number of seconds to wait before sending a SIGKILL to a node which was previously issued the 'stop' command but did not terminate gracefully. + +.. _TimeFmt: + +*TimeFmt* (string, default "%d %b %H:%M:%S") + Format string to print date/time specifications (see 'man strftime'). + +.. _TimeMachineHost: + +*TimeMachineHost* (string, default _empty_) + If the manager should connect to a Time Machine, the address of the host it is running on. + +.. _TimeMachinePort: + +*TimeMachinePort* (string, default "47757/tcp") + If the manager should connect to a Time Machine, the port it is running on (in Bro syntax, e.g., 47757/tcp). + +.. _ZoneID: + +*ZoneID* (string, default _empty_) + If the host running BroControl is managing a cluster comprised of nodes with non-global IPv6 addresses, this option indicates what RFC 4007 zone_id to append to node addresses when communicating with them. + + +Internal Options +~~~~~~~~~~~~~~~~ + +.. _BinDir: + +*BinDir* (string, default "$\{BroBase}/bin") + Directory for executable files. + +.. _BroBase: + +*BroBase* (string, default _empty_) + Base path of broctl installation on all nodes. + +.. _CapstatsPath: + +*CapstatsPath* (string, default "$\{bindir}/capstats") + Path to capstats binary; empty if not available. + +.. _CfgDir: + +*CfgDir* (string, default "$\{BroBase}/etc") + Directory for configuration files. + +.. _DebugLog: + +*DebugLog* (string, default "$\{SpoolDir}/debug.log") + Log file for debugging information. + +.. _HelperDir: + +*HelperDir* (string, default "$\{BroBase}/share/broctl/scripts/helpers") + Directory for broctl helper scripts. + +.. _LibDir: + +*LibDir* (string, default "$\{BroBase}/lib") + Directory for library files. + +.. _LibDirInternal: + +*LibDirInternal* (string, default "$\{BroBase}/lib/broctl") + Directory for broctl-specific library files. + +.. _LocalNetsCfg: + +*LocalNetsCfg* (string, default "$\{CfgDir}/networks.cfg") + File defining the local networks. + +.. _LockFile: + +*LockFile* (string, default "$\{SpoolDir}/lock") + Lock file preventing concurrent shell operations. + +.. _NodeCfg: + +*NodeCfg* (string, default "$\{CfgDir}/node.cfg") + Node configuration file. + +.. _OS: + +*OS* (string, default _empty_) + Name of operating system as reported by uname. + +.. _PluginDir: + +*PluginDir* (string, default "$\{LibDirInternal}/plugins") + Directory where standard plugins are located. + +.. _PolicyDir: + +*PolicyDir* (string, default "$\{BroBase}/share/bro") + Directory for standard policy files. + +.. _PolicyDirSiteInstall: + +*PolicyDirSiteInstall* (string, default "$\{SpoolDir}/installed-scripts-do-not-touch/site") + Directory where the shell copies local policy scripts when installing. + +.. _PolicyDirSiteInstallAuto: + +*PolicyDirSiteInstallAuto* (string, default "$\{SpoolDir}/installed-scripts-do-not-touch/auto") + Directory where the shell copies auto-generated local policy scripts when installing. + +.. _PostProcDir: + +*PostProcDir* (string, default "$\{BroBase}/share/broctl/scripts/postprocessors") + Directory for log postprocessors. + +.. _ScriptsDir: + +*ScriptsDir* (string, default "$\{BroBase}/share/broctl/scripts") + Directory for executable scripts shipping as part of broctl. + +.. _SpoolDir: + +*SpoolDir* (string, default "$\{BroBase}/spool") + Directory for run-time data. + +.. _StandAlone: + +*StandAlone* (bool, default 0) + True if running in stand-alone mode (see elsewhere). + +.. _StateFile: + +*StateFile* (string, default "$\{SpoolDir}/broctl.dat") + File storing the current broctl state. + +.. _StaticDir: + +*StaticDir* (string, default "$\{BroBase}/share/broctl") + Directory for static, arch-independent files. + +.. _StatsDir: + +*StatsDir* (string, default "$\{LogDir}/stats") + Directory where statistics are kept. + +.. _StatsLog: + +*StatsLog* (string, default "$\{SpoolDir}/stats.log") + Log file for statistics. + +.. _Time: + +*Time* (string, default _empty_) + Path to time binary. + +.. _TmpDir: + +*TmpDir* (string, default "$\{SpoolDir}/tmp") + Directory for temporary data. + +.. _TmpExecDir: + +*TmpExecDir* (string, default "$\{SpoolDir}/tmp") + Directory where binaries are copied before execution. + +.. _TraceSummary: + +*TraceSummary* (string, default "$\{bindir}/trace-summary") + Path to trace-summary script (empty if not available). Make this string blank to disable the connection summary emails. + +.. _Version: + +*Version* (string, default _empty_) + Version of the broctl. + + +Writing Plugins +--------------- + +BroControl provides a plugin interface to extend its functionality. A +plugin is written in Python and can do any, or all, of the following: + + * Perform actions before or after any of the standard BroControl + commands is executed. When running before the actual command, it + can filter which nodes to operate or stop the execution + altogether. When running after the command, it gets access to + the commands success on a per-node basis (where applicable). + + * Add custom commands to BroControl. + + * Add custom options to BroControl defined in ``broctl.cfg``. + + * Add custom keys to nodes defined in ``node.cfg``. + +A plugin is written by deriving a new class from BroControl class +`Plugin`_. The Python script with the new plugin is then copied into a +plugin directory searched by BroControl at startup. By default, +BroControl searches ``/lib/broctl/plugins``; further may be +configured by setting the PluginDir_ option. Note that any plugin +script must end in ``*.py`` to be found. BroControl comes with some +example plugins that can be used as a starting point; see +the ``/lib/broctl/plugins`` directory. + +In the following, we document the API that is available to plugins. A +plugin must be derived from the `Plugin`_ class, and can use its +methods as well as those of the `Node`_ class. + +.. _Plugin: + +Class ``Plugin`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +class **Plugin** + The class ``Plugin`` is the base class for all BroControl plugins. + + The class has a number of methods for plugins to override, and every + plugin must at least override ``name()`` and ``pluginVersion()``. + + For each BroControl command ``foo``, there are two methods, + ``cmd_foo_pre`` and ``cmd_foo_post``, that are called just before the + command is executed and just after it has finished, respectively. The + arguments these methods receive correspond to their command-line + parameters, and are further documented below. + + The ``cmd__pre`` methods have the ability to prevent the command's + execution, either completely or partially for those commands that take + nodes as parameters. In the latter case, the method receives a list of + nodes that the command is to be run on, and it can filter that list and + returns modified version of nodes to actually use. The standard case would + be returning simply the unmodified ``nodes`` parameter. To completely + block the command's execution, return an empty list. To just not execute + the command for a subset, remove the affected ones. For commands that do + not receive nodes as arguments, the return value is interpreted as boolean + indicating whether command execution should proceed (True) or not (False). + + The ``cmd__post`` methods likewise receive the commands arguments as + their parameter, as documented below. For commands taking nodes, the list + corresponds to those nodes for which the command was actually executed + (i.e., after any ``cmd__pre`` filtering). Each node is given as a + tuple ``(node, bool)`` with *node* being the actual `Node`_, and the boolean + indicating whether the command was successful for it. + + Note that if a plugin prevents a command from executing either completely or + partially, it should report its reason via the ``message*()`` or + ``error()`` methods. + + If multiple plugins hook into the same command, all their + ``cmd__{pre,post}`` are executed in undefined order. The command is + executed on the intersection of all ``cmd__pre`` results. + + Finally, note that the ``restart`` command doesn't have its own method as + it's just a combination of other commands and thus their callbacks are + run. + + .. _Plugin.debug: + + **debug** (self, msg) + + Logs a debug message in BroControl's debug log if enabled. + + .. _Plugin.error: + + **error** (self, msg) + + Reports an error to the user. + + .. _Plugin.execute: + + **execute** (self, node, cmd) + + Executes a command on the host for the given *node* of type + `Node`_. Returns a tuple ``(success, output)`` in which ``success`` is + True if the command ran successfully and ``output`` is the combined + stdout/stderr output. + + .. _Plugin.executeParallel: + + **executeParallel** (self, cmds) + + Executes a set of commands in parallel on multiple hosts. ``cmds`` + is a list of tuples ``(node, cmd)``, in which the *node* is a `Node`_ + instance and *cmd* is a string with the command to execute for it. The + method returns a list of tuples ``(node, success, output)``, in which + ``success`` is True if the command ran successfully and ``output`` is + the combined stdout/stderr output for the corresponding ``node``. + + .. _Plugin.getGlobalOption: + + **getGlobalOption** (self, name) + + Returns the value of the global BroControl option or state + attribute *name*. If the user has not set the options, its default + value is returned. See the output of ``broctl config`` for a complete + list. + + .. _Plugin.getOption: + + **getOption** (self, name) + + Returns the value of one of the plugin's options, *name*. The + returned value will always be a string. + + An option has a default value (see *options()*), which can be + overridden by a user in ``broctl.cfg``. An option's value cannot be + changed by the plugin. + + .. _Plugin.getState: + + **getState** (self, name) + + Returns the current value of one of the plugin's state variables, + *name*. The returned value will always be a string. If it has not yet + been set, an empty string will be returned. + + Different from options, state variables can be set by the plugin and + are persistent across restarts. They are not visible to the user. + + Note that a plugin cannot query any global BroControl state variables. + + .. _Plugin.hosts: + + **hosts** (self, nodes) + + Returns a list of all hosts running at least one node from the list + of Nodes_ objects in *nodes*, or configured in if *nodes* is empty. + + .. _Plugin.message: + + **message** (self, msg) + + Reports a message to the user. + + .. _Plugin.nodes: + + **nodes** (self) + + Returns a list of all configured `Node`_ objects. + + .. _Plugin.parseNodes: + + **parseNodes** (self, names) + + Returns `Node`_ objects for a string of space-separated node names. + If a name does not correspond to a known node, an error message is + printed and the node is skipped from the returned list. If no names + are known, an empty list is returned. + + .. _Plugin.setState: + + **setState** (self, name, value) + + Sets one of the plugin's state variables, *name*, to *value*. + *value* must be a string. The change is permanent and will be recorded + to disk. + + Note that a plugin cannot change any global BroControl state + variables. + + .. _Plugin.broProcessDied: + + **broProcessDied** (self, node) + + Called when BroControl finds the Bro process for Node_ *node* + to have terminated unexpectedly. This method will be called just + before BroControl prepares the node's "crash report" and before it + cleans up the node's spool directory. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_attachgdb_post: + + **cmd_attachgdb_post** (self, nodes) + + Called just after the ``attachgdb`` command has finished. Arguments + are as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_attachgdb_pre: + + **cmd_attachgdb_pre** (self, nodes) + + Called just before the ``attachgdb`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_capstats_post: + + **cmd_capstats_post** (self, nodes, interval) + + Called just after the ``capstats`` command has finished. Arguments + are as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_capstats_pre: + + **cmd_capstats_pre** (self, nodes, interval) + + Called just before the ``capstats`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. *integer* is an integer with the measurement interval in + seconds. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_check_post: + + **cmd_check_post** (self, results) + + Called just after the ``check`` command has finished. It receives + the list of 2-tuples ``(node, bool)`` indicating the nodes the command + was executed for, along with their success status. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_check_pre: + + **cmd_check_pre** (self, nodes) + + Called just before the ``check`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_cleanup_post: + + **cmd_cleanup_post** (self, nodes, all) + + Called just after the ``cleanup`` command has finished. Arguments + are as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_cleanup_pre: + + **cmd_cleanup_pre** (self, nodes, all) + + Called just before the ``cleanup`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. *all* is boolean indicating whether the ``--all`` + argument has been given. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_config_post: + + **cmd_config_post** (self) + + Called just after the ``config`` command has finished. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_config_pre: + + **cmd_config_pre** (self) + + Called just before the ``config`` command is run. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_cron_post: + + **cmd_cron_post** (self, arg, watch) + + Called just after the ``cron`` command has finished. Arguments are + as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_cron_pre: + + **cmd_cron_pre** (self, arg, watch) + + Called just before the ``cron`` command is run. *arg* is None if + the cron is executed without arguments. Otherwise, it is one of the + strings: ``enable``, ``disable``, ``?``. *watch* is a boolean + indicating whether ``cron`` should restart abnormally terminated Bro + processes; it's only valid if arg is empty. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_custom: + + **cmd_custom** (self, cmd, args) + + Called when command defined by the ``commands`` method is executed. + ``cmd`` is the command (with the plugin's prefix), and ``args`` is a + single *string* with all arguments. + + If the arguments are actually node names, ``parseNodes`` can + be used to get the `Node`_ objects. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_df_post: + + **cmd_df_post** (self, nodes) + + Called just after the ``df`` command has finished. Arguments are as + with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_df_pre: + + **cmd_df_pre** (self, nodes) + + Called just before the ``df`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_diag_post: + + **cmd_diag_post** (self, nodes) + + Called just after the ``diag`` command has finished. Arguments are + as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_diag_pre: + + **cmd_diag_pre** (self, nodes) + + Called just before the ``diag`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_exec_post: + + **cmd_exec_post** (self, cmdline) + + Called just after the ``exec`` command has finished. Arguments are + as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_exec_pre: + + **cmd_exec_pre** (self, cmdline) + + Called just before the ``exec`` command is run. *cmdline* is a + string with the command line to execute. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_install_post: + + **cmd_install_post** (self) + + Called just after the ``install`` command has finished. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_install_pre: + + **cmd_install_pre** (self) + + Called just before the ``install`` command is run. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_netstats_post: + + **cmd_netstats_post** (self, nodes) + + Called just after the ``netstats`` command has finished. Arguments + are as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_netstats_pre: + + **cmd_netstats_pre** (self, nodes) + + Called just before the ``netstats`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_nodes_post: + + **cmd_nodes_post** (self) + + Called just after the ``nodes`` command has finished. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_nodes_pre: + + **cmd_nodes_pre** (self) + + Called just before the ``nodes`` command is run. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_peerstatus_post: + + **cmd_peerstatus_post** (self, nodes) + + Called just after the ``peerstatus`` command has finished. + Arguments are as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_peerstatus_pre: + + **cmd_peerstatus_pre** (self, nodes) + + Called just before the ``peerstatus`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_print_post: + + **cmd_print_post** (self, nodes, id) + + Called just after the ``print`` command has finished. Arguments are + as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_print_pre: + + **cmd_print_pre** (self, nodes, id) + + Called just before the ``print`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. *id* is a string with the name of the ID to be printed. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_process_post: + + **cmd_process_post** (self, trace, options, scripts, success) + + Called just after the ``process`` command has finished. Arguments + are as with the ``pre`` method, plus an additional boolean *success* + indicating whether Bro terminated normally. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_process_pre: + + **cmd_process_pre** (self, trace, options, scripts) + + Called just before the ``process`` command is run. It receives the + *trace* to read from as a string, a list of additional Bro *options*, + and a list of additional Bro scripts. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_restart_post: + + **cmd_restart_post** (self, results) + + Called just after the ``restart`` command has finished. It receives + the list of 2-tuples ``(node, bool)`` indicating the nodes the command + was executed for, along with their success status. The remaining + arguments are as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_restart_pre: + + **cmd_restart_pre** (self, nodes, clean) + + Called just before the ``restart`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. *clean* is boolean indicating whether the ``--clean`` + argument has been given. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_scripts_post: + + **cmd_scripts_post** (self, nodes, full_path, check) + + Called just after the ``scripts`` command has finished. Arguments + are as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_scripts_pre: + + **cmd_scripts_pre** (self, nodes, full_path, check) + + Called just before the ``scripts`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. ``full_path`` and ``check`` are boolean indicating + whether the ``-p`` and ``-c`` options were given, respectively. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_start_post: + + **cmd_start_post** (self, results) + + Called just after the ``start`` command has finished. It receives + the list of 2-tuples ``(node, bool)`` indicating the nodes the command + was executed for, along with their success status. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_start_pre: + + **cmd_start_pre** (self, nodes) + + Called just before the ``start`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_status_post: + + **cmd_status_post** (self, nodes) + + Called just after the ``status`` command has finished. Arguments + are as with the ``pre`` method. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_status_pre: + + **cmd_status_pre** (self, nodes) + + Called just before the ``status`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_stop_post: + + **cmd_stop_post** (self, results) + + Called just after the ``stop`` command has finished. It receives + the list of 2-tuples ``(node, bool)`` indicating the nodes the command + was executed for, along with their success status. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_stop_pre: + + **cmd_stop_pre** (self, nodes) + + Called just before the ``stop`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_top_post: + + **cmd_top_post** (self, nodes) + + Called just after the ``top`` command has finished. Arguments are + as with the ``pre`` method. Note that when ``top`` is run + interactively to auto-refresh continuously, this method will be called + once after each update. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_top_pre: + + **cmd_top_pre** (self, nodes) + + Called just before the ``top`` command is run. It receives the list + of nodes, and returns the list of nodes that should proceed with the + command. Note that when ``top`` is run interactively to auto-refresh + continuously, this method will be called once before each update. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_update_post: + + **cmd_update_post** (self, results) + + Called just after the ``update`` command has finished. It receives + the list of 2-tuples ``(node, bool)`` indicating the nodes the command + was executed for, along with their success status. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.cmd_update_pre: + + **cmd_update_pre** (self, nodes) + + Called just before the ``update`` command is run. It receives the + list of nodes, and returns the list of nodes that should proceed with + the command. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.commands: + + **commands** (self) + + Returns a set of custom commands provided by the + plugin. + + The return value is a list of 3-tuples each having the following + elements: + + ``command`` + A string with the command's name. Note that the command name + exposed to the user will be prefixed with the plugin's prefix + as returned by *name()* (e.g., ``myplugin.mycommand``). + + ``arguments`` + A string describing the command's arguments in a textual form + suitable for use in the ``help`` command summary (e.g., + ``[]`` for command taking an optional list of nodes). + Empty if no arguments are expected. + + ``description`` + A string with a description of the command's semantics. + + + This method can be overridden by derived classes. The implementation + must not call the parent class' implementation. The default + implementation returns an empty list. + + .. _Plugin.done: + + **done** (self) + + Called once just before BroControl terminates. This method can do + any cleanup the plugin may require. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.hostStatusChanged: + + **hostStatusChanged** (self, host, status) + + Called when BroControl's ``cron`` command finds the availability of + a cluster system to have changed. Initially, all systems are assumed + to be up and running. Once BroControl notices that a system isn't + responding (defined as either it doesn't ping at all, or does not + accept SSH sessions), it calls this method, passing in a string with + the name of the *host* and a boolean *status* set to False. Once the + host becomes available again, the method will be called again for the + same host with *status* now set to True. + + Note that BroControl's ``cron`` tracks a host's availability across + execution, so if the next time it's run the host is still down, this + method will not be called again. + + This method can be overridden by derived classes. The default + implementation does nothing. + + .. _Plugin.init: + + **init** (self) + + Called once just before BroControl starts executing any commands. + This method can do any initialization that the plugin may require. + + Note that when this method executes, BroControl guarantees that all + internals are fully set up (e.g., user-defined options are available). + This may not be the case when the class ``__init__`` method runs. + + Returns a boolean, indicating whether the plugin should be used. If it + returns ``False``, the plugin will be removed and no other methods + called. + + This method can be overridden by derived classes. The default + implementation always returns True. + + .. _Plugin.name: + + **name** (self) + + Returns a string with a descriptive *name* for the plugin (e.g., + ``"TestPlugin"``). The name must not contain any whitespace. + + This method must be overridden by derived classes. The implementation + must not call the parent class' implementation. + + .. _Plugin.nodeKeys: + + **nodeKeys** (self) + + Returns a list of custom keys for ``node.cfg``. The value for a + key will be available from the `Node`_ object as attribute + ``_`` (e.g., ``node.test_mykw``). If not set, the + attribute will be set to None. + + This method can be overridden by derived classes. The implementation + must not call the parent class' implementation. The default + implementation returns an empty list. + + .. _Plugin.options: + + **options** (self) + + Returns a set of local configuration options provided by the + plugin. + + The return value is a list of 4-tuples each having the following + elements: + + ``name`` + A string with name of the option (e.g., ``Path``). Option + names are case-insensitive. Note that the option name exposed + to the user will be prefixed with your plugin's prefix as + returned by *name()* (e.g., ``myplugin.Path``). + + ``type`` + A string with type of the option, which must be one of + ``"bool"``, ``"string"``, or ``"int"``. + + ``default`` + A string with the option's default value. Note that this must + always be a string, even for non-string types. For booleans, + use ``"0"`` for False and ``"1"`` for True. For integers, give + the value as a string ``"42"``. + + ``description`` + A string with a description of the option semantics. + + This method can be overridden by derived classes. The implementation + must not call the parent class' implementation. The default + implementation returns an empty list. + + .. _Plugin.pluginVersion: + + **pluginVersion** (self) + + Returns an integer with a version number for the plugin. Plugins + should increase their version number with any significant change. + + This method must be overridden by derived classes. The implementation + must not call the parent class' implementation. + + .. _Plugin.prefix: + + **prefix** (self) + + Returns a string with a prefix for the plugin's options and + commands names (e.g., "myplugin"). + + This method can be overridden by derived classes. The implementation + must not call the parent class' implementation. The default + implementation returns a lower-cased version of *name()*. + +.. _Node: + +Class ``Node`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +class **Node** + Class representing one node of the BroControl maintained setup. In + standalone mode, there's always exactly one node of type ``standalone``. In + a cluster setup, there is exactly one of type ``manager``, one or + more of type ``proxy``, and zero or more of type ``worker``. + + In addition to the methods described above, a ``Node`` object has a number + of keys with values that are set via ``node.cfg`` and can be accessed + directly via corresponding Python attributes (e.g., ``node.name``): + + ``name`` (string) + The name of the node, which corresponds to the ``[]`` + section in ``node.cfg``. + + ``type`` (string) + The type of the node, which will be one of ``standalone``, + ``manager``, ``proxy``, and ``worker``. + + ``host`` (string) + The hostname of the system the node is running on. + + ``interface`` (string) + The network interface for Bro to use; empty if not set. + + ``lb_procs`` (integer) + The number of clustered Bro workers you'd like to start up. + + ``lb_method`` (string) + The load balancing method to distribute packets to all of the + processes (must be one of: ``pf_ring``, ``myricom``, or + ``interfaces``). + + ``lb_interfaces`` (string) + If the load balancing method is ``interfaces``, then this is + a comma-separated list of network interface names to use. + + ``aux_scripts`` (string) + Any node-specific Bro script configured for this node. + + ``zone_id`` (string) + If BroControl is managing a cluster comprised of nodes + using non-global IPv6 addresses, then this configures the + RFC 4007 ``zone_id`` string that the node associates with + the common zone that all cluster nodes are a part of. This + identifier may differ between nodes. + + Any attribute that is not defined in ``node.cfg`` will be empty. + + In addition, plugins can override `Plugin.nodeKeys`_ to define their own + node keys, which can then be likewise set in ``node.cfg``. The key names + will be prepended with the plugin's `Plugin.prefix`_ (e.g., for the plugin + ``test``, the node key ``foo`` is set by adding ``test.foo=value`` to + ``node.cfg``). + + .. _Node.cwd: + + **cwd** (self) + + Returns a string with the node's working directory. + + .. _Node.describe: + + **describe** (self) + + Returns an extended string representation of the node including all + its keys with values. + + .. _Node.getPID: + + **getPID** (self) + + Returns the process ID of the node's Bro process if running, and + None otherwise. + + .. _Node.getPort: + + **getPort** (self) + + Returns an integer with the port that this node's communication + system is listening on for incoming connections, or -1 if no such port + has been set yet. + + .. _Node.hasCrashed: + + **hasCrashed** (self) + + Returns True if the node's Bro process has exited abnormally. + + +Miscellaneous +------------- + +Mails +~~~~~ + +*BroControl* sends four types of mails to the address given in +``MailTo``: + +1. When logs are rotated (per default once a day), a list of all + alarms during the last rotation interval is sent. This can be + disabled by setting ``MailAlarms=0``. + +2. When the ``cron`` command notices that a node has crashed, it + restarts it and sends a notification. It may also send a more + detailed crash report containing information about the crash. + +3. NOTICES with a notice action ``EMAIL``. + +4. If `trace-summary `_ + is installed, a traffic summary is sent each rotation interval. + +Performance Analysis +~~~~~~~~~~~~~~~~~~~~ + +*TODO*: ``broctl cron`` logs a number of statistics, which can be +analyzed/plotted for understanding the clusters run-time behavior. + +Questions and Answers +--------------------- + +*Can I use an NFS-mounted partition as the cluster's base directory to avoid the ``rsync``'ing?* + Yes. BroBase_ can be on an NFS partition. + Configure and install the shell as usual with + ``--prefix=``. Then add ``HaveNFS=1`` and + ``SpoolDir=`` to ``broctl.cfg``, where ```` is a + path on the local disks of the nodes; ```` will be used for + all non-shared data (make sure that the parent directory exists + and is writable on all nodes!). Then run ``make install`` again. + Finally, you can remove ``/spool`` (or link it to ). + In addition, you might want to keep the log files locally on the nodes + as well by setting LogDir_ to a non-NFS directory. (Only + the manager's logs will be kept permanently, the logs of + workers/proxies are discarded upon rotation.) + +*When I'm using the stand-alone mode, do I still need to have ``ssh`` and ``rsync`` installed and configured?* + No. In stand-alone mode all operations are performed directly on the local + file system. + +*What do I need to do when something in the Bro distribution changes?* + After pulling from the main Bro git repository, just re-run ``make + install`` inside your build directory. It will reinstall all the + files from the distribution that are not up-to-date. Then do + ``broctl install`` to make sure everything gets pushed out. + +*Can I change the naming scheme that BroControl uses for archived log files?* + Yes, set MakeArchiveName_ to a + script that outputs the desired destination file name for an + archived log file. The default script for that task is + ``/share/broctl/scripts/make-archive-name``, which you + can use as a template for creating your own version. See + the beginning of that script for instructions. + +*Can BroControl manage a cluster of nodes over non-global IPv6 scope (e.g. link-local)?* + Yes, set ``ZoneID`` in ``broctl.cfg`` to the zone identifier + that the BroControl node uses to identify the scope zone + (the ``ifconfig`` command output is usually helpful, if it doesn't + show the zone identifier appended to the address with a '%' + character, then it may just be the interface name). Then in + ``node.cfg``, add a ``zone_id`` key to each node section + representing that particular node's zone identifier and set + the ``host`` key to the IPv6 address assigned to the node within + the scope zone. Most nodes probably have the same ``zone_id``, but + may not if their interface configuration differs. See RFC 4007 for + more information on IPv6 scoped addresses and zones. diff --git a/doc/components/btest/README.rst b/doc/components/btest/README.rst new file mode 100644 index 0000000000..03a859e735 --- /dev/null +++ b/doc/components/btest/README.rst @@ -0,0 +1,843 @@ +.. -*- mode: rst-mode -*- +.. +.. Version number is filled in automatically. +.. |version| replace:: 0.4-14 + +============================================ +BTest - A Simple Driver for Basic Unit Tests +============================================ + +.. rst-class:: opening + + The ``btest`` is a simple framework for writing unit tests. Freely + borrowing some ideas from other packages, it's main objective is to + provide an easy-to-use, straightforward driver for a suite of + shell-based tests. Each test consists of a set of command lines that + will be executed, and success is determined based on their exit + codes. ``btest`` comes with some additional tools that can be used + within such tests to compare output against a previously established + baseline. + +.. contents:: + +Download +======== + +You can find the latest BTest release for download at +http://www.bro.org/download. + +BTest's git repository is located at `git://git.bro.org/btest.git +`__. You can browse the repository +`here `__. + +This document describes BTest |version|. See the ``CHANGES`` +file for version history. + + +Installation +============ + +Installation is simple and standard:: + + tar xzvf btest-*.tar.gz + cd btest-* + python setup.py install + +This will install a few scripts: ``btest`` is the main driver program, +and there are a number of further helper scripts that we discuss below +(including ``btest-diff``, which is a tool for comparing output to a +previously established baseline). + +Writing a Simple Test +===================== + +In the most simple case, ``btest`` simply executes a set of command +lines, each of which must be prefixed with ``@TEST-EXEC:`` +:: + + > cat examples/t1 + @TEST-EXEC: echo "Foo" | grep -q Foo + @TEST-EXEC: test -d . + > btest examples/t1 + examples.t1 ... ok + +The test passes as both command lines return success. If one of them +didn't, that would be reported:: + + > cat examples/t2 + @TEST-EXEC: echo "Foo" | grep -q Foo + @TEST-EXEC: test -d DOESNOTEXIST + > btest examples/t2 + examples.t2 ... failed + +Usually you will just run all tests found in a directory:: + + > btest examples + examples.t1 ... ok + examples.t2 ... failed + 1 test failed + +Why do we need the ``@TEST-EXEC:`` prefixes? Because the file +containing the test can simultaneously act as *its input*. Let's +say we want to verify a shell script:: + + > cat examples/t3.sh + # @TEST-EXEC: sh %INPUT + ls /etc | grep -q passwd + > btest examples/t3.sh + examples.t3 ... ok + +Here, ``btest`` is executing (something similar to) ``sh +examples/t3.sh``, and then checks the return value as usual. The +example also shows that the ``@TEST-EXEC`` prefix can appear +anywhere, in particular inside the comment section of another +language. + +Now, let's say we want to check the output of a program, making sure +that it matches what we expect. For that, we first add a command +line to the test that produces the output we want to check, and then +run ``btest-diff`` to make sure it matches a previously recorded +baseline. ``btest-diff`` is itself just a script that returns +success if the output is as expected, and failure otherwise. In the +following example, we use an awk script as a fancy way to print all +file names starting with a dot in the user's home directory. We +write that list into a file called ``dots`` and then check whether +its content matches what we know from last time:: + + > cat examples/t4.awk + # @TEST-EXEC: ls -a $HOME | awk -f %INPUT >dots + # @TEST-EXEC: btest-diff dots + /^\.+/ { print $1 } + +Note that each test gets its own little sandbox directory when run, +so by creating a file like ``dots``, you aren't cluttering up +anything. + +The first time we run this test, we need to record a baseline:: + + > btest -U examples/t4.awk + +Now, ``btest-diff`` has remembered what the ``dots`` file should +look like:: + + > btest examples/t4.awk + examples.t4 ... ok + > touch ~/.NEWDOTFILE + > btest examples/t4.awk + examples.t4 ... failed + 1 test failed + +If we want to see what exactly the unexpected change is that was +introduced to ``dots``, there's a *diff* mode for that:: + + > btest -d examples/t4.awk + examples.t4 ... failed + % 'btest-diff dots' failed unexpectedly (exit code 1) + % cat .diag + == File =============================== + [... current dots file ...] + == Diff =============================== + --- /Users/robin/work/binpacpp/btest/Baseline/examples.t4/dots + 2010-10-28 20:11:11.000000000 -0700 + +++ dots 2010-10-28 20:12:30.000000000 -0700 + @@ -4,6 +4,7 @@ + .CFUserTextEncoding + .DS_Store + .MacOSX + +.NEWDOTFILE + .Rhistory + .Trash + .Xauthority + ======================================= + + % cat .stderr + [... if any of the commands had printed something to stderr, that would follow here ...] + +Once we delete the new file, we are fine again:: + + > rm ~/.NEWDOTFILE + > btest -d examples/t4.awk + examples.t4 ... ok + +That's already the main functionality that the ``btest`` package +provides. In the following, we describe a number of further options +extending/modifying this basic approach. + +Reference +========= + +Command Line Usage +------------------ + +``btest`` must be started with a list of tests and/or directories +given on the command line. In the latter case, the default is to +recursively scan the directories and assume all files found to be +tests to perform. It is however possible to exclude certain files by +specifying a suitable `configuration file`_. + +``btest`` returns exit code 0 if all tests have successfully passed, +and 1 otherwise. + +``btest`` accepts the following options: + + -a ALTERNATIVE, --alternative=ALTERNATIVE + Activates an alternative_ configuration defined in the + configuration file. This option can be given multiple times to + run tests with several alternatives. If ``ALTERNATIVE`` is ``-`` + that refers to running with the standard setup, which can be used + to run tests both with and without alterantives by giving both. + + -b, --brief + Does not output *anything* for tests which pass. If all tests + pass, there will not be any output at all. + + -c CONFIG, --config=CONFIG + Specifies an alternative `configuration file`_ to use. If not + specified, the default is to use a file called ``btest.cfg`` + if found in the current directory. + + -d, --diagnostics + Reports diagnostics for all failed tests. The diagnostics + include the command line that failed, its output to standard + error, and potential additional information recorded by the + command line for diagnostic purposes (see `@TEST-EXEC`_ + below). In the case of ``btest-diff``, the latter is the + ``diff`` between baseline and actual output. + + -D, --diagnostics-all + Reports diagnostics for all tests, including those which pass. + + -f DIAGFILE, --file-diagnostics=DIAGFILE + Writes diagnostics for all failed tests into the given file. + If the file already exists, it will be overwritten. + + -g GROUPS, --group=GROUPS + Runs only tests assigned to the given test groups, see + `@TEST-GROUP`_. Multiple groups can be given as a + comma-separated list. Specifying ``-`` as a group name selects + all tests that do not belong to any group. + + -j [THREADS], --jobs[=THREADS] + Runs up to the given number of tests in parallel. If no number + is given, BTest substitutes the number of available CPU cores + as reported by the OS. + + By default, BTest assumes that all tests can be executed + concurrently without further constraints. One can however + ensure serialization of subsets by assigning them to the same + serialization set, see `@TEST-SERIALIZE`_. + + -q, --quiet + Suppress information output other than about failed tests. + If all tests pass, there will not be any output at all. + + -r, --rerun + Runs only tests that failed last time. After each execution + (except when updating baselines), BTest generates a state file + that records the tests that have failed. Using this option on + the next run then reads that file back in and limits execution + to those tests found in there. + + -t, --tmp-keep + Does not delete any temporary files created for running the + tests (including their outputs). By default, the temporary + files for a test will be located in ``.tmp//``, where + ```` is the relative path of the test file with all slashes + replaced with dots and the file extension removed (e.g., the files + for ``example/t3.sh`` will be in ``.tmp/example.t3``). + + -U, --update-baseline + Records a new baseline for all ``btest-diff`` commands found + in any of the specified tests. To do this, all tests are run + as normal except that when ``btest-diff`` is executed, it + does not compute a diff but instead considers the given file + to be authoritative and records it as the version to compare + with in future runs. + + -u, --update-interactive + Each time a ``btest-diff`` command fails in any tests that are + run, btest will stop and ask whether or not the user wants to + record a new baseline. + + -v, --verbose + Shows all test command lines as they are executed. + + -w, --wait + Interactively waits for ```` after showing diagnostics + for a test. + + -x FILE, --xml=FILE + Records test results in JUnit XML format to the given file. + If the file exists already, it is overwritten. + +.. _configuration file: + +Configuration +------------- + +Specifics of ``btest``'s execution can be tuned with a configuration +file, which by default is ``btest.cfg`` if that's found in the +current directory. It can alternatively be specified with the +``--config`` command line option. The configuration file is +"INI-style", and an example comes with the distribution, see +``btest.cfg.example``. A configuration file has one main section, +``btest``, that defines most options; as well as an optional section +for defining `environment variables`_ and further optional sections +for defining alternatives_. + +Note that all paths specified in the configuration file are relative +to ``btest``'s *base directory*. The base directory is either the +one where the configuration file is located if such is given/found, +or the current working directory if not. When setting values for +configuration options, the absolute path to the base directory is +available by using the macro ``%(testbase)s`` (the weird syntax is +due to Python's ``ConfigParser`` module). + +Furthermore, all values can use standard "backtick-syntax" to +include the output of external commands (e.g., xyz=`\echo test\`). +Note that the backtick expansion is performed after any ``%(..)`` +have already been replaced (including within the backticks). + +Options +~~~~~~~ + +The following options can be set in the ``btest`` section of the +configuration file: + +``TestDirs`` + A space-separated list of directories to search for tests. If + defined, one doesn't need to specify any tests on the command + line. + +``TmpDir`` + A directory where to create temporary files when running tests. + By default, this is set to ``%(testbase)s/.tmp``. + +``BaselineDir`` + A directory where to store the baseline files for ``btest-diff``. + By default, this is set to ``%(testbase)s/Baseline``. + +``IgnoreDirs`` + A space-separated list of relative directory names to ignore + when scanning test directories recursively. Default is empty. + +``IgnoreFiles`` + A space-separated list of filename globs matching files to + ignore when scanning given test directories recursively. + Default is empty. + +``StateFile`` + The name of the state file to record the names of failing tests. Default is + ``.btest.failed.dat``. + +``Finalizer`` + An executable that will be executed each time any test has + successfully run. It runs in the same directory as the test itself + and receives the name of the test as its parameter. The return + value indicates whether the test should indeed be considered + successful. By default, there's no finalizer set. + +.. _environment variables: + +Environment Variables +~~~~~~~~~~~~~~~~~~~~~ + +A special section ``environment`` defines environment variables that +will be propagated to all tests:: + + [environment] + CFLAGS=-O3 + PATH=%(testbase)s/bin:%(default_path)s + +Note how ``PATH`` can be adjusted to include local scripts: the +example above prefixes it with a local ``bin/`` directory inside the +base directory, using the predefined ``default_path`` macro to refer +to the ``PATH`` as it is set by default. + +Furthermore, by setting ``PATH`` to include the ``btest`` +distribution directory, one could skip the installation of the +``btest`` package. + +.. _alternative: + +Alternatives +~~~~~~~~~~~~ + +BTest can run a set of tests with different settings than it would +normally use by specifying an *alternative* configuration. Currently, +three things can be adjusted: + + - Further environment variables can be set that will then be + available to all the commands that a test executes. + + - *Filters* can modify an input file before a test uses it. + + - *Substitutions* can modify command lines executed as part of a + test. + +We discuss the three separately in the following. All of them are +defined by adding sections ``[-]`` where ```` +corresponds to the type of adjustment being made and ```` is the +name of the alternative. Once at least one section is defined for a +name, that alternative can be enabled by BTest's ``--alternative`` +flag. + +Environment Variables +^^^^^^^^^^^^^^^^^^^^^ + +An alternative can add further environment variables by defining an +``[environment-]`` section: + + [environment-myalternative] + CFLAGS=-O3 + +Running ``btest`` with ``--alternative=myalternative`` will now make +the ``CFLAGS`` environment variable available to all commands +executed. + +.. _filters: + +Filters +^^^^^^^ + +Filters are a transparent way to adapt the input to a specific test +command before it is executed. A filter is defined by adding a section +``[filter-]`` to the configuration file. This section must have +exactly one entry, and the name of that entry is interpreted as the +name of a command whose input is to be filtered. The value of that +entry is the name of a filter script that will be run with two +arguments representing input and output files, respectively. Example:: + + [filter-myalternative] + cat=%(testbase)s/bin/filter-cat + +Once the filter is activated by running ``btest`` with +``--alternative=myalternative``, every time a ``@TEST-EXEC: cat +%INPUT`` is found, ``btest`` will first execute (something similar to) +``%(testbase)s/bin/filter-cat %INPUT out.tmp``, and then subsequently +``cat out.tmp`` (i.e., the original command but with the filtered +output). In the simplest case, the filter could be a no-op in the +form ``cp $1 $2``. + +.. note:: + There are a few limitations to the filter concept currently: + + * Filters are *always* fed with ``%INPUT`` as their first + argument. We should add a way to filter other files as well. + + * Filtered commands are only recognized if they are directly + starting the command line. For example, ``@TEST-EXEC: ls | cat + >outout`` would not trigger the example filter above. + + * Filters are only executed for ``@TEST-EXEC``, not for + ``@TEST-EXEC-FAIL``. + +.. _substitution: + +Substitutions +^^^^^^^^^^^^^^ + +Substitutions are similar to filters, yet they do not adapt the input +but the command line being executed. A substitution is defined by +adding a section ``[substitution-]`` to the configuration file. +For each entry in this section, the entry's name specifies the +command that is to be replaced with something else given as its value. +Example:: + + [substitution-myalternative] + gcc=gcc -O2 + +Once the substitution is activated by running ``btest`` with +``--alternative=myalternative``, every time a ``@TEST-EXEC`` executes +``gcc``, that is replaced with ``gcc -O2``. The replacement is simple +string substitution so it works not only with commands but anything +found on the command line; it however only replaces full words, not +subparts of words. + +Writing Tests +------------- + +``btest`` scans a test file for lines containing keywords that +trigger certain functionality. Currently, the following keywords are +supported: + +.. _@TEST-EXEC: + +``@TEST-EXEC: `` + Executes the given command line and aborts the test if it + returns an error code other than zero. The ```` is + passed to the shell and thus can be a pipeline, use redirection, + and any environment variables specified in ```` will be + expanded, etc. + + When running a test, the current working directory for all + command lines will be set to a temporary sandbox (and will be + deleted later). + + There are two macros that can be used in ````: + ``%INPUT`` will be replaced with the full pathname of the file defining + the test; and ``%DIR`` will be replaced with the directory where + the test file is located. The latter can be used to reference + further files also located there. + + In addition to environment variables defined in the + configuration file, there are further ones that are passed into + the commands: + + ``TEST_DIAGNOSTICS`` + A file where further diagnostic information can be saved + in case a command fails. ``--diagnostics`` will show + this file. (This is also where ``btest-diff`` stores its + diff.) + + ``TEST_MODE`` + This is normally set to ``TEST``, but will be ``UPDATE`` + if ``btest`` is run with ``--update-baseline``, or + ``UPDATE_INTERACTIVE`` if run with ``--update-interactive``. + + ``TEST_BASELINE`` + The name of a directory where the command can save permanent + information across ``btest`` runs. (This is where + ``btest-diff`` stores its baseline in ``UPDATE`` mode.) + + ``TEST_NAME`` + The name of the currently executing test. + + ``TEST_VERBOSE`` + The path of a file where the test can record further + information about its execution that will be included with + btest's ``--verbose`` output. This is for further tracking + the execution of commands and should generally generate + output that follows a line-based structure. + + .. note:: + + If a command returns the special exit code 100, the test is + considered failed, however subsequent test commands are still + run. ``btest-diff`` uses this special exit code to indicate that + no baseline has yet been established. + + If a command returns the special exit code 200, the test is + considered failed and all further test executions are aborted. + + +``@TEST-EXEC-FAIL: `` + Like ``@TEST-EXEC``, except that this expects the command to + *fail*, i.e., the test is aborted when the return code is zero. + +``@TEST-REQUIRES: `` + Defines a condition that must be met for the test to be executed. + The given command line will be run before any of the actual test + commands, and it must return success for the test to continue. If + it does not return success, the rest of the test will be skipped + but doing so will not be considered a failure of the test. This allows to + write conditional tests that may not always make sense to run, depending + on whether external constraints are satisfied or not (say, whether + a particular library is available). Multiple requirements may be + specified and then all must be met for the test to continue. + +``@TEST-ALTERNATIVE: `` Runs this test only for the given + alternative (see alternative_). If ```` is + ``default``, the test executes when BTest runs with no alternative + given (which however is the default anyways). + +``@TEST-NOT-ALTERNATIVE: `` Ignores this test for the + given alternative (see alternative_). If ```` is + ``default``, the test is ignored if BTest runs with no alternative + given. + +``@TEST-COPY-FILE: `` + Copy the given file into the test's directory before the test is + run. If ```` is a relative path, it's interpreted relative + to the BTest's base directory. Environment variables in ```` + will be replaced if enclosed in ``${..}``. This command can be + given multiple times. + +``@TEST-START-NEXT`` + This is a short-cut for defining multiple test inputs in the + same file, all executing with the same command lines. When + ``@TEST-START-NEXT`` is encountered, the test file is initially + considered to end at that point, and all ``@TEST-EXEC-*`` are + run with an ``%INPUT`` truncated accordingly. Afterwards, a + *new* ``%INPUT`` is created with everything *following* the + ``@TEST-START-NEXT`` marker, and the *same* commands are run + again (further ``@TEST-EXEC-*`` will be ignored). The effect is + that a single file can actually define two tests, and the + ``btest`` output will enumerate them:: + + > cat examples/t5.sh + # @TEST-EXEC: cat %INPUT | wc -c >output + # @TEST-EXEC: btest-diff output + + This is the first test input in this file. + + # @TEST-START-NEXT + + ... and the second. + + > ./btest -D examples/t5.sh + examples.t5 ... ok + % cat .diag + == File =============================== + 119 + [...] + + examples.t5-2 ... ok + % cat .diag + == File =============================== + 22 + [...] + + Multiple ``@TEST-START-NEXT`` can be used to create more than + two tests per file. + +``@TEST-START-FILE `` + This is used to include an additional input file for a test + right inside the test file. All lines following the keyword will + be written into the given file (and removed from the test's + `%INPUT`) until a terminating ``@TEST-END-FILE`` is found. + Example:: + + > cat examples/t6.sh + # @TEST-EXEC: awk -f %INPUT output + # @TEST-EXEC: btest-diff output + + { lines += 1; } + END { print lines; } + + @TEST-START-FILE foo.dat + 1 + 2 + 3 + @TEST-END-FILE + + > btest -D examples/t6.sh + examples.t6 ... ok + % cat .diag + == File =============================== + 3 + + Multiple such files can be defined within a single test. + + Note that this is only one way to use further input files. + Another is to store a file in the same directory as the test + itself, making sure it's ignored via ``IgnoreFiles``, and then + refer to it via ``%DIR/``. + +.. _@TEST-GROUP: + +``@TEST-GROUP: `` + Assigns the test to a group of name ````. By using option + ``-g`` one can limit execution to all tests that belong to a given + group (or a set of groups). + +.. _@TEST-SERIALIZE: + +``@TEST-SERIALIZE: `` + When using option ``-j`` to parallelize execution, all tests that + specify the same serialization set are guaranteed to run + sequentially. ```` is an arbitrary user-chosen string. + + +Canonifying Diffs +================= + +``btest-diff`` has the capability to filter its input through an +additional script before it compares the current version with the +baseline. This can be useful if certain elements in an output are +*expected* to change (e.g., timestamps). The filter can then +remove/replace these with something consistent. To enable such +canonification, set the environment variable +``TEST_DIFF_CANONIFIER`` to a script reading the original version +from stdin and writing the canonified version to stdout. Note that +both baseline and current output are passed through the filter +before their differences are computed. + +Running Processes in the Background +=================================== + +Sometimes processes need to be spawned in the background for a test, +in particular if multiple processes need to cooperate in some fashion. +``btest`` comes with two helper scripts to make life easier in such a +situation: + +``btest-bg-run `` + This is a script that runs ```` in the background, i.e., + it's like using ``cmdline &`` in a shell script. Test execution + continues immediately with the next command. Note that the spawned + command is *not* run in the current directory, but instead in a + newly created sub-directory called ````. This allows + spawning multiple instances of the same process without needing to + worry about conflicting outputs. If you want to access a command's + output later, like with ``btest-diff``, use ``/foo.log`` to + access it. + +``btest-bg-wait [-k] `` + This script waits for all processes previously spawned via + ``btest-bg-run`` to finish. If any of them exits with a non-zero + return code, ``btest-bg-wait`` does so as well, indicating a + failed test. ```` is mandatory and gives the maximum + number of seconds to wait for any of the processes to terminate. + If any process hasn't done so when the timeout expires, it will be + killed and the test is considered to be failed as long as ``-k`` + is not given. If ``-k`` is given, pending processes are still + killed but the test continues normally, i.e., non-termination is + not considered a failure in this case. This script also collects + the processes' stdout and stderr outputs for diagnostics output. + +Integration with Sphinx +======================= + +``btest`` comes with a new directive for the documentation framework +`Sphinx `_. The directive allows to write a +test directly inside a Sphinx document, and then to include output +from the test's command into the generated documentation. The same +tests can also run externally and will catch if any changes to the +included content occur. The following walks through setting this up. + +Configuration +------------- + +First, you need to tell Sphinx a base directory for the ``btest`` +configuration as well as a directory in there where to store tests +it extracts from the Sphinx documentation. Typically, you'd just +create a new subdirectory ``tests`` in the Sphinx project for the +``btest`` setup and then store the tests in there in, e.g., +``doc/``:: + + cd + mkdir tests + mkdir tests/doc + +Then add the following to your Sphinx ``conf.py``:: + + extensions += ["btest-sphinx"] + btest_base="tests" # Relative to Sphinx-root. + btest_tests="doc" # Relative to btest_base. + +Next, a finalizer to ``btest.cfg``:: + + [btest] + ... + Finalizer=btest-diff-rst + +Finally, create a ``btest.cfg`` in ``tests/`` as usual and add +``doc/`` to the ``TestDirs`` option. + +Including a Test into a Sphinx Document +--------------------------------------- + +The ``btest`` extension provides a new directive to include a test +inside a Sphinx document:: + + + .. btest:: + + + +Here, ```` is a custom name for the test; it will be +stored in ``btest_tests`` under that name. ```` is just +a standard test as you would normally put into one of the +``TestDirs``. Example:: + + + .. btest:: just-a-test + + @TEST-EXEC: expr 2 + 2 + +When you now run Sphinx, it will (1) store the test content into +``tests/doc/just-a-test`` (assuming the above path layout), and (2) +execute the test by running ``btest`` on it. You can then run +``btest`` manually in ``tests/`` as well and it will execute the test +just as it would in a standard setup. If a test fails when Sphinx runs +it, there will be a corresponding error and include the diagnostic output +into the document. + +By default, nothing else will be included into the generated +documentation, i.e., the above test will just turn into an empty text +block. However, ``btest`` comes with a set of scripts that you can use +to specify content to be included. As a simple example, +``btest-rst-cmd `` will execute a command and (if it +succeeds) include both the command line and the standard output into +the documentation. Example:: + + .. btest:: another-test + + @TEST-EXEC: btest-rst-cmd echo Hello, world! + +When running Sphinx, this will render as: + +.. code:: + + # echo Hello, world! + Hello world! + + +When running ``btest`` manually in ``tests/``, the ``Finalizer`` we +added to ``btest.cfg`` (see above) compares the generated reST code +with a previously established baseline, just like ``btest-diff`` does +with files. To establish the initial baseline, run ``btest -u``, like +you would with ``btest-diff``. + +Scripts +------- + +The following Sphinx support scripts come with ``btest``: + +``btest-rst-cmd [options] `` + + By default, this executes ```` and includes both the + command line itself and its standard output into the generated + documentation. See above for an example. + + This script provides the following options: + + -c ALTERNATIVE_CMDLINE + Show ``ALTERNATIVE_CMDLINE`` in the generated + documentation instead of the one actually executed. (It + still runs the ```` given outside the option.) + + -d + Do not actually execute ````; just format it for + the generated documentation and include no further output. + + -f FILTER_CMD + Pipe the command line's output through ``FILTER_CMD`` + before including. If ``-r`` is given, it filters the + file's content instead of stdout. + + -o + Do not include the executed command into the generated + documentation, just its output. + + -r FILE + Insert ``FILE`` into output instead of stdout. + + +``btest-rst-include `` + + Includes ```` inside a code block. + +``btest-rst-pipe `` + + Executes ````, includes its standard output inside a code + block. Note that this script does not include the command line + itself into the code block, just the output. + +.. note:: + + All these scripts can be run directly from the command line to show + the reST code they generate. + +.. note:: + + ``btest-rst-cmd`` can do everything the other scripts provide if + you give it the right options. In fact, the other scripts are + provided just for convenience and leverage ``btest-rst-cmd`` + internally. + +License +======= + +btest is open-source under a BSD licence. + diff --git a/doc/components/capstats/README.rst b/doc/components/capstats/README.rst new file mode 100644 index 0000000000..1ddc07c51c --- /dev/null +++ b/doc/components/capstats/README.rst @@ -0,0 +1,107 @@ +.. -*- mode: rst-mode -*- +.. +.. Version number is filled in automatically. +.. |version| replace:: 0.18 + +=============================================== +capstats - A tool to get some NIC statistics. +=============================================== + +.. rst-class:: opening + + capstats is a small tool to collect statistics on the + current load of a network interface, using either `libpcap + `_ or the native interface for `Endace's + `_. It reports statistics per time interval + and/or for the tool's total run-time. + +Download +-------- + +You can find the latest capstats release for download at +http://www.bro.org/download. + +Capstats's git repository is located at `git://git.bro.org/capstats.git +`__. You can browse the repository +`here `__. + +This document describes capstats |version|. See the ``CHANGES`` +file for version history. + + +Output +------ + +Here's an example output with output in one-second intervals until +``CTRL-C`` is hit: + +.. console:: + + >capstats -i nve0 -I 1 + 1186620936.890567 pkts=12747 kpps=12.6 kbytes=10807 mbps=87.5 nic_pkts=12822 nic_drops=0 u=960 t=11705 i=58 o=24 nonip=0 + 1186620937.901490 pkts=13558 kpps=13.4 kbytes=11329 mbps=91.8 nic_pkts=13613 nic_drops=0 u=1795 t=24339 i=119 o=52 nonip=0 + 1186620938.912399 pkts=14771 kpps=14.6 kbytes=13659 mbps=110.7 nic_pkts=14781 nic_drops=0 u=2626 t=38154 i=185 o=111 nonip=0 + 1186620939.012446 pkts=1332 kpps=13.3 kbytes=1129 mbps=92.6 nic_pkts=1367 nic_drops=0 u=2715 t=39387 i=194 o=112 nonip=0 + === Total + 1186620939.012483 pkts=42408 kpps=13.5 kbytes=36925 mbps=96.5 nic_pkts=1 nic_drops=0 u=2715 t=39387 i=194 o=112 nonip=0 + +Each line starts with a timestamp and the other fields are: + + :pkts: + Absolute number of packets seen by ``capstats`` during interval. + + :kpps: + Number of packets per second. + + :kbytes: + Absolute number of KBytes during interval. + + :mbps: + Mbits/sec. + + :nic_pkts: + Number of packets as reported by ``libpcap``'s ``pcap_stats()`` (may not match _pkts_) + + :nic_drops: + Number of packet drops as reported by ``libpcap``'s ``pcap_stats()``. + + :u: + Number of UDP packets. + + :t: + Number of TCP packets. + + :i: + Number of ICMP packets. + + :nonip: + Number of non-IP packets. + +Options +------- + +A list of all options:: + + capstats [Options] -i interface + + -i| --interface Listen on interface + -d| --dag Use native DAG API + -f| --filter BPF filter + -I| --interval Stats logging interval + -l| --syslog Use syslog rather than print to stderr + -n| --number Stop after outputting intervals + -N| --select Use select() for live pcap (for testing only) + -p| --payload Verifies that packets' payloads consist + entirely of bytes of the given value. + -q| --quiet Suppress output, exit code indicates >= count + packets received. + -S| --size Verify packets to have given + -s| --snaplen Use pcap snaplen + -v| --version Print version and exit + -w| --write Write packets to file + +Installation +------------ + +``capstats`` has been tested on Linux, FreeBSD, and MacOS. Please see +the ``INSTALL`` file for installation instructions. diff --git a/doc/components/index.rst b/doc/components/index.rst new file mode 100644 index 0000000000..8ffbdea1f7 --- /dev/null +++ b/doc/components/index.rst @@ -0,0 +1,28 @@ + +===================== +Additional Components +===================== + +The following are snapshots of documentation for components that come +with this version of Bro (|version|). Since they can also be used +independently, see the `download page +`_ for documentation of any +current, independent component releases. + +.. toctree:: + :maxdepth: 1 + + BinPAC - A protocol parser generator + Broccoli - The Bro Client Communication Library (README) + Broccoli - User Manual + Broccoli Python Bindings + Broccoli Ruby Bindings + BroControl - Interactive Bro management shell + Bro-Aux - Small auxiliary tools for Bro + BTest - A unit testing framework + Capstats - Command-line packet statistic tool + PySubnetTree - Python module for CIDR lookups + trace-summary - Script for generating break-downs of network traffic + +The `Broccoli API Reference `_ may also be of +interest. diff --git a/doc/components/pysubnettree/README.rst b/doc/components/pysubnettree/README.rst new file mode 100644 index 0000000000..be97eef9e3 --- /dev/null +++ b/doc/components/pysubnettree/README.rst @@ -0,0 +1,98 @@ +.. -*- mode: rst-mode -*- +.. +.. Version number is filled in automatically. +.. |version| replace:: 0.19-9 + +=============================================== +PySubnetTree - A Python Module for CIDR Lookups +=============================================== + +.. rst-class:: opening + + The PySubnetTree package provides a Python data structure + ``SubnetTree`` which maps subnets given in `CIDR + `_ notation (incl. + corresponding IPv6 versions) to Python objects. Lookups are + performed by longest-prefix matching. + + +Download +-------- + +You can find the latest PySubnetTree release for download at +http://www.bro.org/download. + +PySubnetTree's git repository is located at `git://git.bro.org/pysubnettree.git +`__. You can browse the repository +`here `__. + +This document describes PySubnetTree |version|. See the ``CHANGES`` +file for version history. + + +Example +------- + +A simple example which associates CIDR prefixes with strings:: + + >>> import SubnetTree + >>> t = SubnetTree.SubnetTree() + >>> t["10.1.0.0/16"] = "Network 1" + >>> t["10.1.42.0/24"] = "Network 1, Subnet 42" + >>> t["10.2.0.0/16"] = "Network 2" + >>> print t["10.1.42.1"] + Network 1, Subnet 42 + >>> print t["10.1.43.1"] + Network 1 + >>> print "10.1.42.1" in t + True + >>> print "10.1.43.1" in t + True + >>> print "10.20.1.1" in t + False + >>> print t["10.20.1.1"] + Traceback (most recent call last): + File "", line 1, in + File "SubnetTree.py", line 67, in __getitem__ + def __getitem__(*args): return _SubnetTree.SubnetTree___getitem__(*args) + KeyError: '10.20.1.1' + +By default, CIDR prefixes and IP addresses are given as strings. +Alternatively, a ``SubnetTree`` object can be switched into *binary +mode*, in which single addresses are passed in the form of packed +binary strings as, e.g., returned by `socket.inet_aton +`_:: + + + >>> t.get_binary_lookup_mode() + False + >>> t.set_binary_lookup_mode(True) + >>> t.binary_lookup_mode() + True + >>> import socket + >>> print t[socket.inet_aton("10.1.42.1")] + Network 1, Subnet 42 + +A SubnetTree also provides methods ``insert(prefix,object=None)`` for insertion +of prefixes (``object`` can be skipped to use the tree like a set), and +``remove(prefix)`` for removing entries (``remove`` performs an _exact_ match +rather than longest-prefix). + +Internally, the CIDR prefixes of a ``SubnetTree`` are managed by a +Patricia tree data structure and lookups are therefore efficient +even with a large number of prefixes. + +PySubnetTree comes with a BSD license. + + +Prerequisites +------------- + +This package requires Python 2.4 or newer. + +Installation +------------ + +Installation is pretty simple:: + + > python setup.py install diff --git a/doc/components/trace-summary/README.rst b/doc/components/trace-summary/README.rst new file mode 100644 index 0000000000..a47381fc37 --- /dev/null +++ b/doc/components/trace-summary/README.rst @@ -0,0 +1,154 @@ +.. -*- mode: rst-mode -*- +.. +.. Version number is filled in automatically. +.. |version| replace:: 0.8 + +==================================================== +trace-summary - Generating network traffic summaries +==================================================== + +.. rst-class:: opening + + ``trace-summary`` is a Python script that generates break-downs of + network traffic, including lists of the top hosts, protocols, + ports, etc. Optionally, it can generate output separately for + incoming vs. outgoing traffic, per subnet, and per time-interval. + +Download +-------- + +You can find the latest trace-summary release for download at +http://www.bro.org/download. + +trace-summary's git repository is located at `git://git.bro.org/trace-summary.git +`__. You can browse the repository +`here `__. + +This document describes trace-summary |version|. See the ``CHANGES`` +file for version history. + + +Overview +-------- + +The ``trace-summary`` script reads both packet traces in `libpcap +`_ format and connection logs produced by the +`Bro `_ network intrusion detection system +(for the latter, it supports both 1.x and 2.x output formats). + +Here are two example outputs in the most basic form (note that IP +addresses are 'anonymized'). The first is from a packet trace and the +second from a Bro connection log:: + + + >== Total === 2005-01-06-14-23-33 - 2005-01-06-15-23-43 + - Bytes 918.3m - Payload 846.3m - Pkts 1.8m - Frags 0.9% - MBit/s 1.9 - + Ports | Sources | Destinations | Protocols | + 80 33.8% | 131.243.89.214 8.5% | 131.243.89.214 7.7% | 6 76.0% | + 22 16.7% | 128.3.2.102 6.2% | 128.3.2.102 5.4% | 17 23.3% | + 11001 12.4% | 204.116.120.26 4.8% | 131.243.89.4 4.8% | 1 0.5% | + 2049 10.7% | 128.3.161.32 3.6% | 131.243.88.227 3.6% | | + 1023 10.6% | 131.243.89.4 3.5% | 204.116.120.26 3.4% | | + 993 8.2% | 128.3.164.194 2.7% | 131.243.89.64 3.1% | | + 1049 8.1% | 128.3.164.15 2.4% | 128.3.164.229 2.9% | | + 524 6.6% | 128.55.82.146 2.4% | 131.243.89.155 2.5% | | + 33305 4.5% | 131.243.88.227 2.3% | 128.3.161.32 2.3% | | + 1085 3.7% | 131.243.89.155 2.3% | 128.55.82.146 2.1% | | + + + >== Total === 2005-01-06-14-23-33 - 2005-01-06-15-23-42 + - Connections 43.4k - Payload 398.4m - + Ports | Sources | Destinations | Services | Protocols | States | + 80 21.7% | 207.240.215.71 3.0% | 239.255.255.253 8.0% | other 51.0% | 17 55.8% | S0 46.2% | + 427 13.0% | 131.243.91.71 2.2% | 131.243.91.255 4.0% | http 21.7% | 6 36.4% | SF 30.1% | + 443 3.8% | 128.3.161.76 1.7% | 131.243.89.138 2.1% | i-echo 7.3% | 1 7.7% | OTH 7.8% | + 138 3.7% | 131.243.90.138 1.6% | 255.255.255.255 1.7% | https 3.8% | | RSTO 5.8% | + 515 2.4% | 131.243.88.159 1.6% | 128.3.97.204 1.5% | nb-dgm 3.7% | | SHR 4.4% | + 11001 2.3% | 131.243.88.202 1.4% | 131.243.88.107 1.1% | printer 2.4% | | REJ 3.0% | + 53 1.9% | 131.243.89.250 1.4% | 117.72.94.10 1.1% | dns 1.9% | | S1 1.0% | + 161 1.6% | 131.243.89.80 1.3% | 131.243.88.64 1.1% | snmp 1.6% | | RSTR 0.9% | + 137 1.4% | 131.243.90.52 1.3% | 131.243.88.159 1.1% | nb-ns 1.4% | | SH 0.3% | + 2222 1.1% | 128.3.161.252 1.2% | 131.243.91.92 1.1% | ntp 1.0% | | RSTRH 0.2% | + + +Prerequisites +------------- + +* This script requires Python 2.4 or newer. + +* The `pysubnettree + `_ Python + module. + +* Eddie Kohler's `ipsumdump `_ + if using ``trace-summary`` with packet traces (versus Bro connection logs) + +Installation +------------ + +Simply copy the script into some directory which is in your ``PATH``. + +Usage +----- + +The general usage is:: + + trace-summary [options] [input-file] + +Per default, it assumes the ``input-file`` to be a ``libpcap`` trace +file. If it is a Bro connection log, use ``-c``. If ``input-file`` is +not given, the script reads from stdin. It writes its output to +stdout. + +Options +~~~~~~~ + +The most important options are summarized +below. Run ``trace-summary --help`` to see the full list including +some more esoteric ones. + +:-c: + Input is a Bro connection log instead of a ``libpcap`` trace + file. + +:-b: + Counts all percentages in bytes rather than number of + packets/connections. + +:-E : + Gives a file which contains a list of networks to ignore for the + analysis. The file must contain one network per line, where each + network is of the CIDR form ``a.b.c.d/mask`` (including the + corresponding syntax for IPv6 prefixes, e.g., ``1:2:3:4::/64``). + Empty lines and lines starting with a "#" are ignored. + +:-i : + Creates totals for each time interval of the given length + (default is seconds; add "``m``" for minutes and "``h``" for + hours). Use ``-v`` if you also want to see the breakdowns for + each interval. + +:-l : + Generates separate summaries for incoming and outgoing traffic. + ```` is a file which contains a list of networks to be + considered local. Format as for ``-E``. + +:-n : + Show top n entries in each break-down. Default is 10. + +:-r: + Resolves hostnames in the output. + +:-s : + Gives the sample factor if the input has been sampled. + +:-S : + Sample input with the given factor; less accurate but faster and + saves memory. + +:-m: + Does skip memory-expensive statistics. + +:-v: + Generates full break-downs for each time interval. Requires + ``-i``. diff --git a/doc/frameworks/index.rst b/doc/frameworks/index.rst new file mode 100644 index 0000000000..63e87a6732 --- /dev/null +++ b/doc/frameworks/index.rst @@ -0,0 +1,16 @@ + +========== +Frameworks +========== + +.. toctree:: + :maxdepth: 1 + + notice + logging + input + intel + cluster + signatures + geoip + diff --git a/doc/frameworks/input.rst b/doc/frameworks/input.rst new file mode 100644 index 0000000000..aca5091972 --- /dev/null +++ b/doc/frameworks/input.rst @@ -0,0 +1,408 @@ + +=============== +Input Framework +=============== + +.. rst-class:: opening + + Bro now features a flexible input framework that allows users + to import data into Bro. Data is either read into Bro tables or + converted to events which can then be handled by scripts. + This document gives an overview of how to use the input framework + with some examples. For more complex scenarios it is + worthwhile to take a look at the unit tests in + ``testing/btest/scripts/base/frameworks/input/``. + +.. contents:: + +Reading Data into Tables +======================== + +Probably the most interesting use-case of the input framework is to +read data into a Bro table. + +By default, the input framework reads the data in the same format +as it is written by the logging framework in Bro - a tab-separated +ASCII file. + +We will show the ways to read files into Bro with a simple example. +For this example we assume that we want to import data from a blacklist +that contains server IP addresses as well as the timestamp and the reason +for the block. + +An example input file could look like this: + +:: + + #fields ip timestamp reason + 192.168.17.1 1333252748 Malware host + 192.168.27.2 1330235733 Botnet server + 192.168.250.3 1333145108 Virus detected + +To read a file into a Bro table, two record types have to be defined. +One contains the types and names of the columns that should constitute the +table keys and the second contains the types and names of the columns that +should constitute the table values. + +In our case, we want to be able to lookup IPs. Hence, our key record +only contains the server IP. All other elements should be stored as +the table content. + +The two records are defined as: + +.. code:: bro + + type Idx: record { + ip: addr; + }; + + type Val: record { + timestamp: time; + reason: string; + }; + +Note that the names of the fields in the record definitions have to correspond +to the column names listed in the '#fields' line of the log file, in this +case 'ip', 'timestamp', and 'reason'. + +The log file is read into the table with a simple call of the ``add_table`` +function: + +.. code:: bro + + global blacklist: table[addr] of Val = table(); + + Input::add_table([$source="blacklist.file", $name="blacklist", $idx=Idx, $val=Val, $destination=blacklist]); + Input::remove("blacklist"); + +With these three lines we first create an empty table that should contain the +blacklist data and then instruct the input framework to open an input stream +named ``blacklist`` to read the data into the table. The third line removes the +input stream again, because we do not need it any more after the data has been +read. + +Because some data files can - potentially - be rather big, the input framework +works asynchronously. A new thread is created for each new input stream. +This thread opens the input data file, converts the data into a Bro format and +sends it back to the main Bro thread. + +Because of this, the data is not immediately accessible. Depending on the +size of the data source it might take from a few milliseconds up to a few +seconds until all data is present in the table. Please note that this means +that when Bro is running without an input source or on very short captured +files, it might terminate before the data is present in the system (because +Bro already handled all packets before the import thread finished). + +Subsequent calls to an input source are queued until the previous action has +been completed. Because of this, it is, for example, possible to call +``add_table`` and ``remove`` in two subsequent lines: the ``remove`` action +will remain queued until the first read has been completed. + +Once the input framework finishes reading from a data source, it fires +the ``end_of_data`` event. Once this event has been received all data +from the input file is available in the table. + +.. code:: bro + + event Input::end_of_data(name: string, source: string) { + # now all data is in the table + print blacklist; + } + +The table can also already be used while the data is still being read - it +just might not contain all lines in the input file when the event has not +yet fired. After it has been populated it can be used like any other Bro +table and blacklist entries can easily be tested: + +.. code:: bro + + if ( 192.168.18.12 in blacklist ) + # take action + + +Re-reading and streaming data +----------------------------- + +For many data sources, like for many blacklists, the source data is continually +changing. For these cases, the Bro input framework supports several ways to +deal with changing data files. + +The first, very basic method is an explicit refresh of an input stream. When +an input stream is open, the function ``force_update`` can be called. This +will trigger a complete refresh of the table; any changed elements from the +file will be updated. After the update is finished the ``end_of_data`` +event will be raised. + +In our example the call would look like: + +.. code:: bro + + Input::force_update("blacklist"); + +The input framework also supports two automatic refresh modes. The first mode +continually checks if a file has been changed. If the file has been changed, it +is re-read and the data in the Bro table is updated to reflect the current +state. Each time a change has been detected and all the new data has been +read into the table, the ``end_of_data`` event is raised. + +The second mode is a streaming mode. This mode assumes that the source data +file is an append-only file to which new data is continually appended. Bro +continually checks for new data at the end of the file and will add the new +data to the table. If newer lines in the file have the same index as previous +lines, they will overwrite the values in the output table. Because of the +nature of streaming reads (data is continually added to the table), +the ``end_of_data`` event is never raised when using streaming reads. + +The reading mode can be selected by setting the ``mode`` option of the +add_table call. Valid values are ``MANUAL`` (the default), ``REREAD`` +and ``STREAM``. + +Hence, when adding ``$mode=Input::REREAD`` to the previous example, the +blacklist table will always reflect the state of the blacklist input file. + +.. code:: bro + + Input::add_table([$source="blacklist.file", $name="blacklist", $idx=Idx, $val=Val, $destination=blacklist, $mode=Input::REREAD]); + +Receiving change events +----------------------- + +When re-reading files, it might be interesting to know exactly which lines in +the source files have changed. + +For this reason, the input framework can raise an event each time when a data +item is added to, removed from or changed in a table. + +The event definition looks like this: + +.. code:: bro + + event entry(description: Input::TableDescription, tpe: Input::Event, left: Idx, right: Val) { + # act on values + } + +The event has to be specified in ``$ev`` in the ``add_table`` call: + +.. code:: bro + + Input::add_table([$source="blacklist.file", $name="blacklist", $idx=Idx, $val=Val, $destination=blacklist, $mode=Input::REREAD, $ev=entry]); + +The ``description`` field of the event contains the arguments that were +originally supplied to the add_table call. Hence, the name of the stream can, +for example, be accessed with ``description$name``. ``tpe`` is an enum +containing the type of the change that occurred. + +If a line that was not previously present in the table has been added, +then ``tpe`` will contain ``Input::EVENT_NEW``. In this case ``left`` contains +the index of the added table entry and ``right`` contains the values of the +added entry. + +If a table entry that already was present is altered during the re-reading or +streaming read of a file, ``tpe`` will contain ``Input::EVENT_CHANGED``. In +this case ``left`` contains the index of the changed table entry and ``right`` +contains the values of the entry before the change. The reason for this is +that the table already has been updated when the event is raised. The current +value in the table can be ascertained by looking up the current table value. +Hence it is possible to compare the new and the old values of the table. + +If a table element is removed because it was no longer present during a +re-read, then ``tpe`` will contain ``Input::REMOVED``. In this case ``left`` +contains the index and ``right`` the values of the removed element. + + +Filtering data during import +---------------------------- + +The input framework also allows a user to filter the data during the import. +To this end, predicate functions are used. A predicate function is called +before a new element is added/changed/removed from a table. The predicate +can either accept or veto the change by returning true for an accepted +change and false for a rejected change. Furthermore, it can alter the data +before it is written to the table. + +The following example filter will reject to add entries to the table when +they were generated over a month ago. It will accept all changes and all +removals of values that are already present in the table. + +.. code:: bro + + Input::add_table([$source="blacklist.file", $name="blacklist", $idx=Idx, $val=Val, $destination=blacklist, $mode=Input::REREAD, + $pred(typ: Input::Event, left: Idx, right: Val) = { + if ( typ != Input::EVENT_NEW ) { + return T; + } + return ( ( current_time() - right$timestamp ) < (30 day) ); + }]); + +To change elements while they are being imported, the predicate function can +manipulate ``left`` and ``right``. Note that predicate functions are called +before the change is committed to the table. Hence, when a table element is +changed (``tpe`` is ``INPUT::EVENT_CHANGED``), ``left`` and ``right`` +contain the new values, but the destination (``blacklist`` in our example) +still contains the old values. This allows predicate functions to examine +the changes between the old and the new version before deciding if they +should be allowed. + +Different readers +----------------- + +The input framework supports different kinds of readers for different kinds +of source data files. At the moment, the default reader reads ASCII files +formatted in the Bro log file format (tab-separated values). At the moment, +Bro comes with two other readers. The ``RAW`` reader reads a file that is +split by a specified record separator (usually newline). The contents are +returned line-by-line as strings; it can, for example, be used to read +configuration files and the like and is probably +only useful in the event mode and not for reading data to tables. + +Another included reader is the ``BENCHMARK`` reader, which is being used +to optimize the speed of the input framework. It can generate arbitrary +amounts of semi-random data in all Bro data types supported by the input +framework. + +In the future, the input framework will get support for new data sources +like, for example, different databases. + +Add_table options +----------------- + +This section lists all possible options that can be used for the add_table +function and gives a short explanation of their use. Most of the options +already have been discussed in the previous sections. + +The possible fields that can be set for a table stream are: + + ``source`` + A mandatory string identifying the source of the data. + For the ASCII reader this is the filename. + + ``name`` + A mandatory name for the filter that can later be used + to manipulate it further. + + ``idx`` + Record type that defines the index of the table. + + ``val`` + Record type that defines the values of the table. + + ``reader`` + The reader used for this stream. Default is ``READER_ASCII``. + + ``mode`` + The mode in which the stream is opened. Possible values are + ``MANUAL``, ``REREAD`` and ``STREAM``. Default is ``MANUAL``. + ``MANUAL`` means that the file is not updated after it has + been read. Changes to the file will not be reflected in the + data Bro knows. ``REREAD`` means that the whole file is read + again each time a change is found. This should be used for + files that are mapped to a table where individual lines can + change. ``STREAM`` means that the data from the file is + streamed. Events / table entries will be generated as new + data is appended to the file. + + ``destination`` + The destination table. + + ``ev`` + Optional event that is raised, when values are added to, + changed in, or deleted from the table. Events are passed an + Input::Event description as the first argument, the index + record as the second argument and the values as the third + argument. + + ``pred`` + Optional predicate, that can prevent entries from being added + to the table and events from being sent. + + ``want_record`` + Boolean value, that defines if the event wants to receive the + fields inside of a single record value, or individually + (default). This can be used if ``val`` is a record + containing only one type. In this case, if ``want_record`` is + set to false, the table will contain elements of the type + contained in ``val``. + +Reading Data to Events +====================== + +The second supported mode of the input framework is reading data to Bro +events instead of reading them to a table using event streams. + +Event streams work very similarly to table streams that were already +discussed in much detail. To read the blacklist of the previous example +into an event stream, the following Bro code could be used: + +.. code:: bro + + type Val: record { + ip: addr; + timestamp: time; + reason: string; + }; + + event blacklistentry(description: Input::EventDescription, tpe: Input::Event, ip: addr, timestamp: time, reason: string) { + # work with event data + } + + event bro_init() { + Input::add_event([$source="blacklist.file", $name="blacklist", $fields=Val, $ev=blacklistentry]); + } + + +The main difference in the declaration of the event stream is, that an event +stream needs no separate index and value declarations -- instead, all source +data types are provided in a single record definition. + +Apart from this, event streams work exactly the same as table streams and +support most of the options that are also supported for table streams. + +The options that can be set when creating an event stream with +``add_event`` are: + + ``source`` + A mandatory string identifying the source of the data. + For the ASCII reader this is the filename. + + ``name`` + A mandatory name for the stream that can later be used + to remove it. + + ``fields`` + Name of a record type containing the fields, which should be + retrieved from the input stream. + + ``ev`` + The event which is fired, after a line has been read from the + input source. The first argument that is passed to the event + is an Input::Event structure, followed by the data, either + inside of a record (if ``want_record is set``) or as + individual fields. The Input::Event structure can contain + information, if the received line is ``NEW``, has been + ``CHANGED`` or ``DELETED``. Since the ASCII reader cannot + track this information for event filters, the value is + always ``NEW`` at the moment. + + ``mode`` + The mode in which the stream is opened. Possible values are + ``MANUAL``, ``REREAD`` and ``STREAM``. Default is ``MANUAL``. + ``MANUAL`` means that the file is not updated after it has + been read. Changes to the file will not be reflected in the + data Bro knows. ``REREAD`` means that the whole file is read + again each time a change is found. This should be used for + files that are mapped to a table where individual lines can + change. ``STREAM`` means that the data from the file is + streamed. Events / table entries will be generated as new + data is appended to the file. + + ``reader`` + The reader used for this stream. Default is ``READER_ASCII``. + + ``want_record`` + Boolean value, that defines if the event wants to receive the + fields inside of a single record value, or individually + (default). If this is set to true, the event will receive a + single record of the type provided in ``fields``. + + + diff --git a/doc/intel.rst b/doc/frameworks/intel.rst similarity index 98% rename from doc/intel.rst rename to doc/frameworks/intel.rst index 390313461a..a8107905a9 100644 --- a/doc/intel.rst +++ b/doc/frameworks/intel.rst @@ -1,5 +1,7 @@ -Intel Framework -=============== + +====================== +Intelligence Framework +====================== Intro ----- diff --git a/doc/frameworks/logging-dataseries.rst b/doc/frameworks/logging-dataseries.rst new file mode 100644 index 0000000000..139a13f813 --- /dev/null +++ b/doc/frameworks/logging-dataseries.rst @@ -0,0 +1,186 @@ + +============================= +Binary Output with DataSeries +============================= + +.. rst-class:: opening + + Bro's default ASCII log format is not exactly the most efficient + way for storing and searching large volumes of data. An an + alternative, Bro comes with experimental support for `DataSeries + `_ + output, an efficient binary format for recording structured bulk + data. DataSeries is developed and maintained at HP Labs. + +.. contents:: + +Installing DataSeries +--------------------- + +To use DataSeries, its libraries must be available at compile-time, +along with the supporting *Lintel* package. Generally, both are +distributed on `HP Labs' web site +`_. Currently, however, you need +to use recent development versions for both packages, which you can +download from github like this:: + + git clone http://github.com/dataseries/Lintel + git clone http://github.com/dataseries/DataSeries + +To build and install the two into ````, do:: + + ( cd Lintel && mkdir build && cd build && cmake -DCMAKE_INSTALL_PREFIX= .. && make && make install ) + ( cd DataSeries && mkdir build && cd build && cmake -DCMAKE_INSTALL_PREFIX= .. && make && make install ) + +Please refer to the packages' documentation for more information about +the installation process. In particular, there's more information on +required and optional `dependencies for Lintel +`_ +and `dependencies for DataSeries +`_. +For users on RedHat-style systems, you'll need the following:: + + yum install libxml2-devel boost-devel + +Compiling Bro with DataSeries Support +------------------------------------- + +Once you have installed DataSeries, Bro's ``configure`` should pick it +up automatically as long as it finds it in a standard system location. +Alternatively, you can specify the DataSeries installation prefix +manually with ``--with-dataseries=``. Keep an eye on +``configure``'s summary output, if it looks like the following, Bro +found DataSeries and will compile in the support:: + + # ./configure --with-dataseries=/usr/local + [...] + ====================| Bro Build Summary |===================== + [...] + DataSeries: true + [...] + ================================================================ + +Activating DataSeries +--------------------- + +The direct way to use DataSeries is to switch *all* log files over to +the binary format. To do that, just add ``redef +Log::default_writer=Log::WRITER_DATASERIES;`` to your ``local.bro``. +For testing, you can also just pass that on the command line:: + + bro -r trace.pcap Log::default_writer=Log::WRITER_DATASERIES + +With that, Bro will now write all its output into DataSeries files +``*.ds``. You can inspect these using DataSeries's set of command line +tools, which its installation process installs into ``/bin``. +For example, to convert a file back into an ASCII representation:: + + $ ds2txt conn.log + [... We skip a bunch of metadata here ...] + ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto service duration orig_bytes resp_bytes conn_state local_orig missed_bytes history orig_pkts orig_ip_bytes resp_pkts resp_ip_bytes + 1300475167.096535 CRCC5OdDlXe 141.142.220.202 5353 224.0.0.251 5353 udp dns 0.000000 0 0 S0 F 0 D 1 73 0 0 + 1300475167.097012 o7XBsfvo3U1 fe80::217:f2ff:fed7:cf65 5353 ff02::fb 5353 udp 0.000000 0 0 S0 F 0 D 1 199 0 0 + 1300475167.099816 pXPi1kPMgxb 141.142.220.50 5353 224.0.0.251 5353 udp 0.000000 0 0 S0 F 0 D 1 179 0 0 + 1300475168.853899 R7sOc16woCj 141.142.220.118 43927 141.142.2.2 53 udp dns 0.000435 38 89 SF F 0 Dd 1 66 1 117 + 1300475168.854378 Z6dfHVmt0X7 141.142.220.118 37676 141.142.2.2 53 udp dns 0.000420 52 99 SF F 0 Dd 1 80 1 127 + 1300475168.854837 k6T92WxgNAh 141.142.220.118 40526 141.142.2.2 53 udp dns 0.000392 38 183 SF F 0 Dd 1 66 1 211 + [...] + +(``--skip-all`` suppresses the metadata.) + +Note that the ASCII conversion is *not* equivalent to Bro's default +output format. + +You can also switch only individual files over to DataSeries by adding +code like this to your ``local.bro``: + +.. code:: bro + + event bro_init() + { + local f = Log::get_filter(Conn::LOG, "default"); # Get default filter for connection log. + f$writer = Log::WRITER_DATASERIES; # Change writer type. + Log::add_filter(Conn::LOG, f); # Replace filter with adapted version. + } + +Bro's DataSeries writer comes with a few tuning options, see +:doc:`scripts/base/frameworks/logging/writers/dataseries`. + +Working with DataSeries +======================= + +Here are a few examples of using DataSeries command line tools to work +with the output files. + +* Printing CSV:: + + $ ds2txt --csv conn.log + ts,uid,id.orig_h,id.orig_p,id.resp_h,id.resp_p,proto,service,duration,orig_bytes,resp_bytes,conn_state,local_orig,missed_bytes,history,orig_pkts,orig_ip_bytes,resp_pkts,resp_ip_bytes + 1258790493.773208,ZTtgbHvf4s3,192.168.1.104,137,192.168.1.255,137,udp,dns,3.748891,350,0,S0,F,0,D,7,546,0,0 + 1258790451.402091,pOY6Rw7lhUd,192.168.1.106,138,192.168.1.255,138,udp,,0.000000,0,0,S0,F,0,D,1,229,0,0 + 1258790493.787448,pn5IiEslca9,192.168.1.104,138,192.168.1.255,138,udp,,2.243339,348,0,S0,F,0,D,2,404,0,0 + 1258790615.268111,D9slyIu3hFj,192.168.1.106,137,192.168.1.255,137,udp,dns,3.764626,350,0,S0,F,0,D,7,546,0,0 + [...] + + Add ``--separator=X`` to set a different separator. + +* Extracting a subset of columns:: + + $ ds2txt --select '*' ts,id.resp_h,id.resp_p --skip-all conn.log + 1258790493.773208 192.168.1.255 137 + 1258790451.402091 192.168.1.255 138 + 1258790493.787448 192.168.1.255 138 + 1258790615.268111 192.168.1.255 137 + 1258790615.289842 192.168.1.255 138 + [...] + +* Filtering rows:: + + $ ds2txt --where '*' 'duration > 5 && id.resp_p > 1024' --skip-all conn.ds + 1258790631.532888 V8mV5WLITu5 192.168.1.105 55890 239.255.255.250 1900 udp 15.004568 798 0 S0 F 0 D 6 966 0 0 + 1258792413.439596 tMcWVWQptvd 192.168.1.105 55890 239.255.255.250 1900 udp 15.004581 798 0 S0 F 0 D 6 966 0 0 + 1258794195.346127 cQwQMRdBrKa 192.168.1.105 55890 239.255.255.250 1900 udp 15.005071 798 0 S0 F 0 D 6 966 0 0 + 1258795977.253200 i8TEjhWd2W8 192.168.1.105 55890 239.255.255.250 1900 udp 15.004824 798 0 S0 F 0 D 6 966 0 0 + 1258797759.160217 MsLsBA8Ia49 192.168.1.105 55890 239.255.255.250 1900 udp 15.005078 798 0 S0 F 0 D 6 966 0 0 + 1258799541.068452 TsOxRWJRGwf 192.168.1.105 55890 239.255.255.250 1900 udp 15.004082 798 0 S0 F 0 D 6 966 0 0 + [...] + +* Calculate some statistics: + + Mean/stddev/min/max over a column:: + + $ dsstatgroupby '*' basic duration from conn.ds + # Begin DSStatGroupByModule + # processed 2159 rows, where clause eliminated 0 rows + # count(*), mean(duration), stddev, min, max + 2159, 42.7938, 1858.34, 0, 86370 + [...] + + Quantiles of total connection volume:: + + $ dsstatgroupby '*' quantile 'orig_bytes + resp_bytes' from conn.ds + [...] + 2159 data points, mean 24616 +- 343295 [0,1.26615e+07] + quantiles about every 216 data points: + 10%: 0, 124, 317, 348, 350, 350, 601, 798, 1469 + tails: 90%: 1469, 95%: 7302, 99%: 242629, 99.5%: 1226262 + [...] + +The ``man`` pages for these tools show further options, and their +``-h`` option gives some more information (either can be a bit cryptic +unfortunately though). + +Deficiencies +------------ + +Due to limitations of the DataSeries format, one cannot inspect its +files before they have been fully written. In other words, when using +DataSeries, it's currently not possible to inspect the live log +files inside the spool directory before they are rotated to their +final location. It seems that this could be fixed with some effort, +and we will work with DataSeries development team on that if the +format gains traction among Bro users. + +Likewise, we're considering writing custom command line tools for +interacting with DataSeries files, making that a bit more convenient +than what the standard utilities provide. diff --git a/doc/frameworks/logging-elasticsearch.rst b/doc/frameworks/logging-elasticsearch.rst new file mode 100644 index 0000000000..7571c68219 --- /dev/null +++ b/doc/frameworks/logging-elasticsearch.rst @@ -0,0 +1,89 @@ + +========================================= +Indexed Logging Output with ElasticSearch +========================================= + +.. rst-class:: opening + + Bro's default ASCII log format is not exactly the most efficient + way for searching large volumes of data. ElasticSearch + is a new data storage technology for dealing with tons of data. + It's also a search engine built on top of Apache's Lucene + project. It scales very well, both for distributed indexing and + distributed searching. + +.. contents:: + +Warning +------- + +This writer plugin is still in testing and is not yet recommended for +production use! The approach to how logs are handled in the plugin is "fire +and forget" at this time, there is no error handling if the server fails to +respond successfully to the insertion request. + +Installing ElasticSearch +------------------------ + +Download the latest version from: . +Once extracted, start ElasticSearch with:: + +# ./bin/elasticsearch + +For more detailed information, refer to the ElasticSearch installation +documentation: http://www.elasticsearch.org/guide/reference/setup/installation.html + +Compiling Bro with ElasticSearch Support +---------------------------------------- + +First, ensure that you have libcurl installed the run configure.:: + + # ./configure + [...] + ====================| Bro Build Summary |===================== + [...] + cURL: true + [...] + ElasticSearch: true + [...] + ================================================================ + +Activating ElasticSearch +------------------------ + +The easiest way to enable ElasticSearch output is to load the tuning/logs-to- +elasticsearch.bro script. If you are using BroControl, the following line in +local.bro will enable it. + +.. console:: + + @load tuning/logs-to-elasticsearch + +With that, Bro will now write most of its logs into ElasticSearch in addition +to maintaining the Ascii logs like it would do by default. That script has +some tunable options for choosing which logs to send to ElasticSearch, refer +to the autogenerated script documentation for those options. + +There is an interface being written specifically to integrate with the data +that Bro outputs into ElasticSearch named Brownian. It can be found here:: + + https://github.com/grigorescu/Brownian + +Tuning +------ + +A common problem encountered with ElasticSearch is too many files being held +open. The ElasticSearch website has some suggestions on how to increase the +open file limit. + + - http://www.elasticsearch.org/tutorials/2011/04/06/too-many-open-files.html + +TODO +---- + +Lots. + +- Perform multicast discovery for server. +- Better error detection. +- Better defaults (don't index loaded-plugins, for instance). +- diff --git a/doc/frameworks/logging.rst b/doc/frameworks/logging.rst new file mode 100644 index 0000000000..06935647d3 --- /dev/null +++ b/doc/frameworks/logging.rst @@ -0,0 +1,387 @@ + +================= +Logging Framework +================= + +.. rst-class:: opening + + Bro comes with a flexible key-value based logging interface that + allows fine-grained control of what gets logged and how it is + logged. This document describes how logging can be customized and + extended. + +.. contents:: + +Terminology +=========== + +Bro's logging interface is built around three main abstractions: + + Log streams + A stream corresponds to a single log. It defines the set of + fields that a log consists of with their names and fields. + Examples are the ``conn`` for recording connection summaries, + and the ``http`` stream for recording HTTP activity. + + Filters + Each stream has a set of filters attached to it that determine + what information gets written out. By default, each stream has + one default filter that just logs everything directly to disk + with an automatically generated file name. However, further + filters can be added to record only a subset, split a stream + into different outputs, or to even duplicate the log to + multiple outputs. If all filters are removed from a stream, + all output is disabled. + + Writers + A writer defines the actual output format for the information + being logged. At the moment, Bro comes with only one type of + writer, which produces tab separated ASCII files. In the + future we will add further writers, like for binary output and + direct logging into a database. + +Basics +====== + +The data fields that a stream records are defined by a record type +specified when it is created. Let's look at the script generating Bro's +connection summaries as an example, +:doc:`scripts/base/protocols/conn/main`. It defines a record +:bro:type:`Conn::Info` that lists all the fields that go into +``conn.log``, each marked with a ``&log`` attribute indicating that it +is part of the information written out. To write a log record, the +script then passes an instance of :bro:type:`Conn::Info` to the logging +framework's :bro:id:`Log::write` function. + +By default, each stream automatically gets a filter named ``default`` +that generates the normal output by recording all record fields into a +single output file. + +In the following, we summarize ways in which the logging can be +customized. We continue using the connection summaries as our example +to work with. + +Filtering +--------- + +To create a new output file for an existing stream, you can add a +new filter. A filter can, e.g., restrict the set of fields being +logged: + +.. code:: bro + + event bro_init() + { + # Add a new filter to the Conn::LOG stream that logs only + # timestamp and originator address. + local filter: Log::Filter = [$name="orig-only", $path="origs", $include=set("ts", "id.orig_h")]; + Log::add_filter(Conn::LOG, filter); + } + +Note the fields that are set for the filter: + + ``name`` + A mandatory name for the filter that can later be used + to manipulate it further. + + ``path`` + The filename for the output file, without any extension (which + may be automatically added by the writer). Default path values + are generated by taking the stream's ID and munging it slightly. + :bro:enum:`Conn::LOG` is converted into ``conn``, + :bro:enum:`PacketFilter::LOG` is converted into + ``packet_filter``, and :bro:enum:`Notice::POLICY_LOG` is + converted into ``notice_policy``. + + ``include`` + A set limiting the fields to the ones given. The names + correspond to those in the :bro:type:`Conn::Info` record, with + sub-records unrolled by concatenating fields (separated with + dots). + +Using the code above, you will now get a new log file ``origs.log`` +that looks like this:: + + #separator \x09 + #path origs + #fields ts id.orig_h + #types time addr + 1128727430.350788 141.42.64.125 + 1128727435.450898 141.42.64.125 + +If you want to make this the only log file for the stream, you can +remove the default filter (which, conveniently, has the name +``default``): + +.. code:: bro + + event bro_init() + { + # Remove the filter called "default". + Log::remove_filter(Conn::LOG, "default"); + } + +An alternate approach to "turning off" a log is to completely disable +the stream: + +.. code:: bro + + event bro_init() + { + Log::disable_stream(Conn::LOG); + } + +If you want to skip only some fields but keep the rest, there is a +corresponding ``exclude`` filter attribute that you can use instead of +``include`` to list only the ones you are not interested in. + +A filter can also determine output paths *dynamically* based on the +record being logged. That allows, e.g., to record local and remote +connections into separate files. To do this, you define a function +that returns the desired path: + +.. code:: bro + + function split_log(id: Log::ID, path: string, rec: Conn::Info) : string + { + # Return "conn-local" if originator is a local IP, otherwise "conn-remote". + local lr = Site::is_local_addr(rec$id$orig_h) ? "local" : "remote"; + return fmt("%s-%s", path, lr); + } + + event bro_init() + { + local filter: Log::Filter = [$name="conn-split", $path_func=split_log, $include=set("ts", "id.orig_h")]; + Log::add_filter(Conn::LOG, filter); + } + +Running this will now produce two files, ``local.log`` and +``remote.log``, with the corresponding entries. One could extend this +further for example to log information by subnets or even by IP +address. Be careful, however, as it is easy to create many files very +quickly ... + +.. sidebar:: A More Generic Path Function + + The ``split_log`` method has one draw-back: it can be used + only with the :bro:enum:`Conn::LOG` stream as the record type is hardcoded + into its argument list. However, Bro allows to do a more generic + variant: + + .. code:: bro + + function split_log(id: Log::ID, path: string, rec: record { id: conn_id; } ) : string + { + return Site::is_local_addr(rec$id$orig_h) ? "local" : "remote"; + } + + This function can be used with all log streams that have records + containing an ``id: conn_id`` field. + +While so far we have seen how to customize the columns being logged, +you can also control which records are written out by providing a +predicate that will be called for each log record: + +.. code:: bro + + function http_only(rec: Conn::Info) : bool + { + # Record only connections with successfully analyzed HTTP traffic + return rec$service == "http"; + } + + event bro_init() + { + local filter: Log::Filter = [$name="http-only", $path="conn-http", $pred=http_only]; + Log::add_filter(Conn::LOG, filter); + } + +This will result in a log file ``conn-http.log`` that contains only +traffic detected and analyzed as HTTP traffic. + +Extending +--------- + +You can add further fields to a log stream by extending the record +type that defines its content. Let's say we want to add a boolean +field ``is_private`` to :bro:type:`Conn::Info` that indicates whether the +originator IP address is part of the :rfc:`1918` space: + +.. code:: bro + + # Add a field to the connection log record. + redef record Conn::Info += { + ## Indicate if the originator of the connection is part of the + ## "private" address space defined in RFC1918. + is_private: bool &default=F &log; + }; + + +Now we need to set the field. A connection's summary is generated at +the time its state is removed from memory. We can add another handler +at that time that sets our field correctly: + +.. code:: bro + + event connection_state_remove(c: connection) + { + if ( c$id$orig_h in Site::private_address_space ) + c$conn$is_private = T; + } + +Now ``conn.log`` will show a new field ``is_private`` of type +``bool``. + +Notes: + +- For extending logs this way, one needs a bit of knowledge about how + the script that creates the log stream is organizing its state + keeping. Most of the standard Bro scripts attach their log state to + the :bro:type:`connection` record where it can then be accessed, just + as the ``c$conn`` above. For example, the HTTP analysis adds a field + ``http`` of type :bro:type:`HTTP::Info` to the :bro:type:`connection` + record. See the script reference for more information. + +- When extending records as shown above, the new fields must always be + declared either with a ``&default`` value or as ``&optional``. + Furthermore, you need to add the ``&log`` attribute or otherwise the + field won't appear in the output. + +Hooking into the Logging +------------------------ + +Sometimes it is helpful to do additional analysis of the information +being logged. For these cases, a stream can specify an event that will +be generated every time a log record is written to it. All of Bro's +default log streams define such an event. For example, the connection +log stream raises the event :bro:id:`Conn::log_conn`. You +could use that for example for flagging when a connection to a +specific destination exceeds a certain duration: + +.. code:: bro + + redef enum Notice::Type += { + ## Indicates that a connection remained established longer + ## than 5 minutes. + Long_Conn_Found + }; + + event Conn::log_conn(rec: Conn::Info) + { + if ( rec$duration > 5mins ) + NOTICE([$note=Long_Conn_Found, + $msg=fmt("unusually long conn to %s", rec$id$resp_h), + $id=rec$id]); + } + +Often, these events can be an alternative to post-processing Bro logs +externally with Perl scripts. Much of what such an external script +would do later offline, one may instead do directly inside of Bro in +real-time. + +Rotation +-------- + +By default, no log rotation occurs, but it's globally controllable for all +filters by redefining the :bro:id:`Log::default_rotation_interval` option: + +.. code:: bro + + redef Log::default_rotation_interval = 1 hr; + +Or specifically for certain :bro:type:`Log::Filter` instances by setting +their ``interv`` field. Here's an example of changing just the +:bro:enum:`Conn::LOG` stream's default filter rotation. + +.. code:: bro + + event bro_init() + { + local f = Log::get_filter(Conn::LOG, "default"); + f$interv = 1 min; + Log::remove_filter(Conn::LOG, "default"); + Log::add_filter(Conn::LOG, f); + } + +ASCII Writer Configuration +-------------------------- + +The ASCII writer has a number of options for customizing the format of +its output, see :doc:`scripts/base/frameworks/logging/writers/ascii`. + +Adding Streams +============== + +It's easy to create a new log stream for custom scripts. Here's an +example for the ``Foo`` module: + +.. code:: bro + + module Foo; + + export { + # Create an ID for our new stream. By convention, this is + # called "LOG". + redef enum Log::ID += { LOG }; + + # Define the fields. By convention, the type is called "Info". + type Info: record { + ts: time &log; + id: conn_id &log; + }; + + # Define a hook event. By convention, this is called + # "log_". + global log_foo: event(rec: Info); + + } + + # This event should be handled at a higher priority so that when + # users modify your stream later and they do it at priority 0, + # their code runs after this. + event bro_init() &priority=5 + { + # Create the stream. This also adds a default filter automatically. + Log::create_stream(Foo::LOG, [$columns=Info, $ev=log_foo]); + } + +You can also add the state to the :bro:type:`connection` record to make +it easily accessible across event handlers: + +.. code:: bro + + redef record connection += { + foo: Info &optional; + } + +Now you can use the :bro:id:`Log::write` method to output log records and +save the logged ``Foo::Info`` record into the connection record: + +.. code:: bro + + event connection_established(c: connection) + { + local rec: Foo::Info = [$ts=network_time(), $id=c$id]; + c$foo = rec; + Log::write(Foo::LOG, rec); + } + +See the existing scripts for how to work with such a new connection +field. A simple example is :doc:`scripts/base/protocols/syslog/main`. + +When you are developing scripts that add data to the :bro:type:`connection` +record, care must be given to when and how long data is stored. +Normally data saved to the connection record will remain there for the +duration of the connection and from a practical perspective it's not +uncommon to need to delete that data before the end of the connection. + +Other Writers +------------- + +Bro supports the following output formats other than ASCII: + +.. toctree:: + :maxdepth: 1 + + logging-dataseries + logging-elasticsearch diff --git a/doc/frameworks/notice.rst b/doc/frameworks/notice.rst new file mode 100644 index 0000000000..dd0be42f02 --- /dev/null +++ b/doc/frameworks/notice.rst @@ -0,0 +1,357 @@ + +================ +Notice Framework +================ + +.. rst-class:: opening + + One of the easiest ways to customize Bro is writing a local notice + policy. Bro can detect a large number of potentially interesting + situations, and the notice policy hook which of them the user wants to be + acted upon in some manner. In particular, the notice policy can specify + actions to be taken, such as sending an email or compiling regular + alarm emails. This page gives an introduction into writing such a notice + policy. + +.. contents:: + +Overview +-------- + +Let's start with a little bit of background on Bro's philosophy on reporting +things. Bro ships with a large number of policy scripts which perform a wide +variety of analyses. Most of these scripts monitor for activity which might be +of interest for the user. However, none of these scripts determines the +importance of what it finds itself. Instead, the scripts only flag situations +as *potentially* interesting, leaving it to the local configuration to define +which of them are in fact actionable. This decoupling of detection and +reporting allows Bro to address the different needs that sites have. +Definitions of what constitutes an attack or even a compromise differ quite a +bit between environments, and activity deemed malicious at one site might be +fully acceptable at another. + +Whenever one of Bro's analysis scripts sees something potentially +interesting it flags the situation by calling the :bro:see:`NOTICE` +function and giving it a single :bro:see:`Notice::Info` record. A Notice +has a :bro:see:`Notice::Type`, which reflects the kind of activity that +has been seen, and it is usually also augmented with further context +about the situation. + +More information about raising notices can be found in the `Raising Notices`_ +section. + +Once a notice is raised, it can have any number of actions applied to it by +writing :bro:see:`Notice::policy` hooks which is described in the `Notice Policy`_ +section below. Such actions can be to send a mail to the configured +address(es) or to simply ignore the notice. Currently, the following actions +are defined: + +.. list-table:: + :widths: 20 80 + :header-rows: 1 + + * - Action + - Description + + * - Notice::ACTION_LOG + - Write the notice to the :bro:see:`Notice::LOG` logging stream. + + * - Notice::ACTION_ALARM + - Log into the :bro:see:`Notice::ALARM_LOG` stream which will rotate + hourly and email the contents to the email address or addresses + defined in the :bro:see:`Notice::mail_dest` variable. + + * - Notice::ACTION_EMAIL + - Send the notice in an email to the email address or addresses given in + the :bro:see:`Notice::mail_dest` variable. + + * - Notice::ACTION_PAGE + - Send an email to the email address or addresses given in the + :bro:see:`Notice::mail_page_dest` variable. + +How these notice actions are applied to notices is discussed in the +`Notice Policy`_ and `Notice Policy Shortcuts`_ sections. + +Processing Notices +------------------ + +Notice Policy +************* + +The hook :bro:see:`Notice::policy` provides the mechanism for applying +actions and generally modifying the notice before it's sent onward to +the action plugins. Hooks can be thought of as multi-bodied functions +and using them looks very similar to handling events. The difference +is that they don't go through the event queue like events. Users should +directly make modifications to the :bro:see:`Notice::Info` record +given as the argument to the hook. + +Here's a simple example which tells Bro to send an email for all notices of +type :bro:see:`SSH::Login` if the server is 10.0.0.1: + +.. code:: bro + + hook Notice::policy(n: Notice::Info) + { + if ( n$note == SSH::Login && n$id$resp_h == 10.0.0.1 ) + add n$actions[Notice::ACTION_EMAIL]; + } + +.. note:: + + Keep in mind that the semantics of the SSH::Login notice are + such that it is only raised when Bro heuristically detects a successful + login. No apparently failed logins will raise this notice. + +Hooks can also have priorities applied to order their execution like events +with a default priority of 0. Greater values are executed first. Setting +a hook body to run before default hook bodies might look like this: + +.. code:: bro + + hook Notice::policy(n: Notice::Info) &priority=5 + { + if ( n$note == SSH::Login && n$id$resp_h == 10.0.0.1 ) + add n$actions[Notice::ACTION_EMAIL]; + } + +Hooks can also abort later hook bodies with the ``break`` keyword. This +is primarily useful if one wants to completely preempt processing by +lower priority :bro:see:`Notice::policy` hooks. + +Notice Policy Shortcuts +*********************** + +Although the notice framework provides a great deal of flexibility and +configurability there are many times that the full expressiveness isn't needed +and actually becomes a hindrance to achieving results. The framework provides +a default :bro:see:`Notice::policy` hook body as a way of giving users the +shortcuts to easily apply many common actions to notices. + +These are implemented as sets and tables indexed with a +:bro:see:`Notice::Type` enum value. The following table shows and describes +all of the variables available for shortcut configuration of the notice +framework. + +.. list-table:: + :widths: 32 40 + :header-rows: 1 + + * - Variable name + - Description + + * - :bro:see:`Notice::ignored_types` + - Adding a :bro:see:`Notice::Type` to this set results in the notice + being ignored. It won't have any other action applied to it, not even + :bro:see:`Notice::ACTION_LOG`. + + * - :bro:see:`Notice::emailed_types` + - Adding a :bro:see:`Notice::Type` to this set results in + :bro:see:`Notice::ACTION_EMAIL` being applied to the notices of + that type. + + * - :bro:see:`Notice::alarmed_types` + - Adding a :bro:see:`Notice::Type` to this set results in + :bro:see:`Notice::ACTION_ALARM` being applied to the notices of + that type. + + * - :bro:see:`Notice::not_suppressed_types` + - Adding a :bro:see:`Notice::Type` to this set results in that notice + no longer undergoing the normal notice suppression that would + take place. Be careful when using this in production it could + result in a dramatic increase in the number of notices being + processed. + + * - :bro:see:`Notice::type_suppression_intervals` + - This is a table indexed on :bro:see:`Notice::Type` and yielding an + interval. It can be used as an easy way to extend the default + suppression interval for an entire :bro:see:`Notice::Type` + without having to create a whole :bro:see:`Notice::policy` entry + and setting the ``$suppress_for`` field. + +Raising Notices +--------------- + +A script should raise a notice for any occurrence that a user may want +to be notified about or take action on. For example, whenever the base +SSH analysis scripts sees an SSH session where it is heuristically +guessed to be a successful login, it raises a Notice of the type +:bro:see:`SSH::Login`. The code in the base SSH analysis script looks +like this: + +.. code:: bro + + NOTICE([$note=SSH::Login, + $msg="Heuristically detected successful SSH login.", + $conn=c]); + +:bro:see:`NOTICE` is a normal function in the global namespace which +wraps a function within the ``Notice`` namespace. It takes a single +argument of the :bro:see:`Notice::Info` record type. The most common +fields used when raising notices are described in the following table: + +.. list-table:: + :widths: 32 40 + :header-rows: 1 + + * - Field name + - Description + + * - ``$note`` + - This field is required and is an enum value which represents the + notice type. + + * - ``$msg`` + - This is a human readable message which is meant to provide more + information about this particular instance of the notice type. + + * - ``$sub`` + - This is a sub-message meant for human readability but will + frequently also be used to contain data meant to be matched with the + ``Notice::policy``. + + * - ``$conn`` + - If a connection record is available when the notice is being raised + and the notice represents some attribute of the connection, then the + connection record can be given here. Other fields such as ``$id`` and + ``$src`` will automatically be populated from this value. + + * - ``$id`` + - If a conn_id record is available when the notice is being raised and + the notice represents some attribute of the connection, then the + connection can be given here. Other fields such as ``$src`` will + automatically be populated from this value. + + * - ``$src`` + - If the notice represents an attribute of a single host then it's + possible that only this field should be filled out to represent the + host that is being "noticed". + + * - ``$n`` + - This normally represents a number if the notice has to do with some + number. It's most frequently used for numeric tests in the + ``Notice::policy`` for making policy decisions. + + * - ``$identifier`` + - This represents a unique identifier for this notice. This field is + described in more detail in the `Automated Suppression`_ section. + + * - ``$suppress_for`` + - This field can be set if there is a natural suppression interval for + the notice that may be different than the default value. The + value set to this field can also be modified by a user's + :bro:see:`Notice::policy` so the value is not set permanently + and unchangeably. + +When writing Bro scripts which raise notices, some thought should be given to +what the notice represents and what data should be provided to give a consumer +of the notice the best information about the notice. If the notice is +representative of many connections and is an attribute of a host (e.g. a +scanning host) it probably makes most sense to fill out the ``$src`` field and +not give a connection or conn_id. If a notice is representative of a +connection attribute (e.g. an apparent SSH login) then it makes sense to fill +out either ``$conn`` or ``$id`` based on the data that is available when the +notice is raised. Using care when inserting data into a notice will make later +analysis easier when only the data to fully represent the occurrence that +raised the notice is available. If complete connection information is +available when an SSL server certificate is expiring, the logs will be very +confusing because the connection that the certificate was detected on is a +side topic to the fact that an expired certificate was detected. It's possible +in many cases that two or more separate notices may need to be generated. As +an example, one could be for the detection of the expired SSL certificate and +another could be for if the client decided to go ahead with the connection +neglecting the expired certificate. + +Automated Suppression +--------------------- + +The notice framework supports suppression for notices if the author of the +script that is generating the notice has indicated to the notice framework how +to identify notices that are intrinsically the same. Identification of these +"intrinsically duplicate" notices is implemented with an optional field in +:bro:see:`Notice::Info` records named ``$identifier`` which is a simple string. +If the ``$identifier`` and ``$type`` fields are the same for two notices, the +notice framework actually considers them to be the same thing and can use that +information to suppress duplicates for a configurable period of time. + +.. note:: + + If the ``$identifier`` is left out of a notice, no notice suppression + takes place due to the framework's inability to identify duplicates. This + could be completely legitimate usage if no notices could ever be + considered to be duplicates. + +The ``$identifier`` field is typically comprised of several pieces of +data related to the notice that when combined represent a unique +instance of that notice. Here is an example of the script +:doc:`scripts/policy/protocols/ssl/validate-certs` raising a notice +for session negotiations where the certificate or certificate chain did +not validate successfully against the available certificate authority +certificates. + +.. code:: bro + + NOTICE([$note=SSL::Invalid_Server_Cert, + $msg=fmt("SSL certificate validation failed with (%s)", c$ssl$validation_status), + $sub=c$ssl$subject, + $conn=c, + $identifier=cat(c$id$resp_h,c$id$resp_p,c$ssl$validation_status,c$ssl$cert_hash)]); + +In the above example you can see that the ``$identifier`` field contains a +string that is built from the responder IP address and port, the validation +status message, and the MD5 sum of the server certificate. Those fields in +particular are chosen because different SSL certificates could be seen on any +port of a host, certificates could fail validation for different reasons, and +multiple server certificates could be used on that combination of IP address +and port with the ``server_name`` SSL extension (explaining the addition of +the MD5 sum of the certificate). The result is that if a certificate fails +validation and all four pieces of data match (IP address, port, validation +status, and certificate hash) that particular notice won't be raised again for +the default suppression period. + +Setting the ``$identifier`` field is left to those raising notices because +it's assumed that the script author who is raising the notice understands the +full problem set and edge cases of the notice which may not be readily +apparent to users. If users don't want the suppression to take place or simply +want a different interval, they can set a notice's suppression +interval to ``0secs`` or delete the value from the ``$identifier`` field in +a :bro:see:`Notice::policy` hook. + + +Extending Notice Framework +-------------------------- + +There are a couple of mechanism currently for extending the notice framework +and adding new capability. + +Extending Notice Emails +*********************** + +If there is extra information that you would like to add to emails, that is +possible to add by writing :bro:see:`Notice::policy` hooks. + +There is a field in the :bro:see:`Notice::Info` record named +``$email_body_sections`` which will be included verbatim when email is being +sent. An example of including some information from an HTTP request is +included below. + +.. code:: bro + + hook Notice::policy(n: Notice::Info) + { + if ( n?$conn && n$conn?$http && n$conn$http?$host ) + n$email_body_sections[|email_body_sections|] = fmt("HTTP host header: %s", n$conn$http$host); + } + + +Cluster Considerations +---------------------- + +As a user/developer of Bro, the main cluster concern with the notice framework +is understanding what runs where. When a notice is generated on a worker, the +worker checks to see if the notice shoudl be suppressed based on information +locally maintained in the worker process. If it's not being +suppressed, the worker forwards the notice directly to the manager and does no more +local processing. The manager then runs the :bro:see:`Notice::policy` hook and +executes all of the actions determined to be run. + diff --git a/doc/frameworks/signatures.rst b/doc/frameworks/signatures.rst new file mode 100644 index 0000000000..915133e178 --- /dev/null +++ b/doc/frameworks/signatures.rst @@ -0,0 +1,394 @@ + +=================== +Signature Framework +=================== + +.. rst-class:: opening + + Bro relies primarily on its extensive scripting language for + defining and analyzing detection policies. In addition, however, + Bro also provides an independent *signature language* for doing + low-level, Snort-style pattern matching. While signatures are + *not* Bro's preferred detection tool, they sometimes come in handy + and are closer to what many people are familiar with from using + other NIDS. This page gives a brief overview on Bro's signatures + and covers some of their technical subtleties. + +.. contents:: + :depth: 2 + +Basics +====== + +Let's look at an example signature first: + +.. code:: bro-sig + + signature my-first-sig { + ip-proto == tcp + dst-port == 80 + payload /.*root/ + event "Found root!" + } + + +This signature asks Bro to match the regular expression ``.*root`` on +all TCP connections going to port 80. When the signature triggers, Bro +will raise an event :bro:id:`signature_match` of the form: + +.. code:: bro + + event signature_match(state: signature_state, msg: string, data: string) + +Here, ``state`` contains more information on the connection that +triggered the match, ``msg`` is the string specified by the +signature's event statement (``Found root!``), and data is the last +piece of payload which triggered the pattern match. + +To turn such :bro:id:`signature_match` events into actual alarms, you can +load Bro's :doc:`/scripts/base/frameworks/signatures/main` script. +This script contains a default event handler that raises +:bro:enum:`Signatures::Sensitive_Signature` :doc:`Notices ` +(as well as others; see the beginning of the script). + +As signatures are independent of Bro's policy scripts, they are put into +their own file(s). There are three ways to specify which files contain +signatures: By using the ``-s`` flag when you invoke Bro, or by +extending the Bro variable :bro:id:`signature_files` using the ``+=`` +operator, or by using the ``@load-sigs`` directive inside a Bro script. +If a signature file is given without a full path, it is searched for +along the normal ``BROPATH``. Additionally, the ``@load-sigs`` +directive can be used to load signature files in a path relative to the +Bro script in which it's placed, e.g. ``@load-sigs ./mysigs.sig`` will +expect that signature file in the same directory as the Bro script. The +default extension of the file name is ``.sig``, and Bro appends that +automatically when necessary. + +Signature language +================== + +Let's look at the format of a signature more closely. Each individual +signature has the format ``signature { }``. ```` +is a unique label for the signature. There are two types of +attributes: *conditions* and *actions*. The conditions define when the +signature matches, while the actions declare what to do in the case of +a match. Conditions can be further divided into four types: *header*, +*content*, *dependency*, and *context*. We discuss these all in more +detail in the following. + +Conditions +---------- + +Header Conditions +~~~~~~~~~~~~~~~~~ + +Header conditions limit the applicability of the signature to a subset +of traffic that contains matching packet headers. This type of matching +is performed only for the first packet of a connection. + +There are pre-defined header conditions for some of the most used +header fields. All of them generally have the format `` +``, where ```` names the header field; ``cmp`` is +one of ``==``, ``!=``, ``<``, ``<=``, ``>``, ``>=``; and +```` is a list of comma-separated values to compare +against. The following keywords are defined: + +``src-ip``/``dst-ip `` + Source and destination address, respectively. Addresses can be given + as IPv4 or IPv6 addresses or CIDR masks. For IPv6 addresses/masks + the colon-hexadecimal representation of the address must be enclosed + in square brackets (e.g. ``[fe80::1]`` or ``[fe80::0]/16``). + +``src-port``/``dst-port `` + Source and destination port, respectively. + +``ip-proto tcp|udp|icmp|icmp6|ip|ip6`` + IPv4 header's Protocol field or the Next Header field of the final + IPv6 header (i.e. either Next Header field in the fixed IPv6 header + if no extension headers are present or that field from the last + extension header in the chain). Note that the IP-in-IP forms of + tunneling are automatically decapsulated by default and signatures + apply to only the inner-most packet, so specifying ``ip`` or ``ip6`` + is a no-op. + +For lists of multiple values, they are sequentially compared against +the corresponding header field. If at least one of the comparisons +evaluates to true, the whole header condition matches (exception: with +``!=``, the header condition only matches if all values differ). + +In addition to these pre-defined header keywords, a general header +condition can be defined either as + +.. code:: bro-sig + + header [:] [& ] + +This compares the value found at the given position of the packet header +with a list of values. ``offset`` defines the position of the value +within the header of the protocol defined by ``proto`` (which can be +``ip``, ``ip6``, ``tcp``, ``udp``, ``icmp`` or ``icmp6``). ``size`` is +either 1, 2, or 4 and specifies the value to have a size of this many +bytes. If the optional ``& `` is given, the packet's value is +first masked with the integer before it is compared to the value-list. +``cmp`` is one of ``==``, ``!=``, ``<``, ``<=``, ``>``, ``>=``. +``value-list`` is a list of comma-separated integers similar to those +described above. The integers within the list may be followed by an +additional ``/ mask`` where ``mask`` is a value from 0 to 32. This +corresponds to the CIDR notation for netmasks and is translated into a +corresponding bitmask applied to the packet's value prior to the +comparison (similar to the optional ``& integer``). IPv6 address values +are not allowed in the value-list, though you can still inspect any 1, +2, or 4 byte section of an IPv6 header using this keyword. + +Putting it all together, this is an example condition that is +equivalent to ``dst-ip == 1.2.3.4/16, 5.6.7.8/24``: + +.. code:: bro-sig + + header ip[16:4] == 1.2.3.4/16, 5.6.7.8/24 + +Note that the analogous example for IPv6 isn't currently possible since +4 bytes is the max width of a value that can be compared. + +Content Conditions +~~~~~~~~~~~~~~~~~~ + +Content conditions are defined by regular expressions. We +differentiate two kinds of content conditions: first, the expression +may be declared with the ``payload`` statement, in which case it is +matched against the raw payload of a connection (for reassembled TCP +streams) or of each packet (for ICMP, UDP, and non-reassembled TCP). +Second, it may be prefixed with an analyzer-specific label, in which +case the expression is matched against the data as extracted by the +corresponding analyzer. + +A ``payload`` condition has the form: + +.. code:: bro-sig + + payload // + +Currently, the following analyzer-specific content conditions are +defined (note that the corresponding analyzer has to be activated by +loading its policy script): + +``http-request //`` + The regular expression is matched against decoded URIs of HTTP + requests. Obsolete alias: ``http``. + +``http-request-header //`` + The regular expression is matched against client-side HTTP headers. + +``http-request-body //`` + The regular expression is matched against client-side bodys of + HTTP requests. + +``http-reply-header //`` + The regular expression is matched against server-side HTTP headers. + +``http-reply-body //`` + The regular expression is matched against server-side bodys of + HTTP replys. + +``ftp //`` + The regular expression is matched against the command line input + of FTP sessions. + +``finger //`` + The regular expression is matched against finger requests. + +For example, ``http-request /.*(etc/(passwd|shadow)/`` matches any URI +containing either ``etc/passwd`` or ``etc/shadow``. To filter on request +types, e.g. ``GET``, use ``payload /GET /``. + +Note that HTTP pipelining (that is, multiple HTTP transactions in a +single TCP connection) has some side effects on signature matches. If +multiple conditions are specified within a single signature, this +signature matches if all conditions are met by any HTTP transaction +(not necessarily always the same!) in a pipelined connection. + +Dependency Conditions +~~~~~~~~~~~~~~~~~~~~~ + +To define dependencies between signatures, there are two conditions: + + +``requires-signature [!] `` + Defines the current signature to match only if the signature given + by ``id`` matches for the same connection. Using ``!`` negates the + condition: The current signature only matches if ``id`` does not + match for the same connection (using this defers the match + decision until the connection terminates). + +``requires-reverse-signature [!] `` + Similar to ``requires-signature``, but ``id`` has to match for the + opposite direction of the same connection, compared to the current + signature. This allows to model the notion of requests and + replies. + +Context Conditions +~~~~~~~~~~~~~~~~~~ + +Context conditions pass the match decision on to other components of +Bro. They are only evaluated if all other conditions have already +matched. The following context conditions are defined: + +``eval `` + The given policy function is called and has to return a boolean + confirming the match. If false is returned, no signature match is + going to be triggered. The function has to be of type ``function + cond(state: signature_state, data: string): bool``. Here, + ``data`` may contain the most recent content chunk available at + the time the signature was matched. If no such chunk is available, + ``data`` will be the empty string. See :bro:type:`signature_state` + for its definition. + +``payload-size `` + Compares the integer to the size of the payload of a packet. For + reassembled TCP streams, the integer is compared to the size of + the first in-order payload chunk. Note that the latter is not very + well defined. + +``same-ip`` + Evaluates to true if the source address of the IP packets equals + its destination address. + +``tcp-state `` + Imposes restrictions on the current TCP state of the connection. + ``state-list`` is a comma-separated list of the keywords + ``established`` (the three-way handshake has already been + performed), ``originator`` (the current data is send by the + originator of the connection), and ``responder`` (the current data + is send by the responder of the connection). + + +Actions +------- + +Actions define what to do if a signature matches. Currently, there are +two actions defined: + +``event `` + Raises a :bro:id:`signature_match` event. The event handler has the + following type: + + .. code:: bro + + event signature_match(state: signature_state, msg: string, data: string) + + The given string is passed in as ``msg``, and data is the current + part of the payload that has eventually lead to the signature + match (this may be empty for signatures without content + conditions). + +``enable `` + Enables the protocol analyzer ```` for the matching + connection (``"http"``, ``"ftp"``, etc.). This is used by Bro's + dynamic protocol detection to activate analyzers on the fly. + +Things to keep in mind when writing signatures +============================================== + +* Each signature is reported at most once for every connection, + further matches of the same signature are ignored. + +* The content conditions perform pattern matching on elements + extracted from an application protocol dialogue. For example, ``http + /.*passwd/`` scans URLs requested within HTTP sessions. The thing to + keep in mind here is that these conditions only perform any matching + when the corresponding application analyzer is actually *active* for + a connection. Note that by default, analyzers are not enabled if the + corresponding Bro script has not been loaded. A good way to + double-check whether an analyzer "sees" a connection is checking its + log file for corresponding entries. If you cannot find the + connection in the analyzer's log, very likely the signature engine + has also not seen any application data. + +* As the name indicates, the ``payload`` keyword matches on packet + *payload* only. You cannot use it to match on packet headers; use + the header conditions for that. + +* For TCP connections, header conditions are only evaluated for the + *first packet from each endpoint*. If a header condition does not + match the initial packets, the signature will not trigger. Bro + optimizes for the most common application here, which is header + conditions selecting the connections to be examined more closely + with payload statements. + +* For UDP and ICMP flows, the payload matching is done on a per-packet + basis; i.e., any content crossing packet boundaries will not be + found. For TCP connections, the matching semantics depend on whether + Bro is *reassembling* the connection (i.e., putting all of a + connection's packets in sequence). By default, Bro is reassembling + the first 1K of every TCP connection, which means that within this + window, matches will be found without regards to packet order or + boundaries (i.e., *stream-wise matching*). + +* For performance reasons, by default Bro *stops matching* on a + connection after seeing 1K of payload; see the section on options + below for how to change this behaviour. The default was chosen with + Bro's main user of signatures in mind: dynamic protocol detection + works well even when examining just connection heads. + +* Regular expressions are implicitly anchored, i.e., they work as if + prefixed with the ``^`` operator. For reassembled TCP connections, + they are anchored at the first byte of the payload *stream*. For all + other connections, they are anchored at the first payload byte of + each packet. To match at arbitrary positions, you can prefix the + regular expression with ``.*``, as done in the examples above. + +* To match on non-ASCII characters, Bro's regular expressions support + the ``\x`` operator. CRs/LFs are not treated specially by the + signature engine and can be matched with ``\r`` and ``\n``, + respectively. Generally, Bro follows `flex's regular expression + syntax + `_. + See the DPD signatures in ``base/frameworks/dpd/dpd.sig`` for some examples + of fairly complex payload patterns. + +* The data argument of the :bro:id:`signature_match` handler might not carry + the full text matched by the regular expression. Bro performs the + matching incrementally as packets come in; when the signature + eventually fires, it can only pass on the most recent chunk of data. + + +Options +======= + +The following options control details of Bro's matching process: + +``dpd_reassemble_first_packets: bool`` (default: ``T``) + If true, Bro reassembles the beginning of every TCP connection (of + up to ``dpd_buffer_size`` bytes, see below), to facilitate + reliable matching across packet boundaries. If false, only + connections are reassembled for which an application-layer + analyzer gets activated (e.g., by Bro's dynamic protocol + detection). + +``dpd_match_only_beginning : bool`` (default: ``T``) + If true, Bro performs packet matching only within the initial + payload window of ``dpd_buffer_size``. If false, it keeps matching + on subsequent payload as well. + +``dpd_buffer_size: count`` (default: ``1024``) + Defines the buffer size for the two preceding options. In + addition, this value determines the amount of bytes Bro buffers + for each connection in order to activate application analyzers + even after parts of the payload have already passed through. This + is needed by the dynamic protocol detection capability to defer + the decision which analyzers to use. + + +So, how about using Snort signatures with Bro? +============================================== + +There was once a script, ``snort2bro``, that converted Snort +signatures automatically into Bro's signature syntax. However, in our +experience this didn't turn out to be a very useful thing to do +because by simply using Snort signatures, one can't benefit from the +additional capabilities that Bro provides; the approaches of the two +systems are just too different. We therefore stopped maintaining the +``snort2bro`` script, and there are now many newer Snort options which +it doesn't support. The script is now no longer part of the Bro +distribution. + diff --git a/doc/index.rst b/doc/index.rst index e966661115..32c01a6e7d 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -5,50 +5,15 @@ Bro Documentation ================= -Guides ------- - .. toctree:: - :maxdepth: 1 - - INSTALL - upgrade - quickstart - faq - reporting-problems - -xFrameworks ----------- - -.. toctree:: - :maxdepth: 1 - - notice - logging - input - cluster - signatures - -How-Tos -------- - -.. toctree:: - :maxdepth: 2 - :numbered: - - user-manual/index - reference/index - -Just Testing -============ - -.. code:: bro - - print "Hey Bro!" - -.. btest:: test - - @TEST-COPY-FILE: ${TRACES}/wikipedia.trace - @TEST-EXEC: btest-rst-cmd bro -r wikipedia.trace - @TEST-EXEC: btest-rst-cmd "cat http.log | bro-cut ts id.orig_h | head -5" + :maxdepth: 2 + intro/index.rst + using/index.rst + scripting/index.rst + frameworks/index.rst + cluster/index.rst + scripts/index.rst + misc/index.rst + components/index.rst + indices/index.rst diff --git a/doc/indices/index.rst b/doc/indices/index.rst new file mode 100644 index 0000000000..1d0411c7c0 --- /dev/null +++ b/doc/indices/index.rst @@ -0,0 +1,7 @@ + +======= +Indices +======= + +* :ref:`General Index ` +* :ref:`search` diff --git a/doc/intro/index.rst b/doc/intro/index.rst new file mode 100644 index 0000000000..58a5a03bd0 --- /dev/null +++ b/doc/intro/index.rst @@ -0,0 +1,13 @@ + +============ +Introduction +============ + +.. toctree:: + :maxdepth: 2 + + overview + quickstart + upgrade + reporting-problems + diff --git a/doc/reference/language.rst b/doc/intro/overview.rst similarity index 65% rename from doc/reference/language.rst rename to doc/intro/overview.rst index dd50997672..acb9c970d5 100644 --- a/doc/reference/language.rst +++ b/doc/intro/overview.rst @@ -1,7 +1,5 @@ ================== -Language (Missing) +Overview (Missing) ================== - - diff --git a/doc/user-manual/quickstart.rst b/doc/intro/quickstart.rst similarity index 100% rename from doc/user-manual/quickstart.rst rename to doc/intro/quickstart.rst diff --git a/doc/intro/reporting-problems.rst b/doc/intro/reporting-problems.rst new file mode 100644 index 0000000000..903df76257 --- /dev/null +++ b/doc/intro/reporting-problems.rst @@ -0,0 +1,194 @@ + +Reporting Problems +================== + +.. rst-class:: opening + + Here we summarize some steps to follow when you see Bro doing + something it shouldn't. To provide help, it is often crucial for + us to have a way of reliably reproducing the effect you're seeing. + Unfortunately, reproducing problems can be rather tricky with Bro + because more often than not, they occur only in either very rare + situations or only after Bro has been running for some time. In + particular, getting a small trace showing a specific effect can be + a real problem. In the following, we'll summarize some strategies + to this end. + +Reporting Problems +------------------ + +Generally, when you encounter a problem with Bro, the best thing to do +is opening a new ticket in `Bro's issue tracker +`__ and include information on how to +reproduce the issue. Ideally, your ticket should come with the +following: + +* The Bro version you're using (if working directly from the git + repository, the branch and revision number.) + +* The output you're seeing along with a description of what you'd expect + Bro to do instead. + +* A *small* trace in `libpcap format `__ + demonstrating the effect (assuming the problem doesn't happen right + at startup already). + +* The exact command-line you're using to run Bro with that trace. If + you can, please try to run the Bro binary directly from the command + line rather than using BroControl. + +* Any non-standard scripts you're using (but please only those really + necessary; just a small code snippet triggering the problem would + be perfect). + +* If you encounter a crash, information from the core dump, such as + the stack backtrace, can be very helpful. See below for more on + this. + + +How Do I Get a Trace File? +-------------------------- + +As Bro is usually running live, coming up with a small trace file that +reproduces a problem can turn out to be quite a challenge. Often it +works best to start with a large trace that triggers the problem, +and then successively thin it out as much as possible. + +To get to the initial large trace, here are a few things you can try: + +* Capture a trace with `tcpdump `__, either + on the same interface Bro is running on, or on another host where + you can generate traffic of the kind likely triggering the problem + (e.g., if you're seeing problems with the HTTP analyzer, record some + of your Web browsing on your desktop.) When using tcpdump, don't + forget to record *complete* packets (``tcpdump -s 0 ...``). You can + reduce the amount of traffic captured by using a suitable BPF filter + (e.g., for HTTP only, try ``port 80``). + +* Bro's command-line option ``-w `` records all packets it + processes into the given file. You can then later run Bro + offline on this trace and it will process the packets in the same + way as it did live. This is particularly helpful with problems that + only occur after Bro has already been running for some time. For + example, sometimes a crash may be triggered by a particular kind of + traffic only occurring rarely. Running Bro live with ``-w`` and + then, after the crash, offline on the recorded trace might, with a + little bit of luck, reproduce the problem reliably. However, be + careful with ``-w``: it can result in huge trace files, quickly + filling up your disk. (One way to mitigate the space issues is to + periodically delete the trace file by configuring + ``rotate-logs.bro`` accordingly. BroControl does that for you if you + set its ``SaveTraces`` option.) + +* Finally, you can try running Bro on a publically available trace + file, such as `anonymized FTP traffic `__, `headers-only enterprise traffic + `__, or + `Defcon traffic `__. Some of these + particularly stress certain components of Bro (e.g., the Defcon + traces contain tons of scans). + +Once you have a trace that demonstrates the effect, you will often +notice that it's pretty big, in particular if recorded from the link +you're monitoring. Therefore, the next step is to shrink its size as +much as possible. Here are a few things you can try to this end: + +* Very often, a single connection is able to demonstrate the problem. + If you can identify which one it is (e.g., from one of Bro's + ``*.log`` files) you can extract the connection's packets from the + trace using tcpdump by filtering for the corresponding 4-tuple of + addresses and ports: + + .. console:: + + > tcpdump -r large.trace -w small.trace host and port and host and port + +* If you can't reduce the problem to a connection, try to identify + either a host pair or a single host triggering it, and filter down + the trace accordingly. + +* You can try to extract a smaller time slice from the trace using + `TCPslice `__. For example, to + extract the first 100 seconds from the trace: + + .. console:: + + # Test comment + > tcpslice +100 out + +Alternatively, tcpdump extracts the first ``n`` packets with its +option ``-c ``. + + +Getting More Information After a Crash +-------------------------------------- + +If Bro crashes, a *core dump* can be very helpful to nail down the +problem. Examining a core is not for the faint of heart but can reveal +extremely useful information. + +First, you should configure Bro with the option ``--enable-debug`` and +recompile; this will disable all compiler optimizations and thus make +the core dump more useful (don't expect great performance with this +version though; compiling Bro without optimization has a noticeable +impact on its CPU usage.). Then enable core dumps if you haven't +already (e.g., ``ulimit -c unlimited`` if you're using bash). + +Once Bro has crashed, start gdb with the Bro binary and the file +containing the core dump. (Alternatively, you can also run Bro +directly inside gdb instead of working from a core file.) The first +helpful information to include with your tracker ticket is a stack +backtrace, which you get with gdb's ``bt`` command: + +.. console:: + + > gdb bro core + [...] + > bt + + +If the crash occurs inside Bro's script interpreter, the next thing to +do is identifying the line of script code processed just before the +abnormal termination. Look for methods in the stack backtrace which +belong to any of the script interpreter's classes. Roughly speaking, +these are all classes with names ending in ``Expr``, ``Stmt``, or +``Val``. Then climb up the stack with ``up`` until you reach the first +of these methods. The object to which ``this`` is pointing will have a +``Location`` object, which in turn contains the file name and line +number of the corresponding piece of script code. Continuing the +example from above, here's how to get that information: + +.. console:: + + [in gdb] + > up + > ... + > up + > print this->location->filename + > print this->location->first_line + + +If the crash occurs while processing input packets but you cannot +directly tell which connection is responsible (and thus not extract +its packets from the trace as suggested above), try getting the +4-tuple of the connection currently being processed from the core dump +by again examining the stack backtrace, this time looking for methods +belonging to the ``Connection`` class. That class has members +``orig_addr``/``resp_addr`` and ``orig_port``/``resp_port`` storing +(pointers to) the IP addresses and ports respectively: + +.. console:: + + [in gdb] + > up + > ... + > up + > printf "%08x:%04x %08x:%04x\n", *this->orig_addr, this->orig_port, *this->resp_addr, this->resp_port + + +Note that these values are stored in `network byte order +`__ +so you will need to flip the bytes around if you are on a low-endian +machine (which is why the above example prints them in hex). For +example, if an IP address prints as ``0100007f`` , that's 127.0.0.1 . + diff --git a/doc/intro/upgrade.rst b/doc/intro/upgrade.rst new file mode 100644 index 0000000000..539757537d --- /dev/null +++ b/doc/intro/upgrade.rst @@ -0,0 +1,308 @@ + +========================================== +Upgrading From the Previous Version of Bro +========================================== + +.. rst-class:: opening + + This guide details specific differences between Bro versions + that may be important for users to know as they work on updating + their Bro deployment/configuration to the later version. + +.. contents:: + + +Upgrading From Bro 2.0 to 2.1 +============================= + +In Bro 2.1, IPv6 is enabled by default. Therefore, when building Bro from +source, the "--enable-brov6" configure option has been removed because it +is no longer relevant. + +Other configure changes include renaming the "--enable-perftools" option +to "--enable-perftools-debug" to indicate that the option is only relevant +for debugging the heap. One other change involves what happens when +tcmalloc (part of Google perftools) is found at configure time. On Linux, +it will automatically be linked with Bro, but on other platforms you +need to use the "--enable-perftools" option to enable linking to tcmalloc. + +There are a couple of changes to the Bro scripting language to better +support IPv6. First, IPv6 literals appearing in a Bro script must now be +enclosed in square brackets (for example, ``[fe80::db15]``). For subnet +literals, the slash "/" appears after the closing square bracket (for +example, ``[fe80:1234::]/32``). Second, when an IP address variable or IP +address literal is enclosed in pipes (for example, ``|[fe80::db15]|``) the +result is now the size of the address in bits (32 for IPv4 and 128 for IPv6). + +In the Bro scripting language, "match" and "using" are no longer reserved +keywords. + +Some built-in functions have been removed: "addr_to_count" (use +"addr_to_counts" instead), "bro_has_ipv6" (this is no longer relevant +because Bro now always supports IPv6), "active_connection" (use +"connection_exists" instead), and "connection_record" (use "lookup_connection" +instead). + +The "NFS3::mode2string" built-in function has been renamed to "file_mode". + +Some built-in functions have been changed: "exit" (now takes the exit code +as a parameter), "to_port" (now takes a string as parameter instead +of a count and transport protocol, but "count_to_port" is still available), +"connect" (now takes an additional string parameter specifying the zone of +a non-global IPv6 address), and "listen" (now takes three additional +parameters to enable listening on IPv6 addresses). + +Some Bro script variables have been renamed: "LogAscii::header_prefix" +has been renamed to "LogAscii::meta_prefix", "LogAscii::include_header" +has been renamed to "LogAscii::include_meta". + +Some Bro script variables have been removed: "tunnel_port", +"parse_udp_tunnels", "use_connection_compressor", "cc_handle_resets", +"cc_handle_only_syns", and "cc_instantiate_on_data". + +A couple events have changed: the "icmp_redirect" event now includes +the target and destination addresses and any Neighbor Discovery options +in the message, and the last parameter of the "dns_AAAA_reply" event has +been removed because it was unused. + +The format of the ASCII log files has changed very slightly. Two new lines +are automatically added, one to record the time when the log was opened, +and the other to record the time when the log was closed. + +In BroControl, the option (in broctl.cfg) "CFlowAddr" was renamed +to "CFlowAddress". + + +Upgrading From Bro 1.5 to 2.0 +============================= + +As the version number jump suggests, Bro 2.0 is a major upgrade and +lots of things have changed. Most importantly, we have rewritten +almost all of Bro's default scripts from scratch, using quite +different structure now and focusing more on operational deployment. +The result is a system that works much better "out of the box", even +without much initial site-specific configuration. The down-side is +that 1.x configurations will need to be adapted to work with the new +version. The two rules of thumb are: + + (1) If you have written your own Bro scripts + that do not depend on any of the standard scripts formerly + found in ``policy/``, they will most likely just keep working + (although you might want to adapt them to use some of the new + features, like the new logging framework; see below). + + (2) If you have custom code that depends on specifics of 1.x + default scripts (including most configuration tuning), that is + unlikely to work with 2.x. We recommend to start by using just + the new scripts first, and then port over any customizations + incrementally as necessary (they may be much easier to do now, + or even unnecessary). Send mail to the Bro user mailing list + if you need help. + +Below we summarize changes from 1.x to 2.x in more detail. This list +isn't complete, see the :download:`CHANGES ` file in the +distribution for the full story. + +Default Scripts +=============== + +Organization +------------ + +In versions before 2.0, Bro scripts were all maintained in a flat +directory called ``policy/`` in the source tree. This directory is now +renamed to ``scripts/`` and contains major subdirectories ``base/``, +``policy/``, and ``site/``, each of which may also be subdivided +further. + +The contents of the new ``scripts/`` directory, like the old/flat +``policy/`` still gets installed under the ``share/bro`` +subdirectory of the installation prefix path just like previous +versions. For example, if Bro was compiled like ``./configure +--prefix=/usr/local/bro && make && make install``, then the script +hierarchy can be found in ``/usr/local/bro/share/bro``. + +The main +subdirectories of that hierarchy are as follows: + +- ``base/`` contains all scripts that are loaded by Bro by default + (unless the ``-b`` command line option is used to run Bro in a + minimal configuration). Note that is a major conceptual change: + rather than not loading anything by default, Bro now uses an + extensive set of default scripts out of the box. + + The scripts under this directory generally either accumulate/log + useful state/protocol information for monitored traffic, configure a + default/recommended mode of operation, or provide extra Bro + scripting-layer functionality that has no significant performance cost. + +- ``policy/`` contains all scripts that a user will need to explicitly + tell Bro to load. These are scripts that implement + functionality/analysis that not all users may want to use and may have + more significant performance costs. For a new installation, you + should go through these and see what appears useful to load. + +- ``site/`` remains a directory that can be used to store locally + developed scripts. It now comes with some preinstalled example + scripts that contain recommended default configurations going beyond + the ``base/`` setup. E.g. ``local.bro`` loads extra scripts from + ``policy/`` and does extra tuning. These files can be customized in + place without being overwritten by upgrades/reinstalls, unlike + scripts in other directories. + +With version 2.0, the default ``BROPATH`` is set to automatically +search for scripts in ``policy/``, ``site/`` and their parent +directory, but **not** ``base/``. Generally, everything under +``base/`` is loaded automatically, but for users of the ``-b`` option, +it's important to know that loading a script in that directory +requires the extra ``base/`` path qualification. For example, the +following two scripts: + +* ``$PREFIX/share/bro/base/protocols/ssl/main.bro`` +* ``$PREFIX/share/bro/policy/protocols/ssl/validate-certs.bro`` + +are referenced from another Bro script like: + +.. code:: bro + + @load base/protocols/ssl/main + @load protocols/ssl/validate-certs + +Notice how ``policy/`` can be omitted as a convenience in the second +case. ``@load`` can now also use relative path, e.g., ``@load +../main``. + + +Logging Framework +----------------- + +- The logs generated by scripts that ship with Bro are entirely redone + to use a standardized, machine parsable format via the new logging + framework. Generally, the log content has been restructured towards + making it more directly useful to operations. Also, several + analyzers have been significantly extended and thus now log more + information. Take a look at ``ssl.log``. + + * A particular format change that may be useful to note is that the + ``conn.log`` ``service`` field is derived from DPD instead of + well-known ports (while that was already possible in 1.5, it was + not the default). + + * Also, ``conn.log`` now reports raw number of packets/bytes per + endpoint. + +- The new logging framework makes it possible to extend, customize, + and filter logs very easily. See the :doc:`logging framework ` + for more information on usage. + +- A common pattern found in the new scripts is to store logging stream + records for protocols inside the ``connection`` records so that + state can be collected until enough is seen to log a coherent unit + of information regarding the activity of that connection. This + state is now frequently seen/accessible in event handlers, for + example, like ``c$`` where ```` is replaced by + the name of the protocol. This field is added to the ``connection`` + record by ``redef``'ing it in a + ``base/protocols//main.bro`` script. + +- The logging code has been rewritten internally, with script-level + interface and output backend now clearly separated. While ASCII + logging is still the default, we will add further output types in + the future (binary format, direct database logging). + + +Notice Framework +---------------- + +The way users interact with "notices" has changed significantly in +order to make it easier to define a site policy and more extensible +for adding customized actions. See the :doc:`notice framework `. + + +New Default Settings +-------------------- + +- Dynamic Protocol Detection (DPD) is now enabled/loaded by default. + +- The default packet filter now examines all packets instead of + dynamically building a filter based on which protocol analysis scripts + are loaded. See ``PacketFilter::all_packets`` for how to revert to old + behavior. + +API Changes +----------- + +- The ``@prefixes`` directive works differently now. + Any added prefixes are now searched for and loaded *after* all input + files have been parsed. After all input files are parsed, Bro + searches ``BROPATH`` for prefixed, flattened versions of all of the + parsed input files. For example, if ``lcl`` is in ``@prefixes``, and + ``site.bro`` is loaded, then a file named ``lcl.site.bro`` that's in + ``BROPATH`` would end up being automatically loaded as well. Packages + work similarly, e.g. loading ``protocols/http`` means a file named + ``lcl.protocols.http.bro`` in ``BROPATH`` gets loaded automatically. + +- The ``make_addr`` BIF now returns a ``subnet`` versus an ``addr`` + + +Variable Naming +--------------- + +- ``Module`` is more widely used for namespacing. E.g. the new + ``site.bro`` exports the ``local_nets`` identifier (among other + things) into the ``Site`` module. + +- Identifiers may have been renamed to conform to new `scripting + conventions + `_ + + +BroControl +========== + +BroControl looks pretty much similar to the version coming with Bro 1.x, +but has been cleaned up and streamlined significantly internally. + +BroControl has a new ``process`` command to process a trace on disk +offline using a similar configuration to what BroControl installs for +live analysis. + +BroControl now has an extensive plugin interface for adding new +commands and options. Note that this is still considered experimental. + +We have removed the ``analysis`` command, and BroControl currently +does not send daily alarm summaries anymore (this may be restored +later). + +Removed Functionality +===================== + +We have remove a bunch of functionality that was rarely used and/or +had not been maintained for a while already: + + - The ``net`` script data type. + - The ``alarm`` statement; use the notice framework instead. + - Trace rewriting. + - DFA state expiration in regexp engine. + - Active mapping. + - Native DAG support (may come back eventually) + - ClamAV support. + - The connection compressor is now disabled by default, and will + be removed in the future. + +Development Infrastructure +========================== + +Bro development has moved from using SVN to Git for revision control. +Users that want to use the latest Bro development snapshot by checking it out +from the source repositories should see the `development process +`_. Note that all the various +sub-components now reside in their own repositories. However, the +top-level Bro repository includes them as git submodules so it's easy +to check them all out simultaneously. + +Bro now uses `CMake `_ for its build system so +that is a new required dependency when building from source. + +Bro now comes with a growing suite of regression tests in +``testing/``. diff --git a/doc/misc/geoip.rst b/doc/misc/geoip.rst new file mode 100644 index 0000000000..bd9ae0c08d --- /dev/null +++ b/doc/misc/geoip.rst @@ -0,0 +1,102 @@ + +=========== +GeoLocation +=========== + +.. rst-class:: opening + + During the process of creating policy scripts the need may arise + to find the geographic location for an IP address. Bro has support + for the `GeoIP library `__ at the + policy script level beginning with release 1.3 to account for this + need. + +.. contents:: + +GeoIPLite Database Installation +------------------------------------ + +A country database for GeoIPLite is included when you do the C API +install, but for Bro, we are using the city database which includes +cities and regions in addition to countries. + +`Download `__ the geolitecity +binary database and follow the directions to install it. + +FreeBSD Quick Install +--------------------- + +.. console:: + + pkg_add -r GeoIP + wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz + gunzip GeoLiteCity.dat.gz + mv GeoLiteCity.dat /usr/local/share/GeoIP/GeoIPCity.dat + + # Set your environment correctly before running Bro's configure script + export CFLAGS=-I/usr/local/include + export LDFLAGS=-L/usr/local/lib + + +CentOS Quick Install +-------------------- + +.. console:: + + yum install GeoIP-devel + + wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz + gunzip GeoLiteCity.dat.gz + mkdir -p /var/lib/GeoIP/ + mv GeoLiteCity.dat /var/lib/GeoIP/GeoIPCity.dat + + # Set your environment correctly before running Bro's configure script + export CFLAGS=-I/usr/local/include + export LDFLAGS=-L/usr/local/lib + + +Usage +----- + +There is a single built in function that provides the GeoIP +functionality: + +.. code:: bro + + function lookup_location(a:addr): geo_location + +There is also the ``geo_location`` data structure that is returned +from the ``lookup_location`` function: + +.. code:: bro + + type geo_location: record { + country_code: string; + region: string; + city: string; + latitude: double; + longitude: double; + }; + + +Example +------- + +To write a line in a log file for every ftp connection from hosts in +Ohio, this is now very easy: + +.. code:: bro + + global ftp_location_log: file = open_log_file("ftp-location"); + + event ftp_reply(c: connection, code: count, msg: string, cont_resp: bool) + { + local client = c$id$orig_h; + local loc = lookup_location(client); + if (loc$region == "OH" && loc$country_code == "US") + { + print ftp_location_log, fmt("FTP Connection from:%s (%s,%s,%s)", client, loc$city, loc$region, loc$country_code); + } + } + + diff --git a/doc/misc/index.rst b/doc/misc/index.rst new file mode 100644 index 0000000000..edf82e8fd2 --- /dev/null +++ b/doc/misc/index.rst @@ -0,0 +1,9 @@ + +==================== +Miscellaneous Topics +==================== + +.. toctree:: + :maxdepth: 2 + + geoip diff --git a/doc/reference/events.rst b/doc/reference/events.rst deleted file mode 100644 index bcb3adae42..0000000000 --- a/doc/reference/events.rst +++ /dev/null @@ -1,5 +0,0 @@ - -================ -Events (Missing) -================ - diff --git a/doc/reference/frameworks.rst b/doc/reference/frameworks.rst deleted file mode 100644 index 20824b03bc..0000000000 --- a/doc/reference/frameworks.rst +++ /dev/null @@ -1,5 +0,0 @@ - -==================== -Frameworks (Missing) -==================== - diff --git a/doc/reference/index.rst b/doc/reference/index.rst deleted file mode 100644 index cba512cd1c..0000000000 --- a/doc/reference/index.rst +++ /dev/null @@ -1,13 +0,0 @@ - -========= -Reference -========= - -.. toctree:: - :maxdepth: 2 - :numbered: - - frameworks.rst - events.rst - language.rst - subsystems.rst diff --git a/doc/reference/subsystems.rst b/doc/reference/subsystems.rst deleted file mode 100644 index 9caafba8b3..0000000000 --- a/doc/reference/subsystems.rst +++ /dev/null @@ -1,4 +0,0 @@ - -==================== -Subsystems (Missing) -==================== diff --git a/doc/user-manual/scripting.rst b/doc/scripting/index.rst similarity index 99% rename from doc/user-manual/scripting.rst rename to doc/scripting/index.rst index adc35b127e..9ef8e8f8f3 100644 --- a/doc/user-manual/scripting.rst +++ b/doc/scripting/index.rst @@ -1,7 +1,7 @@ -========= -Scripting -========= +=================== +Writing Bro Scripts +=================== .. toctree:: :maxdepth: 2 diff --git a/doc/scripts/index.rst b/doc/scripts/index.rst index bf0fa25f10..d8fe2e57b1 100644 --- a/doc/scripts/index.rst +++ b/doc/scripts/index.rst @@ -1,8 +1,21 @@ .. This is a stub doc to which broxygen appends during the build process -Index of All Individual Bro Scripts -=================================== +================ +Script Reference +================ .. toctree:: :maxdepth: 1 + builtins + bifs + scripts + packages + internal + +Indices +======= + + * `Notice Index `_ + + diff --git a/doc/scripts/scripts.rst b/doc/scripts/scripts.rst new file mode 100644 index 0000000000..d454063002 --- /dev/null +++ b/doc/scripts/scripts.rst @@ -0,0 +1,8 @@ +.. This is a stub doc to which broxygen appends during the build process + +======================== +Index of All Bro Scripts +======================== + +.. toctree:: + :maxdepth: 1 diff --git a/doc/user-manual/index.rst b/doc/user-manual/index.rst deleted file mode 100644 index ad3e516a78..0000000000 --- a/doc/user-manual/index.rst +++ /dev/null @@ -1,13 +0,0 @@ - -=========== -User Manual -=========== - -.. toctree:: - :maxdepth: 2 - :numbered: - - intro.rst - quickstart.rst - scripting.rst - diff --git a/doc/user-manual/intro.rst b/doc/user-manual/intro.rst deleted file mode 100644 index c7a210747d..0000000000 --- a/doc/user-manual/intro.rst +++ /dev/null @@ -1,4 +0,0 @@ - -====================== -Introduction (Missing) -====================== diff --git a/doc/using/index.rst b/doc/using/index.rst new file mode 100644 index 0000000000..ab7834d904 --- /dev/null +++ b/doc/using/index.rst @@ -0,0 +1,6 @@ + +=================== +Using Bro (Missing) +=================== + +TODO.