Restructuring the main documentation index.

I'm merging in the remaining pieces from the former doc directory and restructuring things into sub-directories.
2025-10-02 14:48:21 +00:00 · 2013-04-01 17:30:12 -07:00 · 2013-04-01 17:30:12 -07:00 · 25bf563e1c
commit 25bf563e1c
parent 12e4dd8066
41 changed files with 7679 additions and 100 deletions
--- a/doc/CMakeLists.txt
+++ b/doc/CMakeLists.txt
@ -31,7 +31,7 @@ add_custom_target(broxygen
                          ${DOC_SOURCE_WORKDIR}/scripts
                  # append to the master index of all policy scripts
                  COMMAND cat ${MASTER_POLICY_INDEX} >>
-                          ${DOC_SOURCE_WORKDIR}/scripts/index.rst
+                          ${DOC_SOURCE_WORKDIR}/scripts/scripts.rst
                  # append to the master index of all policy packages
                  COMMAND cat ${MASTER_PACKAGE_INDEX} >>
                          ${DOC_SOURCE_WORKDIR}/scripts/packages.rst
--- a/doc/cluster/index.rst
+++ b/doc/cluster/index.rst
@ -0,0 +1,86 @@
 ========================
 Setting up a Bro Cluster
 ========================
 Intro
 ------
 Bro is not multithreaded, so once the limitations of a single processor core are reached, the only option currently is to spread the workload across many cores or even many physical computers.  The cluster deployment scenario for Bro is the current solution to build these larger systems.  The accompanying tools and scripts provide the structure to easily manage many Bro processes examining packets and doing correlation activities but acting as a singular, cohesive entity.  
 Architecture
 ---------------
 The figure below illustrates the main components of a Bro cluster.
 .. image:: /images/deployment.png
 Tap
 ***
 This is a mechanism that splits the packet stream in order to make a copy
 available for inspection. Examples include the monitoring port on a switch and
 an optical splitter for fiber networks.
 Frontend 
 ********
 This is a discrete hardware device or on-host technique that will split your traffic into many streams or flows.  The Bro binary does not do this job.  There are numerous ways to accomplish this task, some of which are described below in `Frontend Options`_.
 Manager
 *******
 This is a Bro process which has two primary jobs.  It receives log messages and notices from the rest of the nodes in the cluster using the Bro communications protocol.  The result is that you will end up with single logs for each log instead of many discrete logs that you have to later combine in some manner with post processing.  The manager also takes the opportunity to de-duplicate notices and it has the ability to do so since it’s acting as the choke point for notices and how notices might be processed into actions such as emailing, paging, or blocking.
 The manager process is started first by BroControl and it only opens it’s designated port and waits for connections, it doesn’t initiate any connections to the rest of the cluster.  Once the workers are started and connect to the manager, logs and notices will start arriving to the manager process from the workers.
 Proxy
 *****
 This is a Bro process which manages synchronized state.  Variables can be synchronized across connected Bro processes automatically in Bro and proxies will help the workers by alleviating the need for all of the workers to connect directly to each other.  
 Examples of synchronized state from the scripts that ship with Bro are things such as the full list of “known” hosts and services which are hosts or services which have been detected as performing full TCP handshakes or an analyzed protocol has been found on the connection.  If worker A detects host 1.2.3.4 as an active host, it would be beneficial for worker B to know that as well so worker A shares that information as an insertion to a set <link to set documentation would be good here> which travels to the cluster’s proxy and the proxy then sends that same set insertion to worker B.  The result is that worker A and worker B have shared knowledge about host and services that are active on the network being monitored.  
 The proxy model extends to having multiple proxies as well if necessary for performance reasons, it only adds one additional step for the Bro processes.  Each proxy connects to another proxy in a ring and the workers are shared between them as evenly as possible.  When a proxy receives some new bit of state, it will share that with it’s proxy which is then shared around the ring of proxies and down to all of the workers.  From a practical standpoint, there are no rules of thumb established yet for the number of proxies necessary for the number of workers they are serving.  Best is to start with a single proxy and add more if communication performance problems are found.
 Bro processes acting as proxies don’t tend to be extremely intense to CPU or memory and users frequently run proxy processes on the same physical host as the manager.
 Worker
 ******
 This is the Bro process that sniffs network traffic and does protocol analysis on the reassembled traffic streams.  Most of the work of an active cluster takes place on the workers and as such, the workers typically represent the bulk of the Bro processes that are running in a cluster.  The fastest memory and CPU core speed you can afford is best here since all of the protocol parsing and most analysis will take place here.   There are no particular requirements for the disks in workers since almost all logging is done remotely to the manager and very little is normally written to disk.
 The rule of thumb we have followed recently is to allocate approximately 1 core for every 80Mbps of traffic that is being analyzed, however this estimate could be extremely traffic mix specific.  It has generally worked for mixed traffic with many users and servers.  For example, if your traffic peaks around 2Gbps (combined) and you want to handle traffic at peak load, you may want to have 26 cores available (2048 / 80 == 25.6).  If the 80Mbps estimate works for your traffic, this could be handled by 3 physical hosts dedicated to being workers with each one containing dual 6-core processors.  
 Once a flow based load balancer is put into place this model is extremely easy to scale as well so it’s recommended that you guess at the amount of hardware you will need to fully analyze your traffic.  If it turns out that you need more, it’s relatively easy to increase the size of the cluster in most cases.
 Frontend Options
 ----------------
 There are many options for setting up a frontend flow distributor and in many cases it may even be beneficial to do multiple stages of flow distribution on the network and on the host.
 Discrete hardware flow balancers
 ********************************
 cPacket
 ^^^^^^^
 If you are monitoring one or more 10G physical interfaces, the recommended solution is to use either a cFlow or cVu device from cPacket because they are currently being used very successfully at a number of sites.  These devices will perform layer-2 load balancing by rewriting the destination ethernet MAC address to cause each packet associated with a particular flow to have the same destination MAC.  The packets can then be passed directly to a monitoring host where each worker has a BPF filter to limit its visibility to only that stream of flows or onward to a commodity switch to split the traffic out to multiple 1G interfaces for the workers.  This can ultimately greatly reduce costs since workers can use relatively inexpensive 1G interfaces.
 OpenFlow Switches
 ^^^^^^^^^^^^^^^^^
 We are currently exploring the use of OpenFlow based switches to do flow based load balancing directly on the switch which can greatly reduce frontend costs for many users.  This document will be updated when we have more information.
 On host flow balancing
 **********************
 PF_RING
 ^^^^^^^
 The PF_RING software for Linux has a “clustering” feature which will do flow based load balancing across a number of processes that are sniffing the same interface.  This will allow you to easily take advantage of multiple cores in a single physical host because Bro’s main event loop is single threaded and can’t natively utilize all of the cores.  More information about Bro with PF_RING can be found here: (someone want to write a quick Bro/PF_RING tutorial to link to here?  document installing kernel module, libpcap wrapper, building Bro with the --with-pcap configure option)
 Netmap
 ^^^^^^
 FreeBSD has an in-progress project named Netmap which will enable flow based load balancing as well.  When it becomes viable for real world use, this document will be updated.
 Click! Software Router
 ^^^^^^^^^^^^^^^^^^^^^^
 Click! can be used for flow based load balancing with a simple configuration.  (link to an example for the config).  This solution is not recommended on Linux due to Bro’s PF_RING support and only as a last resort on other operating systems since it causes a lot of overhead due to context switching back and forth between kernel and userland several times per packet.
--- a/doc/components/binpac/README.rst
+++ b/doc/components/binpac/README.rst
@ -0,0 +1,68 @@
 ..	-*- mode: rst-mode -*-
 ..
 .. Version number is filled in automatically.
 .. |version| replace:: 0.34-3
 ======
 BinPAC
 ======
 .. rst-class:: opening
    BinPAC is a high level language for describing protocol parsers and
    generates C++ code.  It is currently maintained and distributed with the
    Bro Network Security Monitor distribution, however, the generated parsers
    may be used with other programs besides Bro.
 Download
 --------
 You can find the latest BinPAC release for download at
 http://www.bro.org/download.
 BinPAC's git repository is located at `git://git.bro.org/binpac.git
 <git://git.bro.org/binpac.git>`__. You can browse the repository
 `here <http://git.bro.org/binpac.git>`__.
 This document describes BinPAC |version|. See the ``CHANGES``
 file for version history.
 Prerequisites
 -------------
 BinPAC relies on the following libraries and tools, which need to be
 installed before you begin:
    * Flex (Fast Lexical Analyzer)
       Flex is already installed on most systems, so with luck you can
       skip having to install it yourself.
    * Bison (GNU Parser Generator)
       Bison is also already installed on many system.
    * CMake 2.6.3 or greater
       CMake is a cross-platform, open-source build system, typically
       not installed by default.  See http://www.cmake.org for more
       information regarding CMake and the installation steps below for
       how to use it to build this distribution.  CMake generates native
       Makefiles that depend on GNU Make by default
 Installation
 ------------
 To build and install into ``/usr/local``::
    ./configure
    cd build
    make
    make install
 This will perform an out-of-source build into the build directory using
 the default build options and then install the binpac binary into
 ``/usr/local/bin``.
 You can specify a different installation directory with::
   ./configure --prefix=<dir>
 Run ``./configure --help`` for more options.
--- a/doc/components/bro-aux/README.rst
+++ b/doc/components/bro-aux/README.rst
@ -0,0 +1,70 @@
 .. -*- mode: rst; -*-
 ..
 .. Version number is filled in automatically.
 .. |version| replace:: 0.26-5
 ======================
 Bro Auxiliary Programs
 ======================
 .. contents::
 :Version: |version|
 Handy auxiliary programs related to the use of the Bro Network Security
 Monitor (http://www.bro.org).
 Note that some files that were formerly distributed with Bro as part
 of the aux/ tree are now maintained separately. See the
 http://www.bro.org/download for their download locations.
 adtrace
 =======
 Makefile and source for the adtrace utility. This program is used
 in conjunction with the localnetMAC.pl perl script to compute the
 network address that compose the internal and extern nets that bro
 is monitoring. This program when run by itself just reads a pcap
 (tcpdump) file and writes out the src MAC, dst MAC, src IP, dst
 IP for each packet seen in the file. This output is processed by
 the localnetMAC.pl script during 'make install'.
 devel-tools
 ===========
 A set of scripts used commonly for Bro development.
 extract-conn-by-uid:
    Extracts a connection from a trace file based
    on its UID found in Bro's conn.log
 gen-mozilla-ca-list.rb
    Generates list of Mozilla SSL root certificates in
    a format readable by Bro.
 update-changes
    A script to maintain the CHANGES and VERSION files.
 git-show-fastpath
    Show commits to the fastpath branch not yet merged into master.
 cpu-bench-with-trace
    Run a number of Bro benchmarks on a trace file.
 nftools
 =======
 Utilities for dealing with Bro's custom file format for storing
 NetFlow records.  nfcollector reads NetFlow data from a socket and
 writes it in Bro's format.  ftwire2bro reads NetFlow "wire" format
 (e.g., as generated by a 'flow-export' directive) and writes it in
 Bro's format.
 rst
 ===
 Makefile and source for the rst utility. "rst" can be invoked by
 a Bro script to terminate an established TCP connection by forging
 RST tear-down packets.  See terminate_connection() in conn.bro.
--- a/doc/components/broccoli-python/README.rst
+++ b/doc/components/broccoli-python/README.rst
@ -0,0 +1,231 @@
 ..	-*- mode: rst-mode -*-
 ..
 .. Version number is filled in automatically.
 .. |version| replace:: 0.54
 ============================
 Python Bindings for Broccoli
 ============================
 .. rst-class:: opening
    This Python module provides bindings for Broccoli, Bro's client
    communication library. In general, the bindings provide the same
    functionality as Broccoli's C API.
 .. contents::
 Download
 --------
 You can find the latest Broccoli-Python release for download at
 http://www.bro.org/download.
 Broccoli-Python's git repository is located at `git://git.bro.org/broccoli-python.git
 <git://git.bro.org/broccoli-python.git>`__. You can browse the repository
 `here <http://git.bro.org/broccoli-python.git>`__.
 This document describes Broccoli-Python |version|. See the ``CHANGES``
 file for version history.
 Installation
 ------------
 Installation of the Python module is pretty straight-forward. After
 Broccoli itself has been installed, it follows the standard installation
 process for Python modules::
    python setup.py install
 Try the following to test the installation. If you do not see any
 error message, everything should be fine::
    python -c "import broccoli"
 Usage
 -----
 The following examples demonstrate how to send and receive Bro
 events in Python.
 The main challenge when using Broccoli from Python is dealing with
 the data types of Bro event parameters as there is no one-to-one
 mapping between Bro's types and Python's types. The Python modules
 automatically maps between those types which both systems provide
 (such as strings) and provides a set of wrapper classes for Bro
 types which do not have a direct Python equivalent (such as IP
 addresses).
 Connecting to Bro
 ~~~~~~~~~~~~~~~~~
 The following code sets up a connection from Python to a remote Bro
 instance (or another Broccoli) and provides a connection handle for
 further communication::
     from broccoli import *
     bc = Connection("127.0.0.1:47758")
 An ``IOError`` will be raised if the connection cannot be established.
 Sending Events
 ~~~~~~~~~~~~~~
 Once you have a connection handle ``bc`` set up as shown above, you can
 start sending events::
     bc.send("foo", 5, "attack!")
 This sends an event called ``foo`` with two parameters, ``5`` and
 ``attack!``. Broccoli operates asynchronously, i.e., events scheduled
 with ``send()`` are not always sent out immediately but might be
 queued for later transmission. To ensure that all events get out
 (and incoming events are processed, see below), you need to call
 ``bc.processInput()`` regularly.
 Data Types
 ~~~~~~~~~~
 In the example above, the types of the event parameters are
 automatically derived from the corresponding Python types: the first
 parameter (``5``) has the Bro type ``int`` and the second one
 (``attack!``) has Bro type ``string``.
 For types which do not have a Python equivalent, the ``broccoli``
 module provides wrapper classes which have the same names as the
 corresponding Bro types. For example, to send an event called ``bar``
 with one ``addr`` argument and one ``count`` argument, you can write::
     bc.send("bar", addr("192.168.1.1"), count(42))
 The following table summarizes the available atomic types and their
 usage.
 ========   ===========   ===========================
 Bro Type   Python Type   Example
 ========   ===========   ===========================
 addr                     ``addr("192.168.1.1")``
 bool       bool          ``True``
 count                    ``count(42)``
 double     float         ``3.14``
 enum                     Type currently not supported
 int 	   int           ``5``
 interval                 ``interval(60)``
 net                      Type currently not supported
 port	                 ``port("80/tcp")``
 string     string        ``"attack!"``
 subnet                   ``subnet("192.168.1.0/24")``
 time                     ``time(1111111111.0)``
 ========   ===========   ===========================
 The ``broccoli`` module also supports sending Bro records as event
 parameters. To send a record, you first define a record type. For
 example, a Bro record type::
    type my_record: record {
        a: int;
        b: addr;
        c: subnet;
    };
 turns into Python as::
    my_record = record_type("a", "b", "c")
 As the example shows, Python only needs to know the attribute names
 but not their types. The types are derived automatically in the same
 way as discussed above for atomic event parameters.
 Now you can instantiate a record instance of the newly defined type
 and send it out::
    rec = record(my_record)
    rec.a = 5
    rec.b = addr("192.168.1.1")
    rec.c = subnet("192.168.1.0/24")
    bc.send("my_event", rec)
 .. note:: The Python module does not support nested records at this time.
 Receiving Events
 ~~~~~~~~~~~~~~~~
 To receive events, you define a callback function having the same
 name as the event and mark it with the ``event`` decorator::
   @event
   def foo(arg1, arg2):
       print arg1, arg2
 Once you start calling ``bc.processInput()`` regularly (see above),
 each received ``foo`` event will trigger the callback function.
 By default, the event's arguments are always passed in with built-in
 Python types. For Bro types which do not have a direct Python
 equivalent (see table above), a substitute built-in type is used
 which corresponds to the type the wrapper class' constructor expects
 (see the examples in the table). For example, Bro type ``addr`` is
 passed in as a string and Bro type ``time`` is passed in as a float.
 Alternatively, you can define a _typed_ prototype for the event. If you
 do so, arguments will first be type-checked and then passed to the
 call-back with the specified type (which means instances of the
 wrapper classes for non-Python types). Example::
   @event(count, addr)
   def bar(arg1, arg2):
       print arg1, arg2
 Here, ``arg1`` will be an instance of the ``count`` wrapper class and
 ``arg2`` will be an instance of the ``addr`` wrapper class.
 Protoyping works similarly with built-in Python types::
   @event(int, string):
   def foo(arg1, arg2):
       print arg1, arg2
 In general, the prototype specifies the types in which the callback
 wants to receive the arguments. This actually provides support for
 simple type casts as some types support conversion to into something
 different. If for instance the event source sends an event with a
 single port argument, ``@event(port)`` will pass the port as an
 instance of the ``port`` wrapper class; ``@event(string)`` will pass it
 as a string (e.g., ``"80/tcp"``); and ``@event(int)`` will pass it as an
 integer without protocol information (e.g., just ``80``). If an
 argument cannot be converted into the specified type, a ``TypeError``
 will be raised.
 To receive an event with a record parameter, the record type first
 needs to be defined, as described above. Then the type can be used
 with the ``@event`` decorator in the same way as atomic types::
   my_record = record_type("a", "b", "c")
   @event(my_record)
   def my_event(rec):
       print rec.a, rec.b, rec.c
 Helper Functions
 ----------------
 The ``broccoli`` module provides one helper function: ``current_time()``
 returns the current time as a float which, if necessary, can be
 wrapped into a ``time`` parameter (i.e., ``time(current_time()``)
 Examples
 --------
 There are some example scripts in the ``tests/`` subdirectory of the
 ``broccoli-python`` repository
 `here <http://git.bro.org/broccoli-python.git/tree/HEAD:/tests>`_:
   - ``broping.py`` is a (simplified) Python version of Broccoli's test program
     ``broping``. Start Bro with ``broping.bro``.
   - ``broping-record.py`` is a Python version of Broccoli's ``broping``
     for records. Start Bro with ``broping-record.bro``.
   - ``test.py`` is a very ugly but comprehensive regression test and part of
     the communication test-suite. Start Bro with ``test.bro``.
--- a/doc/components/broccoli-ruby/README.rst
+++ b/doc/components/broccoli-ruby/README.rst
@ -0,0 +1,67 @@
 ..	-*- mode: rst-mode -*-
 ..
 .. Version number is filled in automatically.
 .. |version| replace:: 1.54
 ===============================================
 Ruby Bindings for Broccoli
 ===============================================
 .. rst-class:: opening
    This is the broccoli-ruby extension for Ruby which provides access
    to the Broccoli API.  Broccoli is a library for
    communicating with the Bro Intrusion Detection System.
 Download
 ========
 You can find the latest Broccoli-Ruby release for download at
 http://www.bro.org/download.
 Broccoli-Ruby's git repository is located at `git://git.bro.org/broccoli-ruby.git
 <git://git.bro.org/broccoli-ruby.git>`__. You can browse the repository
 `here <http://git.bro.org/broccoli-ruby.git>`__.
 This document describes Broccoli-Ruby |version|. See the ``CHANGES``
 file for version history.
 Installation
 ============
 To install the extension:
 1. Make sure that the ``broccoli-config`` binary is in your path.
   (``export PATH=/usr/local/bro/bin:$PATH``)
 2. Run ``sudo ruby setup.rb``.
 To install the extension as a gem (suggested):
 1. Install `rubygems <http://rubygems.org>`_.
 2. Make sure that the ``broccoli-config`` binary is in your path.
   (``export PATH=/usr/local/bro/bin:$PATH``)
 3. Run, ``sudo gem install rbroccoli``.
 Usage
 =====
 There aren't really any useful docs yet.  Your best bet currently is
 to read through the examples.
 One thing I should mention however is that I haven't done any optimization
 yet.  You may find that if you write code that is going to be sending or
 receiving extremely large numbers of events, that it won't run fast enough and
 will begin to fall behind the Bro server.  The dns_requests.rb example is
 a good performance test if your Bro server is sitting on a network with many
 dns lookups.
 Contact
 =======
 If you have a question/comment/patch, see the Bro `contact page
 <http://www.bro.org/contact/index.html>`_.
--- a/doc/components/broccoli/README.rst
+++ b/doc/components/broccoli/README.rst
@ -0,0 +1,141 @@
 ..	-*- mode: rst-mode -*-
 ..
 .. Version number is filled in automatically.
 .. |version| replace:: 1.92-9
 ===============================================
 Broccoli: The Bro Client Communications Library
 ===============================================
 .. rst-class:: opening
  Broccoli is the "Bro client communications library". It allows you
  to create client sensors for the Bro intrusion detection system.
  Broccoli can speak a good subset of the Bro communication protocol,
  in particular, it can receive Bro IDs, send and receive Bro events,
  and send and receive event requests to/from peering Bros. You can
  currently create and receive values of pure types like integers,
  counters, timestamps, IP addresses, port numbers, booleans, and
  strings.
 Download
 --------
 You can find the latest Broccoli release for download at
 http://www.bro.org/download.
 Broccoli's git repository is located at
 `git://git.bro.org/broccoli <git://git.bro.org/broccoli>`_. You
 can browse the repository `here <http://git.bro.org/broccoli>`_.
 This document describes Broccoli |version|. See the ``CHANGES``
 file for version history.
 Installation
 ------------
 The Broccoli library has been tested on Linux, the BSDs, and Solaris.
 A Windows build has not currently been tried but is part of our future
 plans. If you succeed in building Broccoli on other platforms, let us
 know!
 Prerequisites
 -------------
 Broccoli relies on the following libraries and tools, which need to be
 installed before you begin:
    Flex (Fast Lexical Analyzer)
        Flex is already installed on most systems, so with luck you
        can skip having to install it yourself.
    Bison (GNU Parser Generator)
        This comes with many systems, but if you get errors compiling
        parse.y, you will need to install it.
    OpenSSL headers and libraries
        For encrypted communication. These are likely installed,
        though some platforms may require installation of a 'devel'
        package for the headers.
    CMake 2.6.3 or greater
        CMake is a cross-platform, open-source build system, typically
        not installed by default.  See http://www.cmake.org for more
        information regarding CMake and the installation steps below
        for how to use it to build this distribution.  CMake generates
        native Makefiles that depend on GNU Make by default.
 Broccoli can also make use of some optional libraries if they are found at
 installation time:
 Libpcap headers and libraries
    Network traffic capture library
 Installation
 ------------
 To build and install into ``/usr/local``::
    ./configure
    make
    make install
 This will perform an out-of-source build into the build directory using the
 default build options and then install libraries into ``/usr/local/lib``.
 You can specify a different installation directory with::
    ./configure --prefix=<dir>
 Or control the python bindings install destination more precisely with::
    ./configure --python-install-dir=<dir>
 Run ``./configure --help`` for more options.
 Further notable configure options:
  ``--enable-debug``
      This one enables lots of debugging output. Be sure to disable
      this when using the library in a production environment! The
      output could easily end up in undersired places when the stdout
      of the program you've instrumented is used in other ways.
  ``--with-configfile=FILE``
      Broccoli can read key/value pairs from a config file. By default
      it is located in the etc directory of the installation root
      (exception: when using ``--prefix=/usr``, ``/etc`` is used
      instead of /usr/etc). The default config file name is
      broccoli.conf. Using ``--with-configfile``, you can override the
      location and name of the config file.
 To use the library in other programs & configure scripts, use the
 ``broccoli-config`` script. It gives you the necessary configuration flags
 and linker flags for your system, see ``--cflags`` and ``--libs``.
 The API is contained in broccoli.h and pretty well documented. A few
 usage examples can be found in the test directory, in particular, the
 ``broping`` tool can be used to test event transmission and reception. Have
 a look at the policy file ``broping.bro`` for the events that need to be
 defined at the peering Bro. Try ``broping -h`` for a look at the available
 options.
 Broccoli knows two kinds of version numbers: the release version number
 (as in "broccoli-x.y.tar.gz", or as shipped with Bro) and the shared
 library API version number (as in libbroccoli.so.3.0.0). The former
 relates to changes in the tree, the latter to compatibility changes in
 the API.
 Comments, feedback and patches are appreciated; please check the `Bro
 website <http://www.bro.org/community>`_.
 Documentation
 -------------
 Please see the `Broccoli User Manual <./broccoli-manual.html>`_ and
 the `Broccoli API Reference <../../broccoli-api/index.html>`_.
--- a/doc/components/broccoli/broccoli-manual.rst
+++ b/doc/components/broccoli/broccoli-manual.rst
--- a/doc/components/broctl/README.rst
+++ b/doc/components/broctl/README.rst
--- a/doc/components/btest/README.rst
+++ b/doc/components/btest/README.rst
@ -0,0 +1,843 @@
 ..	-*- mode: rst-mode -*-
 ..
 .. Version number is filled in automatically.
 .. |version| replace:: 0.4-14
 ============================================
 BTest - A Simple Driver for Basic Unit Tests
 ============================================
 .. rst-class:: opening
    The ``btest`` is a simple framework for writing unit tests. Freely
    borrowing some ideas from other packages, it's main objective is to
    provide an easy-to-use, straightforward driver for a suite of
    shell-based tests. Each test consists of a set of command lines that
    will be executed, and success is determined based on their exit
    codes. ``btest`` comes with some additional tools that can be used
    within such tests to compare output against a previously established
    baseline.
 .. contents::
 Download
 ========
 You can find the latest BTest release for download at
 http://www.bro.org/download.
 BTest's git repository is located at `git://git.bro.org/btest.git
 <git://git.bro.org/btest.git>`__. You can browse the repository
 `here <http://git.bro.org/btest.git>`__.
 This document describes BTest |version|. See the ``CHANGES``
 file for version history.
 Installation
 ============
 Installation is simple and standard::
    tar xzvf btest-*.tar.gz
    cd btest-*
    python setup.py install
 This will install a few scripts: ``btest`` is the main driver program,
 and there are a number of further helper scripts that we discuss below
 (including ``btest-diff``, which is a tool for comparing output to a
 previously established baseline).
 Writing a Simple Test
 =====================
 In the most simple case, ``btest`` simply executes a set of command
 lines, each of which must be prefixed with ``@TEST-EXEC:``
 ::
    > cat examples/t1
    @TEST-EXEC: echo "Foo" | grep -q Foo
    @TEST-EXEC: test -d .
    > btest examples/t1
    examples.t1 ... ok
 The test passes as both command lines return success. If one of them
 didn't, that would be reported::
    > cat examples/t2
    @TEST-EXEC: echo "Foo" | grep -q Foo
    @TEST-EXEC: test -d DOESNOTEXIST
    > btest examples/t2
    examples.t2 ... failed
 Usually you will just run all tests found in a directory::
    > btest examples
    examples.t1 ... ok
    examples.t2 ... failed
    1 test failed
 Why do we need the ``@TEST-EXEC:`` prefixes? Because the file
 containing the test can simultaneously act as *its input*. Let's
 say we want to verify a shell script::
    > cat examples/t3.sh
    # @TEST-EXEC: sh %INPUT
    ls /etc | grep -q passwd
    > btest examples/t3.sh
    examples.t3 ... ok
 Here, ``btest`` is executing (something similar to) ``sh
 examples/t3.sh``, and then checks the return value as usual. The
 example also shows that the ``@TEST-EXEC`` prefix can appear
 anywhere, in particular inside the comment section of another
 language.
 Now, let's say we want to check the output of a program, making sure
 that it matches what we expect. For that, we first add a command
 line to the test that produces the output we want to check, and then
 run ``btest-diff`` to make sure it matches a previously recorded
 baseline. ``btest-diff`` is itself just a script that returns
 success if the output is as expected, and failure otherwise. In the
 following example, we use an awk script as a fancy way to print all
 file names starting with a dot in the user's home directory. We
 write that list into a file called ``dots`` and then check whether
 its content matches what we know from last time::
    > cat examples/t4.awk
    # @TEST-EXEC: ls -a $HOME | awk -f %INPUT >dots
    # @TEST-EXEC: btest-diff dots
    /^\.+/ { print $1 }
 Note that each test gets its own little sandbox directory when run,
 so by creating a file like ``dots``, you aren't cluttering up
 anything.
 The first time we run this test, we need to record a baseline::
    > btest -U examples/t4.awk
 Now, ``btest-diff`` has remembered what the ``dots`` file should
 look like::
    > btest examples/t4.awk
    examples.t4 ... ok
    > touch ~/.NEWDOTFILE
    > btest examples/t4.awk
    examples.t4 ... failed
    1 test failed
 If we want to see what exactly the unexpected change is that was
 introduced to ``dots``, there's a *diff* mode for that::
    > btest -d examples/t4.awk
    examples.t4 ... failed
    % 'btest-diff dots' failed unexpectedly (exit code 1)
    % cat .diag
    == File ===============================
    [... current dots file ...]
    == Diff ===============================
    --- /Users/robin/work/binpacpp/btest/Baseline/examples.t4/dots
    2010-10-28 20:11:11.000000000 -0700
    +++ dots      2010-10-28 20:12:30.000000000 -0700
    @@ -4,6 +4,7 @@
    .CFUserTextEncoding
    .DS_Store
    .MacOSX
    +.NEWDOTFILE
    .Rhistory
    .Trash
    .Xauthority
    =======================================
    % cat .stderr
    [... if any of the commands had printed something to stderr, that would follow here ...]
 Once we delete the new file, we are fine again::
    > rm ~/.NEWDOTFILE
    > btest -d examples/t4.awk
    examples.t4 ... ok
 That's already the main functionality that the ``btest`` package
 provides. In the following, we describe a number of further options
 extending/modifying this basic approach.
 Reference
 =========
 Command Line Usage
 ------------------
 ``btest`` must be started with a list of tests and/or directories
 given on the command line. In the latter case, the default is to
 recursively scan the directories and assume all files found to be
 tests to perform. It is however possible to exclude certain files by
 specifying a suitable `configuration file`_.
 ``btest`` returns exit code 0 if all tests have successfully passed,
 and 1 otherwise.
 ``btest`` accepts the following options:
    -a ALTERNATIVE, --alternative=ALTERNATIVE
        Activates an alternative_ configuration defined in the
        configuration file. This option can be given multiple times to
        run tests with several alternatives. If ``ALTERNATIVE`` is ``-``
        that refers to running with the standard setup, which can be used
        to run tests both with and without alterantives by giving both.
    -b, --brief
        Does not output *anything* for tests which pass. If all tests
        pass, there will not be any output at all.
    -c CONFIG, --config=CONFIG
        Specifies an alternative `configuration file`_ to use. If not
        specified, the default is to use a file called ``btest.cfg``
        if found in the current directory.
    -d, --diagnostics
        Reports diagnostics for all failed tests. The diagnostics
        include the command line that failed, its output to standard
        error, and potential additional information recorded by the
        command line for diagnostic purposes (see `@TEST-EXEC`_
        below). In the case of ``btest-diff``, the latter is the
        ``diff`` between baseline and actual output.
    -D, --diagnostics-all
        Reports diagnostics for all tests, including those which pass.
    -f DIAGFILE, --file-diagnostics=DIAGFILE
        Writes diagnostics for all failed tests into the given file.
        If the file already exists, it will be overwritten.
    -g GROUPS, --group=GROUPS
        Runs only tests assigned to the given test groups, see
        `@TEST-GROUP`_. Multiple groups can be given as a
        comma-separated list. Specifying ``-`` as a group name selects
        all tests that do not belong to any group.
    -j [THREADS], --jobs[=THREADS]
        Runs up to the given number of tests in parallel. If no number
        is given, BTest substitutes the number of available CPU cores
        as reported by the OS.
        By default, BTest assumes that all tests can be executed
        concurrently without further constraints. One can however
        ensure serialization of subsets by assigning them to the same
        serialization set, see `@TEST-SERIALIZE`_.
    -q, --quiet
        Suppress information output other than about failed tests.
        If all tests pass, there will not be any output at all.
    -r, --rerun
        Runs only tests that failed last time. After each execution
        (except when updating baselines), BTest generates a state file
        that records the tests that have failed. Using this option on
        the next run then reads that file back in and limits execution
        to those tests found in there.
    -t, --tmp-keep
        Does not delete any temporary files created for running the
        tests (including their outputs). By default, the temporary
        files for a test will be located in ``.tmp/<test>/``, where
        ``<test>`` is the relative path of the test file with all slashes
        replaced with dots and the file extension removed (e.g., the files
        for ``example/t3.sh`` will be in ``.tmp/example.t3``).
    -U, --update-baseline
        Records a new baseline for all ``btest-diff`` commands found
        in any of the specified tests. To do this, all tests are run
        as normal except that when ``btest-diff`` is executed, it
        does not compute a diff but instead considers the given file
        to be authoritative and records it as the version to compare
        with in future runs.
    -u, --update-interactive
        Each time a ``btest-diff`` command fails in any tests that are
        run, btest will stop and ask whether or not the user wants to
        record a new baseline.
    -v, --verbose
        Shows all test command lines as they are executed.
    -w, --wait
        Interactively waits for ``<enter>`` after showing diagnostics
        for a test.
    -x FILE, --xml=FILE
        Records test results in JUnit XML format to the given file.
        If the file exists already, it is overwritten.
 .. _configuration file:
 Configuration
 -------------
 Specifics of ``btest``'s execution can be tuned with a configuration
 file, which by default is ``btest.cfg`` if that's found in the
 current directory. It can alternatively be specified with the
 ``--config`` command line option. The configuration file is
 "INI-style", and an example comes with the distribution, see
 ``btest.cfg.example``. A configuration file has one main section,
 ``btest``, that defines most options; as well as an optional section
 for defining `environment variables`_ and further optional sections
 for defining alternatives_.
 Note that all paths specified in the configuration file are relative
 to ``btest``'s *base directory*. The base directory is either the
 one where the configuration file is located if such is given/found,
 or the current working directory if not. When setting values for
 configuration options, the absolute path to the base directory is
 available by using the macro ``%(testbase)s`` (the weird syntax is
 due to Python's ``ConfigParser`` module).
 Furthermore, all values can use standard "backtick-syntax" to
 include the output of external commands (e.g., xyz=`\echo test\`).
 Note that the backtick expansion is performed after any ``%(..)``
 have already been replaced (including within the backticks).
 Options
 ~~~~~~~
 The following options can be set in the ``btest`` section of the
 configuration file:
 ``TestDirs``
    A space-separated list of directories to search for tests. If
    defined, one doesn't need to specify any tests on the command
    line.
 ``TmpDir``
    A directory where to create temporary files when running tests.
    By default, this is set to ``%(testbase)s/.tmp``.
 ``BaselineDir``
    A directory where to store the baseline files for ``btest-diff``.
    By default, this is set to ``%(testbase)s/Baseline``.
 ``IgnoreDirs``
    A space-separated list of relative directory names to ignore
    when scanning test directories recursively. Default is empty.
 ``IgnoreFiles``
    A space-separated list of filename globs matching files to
    ignore when scanning given test directories recursively.
    Default is empty.
 ``StateFile``
    The name of the state file to record the names of failing tests. Default is
    ``.btest.failed.dat``.
 ``Finalizer``
    An executable that will be executed each time any test has
    successfully run. It runs in the same directory as the test itself
    and receives the name of the test as its parameter. The return
    value indicates whether the test should indeed be considered
    successful. By default, there's no finalizer set.
 .. _environment variables:
 Environment Variables
 ~~~~~~~~~~~~~~~~~~~~~
 A special section ``environment`` defines environment variables that
 will be propagated to all tests::
     [environment]
     CFLAGS=-O3
     PATH=%(testbase)s/bin:%(default_path)s
 Note how ``PATH`` can be adjusted to include local scripts: the
 example above prefixes it with a local ``bin/`` directory inside the
 base directory, using the predefined ``default_path`` macro to refer
 to the ``PATH`` as it is set by default.
 Furthermore, by setting ``PATH`` to include the ``btest``
 distribution directory, one could skip the installation of the
 ``btest`` package.
 .. _alternative:
 Alternatives
 ~~~~~~~~~~~~
 BTest can run a set of tests with different settings than it would
 normally use by specifying an *alternative* configuration. Currently,
 three things can be adjusted:
    - Further environment variables can be set that will then be
      available to all the commands that a test executes.
    - *Filters* can modify an input file before a test uses it.
    - *Substitutions* can modify command lines executed as part of a
      test.
 We discuss the three separately in the following. All of them are
 defined by adding sections ``[<type>-<name>]`` where ``<type>``
 corresponds to the type of adjustment being made and ``<name>`` is the
 name of the alternative. Once at least one section is defined for a
 name, that alternative can be enabled by BTest's ``--alternative``
 flag.
 Environment Variables
 ^^^^^^^^^^^^^^^^^^^^^
 An alternative can add further environment variables by defining an
 ``[environment-<name>]`` section:
     [environment-myalternative]
     CFLAGS=-O3
 Running ``btest`` with ``--alternative=myalternative`` will now make
 the ``CFLAGS`` environment variable available to all commands
 executed.
 .. _filters:
 Filters
 ^^^^^^^
 Filters are a transparent way to adapt the input to a specific test
 command before it is executed. A filter is defined by adding a section
 ``[filter-<name>]`` to the configuration file. This section must have
 exactly one entry, and the name of that entry is interpreted as the
 name of a command whose input is to be filtered.  The value of that
 entry is the name of a filter script that will be run with two
 arguments representing input and output files, respectively. Example::
    [filter-myalternative]
    cat=%(testbase)s/bin/filter-cat
 Once the filter is activated by running ``btest`` with
 ``--alternative=myalternative``, every time a ``@TEST-EXEC: cat
 %INPUT`` is found, ``btest`` will first execute (something similar to)
 ``%(testbase)s/bin/filter-cat %INPUT out.tmp``, and then subsequently
 ``cat out.tmp`` (i.e., the original command but with the filtered
 output).  In the simplest case, the filter could be a no-op in the
 form ``cp $1 $2``.
 .. note::
    There are a few limitations to the filter concept currently:
    * Filters are *always* fed with ``%INPUT`` as their first
      argument. We should add a way to filter other files as well.
    * Filtered commands are only recognized if they are directly
      starting the command line. For example, ``@TEST-EXEC: ls | cat
      >outout`` would not trigger the example filter above.
    * Filters are only executed for ``@TEST-EXEC``, not for
      ``@TEST-EXEC-FAIL``.
 .. _substitution:
 Substitutions
 ^^^^^^^^^^^^^^
 Substitutions are similar to filters, yet they do not adapt the input
 but the command line being executed. A substitution is defined by
 adding a section ``[substitution-<name>]`` to the configuration file.
 For each entry in this section, the entry's name specifies the
 command that is to be replaced with something else given as its value.
 Example::
    [substitution-myalternative]
    gcc=gcc -O2
 Once the substitution is activated by running ``btest`` with
 ``--alternative=myalternative``, every time a ``@TEST-EXEC`` executes
 ``gcc``, that is replaced with ``gcc -O2``. The replacement is simple
 string substitution so it works not only with commands but anything
 found on the command line; it however only replaces full words, not
 subparts of words.
 Writing Tests
 -------------
 ``btest`` scans a test file for lines containing keywords that
 trigger certain functionality. Currently, the following keywords are
 supported:
 .. _@TEST-EXEC:
 ``@TEST-EXEC: <cmdline>``
    Executes the given command line and aborts the test if it
    returns an error code other than zero. The ``<cmdline>`` is
    passed to the shell and thus can be a pipeline, use redirection,
    and any environment variables specified in ``<cmdline>`` will be
    expanded, etc.
    When running a test, the current working directory for all
    command lines will be set to a temporary sandbox (and will be
    deleted later).
    There are two macros that can be used in ``<cmdline>``:
    ``%INPUT`` will be replaced with the full pathname of the file defining
    the test; and ``%DIR`` will be replaced with the directory where
    the test file is located. The latter can be used to reference
    further files also located there.
    In addition to environment variables defined in the
    configuration file, there are further ones that are passed into
    the commands:
        ``TEST_DIAGNOSTICS``
            A file where further diagnostic information can be saved
            in case a command fails. ``--diagnostics`` will show
            this file. (This is also where ``btest-diff`` stores its
            diff.)
        ``TEST_MODE``
            This is normally set to ``TEST``, but will be ``UPDATE``
            if ``btest`` is run with ``--update-baseline``, or
            ``UPDATE_INTERACTIVE`` if run with ``--update-interactive``.
        ``TEST_BASELINE``
            The name of a directory where the command can save permanent
            information across ``btest`` runs. (This is where
            ``btest-diff`` stores its baseline in ``UPDATE`` mode.)
        ``TEST_NAME``
            The name of the currently executing test.
        ``TEST_VERBOSE``
            The path of a file where the test can record further
            information about its execution that will be included with
            btest's ``--verbose`` output. This is for further tracking
            the execution of commands and should generally generate
            output that follows a line-based structure.
    .. note::
        If a command returns the special exit code 100, the test is
        considered failed, however subsequent test commands are still
        run. ``btest-diff`` uses this special exit code to indicate that
        no baseline has yet been established.
        If a command returns the special exit code 200, the test is
        considered failed and all further test executions are aborted.
 ``@TEST-EXEC-FAIL: <cmdline>``
    Like ``@TEST-EXEC``, except that this expects the command to
    *fail*, i.e., the test is aborted when the return code is zero.
 ``@TEST-REQUIRES: <cmdline>``
    Defines a condition that must be met for the test to be executed.
    The given command line will be run before any of the actual test
    commands, and it must return success for the test to continue. If
    it does not return success, the rest of the test will be skipped
    but doing so will not be considered a failure of the test. This allows to
    write conditional tests that may not always make sense to run, depending
    on whether external constraints are satisfied or not (say, whether
    a particular library is available). Multiple requirements may be
    specified and then all must be met for the test to continue.
 ``@TEST-ALTERNATIVE: <alternative>`` Runs this test only for the given
   alternative (see alternative_). If ``<alternatives>`` is
   ``default``, the test executes when BTest runs with no alternative
   given (which however is the default anyways).
 ``@TEST-NOT-ALTERNATIVE: <alternative>`` Ignores this test for the
   given alternative (see alternative_).  If ``<alternative>`` is
   ``default``, the test is ignored if BTest runs with no alternative
   given.
 ``@TEST-COPY-FILE: <file>``
    Copy the given file into the test's directory before the test is
    run. If ``<file>`` is a relative path, it's interpreted relative
    to the BTest's base directory. Environment variables in ``<file>``
    will be replaced if enclosed in ``${..}``. This command can be
    given multiple times.
 ``@TEST-START-NEXT``
    This is a short-cut for defining multiple test inputs in the
    same file, all executing with the same command lines. When
    ``@TEST-START-NEXT`` is encountered, the test file is initially
    considered to end at that point, and all ``@TEST-EXEC-*`` are
    run with an ``%INPUT`` truncated accordingly. Afterwards, a
    *new* ``%INPUT`` is created with everything *following* the
    ``@TEST-START-NEXT`` marker, and the *same* commands are run
    again (further ``@TEST-EXEC-*`` will be ignored). The effect is
    that a single file can actually define two tests, and the
    ``btest`` output will enumerate them::
        > cat examples/t5.sh
        # @TEST-EXEC: cat %INPUT | wc -c >output
        # @TEST-EXEC: btest-diff output
        This is the first test input in this file.
        # @TEST-START-NEXT
        ... and the second.
        > ./btest -D examples/t5.sh
        examples.t5 ... ok
          % cat .diag
          == File ===============================
          119
          [...]
        examples.t5-2 ... ok
          % cat .diag
          == File ===============================
          22
          [...]
    Multiple ``@TEST-START-NEXT`` can be used to create more than
    two tests per file.
 ``@TEST-START-FILE <file>``
    This is used to include an additional input file for a test
    right inside the test file. All lines following the keyword will
    be written into the given file (and removed from the test's
    `%INPUT`) until a terminating ``@TEST-END-FILE`` is found.
    Example::
        > cat examples/t6.sh
        # @TEST-EXEC: awk -f %INPUT <foo.dat >output
        # @TEST-EXEC: btest-diff output
            { lines += 1; }
        END { print lines; }
        @TEST-START-FILE foo.dat
        1
        2
        3
        @TEST-END-FILE
        > btest -D examples/t6.sh
        examples.t6 ... ok
          % cat .diag
          == File ===============================
          3
    Multiple such files can be defined within a single test.
    Note that this is only one way to use further input files.
    Another is to store a file in the same directory as the test
    itself, making sure it's ignored via ``IgnoreFiles``, and then
    refer to it via ``%DIR/<name>``.
 .. _@TEST-GROUP:
 ``@TEST-GROUP: <group>``
    Assigns the test to a group of name ``<group>``. By using option
    ``-g`` one can limit execution to all tests that belong to a given
    group (or a set of groups).
 .. _@TEST-SERIALIZE:
 ``@TEST-SERIALIZE: <set>``
   When using option ``-j`` to parallelize execution, all tests that
   specify the same serialization set are guaranteed to run
   sequentially. ``<set>`` is an arbitrary user-chosen string.
 Canonifying Diffs
 =================
 ``btest-diff`` has the capability to filter its input through an
 additional script before it compares the current version with the
 baseline. This can be useful if certain elements in an output are
 *expected* to change (e.g., timestamps). The filter can then
 remove/replace these with something consistent. To enable such
 canonification, set the environment variable
 ``TEST_DIFF_CANONIFIER`` to a script reading the original version
 from stdin and writing the canonified version to stdout. Note that
 both baseline and current output are passed through the filter
 before their differences are computed.
 Running Processes in the Background
 ===================================
 Sometimes processes need to be spawned in the background for a test,
 in particular if multiple processes need to cooperate in some fashion.
 ``btest`` comes with two helper scripts to make life easier in such a
 situation:
 ``btest-bg-run <tag> <cmdline>``
    This is a script that runs ``<cmdline>`` in the background, i.e.,
    it's like using ``cmdline &`` in a shell script. Test execution
    continues immediately with the next command. Note that the spawned
    command is *not* run in the current directory, but instead in a
    newly created sub-directory called ``<tag>``. This allows
    spawning multiple instances of the same process without needing to
    worry about conflicting outputs. If you want to access a command's
    output later, like with ``btest-diff``, use ``<tag>/foo.log`` to
    access it.
 ``btest-bg-wait [-k] <timeout>``
    This script waits for all processes previously spawned via
    ``btest-bg-run`` to finish. If any of them exits with a non-zero
    return code, ``btest-bg-wait`` does so as well, indicating a
    failed test. ``<timeout>`` is mandatory and gives the maximum
    number of seconds to wait for any of the processes to terminate.
    If any process hasn't done so when the timeout expires, it will be
    killed and the test is considered to be failed as long as ``-k``
    is not given. If ``-k`` is given, pending processes are still
    killed but the test continues normally, i.e., non-termination is
    not considered a failure in this case. This script also collects
    the processes' stdout and stderr outputs for diagnostics output.
 Integration with Sphinx
 =======================
 ``btest`` comes with a new directive for the documentation framework
 `Sphinx <http://sphinx.pocoo.org>`_. The directive allows to write a
 test directly inside a Sphinx document, and then to include output
 from the test's command into the generated documentation. The same
 tests can also run externally and will catch if any changes to the
 included content occur. The following walks through setting this up.
 Configuration
 -------------
 First, you need to tell Sphinx a base directory for the ``btest``
 configuration as well as a directory in there where to store tests
 it extracts from the Sphinx documentation. Typically, you'd just
 create a new subdirectory ``tests`` in the Sphinx project for the
 ``btest`` setup and then store the tests in there in, e.g.,
 ``doc/``::
    cd <sphinx-root>
    mkdir tests
    mkdir tests/doc
 Then add the following to your Sphinx ``conf.py``::
    extensions += ["btest-sphinx"]
    btest_base="tests"         # Relative to Sphinx-root.
    btest_tests="doc"          # Relative to btest_base.
 Next, a finalizer to ``btest.cfg``::
    [btest]
    ...
    Finalizer=btest-diff-rst
 Finally, create a ``btest.cfg`` in ``tests/`` as usual and add
 ``doc/`` to the ``TestDirs`` option.
 Including a Test into a Sphinx Document
 ---------------------------------------
 The ``btest`` extension provides a new directive to include a test
 inside a Sphinx document::
    .. btest:: <test-name>
        <test content>
 Here, ``<test-name>`` is a custom name for the test; it will be
 stored in ``btest_tests`` under that name. ``<test content>`` is just
 a standard test as you would normally put into one of the
 ``TestDirs``. Example::
    .. btest:: just-a-test
        @TEST-EXEC: expr 2 + 2
 When you now run Sphinx, it will (1) store the test content into
 ``tests/doc/just-a-test`` (assuming the above path layout), and (2)
 execute the test by running ``btest`` on it. You can then run
 ``btest`` manually in ``tests/`` as well and it will execute the test
 just as it would in a standard setup. If a test fails when Sphinx runs
 it, there will be a corresponding error and include the diagnostic output
 into the document.
 By default, nothing else will be included into the generated
 documentation, i.e., the above test will just turn into an empty text
 block. However, ``btest`` comes with a set of scripts that you can use
 to specify content to be included. As a simple example,
 ``btest-rst-cmd <cmdline>`` will execute a command and (if it
 succeeds) include both the command line and the standard output into
 the documentation. Example::
    .. btest:: another-test
        @TEST-EXEC: btest-rst-cmd echo Hello, world!
 When running Sphinx, this will render as:
 .. code::
    # echo Hello, world!
    Hello world!
 When running ``btest`` manually in ``tests/``, the ``Finalizer`` we
 added to ``btest.cfg`` (see above) compares the generated reST code
 with a previously established baseline, just like ``btest-diff`` does
 with files. To establish the initial baseline, run ``btest -u``, like
 you would with ``btest-diff``.
 Scripts
 -------
 The following Sphinx support scripts come with ``btest``:
 ``btest-rst-cmd [options] <cmdline>``
    By default, this executes ``<cmdline>`` and includes both the
    command line itself and its standard output into the generated
    documentation. See above for an example.
    This script provides the following options:
        -c ALTERNATIVE_CMDLINE
            Show ``ALTERNATIVE_CMDLINE`` in the generated
            documentation instead of the one actually executed. (It
            still runs the ``<cmdline>`` given outside the option.)
        -d
            Do not actually execute ``<cmdline>``; just format it for
            the generated documentation and include no further output.
        -f FILTER_CMD
            Pipe the command line's output through ``FILTER_CMD``
            before including. If ``-r`` is given, it filters the
            file's content instead of stdout.
        -o
            Do not include the executed command into the generated
            documentation, just its output.
        -r FILE
            Insert ``FILE`` into output instead of stdout.
 ``btest-rst-include <file>``
    Includes ``<file>`` inside a code block.
 ``btest-rst-pipe <cmdline>``
    Executes ``<cmdline>``, includes its standard output inside a code
    block. Note that this script does not include the command line
    itself into the code block, just the output.
 .. note::
    All these scripts can be run directly from the command line to show
    the reST code they generate.
 .. note::
    ``btest-rst-cmd`` can do everything the other scripts provide if
    you give it the right options. In fact, the other scripts are
    provided just for convenience and leverage ``btest-rst-cmd``
    internally.
 License
 =======
 btest is open-source under a BSD licence.
--- a/doc/components/capstats/README.rst
+++ b/doc/components/capstats/README.rst
@ -0,0 +1,107 @@
 ..	-*- mode: rst-mode -*-
 ..
 .. Version number is filled in automatically.
 .. |version| replace:: 0.18
 ===============================================
 capstats - A tool to get some NIC statistics.
 ===============================================
 .. rst-class:: opening
    capstats is a small tool to collect statistics on the
    current load of a network interface, using either `libpcap
    <http://www.tcpdump.org>`_ or the native interface for `Endace's
    <http:///www.endace.com>`_. It reports statistics per time interval
    and/or for the tool's total run-time.
 Download
 --------
 You can find the latest capstats release for download at
 http://www.bro.org/download.
 Capstats's git repository is located at `git://git.bro.org/capstats.git
 <git://git.bro.org/capstats.git>`__. You can browse the repository
 `here <http://git.bro.org/capstats.git>`__.
 This document describes capstats |version|. See the ``CHANGES``
 file for version history.
 Output
 ------
 Here's an example output with output in one-second intervals until
 ``CTRL-C`` is hit:
 .. console::
    >capstats -i nve0 -I 1
    1186620936.890567 pkts=12747 kpps=12.6 kbytes=10807 mbps=87.5 nic_pkts=12822 nic_drops=0 u=960 t=11705 i=58 o=24 nonip=0
    1186620937.901490 pkts=13558 kpps=13.4 kbytes=11329 mbps=91.8 nic_pkts=13613 nic_drops=0 u=1795 t=24339 i=119 o=52 nonip=0
    1186620938.912399 pkts=14771 kpps=14.6 kbytes=13659 mbps=110.7 nic_pkts=14781 nic_drops=0 u=2626 t=38154 i=185 o=111 nonip=0
    1186620939.012446 pkts=1332 kpps=13.3 kbytes=1129 mbps=92.6 nic_pkts=1367 nic_drops=0 u=2715 t=39387 i=194 o=112 nonip=0
    === Total
    1186620939.012483 pkts=42408 kpps=13.5 kbytes=36925 mbps=96.5 nic_pkts=1 nic_drops=0 u=2715 t=39387 i=194 o=112 nonip=0
 Each line starts with a timestamp and the other fields are:
    :pkts:
        Absolute number of packets seen by ``capstats`` during interval.
    :kpps:
        Number of packets per second.
    :kbytes:
        Absolute number of KBytes during interval.
    :mbps:
        Mbits/sec.
    :nic_pkts:
        Number of packets as reported by ``libpcap``'s ``pcap_stats()`` (may not match _pkts_)
    :nic_drops:
        Number of packet drops as reported by ``libpcap``'s ``pcap_stats()``.
    :u:
        Number of UDP packets.
    :t:
        Number of TCP packets.
    :i:
        Number of ICMP packets.
    :nonip:
        Number of non-IP packets.
 Options
 -------
 A list of all options::
    capstats [Options] -i interface
       -i| --interface <interface>    Listen on interface
       -d| --dag                      Use native DAG API
       -f| --filter <filter>          BPF filter
       -I| --interval <secs>          Stats logging interval
       -l| --syslog                   Use syslog rather than print to stderr
       -n| --number <count>           Stop after outputting <number> intervals
       -N| --select                   Use select() for live pcap (for testing only)
       -p| --payload <n>              Verifies that packets' payloads consist
                                      entirely of bytes of the given value.
       -q| --quiet <count>            Suppress output, exit code indicates >= count
                                      packets received.
       -S| --size <size>              Verify packets to have given <size>
       -s| --snaplen <size>           Use pcap snaplen <size>
       -v| --version                  Print version and exit
       -w| --write <filename>         Write packets to file
 Installation
 ------------
 ``capstats`` has been tested on Linux, FreeBSD, and MacOS. Please see
 the ``INSTALL`` file for installation instructions.
--- a/doc/components/index.rst
+++ b/doc/components/index.rst
@ -0,0 +1,28 @@
 =====================
 Additional Components
 =====================
 The following are snapshots of documentation for components that come
 with this version of Bro (|version|). Since they can also be used
 independently, see the `download page
 <http://bro-ids.org/download/index.html>`_ for documentation of any
 current, independent component releases.
 .. toctree::
   :maxdepth: 1
   BinPAC - A protocol parser generator <binpac/README>
   Broccoli - The Bro Client Communication Library (README) <broccoli/README>
   Broccoli - User Manual <broccoli/broccoli-manual>
   Broccoli Python Bindings <broccoli-python/README>
   Broccoli Ruby Bindings <broccoli-ruby/README>
   BroControl - Interactive Bro management shell <broctl/README>
   Bro-Aux - Small auxiliary tools for Bro <bro-aux/README>
   BTest - A unit testing framework <btest/README>
   Capstats - Command-line packet statistic tool <capstats/README>
   PySubnetTree - Python module for CIDR lookups<pysubnettree/README>
   trace-summary - Script for generating break-downs of network traffic <trace-summary/README>
 The `Broccoli API Reference <broccoli-api/index.html>`_ may also be of
 interest.
--- a/doc/components/pysubnettree/README.rst
+++ b/doc/components/pysubnettree/README.rst
@ -0,0 +1,98 @@
 ..	-*- mode: rst-mode -*-
 ..
 .. Version number is filled in automatically.
 .. |version| replace:: 0.19-9
 ===============================================
 PySubnetTree - A Python Module for CIDR Lookups
 ===============================================
 .. rst-class:: opening
    The PySubnetTree package provides a Python data structure
    ``SubnetTree`` which maps subnets given in `CIDR
    <http://tools.ietf.org/html/rfc4632>`_ notation (incl.
    corresponding IPv6 versions) to Python objects. Lookups are
    performed by longest-prefix matching.
 Download
 --------
 You can find the latest PySubnetTree release for download at
 http://www.bro.org/download.
 PySubnetTree's git repository is located at `git://git.bro.org/pysubnettree.git
 <git://git.bro.org/pysubnettree.git>`__. You can browse the repository
 `here <http://git.bro.org/pysubnettree.git>`__.
 This document describes PySubnetTree |version|. See the ``CHANGES``
 file for version history.
 Example
 -------
 A simple example which associates CIDR prefixes with strings::
    >>> import SubnetTree
    >>> t = SubnetTree.SubnetTree()
    >>> t["10.1.0.0/16"] = "Network 1"
    >>> t["10.1.42.0/24"] = "Network 1, Subnet 42"
    >>> t["10.2.0.0/16"] = "Network 2"
    >>> print t["10.1.42.1"]
    Network 1, Subnet 42
    >>> print t["10.1.43.1"]
    Network 1
    >>> print "10.1.42.1" in t
    True
    >>> print "10.1.43.1" in t
    True
    >>> print "10.20.1.1" in t
    False
    >>> print t["10.20.1.1"]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "SubnetTree.py", line 67, in __getitem__
        def __getitem__(*args): return _SubnetTree.SubnetTree___getitem__(*args)
    KeyError: '10.20.1.1'
 By default, CIDR prefixes and IP addresses are given as strings.
 Alternatively, a ``SubnetTree`` object can be switched into *binary
 mode*, in which single addresses are passed in the form of packed
 binary strings as, e.g., returned by `socket.inet_aton
 <http://docs.python.org/lib/module-socket.html#l2h-3657>`_::
    >>> t.get_binary_lookup_mode()
    False
    >>> t.set_binary_lookup_mode(True)
    >>> t.binary_lookup_mode()
    True
    >>> import socket
    >>> print t[socket.inet_aton("10.1.42.1")]
    Network 1, Subnet 42
 A SubnetTree also provides methods ``insert(prefix,object=None)`` for insertion
 of prefixes (``object`` can be skipped to use the tree like a set), and
 ``remove(prefix)`` for removing entries (``remove`` performs an _exact_ match
 rather than longest-prefix).
 Internally, the CIDR prefixes of a ``SubnetTree`` are managed by a
 Patricia tree data structure and lookups are therefore efficient
 even with a large number of prefixes.
 PySubnetTree comes with a BSD license.
 Prerequisites
 -------------
 This package requires Python 2.4 or newer.
 Installation
 ------------
 Installation is pretty simple::
   > python setup.py install
--- a/doc/components/trace-summary/README.rst
+++ b/doc/components/trace-summary/README.rst
@ -0,0 +1,154 @@
 ..	-*- mode: rst-mode -*-
 ..
 .. Version number is filled in automatically.
 .. |version| replace:: 0.8
 ====================================================
 trace-summary - Generating network traffic summaries
 ====================================================
 .. rst-class:: opening
    ``trace-summary`` is a Python script that generates break-downs of
    network traffic, including lists of the top hosts, protocols,
    ports, etc. Optionally, it can generate output separately for
    incoming vs. outgoing traffic, per subnet, and per time-interval.
 Download
 --------
 You can find the latest trace-summary release for download at
 http://www.bro.org/download.
 trace-summary's git repository is located at `git://git.bro.org/trace-summary.git
 <git://git.bro.org/trace-summary.git>`__. You can browse the repository
 `here <http://git.bro.org/trace-summary.git>`__.
 This document describes trace-summary |version|. See the ``CHANGES``
 file for version history.
 Overview
 --------
 The ``trace-summary`` script reads both packet traces in `libpcap
 <http://www.tcpdump.org>`_ format and connection logs produced by the
 `Bro <http://www.bro.org>`_ network intrusion detection system
 (for the latter, it supports both 1.x and 2.x output formats).
 Here are two example outputs in the most basic form (note that IP
 addresses are 'anonymized'). The first is from a packet trace and the
 second from a Bro connection log::
 >== Total === 2005-01-06-14-23-33 - 2005-01-06-15-23-43
   - Bytes 918.3m - Payload 846.3m - Pkts 1.8m - Frags   0.9% - MBit/s      1.9 -
     Ports        | Sources                   | Destinations              | Protocols |
     80     33.8% | 131.243.89.214       8.5% | 131.243.89.214       7.7% | 6   76.0% |
     22     16.7% | 128.3.2.102          6.2% | 128.3.2.102          5.4% | 17  23.3% |
     11001  12.4% | 204.116.120.26       4.8% | 131.243.89.4         4.8% | 1    0.5% |
     2049   10.7% | 128.3.161.32         3.6% | 131.243.88.227       3.6% |           |
     1023   10.6% | 131.243.89.4         3.5% | 204.116.120.26       3.4% |           |
     993     8.2% | 128.3.164.194        2.7% | 131.243.89.64        3.1% |           |
     1049    8.1% | 128.3.164.15         2.4% | 128.3.164.229        2.9% |           |
     524     6.6% | 128.55.82.146        2.4% | 131.243.89.155       2.5% |           |
     33305   4.5% | 131.243.88.227       2.3% | 128.3.161.32         2.3% |           |
     1085    3.7% | 131.243.89.155       2.3% | 128.55.82.146        2.1% |           |
 >== Total === 2005-01-06-14-23-33 - 2005-01-06-15-23-42
   - Connections 43.4k - Payload 398.4m -
     Ports        | Sources                   | Destinations              | Services           | Protocols | States        |
     80     21.7% | 207.240.215.71       3.0% | 239.255.255.253      8.0% | other        51.0% | 17  55.8% | S0      46.2% |
     427    13.0% | 131.243.91.71        2.2% | 131.243.91.255       4.0% | http         21.7% | 6   36.4% | SF      30.1% |
     443     3.8% | 128.3.161.76         1.7% | 131.243.89.138       2.1% | i-echo        7.3% | 1    7.7% | OTH      7.8% |
     138     3.7% | 131.243.90.138       1.6% | 255.255.255.255      1.7% | https         3.8% |           | RSTO     5.8% |
     515     2.4% | 131.243.88.159       1.6% | 128.3.97.204         1.5% | nb-dgm        3.7% |           | SHR      4.4% |
     11001   2.3% | 131.243.88.202       1.4% | 131.243.88.107       1.1% | printer       2.4% |           | REJ      3.0% |
     53      1.9% | 131.243.89.250       1.4% | 117.72.94.10         1.1% | dns           1.9% |           | S1       1.0% |
     161     1.6% | 131.243.89.80        1.3% | 131.243.88.64        1.1% | snmp          1.6% |           | RSTR     0.9% |
     137     1.4% | 131.243.90.52        1.3% | 131.243.88.159       1.1% | nb-ns         1.4% |           | SH       0.3% |
     2222    1.1% | 128.3.161.252        1.2% | 131.243.91.92        1.1% | ntp           1.0% |           | RSTRH    0.2% |
 Prerequisites
 -------------
 * This script requires Python 2.4 or newer.
 * The `pysubnettree
  <http://www.bro.org/documentation/pysubnettree.html>`_ Python
  module.
 * Eddie Kohler's `ipsumdump <http://www.cs.ucla.edu/~kohler/ipsumdump>`_
  if using ``trace-summary`` with packet traces (versus Bro connection logs)
 Installation
 ------------
 Simply copy the script into some directory which is in your ``PATH``.
 Usage
 -----
 The general usage is::
   trace-summary [options] [input-file]
 Per default, it assumes the ``input-file`` to be a ``libpcap`` trace
 file. If it is a Bro connection log, use ``-c``. If ``input-file`` is
 not given, the script reads from stdin. It writes its output to
 stdout.
 Options
 ~~~~~~~
 The most important options are summarized
 below. Run ``trace-summary --help`` to see the full list including
 some more esoteric ones.
 :-c:
    Input is a Bro connection log instead of a ``libpcap`` trace
    file.
 :-b:
    Counts all percentages in bytes rather than number of
    packets/connections.
 :-E <file>:
    Gives a file which contains a list of networks to ignore for the
    analysis. The file must contain one network per line, where each
    network is of the CIDR form ``a.b.c.d/mask`` (including the
    corresponding syntax for IPv6 prefixes, e.g., ``1:2:3:4::/64``).
    Empty lines and lines starting with a "#" are ignored.
 :-i <duration>:
    Creates totals for each time interval of the given length
    (default is seconds; add "``m``" for minutes and "``h``" for
    hours). Use ``-v`` if you also want to see the breakdowns for
    each interval.
 :-l <file>:
    Generates separate summaries for incoming and outgoing traffic.
    ``<file>`` is a file which contains a list of networks to be
    considered local. Format as for ``-E``.
 :-n <n>:
    Show top n entries in each break-down. Default is 10.
 :-r:
    Resolves hostnames in the output.
 :-s <n>:
    Gives the sample factor if the input has been sampled.
 :-S <n>:
    Sample input with the given factor; less accurate but faster and
    saves memory.
 :-m:
    Does skip memory-expensive statistics.
 :-v:
    Generates full break-downs for each time interval.  Requires
    ``-i``.
--- a/doc/frameworks/index.rst
+++ b/doc/frameworks/index.rst
@ -0,0 +1,16 @@
 ==========
 Frameworks
 ==========
 .. toctree::
   :maxdepth: 1
   notice
   logging
   input
   intel
   cluster
   signatures
   geoip
--- a/doc/frameworks/input.rst
+++ b/doc/frameworks/input.rst
@ -0,0 +1,408 @@
 ===============
 Input Framework
 ===============
 .. rst-class:: opening
   Bro now features a flexible input framework that allows users
   to import data into Bro. Data is either read into Bro tables or
   converted to events which can then be handled by scripts.
   This document gives an overview of how to use the input framework
   with some examples. For more complex scenarios it is
   worthwhile to take a look at the unit tests in
   ``testing/btest/scripts/base/frameworks/input/``.
 .. contents::
 Reading Data into Tables
 ========================
 Probably the most interesting use-case of the input framework is to
 read data into a Bro table.
 By default, the input framework reads the data in the same format
 as it is written by the logging framework in Bro - a tab-separated
 ASCII file.
 We will show the ways to read files into Bro with a simple example.
 For this example we assume that we want to import data from a blacklist
 that contains server IP addresses as well as the timestamp and the reason
 for the block.
 An example input file could look like this:
 ::
        #fields ip timestamp reason
        192.168.17.1 1333252748 Malware host
        192.168.27.2 1330235733 Botnet server
        192.168.250.3 1333145108 Virus detected
 To read a file into a Bro table, two record types have to be defined.
 One contains the types and names of the columns that should constitute the
 table keys and the second contains the types and names of the columns that
 should constitute the table values.
 In our case, we want to be able to lookup IPs. Hence, our key record
 only contains the server IP. All other elements should be stored as
 the table content.
 The two records are defined as:
 .. code:: bro
        type Idx: record {
                ip: addr;
        };
        type Val: record {
                timestamp: time;
                reason: string;
        };
 Note that the names of the fields in the record definitions have to correspond
 to the column names listed in the '#fields' line of the log file, in this
 case 'ip', 'timestamp', and 'reason'.
 The log file is read into the table with a simple call of the ``add_table``
 function:
 .. code:: bro
        global blacklist: table[addr] of Val = table();
        Input::add_table([$source="blacklist.file", $name="blacklist", $idx=Idx, $val=Val, $destination=blacklist]);
        Input::remove("blacklist");
 With these three lines we first create an empty table that should contain the
 blacklist data and then instruct the input framework to open an input stream
 named ``blacklist`` to read the data into the table. The third line removes the
 input stream again, because we do not need it any more after the data has been
 read.
 Because some data files can - potentially - be rather big, the input framework
 works asynchronously. A new thread is created for each new input stream.
 This thread opens the input data file, converts the data into a Bro format and
 sends it back to the main Bro thread.
 Because of this, the data is not immediately accessible. Depending on the
 size of the data source it might take from a few milliseconds up to a few
 seconds until all data is present in the table. Please note that this means
 that when Bro is running without an input source or on very short captured
 files, it might terminate before the data is present in the system (because
 Bro already handled all packets before the import thread finished).
 Subsequent calls to an input source are queued until the previous action has
 been completed. Because of this, it is, for example, possible to call
 ``add_table`` and ``remove`` in two subsequent lines: the ``remove`` action
 will remain queued until the first read has been completed.
 Once the input framework finishes reading from a data source, it fires
 the ``end_of_data`` event. Once this event has been received all data
 from the input file is available in the table.
 .. code:: bro
        event Input::end_of_data(name: string, source: string) {
                # now all data is in the table
                print blacklist;
        }
 The table can also already be used while the data is still being read - it
 just might not contain all lines in the input file when the event has not
 yet fired. After it has been populated it can be used like any other Bro
 table and blacklist entries can easily be tested:
 .. code:: bro
        if ( 192.168.18.12 in blacklist )
                # take action
 Re-reading and streaming data
 -----------------------------
 For many data sources, like for many blacklists, the source data is continually
 changing. For these cases, the Bro input framework supports several ways to
 deal with changing data files.
 The first, very basic method is an explicit refresh of an input stream. When
 an input stream is open, the function ``force_update`` can be called. This
 will trigger a complete refresh of the table; any changed elements from the
 file will be updated.  After the update is finished the ``end_of_data``
 event will be raised.
 In our example the call would look like:
 .. code:: bro
        Input::force_update("blacklist");
 The input framework also supports two automatic refresh modes. The first mode
 continually checks if a file has been changed. If the file has been changed, it
 is re-read and the data in the Bro table is updated to reflect the current
 state.  Each time a change has been detected and all the new data has been
 read into the table, the ``end_of_data`` event is raised.
 The second mode is a streaming mode. This mode assumes that the source data
 file is an append-only file to which new data is continually appended. Bro
 continually checks for new data at the end of the file and will add the new
 data to the table.  If newer lines in the file have the same index as previous
 lines, they will overwrite the values in the output table.  Because of the
 nature of streaming reads (data is continually added to the table),
 the ``end_of_data`` event is never raised when using streaming reads.
 The reading mode can be selected by setting the ``mode`` option of the
 add_table call.  Valid values are ``MANUAL`` (the default), ``REREAD``
 and ``STREAM``.
 Hence, when adding ``$mode=Input::REREAD`` to the previous example, the
 blacklist table will always reflect the state of the blacklist input file.
 .. code:: bro
        Input::add_table([$source="blacklist.file", $name="blacklist", $idx=Idx, $val=Val, $destination=blacklist, $mode=Input::REREAD]);
 Receiving change events
 -----------------------
 When re-reading files, it might be interesting to know exactly which lines in
 the source files have changed.
 For this reason, the input framework can raise an event each time when a data
 item is added to, removed from or changed in a table.
 The event definition looks like this:
 .. code:: bro
        event entry(description: Input::TableDescription, tpe: Input::Event, left: Idx, right: Val) {
                # act on values
        }
 The event has to be specified in ``$ev`` in the ``add_table`` call:
 .. code:: bro
        Input::add_table([$source="blacklist.file", $name="blacklist", $idx=Idx, $val=Val, $destination=blacklist, $mode=Input::REREAD, $ev=entry]);
 The ``description`` field of the event contains the arguments that were
 originally supplied to the add_table call.  Hence, the name of the stream can,
 for example, be accessed with ``description$name``. ``tpe`` is an enum
 containing the type of the change that occurred.
 If a line that was not previously present in the table has been added,
 then ``tpe`` will contain ``Input::EVENT_NEW``. In this case ``left`` contains
 the index of the added table entry and ``right`` contains the values of the
 added entry.
 If a table entry that already was present is altered during the re-reading or
 streaming read of a file, ``tpe`` will contain ``Input::EVENT_CHANGED``. In
 this case ``left`` contains the index of the changed table entry and ``right``
 contains the values of the entry before the change. The reason for this is
 that the table already has been updated when the event is raised. The current
 value in the table can be ascertained by looking up the current table value.
 Hence it is possible to compare the new and the old values of the table.
 If a table element is removed because it was no longer present during a
 re-read, then ``tpe`` will contain ``Input::REMOVED``.  In this case ``left``
 contains the index and ``right`` the values of the removed element.
 Filtering data during import
 ----------------------------
 The input framework also allows a user to filter the data during the import.
 To this end, predicate functions are used. A predicate function is called
 before a new element is added/changed/removed from a table. The predicate
 can either accept or veto the change by returning true for an accepted
 change and false for a rejected change. Furthermore, it can alter the data
 before it is written to the table.
 The following example filter will reject to add entries to the table when
 they were generated over a month ago. It will accept all changes and all
 removals of values that are already present in the table.
 .. code:: bro
        Input::add_table([$source="blacklist.file", $name="blacklist", $idx=Idx, $val=Val, $destination=blacklist, $mode=Input::REREAD,
                        $pred(typ: Input::Event, left: Idx, right: Val) = {
                                if ( typ != Input::EVENT_NEW ) {
                                        return T;
                                }
                                return ( ( current_time() - right$timestamp ) < (30 day) );
                        }]);
 To change elements while they are being imported, the predicate function can
 manipulate ``left`` and ``right``. Note that predicate functions are called
 before the change is committed to the table. Hence, when a table element is
 changed (``tpe`` is ``INPUT::EVENT_CHANGED``), ``left`` and ``right``
 contain the new values, but the destination (``blacklist`` in our example)
 still contains the old values. This allows predicate functions to examine
 the changes between the old and the new version before deciding if they
 should be allowed.
 Different readers
 -----------------
 The input framework supports different kinds of readers for different kinds
 of source data files. At the moment, the default reader reads ASCII files
 formatted in the Bro log file format (tab-separated values). At the moment,
 Bro comes with two other readers. The ``RAW`` reader reads a file that is
 split by a specified record separator (usually newline). The contents are
 returned line-by-line as strings; it can, for example, be used to read
 configuration files and the like and is probably
 only useful in the event mode and not for reading data to tables.
 Another included reader is the ``BENCHMARK`` reader, which is being used
 to optimize the speed of the input framework. It can generate arbitrary
 amounts of semi-random data in all Bro data types supported by the input
 framework.
 In the future, the input framework will get support for new data sources
 like, for example, different databases.
 Add_table options
 -----------------
 This section lists all possible options that can be used for the add_table
 function and gives a short explanation of their use. Most of the options
 already have been discussed in the previous sections.
 The possible fields that can be set for a table stream are:
        ``source``
                A mandatory string identifying the source of the data.
                For the ASCII reader this is the filename.
        ``name``
                A mandatory name for the filter that can later be used
                to manipulate it further.
        ``idx``
                Record type that defines the index of the table.
        ``val``
                Record type that defines the values of the table.
        ``reader``
                The reader used for this stream. Default is ``READER_ASCII``.
        ``mode``
                The mode in which the stream is opened. Possible values are
                ``MANUAL``, ``REREAD`` and ``STREAM``.  Default is ``MANUAL``.
                ``MANUAL`` means that the file is not updated after it has
                been read. Changes to the file will not be reflected in the
                data Bro knows.  ``REREAD`` means that the whole file is read
                again each time a change is found. This should be used for
                files that are mapped to a table where individual lines can
                change.  ``STREAM`` means that the data from the file is
                streamed. Events / table entries will be generated as new
                data is appended to the file.
        ``destination``
                The destination table.
        ``ev``
                Optional event that is raised, when values are added to,
                changed in, or deleted from the table.  Events are passed an
                Input::Event description as the first argument, the index
                record as the second argument and the values as the third
                argument.
        ``pred``
                Optional predicate, that can prevent entries from being added
                to the table and events from being sent.
        ``want_record``
                Boolean value, that defines if the event wants to receive the
                fields inside of a single record value, or individually
                (default).  This can be used if ``val`` is a record
                containing only one type. In this case, if ``want_record`` is
                set to false, the table will contain elements of the type
                contained in ``val``.
 Reading Data to Events
 ======================
 The second supported mode of the input framework is reading data to Bro
 events instead of reading them to a table using event streams.
 Event streams work very similarly to table streams that were already
 discussed in much detail. To read the blacklist of the previous example
 into an event stream, the following Bro code could be used:
 .. code:: bro
        type Val: record {
                ip: addr;
                timestamp: time;
                reason: string;
        };
        event blacklistentry(description: Input::EventDescription, tpe: Input::Event, ip: addr, timestamp: time, reason: string) {
                # work with event data
        }
        event bro_init() {
                Input::add_event([$source="blacklist.file", $name="blacklist", $fields=Val, $ev=blacklistentry]);
        }
 The main difference in the declaration of the event stream is, that an event
 stream needs no separate index and value declarations -- instead, all source
 data types are provided in a single record definition.
 Apart from this, event streams work exactly the same as table streams and
 support most of the options that are also supported for table streams.
 The options that can be set when creating an event stream with
 ``add_event`` are:
        ``source``
                A mandatory string identifying the source of the data.
                For the ASCII reader this is the filename.
        ``name``
                A mandatory name for the stream that can later be used
                to remove it.
        ``fields``
                Name of a record type containing the fields, which should be
                retrieved from the input stream.
        ``ev``
                The event which is fired, after a line has been read from the
                input source.  The first argument that is passed to the event
                is an Input::Event structure, followed by the data, either
                inside of a record (if ``want_record is set``) or as
                individual fields.  The Input::Event structure can contain
                information, if the received line is ``NEW``, has been
                ``CHANGED`` or ``DELETED``. Since the ASCII reader cannot
                track this information for event filters, the value is
                always ``NEW`` at the moment.
        ``mode``
                The mode in which the stream is opened. Possible values are
                ``MANUAL``, ``REREAD`` and ``STREAM``.  Default is ``MANUAL``.
                ``MANUAL`` means that the file is not updated after it has
                been read. Changes to the file will not be reflected in the
                data Bro knows.  ``REREAD`` means that the whole file is read
                again each time a change is found. This should be used for
                files that are mapped to a table where individual lines can
                change.  ``STREAM`` means that the data from the file is
                streamed. Events / table entries will be generated as new
                data is appended to the file.
        ``reader``
                The reader used for this stream. Default is ``READER_ASCII``.
        ``want_record``
                Boolean value, that defines if the event wants to receive the
                fields inside of a single record value, or individually
                (default). If this is set to true, the event will receive a
                single record of the type provided in ``fields``.
--- a/doc/frameworks/intel.rst
+++ b/doc/frameworks/intel.rst
@ -1,5 +1,7 @@
-Intel Framework
+
-===============
+======================
 Intelligence Framework
 ======================
 Intro
 -----
--- a/doc/frameworks/logging-dataseries.rst
+++ b/doc/frameworks/logging-dataseries.rst
@ -0,0 +1,186 @@
 =============================
 Binary Output with DataSeries
 =============================
 .. rst-class:: opening
   Bro's default ASCII log format is not exactly the most efficient
   way for storing and searching large volumes of data. An an
   alternative, Bro comes with experimental support for `DataSeries
   <http://www.hpl.hp.com/techreports/2009/HPL-2009-323.html>`_
   output, an efficient binary format for recording structured bulk
   data. DataSeries is developed and maintained at HP Labs.
 .. contents::
 Installing DataSeries
 ---------------------
 To use DataSeries, its libraries must be available at compile-time,
 along with the supporting *Lintel* package. Generally, both are
 distributed on `HP Labs' web site
 <http://tesla.hpl.hp.com/opensource/>`_. Currently, however, you need
 to use recent development versions for both packages, which you can
 download from github like this::
    git clone http://github.com/dataseries/Lintel
    git clone http://github.com/dataseries/DataSeries
 To build and install the two into ``<prefix>``, do::
    ( cd Lintel     && mkdir build && cd build && cmake -DCMAKE_INSTALL_PREFIX=<prefix> .. && make && make install )
    ( cd DataSeries && mkdir build && cd build && cmake -DCMAKE_INSTALL_PREFIX=<prefix> .. && make && make install )
 Please refer to the packages' documentation for more information about
 the installation process. In particular, there's more information on
 required and optional `dependencies for Lintel
 <https://raw.github.com/dataseries/Lintel/master/doc/dependencies.txt>`_
 and `dependencies for DataSeries
 <https://raw.github.com/dataseries/DataSeries/master/doc/dependencies.txt>`_.
 For users on RedHat-style systems, you'll need the following::
    yum install libxml2-devel boost-devel
 Compiling Bro with DataSeries Support
 -------------------------------------
 Once you have installed DataSeries, Bro's ``configure`` should pick it
 up automatically as long as it finds it in a standard system location.
 Alternatively, you can specify the DataSeries installation prefix
 manually with ``--with-dataseries=<prefix>``. Keep an eye on
 ``configure``'s summary output, if it looks like the following, Bro
 found DataSeries and will compile in the support::
    # ./configure --with-dataseries=/usr/local
    [...]
    ====================|  Bro Build Summary  |=====================
    [...]
    DataSeries:        true
    [...]
    ================================================================
 Activating DataSeries
 ---------------------
 The direct way to use DataSeries is to switch *all* log files over to
 the binary format. To do that, just add ``redef
 Log::default_writer=Log::WRITER_DATASERIES;`` to your ``local.bro``.
 For testing, you can also just pass that on the command line::
    bro -r trace.pcap Log::default_writer=Log::WRITER_DATASERIES
 With that, Bro will now write all its output into DataSeries files
 ``*.ds``. You can inspect these using DataSeries's set of command line
 tools, which its installation process installs into ``<prefix>/bin``.
 For example, to convert a file back into an ASCII representation::
    $ ds2txt conn.log
    [... We skip a bunch of metadata here ...]
    ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto service duration orig_bytes resp_bytes conn_state local_orig missed_bytes history orig_pkts orig_ip_bytes resp_pkts resp_ip_bytes
    1300475167.096535 CRCC5OdDlXe 141.142.220.202 5353 224.0.0.251 5353 udp dns 0.000000 0 0 S0 F 0 D 1 73 0 0
    1300475167.097012 o7XBsfvo3U1 fe80::217:f2ff:fed7:cf65 5353 ff02::fb 5353 udp  0.000000 0 0 S0 F 0 D 1 199 0 0
    1300475167.099816 pXPi1kPMgxb 141.142.220.50 5353 224.0.0.251 5353 udp  0.000000 0 0 S0 F 0 D 1 179 0 0
    1300475168.853899 R7sOc16woCj 141.142.220.118 43927 141.142.2.2 53 udp dns 0.000435 38 89 SF F 0 Dd 1 66 1 117
    1300475168.854378 Z6dfHVmt0X7 141.142.220.118 37676 141.142.2.2 53 udp dns 0.000420 52 99 SF F 0 Dd 1 80 1 127
    1300475168.854837 k6T92WxgNAh 141.142.220.118 40526 141.142.2.2 53 udp dns 0.000392 38 183 SF F 0 Dd 1 66 1 211
    [...]
 (``--skip-all`` suppresses the metadata.)
 Note that the ASCII conversion is *not* equivalent to Bro's default
 output format.
 You can also switch only individual files over to DataSeries by adding
 code like this to your ``local.bro``:
 .. code:: bro
    event bro_init()
        {
        local f = Log::get_filter(Conn::LOG, "default"); # Get default filter for connection log.
        f$writer = Log::WRITER_DATASERIES;               # Change writer type.
        Log::add_filter(Conn::LOG, f);                   # Replace filter with adapted version.
        }
 Bro's DataSeries writer comes with a few tuning options, see
 :doc:`scripts/base/frameworks/logging/writers/dataseries`.
 Working with DataSeries
 =======================
 Here are a few examples of using DataSeries command line tools to work
 with the output files.
 * Printing CSV::
    $ ds2txt --csv conn.log
    ts,uid,id.orig_h,id.orig_p,id.resp_h,id.resp_p,proto,service,duration,orig_bytes,resp_bytes,conn_state,local_orig,missed_bytes,history,orig_pkts,orig_ip_bytes,resp_pkts,resp_ip_bytes
    1258790493.773208,ZTtgbHvf4s3,192.168.1.104,137,192.168.1.255,137,udp,dns,3.748891,350,0,S0,F,0,D,7,546,0,0
    1258790451.402091,pOY6Rw7lhUd,192.168.1.106,138,192.168.1.255,138,udp,,0.000000,0,0,S0,F,0,D,1,229,0,0
    1258790493.787448,pn5IiEslca9,192.168.1.104,138,192.168.1.255,138,udp,,2.243339,348,0,S0,F,0,D,2,404,0,0
    1258790615.268111,D9slyIu3hFj,192.168.1.106,137,192.168.1.255,137,udp,dns,3.764626,350,0,S0,F,0,D,7,546,0,0
    [...]
  Add ``--separator=X`` to set a different separator.
 * Extracting a subset of columns::
    $ ds2txt --select '*' ts,id.resp_h,id.resp_p --skip-all conn.log
    1258790493.773208 192.168.1.255 137
    1258790451.402091 192.168.1.255 138
    1258790493.787448 192.168.1.255 138
    1258790615.268111 192.168.1.255 137
    1258790615.289842 192.168.1.255 138
    [...]
 * Filtering rows::
    $ ds2txt --where '*' 'duration > 5 && id.resp_p > 1024' --skip-all  conn.ds
    1258790631.532888 V8mV5WLITu5 192.168.1.105 55890 239.255.255.250 1900 udp  15.004568 798 0 S0 F 0 D 6 966 0 0
    1258792413.439596 tMcWVWQptvd 192.168.1.105 55890 239.255.255.250 1900 udp  15.004581 798 0 S0 F 0 D 6 966 0 0
    1258794195.346127 cQwQMRdBrKa 192.168.1.105 55890 239.255.255.250 1900 udp  15.005071 798 0 S0 F 0 D 6 966 0 0
    1258795977.253200 i8TEjhWd2W8 192.168.1.105 55890 239.255.255.250 1900 udp  15.004824 798 0 S0 F 0 D 6 966 0 0
    1258797759.160217 MsLsBA8Ia49 192.168.1.105 55890 239.255.255.250 1900 udp  15.005078 798 0 S0 F 0 D 6 966 0 0
    1258799541.068452 TsOxRWJRGwf 192.168.1.105 55890 239.255.255.250 1900 udp  15.004082 798 0 S0 F 0 D 6 966 0 0
    [...]
 * Calculate some statistics:
    Mean/stddev/min/max over a column::
        $ dsstatgroupby '*' basic duration from conn.ds
        # Begin DSStatGroupByModule
        # processed 2159 rows, where clause eliminated 0 rows
        # count(*), mean(duration), stddev, min, max
        2159, 42.7938, 1858.34, 0, 86370
        [...]
    Quantiles of total connection volume::
        $ dsstatgroupby '*' quantile 'orig_bytes + resp_bytes' from conn.ds
        [...]
        2159 data points, mean 24616 +- 343295 [0,1.26615e+07]
        quantiles about every 216 data points:
        10%: 0, 124, 317, 348, 350, 350, 601, 798, 1469
        tails: 90%: 1469, 95%: 7302, 99%: 242629, 99.5%: 1226262
        [...]
 The ``man`` pages for these tools show further options, and their
 ``-h`` option gives some more information (either can be a bit cryptic
 unfortunately though).
 Deficiencies
 ------------
 Due to limitations of the DataSeries format, one cannot inspect its
 files before they have been fully written. In other words, when using
 DataSeries, it's currently not possible to inspect the live log
 files inside the spool directory before they are rotated to their
 final location. It seems that this could be fixed with some effort,
 and we will work with DataSeries development team on that if the
 format gains traction among Bro users.
 Likewise, we're considering writing custom command line tools for
 interacting with DataSeries files, making that a bit more convenient
 than what the standard utilities provide.
--- a/doc/frameworks/logging-elasticsearch.rst
+++ b/doc/frameworks/logging-elasticsearch.rst
@ -0,0 +1,89 @@
 =========================================
 Indexed Logging Output with ElasticSearch
 =========================================
 .. rst-class:: opening
   Bro's default ASCII log format is not exactly the most efficient
   way for searching large volumes of data. ElasticSearch
   is a new data storage technology for dealing with tons of data.
   It's also a search engine built on top of Apache's Lucene
   project. It scales very well, both for distributed indexing and 
   distributed searching.
 .. contents::
 Warning
 -------
 This writer plugin is still in testing and is not yet recommended for
 production use!  The approach to how logs are handled in the plugin is "fire
 and forget" at this time, there is no error handling if the server fails to
 respond successfully to the insertion request.
 Installing ElasticSearch
 ------------------------
 Download the latest version from: <http://www.elasticsearch.org/download/>.
 Once extracted, start ElasticSearch with::
 # ./bin/elasticsearch
 For more detailed information, refer to the ElasticSearch installation
 documentation: http://www.elasticsearch.org/guide/reference/setup/installation.html
 Compiling Bro with ElasticSearch Support
 ----------------------------------------
 First, ensure that you have libcurl installed the run configure.::
    # ./configure
    [...]
    ====================|  Bro Build Summary  |=====================
    [...]
    cURL:              true
    [...]
    ElasticSearch:     true
    [...]
    ================================================================
 Activating ElasticSearch
 ------------------------
 The easiest way to enable ElasticSearch output is to load the tuning/logs-to-
 elasticsearch.bro script.  If you are using BroControl, the following line in
 local.bro will enable it.
 .. console::
    @load tuning/logs-to-elasticsearch
 With that, Bro will now write most of its logs into ElasticSearch in addition
 to maintaining the Ascii logs like it would do by default.  That script has
 some tunable options for choosing which logs to send to ElasticSearch, refer
 to the autogenerated script documentation for those options.
 There is an interface being written specifically to integrate with the data
 that Bro outputs into ElasticSearch named Brownian.  It can be found here::
    https://github.com/grigorescu/Brownian
 Tuning
 ------
 A common problem encountered with ElasticSearch is too many files being held
 open.  The ElasticSearch website has some suggestions on how to increase the
 open file limit.
  - http://www.elasticsearch.org/tutorials/2011/04/06/too-many-open-files.html
 TODO
 ----
 Lots.
 - Perform multicast discovery for server.
 - Better error detection.
 - Better defaults (don't index loaded-plugins, for instance).
 - 
--- a/doc/frameworks/logging.rst
+++ b/doc/frameworks/logging.rst
@ -0,0 +1,387 @@
 =================
 Logging Framework
 =================
 .. rst-class:: opening
   Bro comes with a flexible key-value based logging interface that
   allows fine-grained control of what gets logged and how it is
   logged. This document describes how logging can be customized and
   extended.
 .. contents::
 Terminology
 ===========
 Bro's logging interface is built around three main abstractions:
    Log streams
        A stream corresponds to a single log. It defines the set of
        fields that a log consists of with their names and fields.
        Examples are the ``conn`` for recording connection summaries,
        and the ``http`` stream for recording HTTP activity.
    Filters
        Each stream has a set of filters attached to it that determine
        what information gets written out. By default, each stream has
        one default filter that just logs everything directly to disk
        with an automatically generated file name. However, further
        filters can be added to record only a subset, split a stream
        into different outputs, or to even duplicate the log to
        multiple outputs. If all filters are removed from a stream,
        all output is disabled.
    Writers
        A writer defines the actual output format for the information
        being logged. At the moment, Bro comes with only one type of
        writer, which produces tab separated ASCII files. In the 
        future we will add further writers, like for binary output and
        direct logging into a database.
 Basics
 ======
 The data fields that a stream records are defined by a record type
 specified when it is created. Let's look at the script generating Bro's
 connection summaries as an example,
 :doc:`scripts/base/protocols/conn/main`. It defines a record
 :bro:type:`Conn::Info` that lists all the fields that go into
 ``conn.log``, each marked with a ``&log`` attribute indicating that it
 is part of the information written out. To write a log record, the
 script then passes an instance of :bro:type:`Conn::Info` to the logging
 framework's :bro:id:`Log::write` function.
 By default, each stream automatically gets a filter named ``default``
 that generates the normal output by recording all record fields into a
 single output file.
 In the following, we summarize ways in which the logging can be
 customized. We continue using the connection summaries as our example
 to work with.
 Filtering
 ---------
 To create a new output file for an existing stream, you can add a
 new filter. A filter can, e.g., restrict the set of fields being
 logged:
 .. code:: bro
    event bro_init()
        {
        # Add a new filter to the Conn::LOG stream that logs only
        # timestamp and originator address.
        local filter: Log::Filter = [$name="orig-only", $path="origs", $include=set("ts", "id.orig_h")];
        Log::add_filter(Conn::LOG, filter);
        }
 Note the fields that are set for the filter:
    ``name``
        A mandatory name for the filter that can later be used
        to manipulate it further.
    ``path``
        The filename for the output file, without any extension (which
        may be automatically added by the writer). Default path values
        are generated by taking the stream's ID and munging it slightly.
        :bro:enum:`Conn::LOG` is converted into ``conn``,
        :bro:enum:`PacketFilter::LOG` is converted into
        ``packet_filter``, and :bro:enum:`Notice::POLICY_LOG` is
        converted into ``notice_policy``.
    ``include``
        A set limiting the fields to the ones given. The names
        correspond to those in the :bro:type:`Conn::Info` record, with
        sub-records unrolled by concatenating fields (separated with 
        dots).
 Using the code above, you will now get a new log file ``origs.log``
 that looks like this::
    #separator \x09
    #path   origs
    #fields ts      id.orig_h
    #types  time    addr
    1128727430.350788       141.42.64.125
    1128727435.450898       141.42.64.125
 If you want to make this the only log file for the stream, you can
 remove the default filter (which, conveniently, has the name
 ``default``):
 .. code:: bro
    event bro_init()
        {
        # Remove the filter called "default".
        Log::remove_filter(Conn::LOG, "default");
        }
 An alternate approach to "turning off" a log is to completely disable
 the stream:
 .. code:: bro
    event bro_init()
        {
        Log::disable_stream(Conn::LOG);
        }
 If you want to skip only some fields but keep the rest, there is a
 corresponding ``exclude`` filter attribute that you can use instead of
 ``include`` to list only the ones you are not interested in.
 A filter can also determine output paths *dynamically* based on the
 record being logged. That allows, e.g., to record local and remote
 connections into separate files. To do this, you define a function
 that returns the desired path:
 .. code:: bro
    function split_log(id: Log::ID, path: string, rec: Conn::Info) : string
        {
        # Return "conn-local" if originator is a local IP, otherwise "conn-remote".
        local lr = Site::is_local_addr(rec$id$orig_h) ? "local" : "remote";
        return fmt("%s-%s", path, lr);
        }
    event bro_init()
        {
        local filter: Log::Filter = [$name="conn-split", $path_func=split_log, $include=set("ts", "id.orig_h")];
        Log::add_filter(Conn::LOG, filter);
        }   
 Running this will now produce two files, ``local.log`` and
 ``remote.log``, with the corresponding entries. One could extend this
 further for example to log information by subnets or even by IP
 address. Be careful, however, as it is easy to create many files very
 quickly ...
 .. sidebar:: A More Generic Path Function
    The ``split_log`` method has one draw-back: it can be used
    only with the :bro:enum:`Conn::LOG` stream as the record type is hardcoded
    into its argument list. However, Bro allows to do a more generic
    variant:
    .. code:: bro
        function split_log(id: Log::ID, path: string, rec: record { id: conn_id; } ) : string
            {
            return Site::is_local_addr(rec$id$orig_h) ? "local" : "remote";
            }
    This function can be used with all log streams that have records
    containing an ``id: conn_id`` field.
 While so far we have seen how to customize the columns being logged,
 you can also control which records are written out by providing a
 predicate that will be called for each log record:
 .. code:: bro
    function http_only(rec: Conn::Info) : bool
        {
        # Record only connections with successfully analyzed HTTP traffic
        return rec$service == "http";
        }
    event bro_init()
        {
        local filter: Log::Filter = [$name="http-only", $path="conn-http", $pred=http_only];
        Log::add_filter(Conn::LOG, filter);
        }
 This will result in a log file ``conn-http.log`` that contains only
 traffic detected and analyzed as HTTP traffic.
 Extending
 ---------
 You can add further fields to a log stream by extending the record
 type that defines its content. Let's say we want to add a boolean
 field ``is_private`` to :bro:type:`Conn::Info` that indicates whether the
 originator IP address is part of the :rfc:`1918` space:
 .. code:: bro
    # Add a field to the connection log record.
    redef record Conn::Info += {
        ## Indicate if the originator of the connection is part of the
        ## "private" address space defined in RFC1918.
        is_private: bool &default=F &log;
    };
 Now we need to set the field. A connection's summary is generated at
 the time its state is removed from memory. We can add another handler
 at that time that sets our field correctly:
 .. code:: bro
    event connection_state_remove(c: connection)
        {
        if ( c$id$orig_h in Site::private_address_space )
            c$conn$is_private = T;
        }
 Now ``conn.log`` will show a new field ``is_private`` of type
 ``bool``.
 Notes:
 - For extending logs this way, one needs a bit of knowledge about how
  the script that creates the log stream is organizing its state
  keeping. Most of the standard Bro scripts attach their log state to
  the :bro:type:`connection` record where it can then be accessed, just
  as the ``c$conn`` above. For example, the HTTP analysis adds a field
  ``http`` of type :bro:type:`HTTP::Info` to the :bro:type:`connection`
  record. See the script reference for more information.
 - When extending records as shown above, the new fields must always be
  declared either with a ``&default`` value or as ``&optional``.
  Furthermore, you need to add the ``&log`` attribute or otherwise the
  field won't appear in the output.
 Hooking into the Logging
 ------------------------
 Sometimes it is helpful to do additional analysis of the information
 being logged. For these cases, a stream can specify an event that will
 be generated every time a log record is written to it. All of Bro's
 default log streams define such an event. For example, the connection
 log stream raises the event :bro:id:`Conn::log_conn`. You
 could use that for example for flagging when a connection to a
 specific destination exceeds a certain duration:
 .. code:: bro
    redef enum Notice::Type += {
        ## Indicates that a connection remained established longer 
        ## than 5 minutes.
        Long_Conn_Found
    };
    event Conn::log_conn(rec: Conn::Info)
        {
        if ( rec$duration > 5mins )
            NOTICE([$note=Long_Conn_Found, 
                    $msg=fmt("unusually long conn to %s", rec$id$resp_h), 
                    $id=rec$id]);
        }
 Often, these events can be an alternative to post-processing Bro logs
 externally with Perl scripts. Much of what such an external script
 would do later offline, one may instead do directly inside of Bro in
 real-time.
 Rotation
 --------
 By default, no log rotation occurs, but it's globally controllable for all
 filters by redefining the :bro:id:`Log::default_rotation_interval` option:
 .. code:: bro
    redef Log::default_rotation_interval = 1 hr;
 Or specifically for certain :bro:type:`Log::Filter` instances by setting
 their ``interv`` field.  Here's an example of changing just the
 :bro:enum:`Conn::LOG` stream's default filter rotation.
 .. code:: bro
    event bro_init()
        {
        local f = Log::get_filter(Conn::LOG, "default");
        f$interv = 1 min;
        Log::remove_filter(Conn::LOG, "default");
        Log::add_filter(Conn::LOG, f);
        }
 ASCII Writer Configuration
 --------------------------
 The ASCII writer has a number of options for customizing the format of
 its output, see :doc:`scripts/base/frameworks/logging/writers/ascii`.
 Adding Streams
 ==============
 It's easy to create a new log stream for custom scripts. Here's an
 example for the ``Foo`` module:
 .. code:: bro
    module Foo;
    export {
        # Create an ID for our new stream. By convention, this is
        # called "LOG".
        redef enum Log::ID += { LOG };
        # Define the fields. By convention, the type is called "Info".
        type Info: record {
            ts: time     &log;
            id: conn_id  &log;
        };
        # Define a hook event. By convention, this is called
        # "log_<stream>".
        global log_foo: event(rec: Info);
    }
    # This event should be handled at a higher priority so that when
    # users modify your stream later and they do it at priority 0, 
    # their code runs after this.
    event bro_init() &priority=5
        {
        # Create the stream. This also adds a default filter automatically.
        Log::create_stream(Foo::LOG, [$columns=Info, $ev=log_foo]);
        }
 You can also add the state to the :bro:type:`connection` record to make
 it easily accessible across event handlers:
 .. code:: bro
    redef record connection += {
        foo: Info &optional;
        }
 Now you can use the :bro:id:`Log::write` method to output log records and 
 save the logged ``Foo::Info`` record into the connection record:
 .. code:: bro
    event connection_established(c: connection)
        {
        local rec: Foo::Info = [$ts=network_time(), $id=c$id];
        c$foo = rec;
        Log::write(Foo::LOG, rec);
        }
 See the existing scripts for how to work with such a new connection
 field. A simple example is :doc:`scripts/base/protocols/syslog/main`. 
 When you are developing scripts that add data to the :bro:type:`connection`
 record, care must be given to when and how long data is stored.
 Normally data saved to the connection record will remain there for the
 duration of the connection and from a practical perspective it's not
 uncommon to need to delete that data before the end of the connection.
 Other Writers
 -------------
 Bro supports the following output formats other than ASCII:
 .. toctree::
   :maxdepth: 1
   logging-dataseries
   logging-elasticsearch
--- a/doc/frameworks/notice.rst
+++ b/doc/frameworks/notice.rst
@ -0,0 +1,357 @@
 ================
 Notice Framework
 ================
 .. rst-class:: opening
    One of the easiest ways to customize Bro is writing a local notice
    policy. Bro can detect a large number of potentially interesting
    situations, and the notice policy hook which of them the user wants to be
    acted upon in some manner. In particular, the notice policy can specify
    actions to be taken, such as sending an email or compiling regular
    alarm emails.  This page gives an introduction into writing such a notice
    policy.
 .. contents::
 Overview
 --------
 Let's start with a little bit of background on Bro's philosophy on reporting
 things. Bro ships with a large number of policy scripts which perform a wide
 variety of analyses. Most of these scripts monitor for activity which might be
 of interest for the user. However, none of these scripts determines the
 importance of what it finds itself. Instead, the scripts only flag situations
 as *potentially* interesting, leaving it to the local configuration to define
 which of them are in fact actionable. This decoupling of detection and
 reporting allows Bro to address the different needs that sites have.
 Definitions of what constitutes an attack or even a compromise differ quite a
 bit between environments, and activity deemed malicious at one site might be
 fully acceptable at another.
 Whenever one of Bro's analysis scripts sees something potentially
 interesting it flags the situation by calling the :bro:see:`NOTICE`
 function and giving it a single :bro:see:`Notice::Info` record. A Notice
 has a :bro:see:`Notice::Type`, which reflects the kind of activity that
 has been seen, and it is usually also augmented with further context
 about the situation.
 More information about raising notices can be found in the `Raising Notices`_
 section.
 Once a notice is raised, it can have any number of actions applied to it by
 writing :bro:see:`Notice::policy` hooks which is described in the `Notice Policy`_
 section below. Such actions can be to send a mail to the configured
 address(es) or to simply ignore the notice. Currently, the following actions
 are defined:
 .. list-table::
    :widths: 20 80
    :header-rows: 1
    * - Action
      - Description
    * - Notice::ACTION_LOG
      - Write the notice to the :bro:see:`Notice::LOG` logging stream.
    * - Notice::ACTION_ALARM
      - Log into the :bro:see:`Notice::ALARM_LOG` stream which will rotate
        hourly and email the contents to the email address or addresses
        defined in the :bro:see:`Notice::mail_dest` variable.
    * - Notice::ACTION_EMAIL
      - Send the notice in an email to the email address or addresses given in
        the :bro:see:`Notice::mail_dest` variable.
    * - Notice::ACTION_PAGE
      - Send an email to the email address or addresses given in the
        :bro:see:`Notice::mail_page_dest` variable.
 How these notice actions are applied to notices is discussed in the
 `Notice Policy`_ and `Notice Policy Shortcuts`_ sections.
 Processing Notices
 ------------------
 Notice Policy
 *************
 The hook :bro:see:`Notice::policy` provides the mechanism for applying
 actions and generally modifying the notice before it's sent onward to
 the action plugins.  Hooks can be thought of as multi-bodied functions
 and using them looks very similar to handling events.  The difference
 is that they don't go through the event queue like events.  Users should
 directly make modifications to the :bro:see:`Notice::Info` record
 given as the argument to the hook.
 Here's a simple example which tells Bro to send an email for all notices of
 type :bro:see:`SSH::Login` if the server is 10.0.0.1:
 .. code:: bro
    hook Notice::policy(n: Notice::Info)
      {
      if ( n$note == SSH::Login && n$id$resp_h == 10.0.0.1 )
        add n$actions[Notice::ACTION_EMAIL];
      }
 .. note::
    Keep in mind that the semantics of the SSH::Login notice are
    such that it is only raised when Bro heuristically detects a successful
    login. No apparently failed logins will raise this notice.
 Hooks can also have priorities applied to order their execution like events
 with a default priority of 0.  Greater values are executed first.  Setting
 a hook body to run before default hook bodies might look like this:
 .. code:: bro
    hook Notice::policy(n: Notice::Info) &priority=5
      {
      if ( n$note == SSH::Login && n$id$resp_h == 10.0.0.1 )
        add n$actions[Notice::ACTION_EMAIL];
      }
 Hooks can also abort later hook bodies with the ``break`` keyword. This
 is primarily useful if one wants to completely preempt processing by
 lower priority :bro:see:`Notice::policy` hooks.
 Notice Policy Shortcuts
 ***********************
 Although the notice framework provides a great deal of flexibility and
 configurability there are many times that the full expressiveness isn't needed
 and actually becomes a hindrance to achieving results. The framework provides
 a default :bro:see:`Notice::policy` hook body as a way of giving users the
 shortcuts to easily apply many common actions to notices.
 These are implemented as sets and tables indexed with a
 :bro:see:`Notice::Type` enum value. The following table shows and describes
 all of the variables available for shortcut configuration of the notice
 framework.
 .. list-table::
    :widths: 32 40
    :header-rows: 1
    * - Variable name
      - Description
    * - :bro:see:`Notice::ignored_types`
      - Adding a :bro:see:`Notice::Type` to this set results in the notice
        being ignored. It won't have any other action applied to it, not even
        :bro:see:`Notice::ACTION_LOG`.
    * - :bro:see:`Notice::emailed_types`
      - Adding a :bro:see:`Notice::Type` to this set results in
        :bro:see:`Notice::ACTION_EMAIL` being applied to the notices of
        that type.
    * - :bro:see:`Notice::alarmed_types`
      - Adding a :bro:see:`Notice::Type` to this set results in
        :bro:see:`Notice::ACTION_ALARM` being applied to the notices of
        that type.
    * - :bro:see:`Notice::not_suppressed_types`
      - Adding a :bro:see:`Notice::Type` to this set results in that notice
        no longer undergoing the normal notice suppression that would
        take place. Be careful when using this in production it could
        result in a dramatic increase in the number of notices being
        processed.
    * - :bro:see:`Notice::type_suppression_intervals`
      - This is a table indexed on :bro:see:`Notice::Type` and yielding an
        interval.  It can be used as an easy way to extend the default
        suppression interval for an entire :bro:see:`Notice::Type`
        without having to create a whole :bro:see:`Notice::policy` entry
        and setting the ``$suppress_for`` field.
 Raising Notices
 ---------------
 A script should raise a notice for any occurrence that a user may want
 to be notified about or take action on. For example, whenever the base
 SSH analysis scripts sees an SSH session where it is heuristically
 guessed to be a successful login, it raises a Notice of the type
 :bro:see:`SSH::Login`. The code in the base SSH analysis script looks
 like this:
 .. code:: bro
    NOTICE([$note=SSH::Login,
            $msg="Heuristically detected successful SSH login.",
            $conn=c]);
 :bro:see:`NOTICE` is a normal function in the global namespace which
 wraps a function within the ``Notice`` namespace. It takes a single
 argument of the :bro:see:`Notice::Info` record type. The most common
 fields used when raising notices are described in the following table:
 .. list-table::
    :widths: 32 40
    :header-rows: 1
    * - Field name
      - Description
    * - ``$note``
      - This field is required and is an enum value which represents the
        notice type.
    * - ``$msg``
      - This is a human readable message which is meant to provide more
        information about this particular instance of the notice type.
    * - ``$sub``
      - This is a sub-message meant for human readability but will
        frequently also be used to contain data meant to be matched with the
        ``Notice::policy``.
    * - ``$conn``
      - If a connection record is available when the notice is being raised
        and the notice represents some attribute of the connection, then the
        connection record can be given here. Other fields such as ``$id`` and
        ``$src`` will automatically be populated from this value.
    * - ``$id``
      - If a conn_id record is available when the notice is being raised and
        the notice represents some attribute of the connection, then the
        connection can be given here. Other fields such as ``$src`` will
        automatically be populated from this value.
    * - ``$src``
      - If the notice represents an attribute of a single host then it's
        possible that only this field should be filled out to represent the
        host that is being "noticed".
    * - ``$n``
      - This normally represents a number if the notice has to do with some
        number. It's most frequently used for numeric tests in the
        ``Notice::policy`` for making policy decisions.
    * - ``$identifier``
      - This represents a unique identifier for this notice. This field is
        described in more detail in the `Automated Suppression`_ section.
    * - ``$suppress_for``
      - This field can be set if there is a natural suppression interval for
        the notice that may be different than the default value. The
        value set to this field can also be modified by a user's
        :bro:see:`Notice::policy` so the value is not set permanently
        and unchangeably.
 When writing Bro scripts which raise notices, some thought should be given to
 what the notice represents and what data should be provided to give a consumer
 of the notice the best information about the notice. If the notice is
 representative of many connections and is an attribute of a host (e.g. a
 scanning host) it probably makes most sense to fill out the ``$src`` field and
 not give a connection or conn_id. If a notice is representative of a
 connection attribute (e.g. an apparent SSH login) then it makes sense to fill
 out either ``$conn`` or ``$id`` based on the data that is available when the
 notice is raised. Using care when inserting data into a notice will make later
 analysis easier when only the data to fully represent the occurrence that
 raised the notice is available. If complete connection information is
 available when an SSL server certificate is expiring, the logs will be very
 confusing because the connection that the certificate was detected on is a
 side topic to the fact that an expired certificate was detected. It's possible
 in many cases that two or more separate notices may need to be generated. As
 an example, one could be for the detection of the expired SSL certificate and
 another could be for if the client decided to go ahead with the connection
 neglecting the expired certificate.
 Automated Suppression
 ---------------------
 The notice framework supports suppression for notices if the author of the
 script that is generating the notice has indicated to the notice framework how
 to identify notices that are intrinsically the same. Identification of these
 "intrinsically duplicate" notices is implemented with an optional field in
 :bro:see:`Notice::Info` records named ``$identifier`` which is a simple string.
 If the ``$identifier`` and ``$type`` fields are the same for two notices, the
 notice framework actually considers them to be the same thing and can use that
 information to suppress duplicates for a configurable period of time.
 .. note::
    If the ``$identifier`` is left out of a notice, no notice suppression
    takes place due to the framework's inability to identify duplicates. This
    could be completely legitimate usage if no notices could ever be
    considered to be duplicates.
 The ``$identifier`` field is typically comprised of several pieces of
 data related to the notice that when combined represent a unique
 instance of that notice. Here is an example of the script
 :doc:`scripts/policy/protocols/ssl/validate-certs` raising a notice
 for session negotiations where the certificate or certificate chain did
 not validate successfully against the available certificate authority
 certificates.
 .. code:: bro
    NOTICE([$note=SSL::Invalid_Server_Cert,
            $msg=fmt("SSL certificate validation failed with (%s)", c$ssl$validation_status),
            $sub=c$ssl$subject,
            $conn=c,
            $identifier=cat(c$id$resp_h,c$id$resp_p,c$ssl$validation_status,c$ssl$cert_hash)]);
 In the above example you can see that the ``$identifier`` field contains a
 string that is built from the responder IP address and port, the validation
 status message, and the MD5 sum of the server certificate. Those fields in
 particular are chosen because different SSL certificates could be seen on any
 port of a host, certificates could fail validation for different reasons, and
 multiple server certificates could be used on that combination of IP address
 and port with the ``server_name`` SSL extension (explaining the addition of
 the MD5 sum of the certificate). The result is that if a certificate fails
 validation and all four pieces of data match (IP address, port, validation
 status, and certificate hash) that particular notice won't be raised again for
 the default suppression period.
 Setting the ``$identifier`` field is left to those raising notices because
 it's assumed that the script author who is raising the notice understands the
 full problem set and edge cases of the notice which may not be readily
 apparent to users. If users don't want the suppression to take place or simply
 want a different interval, they can set a notice's suppression
 interval to ``0secs`` or delete the value from the ``$identifier`` field in
 a :bro:see:`Notice::policy` hook.
 Extending Notice Framework
 --------------------------
 There are a couple of mechanism currently for extending the notice framework
 and adding new capability.
 Extending Notice Emails
 ***********************
 If there is extra information that you would like to add to emails, that is
 possible to add by writing :bro:see:`Notice::policy` hooks.
 There is a field in the :bro:see:`Notice::Info` record named
 ``$email_body_sections`` which will be included verbatim when email is being
 sent. An example of including some information from an HTTP request is
 included below.
 .. code:: bro
    hook Notice::policy(n: Notice::Info)
      {
      if ( n?$conn && n$conn?$http && n$conn$http?$host )
        n$email_body_sections[|email_body_sections|] = fmt("HTTP host header: %s", n$conn$http$host);
      }
 Cluster Considerations
 ----------------------
 As a user/developer of Bro, the main cluster concern with the notice framework
 is understanding what runs where. When a notice is generated on a worker, the
 worker checks to see if the notice shoudl be suppressed based on information
 locally maintained in the worker process. If it's not being
 suppressed, the worker forwards the notice directly to the manager and does no more
 local processing. The manager then runs the :bro:see:`Notice::policy` hook and
 executes all of the actions determined to be run.
--- a/doc/frameworks/signatures.rst
+++ b/doc/frameworks/signatures.rst
@ -0,0 +1,394 @@
 ===================
 Signature Framework
 ===================
 .. rst-class:: opening
    Bro relies primarily on its extensive scripting language for 
    defining and analyzing detection policies. In addition, however,
    Bro also provides an independent *signature language* for doing
    low-level, Snort-style pattern matching. While signatures are
    *not* Bro's preferred detection tool, they sometimes come in handy
    and are closer to what many people are familiar with from using
    other NIDS. This page gives a brief overview on Bro's signatures
    and covers some of their technical subtleties.
 .. contents::
    :depth: 2
 Basics
 ======
 Let's look at an example signature first:
 .. code:: bro-sig
    signature my-first-sig {
        ip-proto == tcp
        dst-port == 80
        payload /.*root/
        event "Found root!"
    }
 This signature asks Bro to match the regular expression ``.*root`` on
 all TCP connections going to port 80. When the signature triggers, Bro
 will raise an event :bro:id:`signature_match` of the form:
 .. code:: bro
    event signature_match(state: signature_state, msg: string, data: string)
 Here, ``state`` contains more information on the connection that
 triggered the match, ``msg`` is the string specified by the
 signature's event statement (``Found root!``), and data is the last
 piece of payload which triggered the pattern match.
 To turn such :bro:id:`signature_match` events into actual alarms, you can
 load Bro's :doc:`/scripts/base/frameworks/signatures/main` script.
 This script contains a default event handler that raises
 :bro:enum:`Signatures::Sensitive_Signature` :doc:`Notices <notice>`
 (as well as others; see the beginning of the script).
 As signatures are independent of Bro's policy scripts, they are put into
 their own file(s). There are three ways to specify which files contain
 signatures: By using the ``-s`` flag when you invoke Bro, or by
 extending the Bro variable :bro:id:`signature_files` using the ``+=``
 operator, or by using the ``@load-sigs`` directive inside a Bro script.
 If a signature file is given without a full path, it is searched for
 along the normal ``BROPATH``.  Additionally, the ``@load-sigs``
 directive can be used to load signature files in a path relative to the
 Bro script in which it's placed, e.g. ``@load-sigs ./mysigs.sig`` will
 expect that signature file in the same directory as the Bro script. The
 default extension of the file name is ``.sig``, and Bro appends that
 automatically when necessary.
 Signature language
 ==================
 Let's look at the format of a signature more closely. Each individual
 signature has the format ``signature <id> { <attributes> }``. ``<id>``
 is a unique label for the signature. There are two types of
 attributes: *conditions* and *actions*. The conditions define when the
 signature matches, while the actions declare what to do in the case of
 a match. Conditions can be further divided into four types: *header*,
 *content*, *dependency*, and *context*. We discuss these all in more
 detail in the following.
 Conditions
 ----------
 Header Conditions
 ~~~~~~~~~~~~~~~~~
 Header conditions limit the applicability of the signature to a subset
 of traffic that contains matching packet headers.  This type of matching
 is performed only for the first packet of a connection.
 There are pre-defined header conditions for some of the most used
 header fields. All of them generally have the format ``<keyword> <cmp>
 <value-list>``, where ``<keyword>`` names the header field; ``cmp`` is
 one of ``==``, ``!=``, ``<``, ``<=``, ``>``, ``>=``; and
 ``<value-list>`` is a list of comma-separated values to compare
 against. The following keywords are defined:
 ``src-ip``/``dst-ip <cmp> <address-list>``
    Source and destination address, respectively. Addresses can be given
    as IPv4 or IPv6 addresses or CIDR masks.  For IPv6 addresses/masks
    the colon-hexadecimal representation of the address must be enclosed
    in square brackets (e.g. ``[fe80::1]`` or ``[fe80::0]/16``).
 ``src-port``/``dst-port <cmp> <int-list>``
    Source and destination port, respectively.
 ``ip-proto <cmp> tcp|udp|icmp|icmp6|ip|ip6``
    IPv4 header's Protocol field or the Next Header field of the final
    IPv6 header (i.e. either Next Header field in the fixed IPv6 header
    if no extension headers are present or that field from the last
    extension header in the chain).  Note that the IP-in-IP forms of
    tunneling are automatically decapsulated by default and signatures
    apply to only the inner-most packet, so specifying ``ip`` or ``ip6``
    is a no-op.
 For lists of multiple values, they are sequentially compared against
 the corresponding header field. If at least one of the comparisons
 evaluates to true, the whole header condition matches (exception: with
 ``!=``, the header condition only matches if all values differ).
 In addition to these pre-defined header keywords, a general header
 condition can be defined either as
 .. code:: bro-sig
    header <proto>[<offset>:<size>] [& <integer>] <cmp> <value-list>
 This compares the value found at the given position of the packet header
 with a list of values. ``offset`` defines the position of the value
 within the header of the protocol defined by ``proto`` (which can be
 ``ip``, ``ip6``, ``tcp``, ``udp``, ``icmp`` or ``icmp6``). ``size`` is
 either 1, 2, or 4 and specifies the value to have a size of this many
 bytes. If the optional ``& <integer>`` is given, the packet's value is
 first masked with the integer before it is compared to the value-list.
 ``cmp`` is one of ``==``, ``!=``, ``<``, ``<=``, ``>``, ``>=``.
 ``value-list`` is a list of comma-separated integers similar to those
 described above.  The integers within the list may be followed by an
 additional ``/ mask`` where ``mask`` is a value from 0 to 32. This
 corresponds to the CIDR notation for netmasks and is translated into a
 corresponding bitmask applied to the packet's value prior to the
 comparison (similar to the optional ``& integer``).  IPv6 address values
 are not allowed in the value-list, though you can still inspect any 1,
 2, or 4 byte section of an IPv6 header using this keyword.
 Putting it all together, this is an example condition that is
 equivalent to ``dst-ip == 1.2.3.4/16, 5.6.7.8/24``:
 .. code:: bro-sig
    header ip[16:4] == 1.2.3.4/16, 5.6.7.8/24
 Note that the analogous example for IPv6 isn't currently possible since
 4 bytes is the max width of a value that can be compared.
 Content Conditions
 ~~~~~~~~~~~~~~~~~~
 Content conditions are defined by regular expressions. We
 differentiate two kinds of content conditions: first, the expression
 may be declared with the ``payload`` statement, in which case it is
 matched against the raw payload of a connection (for reassembled TCP
 streams) or of each packet (for ICMP, UDP, and non-reassembled TCP).
 Second, it may be prefixed with an analyzer-specific label, in which
 case the expression is matched against the data as extracted by the
 corresponding analyzer.
 A ``payload`` condition has the form:
 .. code:: bro-sig
    payload /<regular expression>/
 Currently, the following analyzer-specific content conditions are
 defined (note that the corresponding analyzer has to be activated by
 loading its policy script):
 ``http-request /<regular expression>/``
    The regular expression is matched against decoded URIs of HTTP
    requests. Obsolete alias: ``http``.
 ``http-request-header /<regular expression>/``
    The regular expression is matched against client-side HTTP headers.
 ``http-request-body /<regular expression>/``
    The regular expression is matched against client-side bodys of
    HTTP requests.
 ``http-reply-header /<regular expression>/``
    The regular expression is matched against server-side HTTP headers.
 ``http-reply-body /<regular expression>/``
    The regular expression is matched against server-side bodys of
    HTTP replys.
 ``ftp /<regular expression>/``
    The regular expression is matched against the command line input
    of FTP sessions.
 ``finger /<regular expression>/``
    The regular expression is matched against finger requests.
 For example, ``http-request /.*(etc/(passwd|shadow)/`` matches any URI
 containing either ``etc/passwd`` or ``etc/shadow``. To filter on request
 types, e.g. ``GET``, use ``payload /GET /``.
 Note that HTTP pipelining (that is, multiple HTTP transactions in a
 single TCP connection) has some side effects on signature matches. If
 multiple conditions are specified within a single signature, this
 signature matches if all conditions are met by any HTTP transaction
 (not necessarily always the same!) in a pipelined connection.
 Dependency Conditions
 ~~~~~~~~~~~~~~~~~~~~~
 To define dependencies between signatures, there are two conditions:
 ``requires-signature [!] <id>``
    Defines the current signature to match only if the signature given
    by ``id`` matches for the same connection. Using ``!`` negates the
    condition: The current signature only matches if ``id`` does not
    match for the same connection (using this defers the match
    decision until the connection terminates).
 ``requires-reverse-signature [!] <id>``
    Similar to ``requires-signature``, but ``id`` has to match for the
    opposite direction of the same connection, compared to the current
    signature. This allows to model the notion of requests and
    replies.
 Context Conditions
 ~~~~~~~~~~~~~~~~~~
 Context conditions pass the match decision on to other components of
 Bro. They are only evaluated if all other conditions have already
 matched. The following context conditions are defined:
 ``eval <policy-function>``
    The given policy function is called and has to return a boolean
    confirming the match. If false is returned, no signature match is
    going to be triggered. The function has to be of type ``function
    cond(state: signature_state, data: string): bool``. Here,
    ``data`` may contain the most recent content chunk available at
    the time the signature was matched. If no such chunk is available,
    ``data`` will be the empty string. See :bro:type:`signature_state`
    for its definition.
 ``payload-size <cmp> <integer>``
    Compares the integer to the size of the payload of a packet. For
    reassembled TCP streams, the integer is compared to the size of
    the first in-order payload chunk. Note that the latter is not very
    well defined.
 ``same-ip``
    Evaluates to true if the source address of the IP packets equals
    its destination address.
 ``tcp-state <state-list>``
    Imposes restrictions on the current TCP state of the connection.
    ``state-list`` is a comma-separated list of the keywords
    ``established`` (the three-way handshake has already been
    performed), ``originator`` (the current data is send by the
    originator of the connection), and ``responder`` (the current data
    is send by the responder of the connection).
 Actions
 -------
 Actions define what to do if a signature matches. Currently, there are
 two actions defined:
 ``event <string>``
    Raises a :bro:id:`signature_match` event. The event handler has the
    following type:
    .. code:: bro
        event signature_match(state: signature_state, msg: string, data: string)
    The given string is passed in as ``msg``, and data is the current
    part of the payload that has eventually lead to the signature
    match (this may be empty for signatures without content
    conditions).
 ``enable <string>``
    Enables the protocol analyzer ``<string>`` for the matching
    connection (``"http"``, ``"ftp"``, etc.). This is used by Bro's
    dynamic protocol detection to activate analyzers on the fly.
 Things to keep in mind when writing signatures
 ==============================================
 * Each signature is reported at most once for every connection,
  further matches of the same signature are ignored.
 * The content conditions perform pattern matching on elements
  extracted from an application protocol dialogue. For example, ``http
  /.*passwd/`` scans URLs requested within HTTP sessions. The thing to
  keep in mind here is that these conditions only perform any matching
  when the corresponding application analyzer is actually *active* for
  a connection. Note that by default, analyzers are not enabled if the
  corresponding Bro script has not been loaded. A good way to
  double-check whether an analyzer "sees" a connection is checking its
  log file for corresponding entries. If you cannot find the
  connection in the analyzer's log, very likely the signature engine
  has also not seen any application data.
 * As the name indicates, the ``payload`` keyword matches on packet
  *payload* only. You cannot use it to match on packet headers; use
  the header conditions for that.
 * For TCP connections, header conditions are only evaluated for the
  *first packet from each endpoint*. If a header condition does not
  match the initial packets, the signature will not trigger. Bro
  optimizes for the most common application here, which is header
  conditions selecting the connections to be examined more closely
  with payload statements.
 * For UDP and ICMP flows, the payload matching is done on a per-packet
  basis; i.e., any content crossing packet boundaries will not be
  found. For TCP connections, the matching semantics depend on whether
  Bro is *reassembling* the connection (i.e., putting all of a
  connection's packets in sequence). By default, Bro is reassembling
  the first 1K of every TCP connection, which means that within this
  window, matches will be found without regards to packet order or
  boundaries (i.e., *stream-wise matching*).
 * For performance reasons, by default Bro *stops matching* on a
  connection after seeing 1K of payload; see the section on options
  below for how to change this behaviour. The default was chosen with
  Bro's main user of signatures in mind: dynamic protocol detection
  works well even when examining just connection heads.
 * Regular expressions are implicitly anchored, i.e., they work as if
  prefixed with the ``^`` operator. For reassembled TCP connections,
  they are anchored at the first byte of the payload *stream*. For all
  other connections, they are anchored at the first payload byte of
  each packet. To match at arbitrary positions, you can prefix the
  regular expression with ``.*``, as done in the examples above.
 * To match on non-ASCII characters, Bro's regular expressions support
  the ``\x<hex>`` operator. CRs/LFs are not treated specially by the
  signature engine and can be matched with ``\r`` and ``\n``,
  respectively. Generally, Bro follows `flex's regular expression
  syntax
  <http://flex.sourceforge.net/manual/Patterns.html>`_.
  See the DPD signatures in ``base/frameworks/dpd/dpd.sig`` for some examples
  of fairly complex payload patterns.
 * The data argument of the :bro:id:`signature_match` handler might not carry
  the full text matched by the regular expression. Bro performs the
  matching incrementally as packets come in; when the signature
  eventually fires, it can only pass on the most recent chunk of data.
 Options
 =======
 The following options control details of Bro's matching process:
 ``dpd_reassemble_first_packets: bool`` (default: ``T``)
    If true, Bro reassembles the beginning of every TCP connection (of
    up to ``dpd_buffer_size`` bytes, see below), to facilitate
    reliable matching across packet boundaries. If false, only
    connections are reassembled for which an application-layer
    analyzer gets activated (e.g., by Bro's dynamic protocol
    detection).
 ``dpd_match_only_beginning : bool`` (default: ``T``)
    If true, Bro performs packet matching only within the initial
    payload window of ``dpd_buffer_size``. If false, it keeps matching
    on subsequent payload as well.
 ``dpd_buffer_size: count`` (default: ``1024``)
    Defines the buffer size for the two preceding options. In
    addition, this value determines the amount of bytes Bro buffers
    for each connection in order to activate application analyzers
    even after parts of the payload have already passed through. This
    is needed by the dynamic protocol detection capability to defer
    the decision which analyzers to use.
 So, how about using Snort signatures with Bro?
 ==============================================
 There was once a script, ``snort2bro``, that converted Snort
 signatures automatically into Bro's signature syntax. However, in our
 experience this didn't turn out to be a very useful thing to do
 because by simply using Snort signatures, one can't benefit from the
 additional capabilities that Bro provides; the approaches of the two
 systems are just too different. We therefore stopped maintaining the
 ``snort2bro`` script, and there are now many newer Snort options which
 it doesn't support. The script is now no longer part of the Bro
 distribution.
--- a/doc/index.rst
+++ b/doc/index.rst
@ -5,50 +5,15 @@
 Bro Documentation
 =================
 Guides
 ------
 .. toctree::
-   :maxdepth: 1
+   :maxdepth: 2
   INSTALL
   upgrade
   quickstart
   faq
   reporting-problems
 xFrameworks
 ----------
 .. toctree::
   :maxdepth: 1
   notice
   logging
   input
   cluster
   signatures
 How-Tos
 -------
 .. toctree::
    :maxdepth: 2
    :numbered:
    user-manual/index
    reference/index
 Just Testing
 ============
 .. code:: bro
    print "Hey Bro!"
 .. btest:: test
    @TEST-COPY-FILE: ${TRACES}/wikipedia.trace
    @TEST-EXEC: btest-rst-cmd bro -r wikipedia.trace 
    @TEST-EXEC: btest-rst-cmd "cat http.log | bro-cut ts id.orig_h | head -5"
   intro/index.rst
   using/index.rst
   scripting/index.rst
   frameworks/index.rst
   cluster/index.rst
   scripts/index.rst
   misc/index.rst
   components/index.rst
   indices/index.rst
--- a/doc/indices/index.rst
+++ b/doc/indices/index.rst
@ -0,0 +1,7 @@
 =======
 Indices
 =======
 * :ref:`General Index <genindex>`
 * :ref:`search`
--- a/doc/intro/index.rst
+++ b/doc/intro/index.rst
@ -0,0 +1,13 @@
 ============
 Introduction
 ============
 .. toctree::
   :maxdepth: 2
   overview
   quickstart
   upgrade
   reporting-problems
--- a/doc/reference/language.rst
+++ b/doc/reference/language.rst
@ -1,7 +1,5 @@
 ==================
-Language (Missing)
+Overview (Missing)
 ==================
--- a/doc/user-manual/quickstart.rst
+++ b/doc/user-manual/quickstart.rst
--- a/doc/intro/reporting-problems.rst
+++ b/doc/intro/reporting-problems.rst
@ -0,0 +1,194 @@
 Reporting Problems
 ==================
 .. rst-class:: opening
    Here we summarize some steps to follow when you see Bro doing
    something it shouldn't. To provide help, it is often crucial for
    us to have a way of reliably reproducing the effect you're seeing.
    Unfortunately, reproducing problems can be rather tricky with Bro
    because more often than not, they occur only in either very rare
    situations or only after Bro has been running for some time. In
    particular, getting a small trace showing a specific effect can be
    a real problem. In the following, we'll summarize some strategies
    to this end.
 Reporting Problems
 ------------------
 Generally, when you encounter a problem with Bro, the best thing to do
 is opening a new ticket in `Bro's issue tracker
 <http://tracker.bro.org/>`__ and include information on how to
 reproduce the issue. Ideally, your ticket should come with the
 following:
 * The Bro version you're using (if working directly from the git
  repository, the branch and revision number.)
 * The output you're seeing along with a description of what you'd expect
  Bro to do instead.
 * A *small* trace in `libpcap format <http://www.tcpdump.org>`__
  demonstrating the effect (assuming the problem doesn't happen right
  at startup already).
 * The exact command-line you're using to run Bro with that trace. If
  you can, please try to run the Bro binary directly from the command
  line rather than using BroControl.
 * Any non-standard scripts you're using (but please only those really
  necessary; just a small code snippet triggering the problem would
  be perfect).
 * If you encounter a crash, information from the core dump, such as
  the stack backtrace, can be very helpful. See below for more on
  this.
 How Do I Get a Trace File?
 --------------------------
 As Bro is usually running live, coming up with a small trace file that
 reproduces a problem can turn out to be quite a challenge. Often it
 works best to start with a large trace that triggers the problem,
 and then successively thin it out as much as possible.
 To get to the initial large trace, here are a few things you can try:
 * Capture a trace with `tcpdump <http://www.tcpdump.org/>`__, either
  on the same interface Bro is running on, or on another host where
  you can generate traffic of the kind likely triggering the problem
  (e.g., if you're seeing problems with the HTTP analyzer, record some
  of your Web browsing on your desktop.) When using tcpdump, don't
  forget to record *complete* packets (``tcpdump -s 0 ...``). You can
  reduce the amount of traffic captured by using a suitable BPF filter
  (e.g., for HTTP only, try ``port 80``). 
 * Bro's command-line option ``-w <trace>`` records all packets it
  processes into the given file. You can then later run Bro
  offline on this trace and it will process the packets in the same
  way as it did live. This is particularly helpful with problems that
  only occur after Bro has already been running for some time. For
  example, sometimes a crash may be triggered by a particular kind of
  traffic only occurring rarely. Running Bro live with ``-w`` and
  then, after the crash, offline on the recorded trace might, with a
  little bit of luck, reproduce the problem reliably. However, be
  careful with ``-w``: it can result in huge trace files, quickly
  filling up your disk. (One way to mitigate the space issues is to
  periodically delete the trace file by configuring
  ``rotate-logs.bro`` accordingly. BroControl does that for you if you
  set its ``SaveTraces`` option.)
 * Finally, you can try running Bro on a publically available trace
  file, such as `anonymized FTP traffic <http://www-nrg.ee.lbl.gov
  /anonymized-traces.html>`__, `headers-only enterprise traffic
  <http://www.icir.org/enterprise-tracing/Overview.html>`__, or
  `Defcon traffic <http://cctf.shmoo.com/>`__. Some of these
  particularly stress certain components of Bro (e.g., the Defcon
  traces contain tons of scans).
 Once you have a trace that demonstrates the effect, you will often
 notice that it's pretty big, in particular if recorded from the link
 you're monitoring. Therefore, the next step is to shrink its size as
 much as possible. Here are a few things you can try to this end:
 * Very often, a single connection is able to demonstrate the problem.
  If you can identify which one it is (e.g., from one of Bro's
  ``*.log`` files) you can extract the connection's packets from the
  trace using tcpdump by filtering for the corresponding 4-tuple of
  addresses and ports:
  .. console::
    > tcpdump -r large.trace -w small.trace host <ip1> and port <port1> and host <ip2> and port <port2>
 * If you can't reduce the problem to a connection, try to identify
  either a host pair or a single host triggering it, and filter down
  the trace accordingly.
 * You can try to extract a smaller time slice from the trace using 
  `TCPslice <http://www.tcpdump.org/related.html>`__. For example, to
  extract the first 100 seconds from the trace:
  .. console::
    # Test comment
    > tcpslice +100 <in >out
 Alternatively, tcpdump extracts the first ``n`` packets with its
 option ``-c <n>``.
 Getting More Information After a Crash
 --------------------------------------
 If Bro crashes, a *core dump* can be very helpful to nail down the
 problem. Examining a core is not for the faint of heart but can reveal
 extremely useful information.
 First, you should configure Bro with the option ``--enable-debug`` and
 recompile; this will disable all compiler optimizations and thus make
 the core dump more useful (don't expect great performance with this
 version though; compiling Bro without optimization has a noticeable
 impact on its CPU usage.). Then enable core dumps if you haven't
 already (e.g., ``ulimit -c unlimited`` if you're using bash).
 Once Bro has crashed, start gdb with the Bro binary and the file
 containing the core dump. (Alternatively, you can also run Bro
 directly inside gdb instead of working from a core file.) The first
 helpful information to include with your tracker ticket is a stack
 backtrace, which you get with gdb's ``bt`` command:
 .. console::
    > gdb bro core
    [...]
    > bt
 If the crash occurs inside Bro's script interpreter, the next thing to
 do is identifying the line of script code processed just before the
 abnormal termination. Look for methods in the stack backtrace which
 belong to any of the script interpreter's classes. Roughly speaking,
 these are all classes with names ending in ``Expr``, ``Stmt``, or
 ``Val``. Then climb up the stack with ``up`` until you reach the first
 of these methods. The object to which ``this`` is pointing will have a
 ``Location`` object, which in turn contains the file name and line
 number of the corresponding piece of script code. Continuing the
 example from above, here's how to get that information:
 .. console::
    [in gdb]
    > up
    > ...
    > up
    > print this->location->filename
    > print this->location->first_line
 If the crash occurs while processing input packets but you cannot
 directly tell which connection is responsible (and thus not extract
 its packets from the trace as suggested above), try getting the
 4-tuple of the connection currently being processed from the core dump
 by again examining the stack backtrace, this time looking for methods
 belonging to the ``Connection`` class. That class has members
 ``orig_addr``/``resp_addr`` and ``orig_port``/``resp_port`` storing
 (pointers to) the IP addresses and ports respectively:
 .. console::
    [in gdb]
    > up
    > ...
    > up
    > printf "%08x:%04x %08x:%04x\n", *this->orig_addr, this->orig_port, *this->resp_addr, this->resp_port
 Note that these values are stored in `network byte order
 <http://en.wikipedia.org/wiki/Endianness#Endianness_in_networking>`__
 so you will need to flip the bytes around if you are on a low-endian
 machine (which is why the above example prints them in hex). For
 example, if an IP address prints as ``0100007f`` , that's 127.0.0.1 .
--- a/doc/intro/upgrade.rst
+++ b/doc/intro/upgrade.rst
@ -0,0 +1,308 @@
 ==========================================
 Upgrading From the Previous Version of Bro
 ==========================================
 .. rst-class:: opening
   This guide details specific differences between Bro versions
   that may be important for users to know as they work on updating
   their Bro deployment/configuration to the later version.
 .. contents::
 Upgrading From Bro 2.0 to 2.1
 =============================
 In Bro 2.1, IPv6 is enabled by default.  Therefore, when building Bro from
 source, the "--enable-brov6" configure option has been removed because it
 is no longer relevant.  
 Other configure changes include renaming the "--enable-perftools" option
 to "--enable-perftools-debug" to indicate that the option is only relevant
 for debugging the heap.  One other change involves what happens when
 tcmalloc (part of Google perftools) is found at configure time.  On Linux,
 it will automatically be linked with Bro, but on other platforms you
 need to use the "--enable-perftools" option to enable linking to tcmalloc.
 There are a couple of changes to the Bro scripting language to better
 support IPv6. First, IPv6 literals appearing in a Bro script must now be
 enclosed in square brackets (for example, ``[fe80::db15]``). For subnet
 literals, the slash "/" appears after the closing square bracket (for
 example, ``[fe80:1234::]/32``). Second, when an IP address variable or IP
 address literal is enclosed in pipes (for example, ``|[fe80::db15]|``) the
 result is now the size of the address in bits (32 for IPv4 and 128 for IPv6).
 In the Bro scripting language, "match" and "using" are no longer reserved
 keywords.
 Some built-in functions have been removed: "addr_to_count" (use
 "addr_to_counts" instead), "bro_has_ipv6" (this is no longer relevant
 because Bro now always supports IPv6), "active_connection" (use
 "connection_exists" instead), and "connection_record" (use "lookup_connection"
 instead).
 The "NFS3::mode2string" built-in function has been renamed to "file_mode".
 Some built-in functions have been changed:  "exit" (now takes the exit code
 as a parameter), "to_port" (now takes a string as parameter instead
 of a count and transport protocol, but "count_to_port" is still available),
 "connect" (now takes an additional string parameter specifying the zone of
 a non-global IPv6 address), and "listen" (now takes three additional
 parameters to enable listening on IPv6 addresses).
 Some Bro script variables have been renamed:  "LogAscii::header_prefix"
 has been renamed to "LogAscii::meta_prefix", "LogAscii::include_header"
 has been renamed to "LogAscii::include_meta".
 Some Bro script variables have been removed: "tunnel_port",
 "parse_udp_tunnels", "use_connection_compressor", "cc_handle_resets",
 "cc_handle_only_syns", and "cc_instantiate_on_data".
 A couple events have changed:  the "icmp_redirect" event now includes
 the target and destination addresses and any Neighbor Discovery options
 in the message, and the last parameter of the "dns_AAAA_reply" event has
 been removed because it was unused.
 The format of the ASCII log files has changed very slightly.  Two new lines
 are automatically added, one to record the time when the log was opened,
 and the other to record the time when the log was closed.
 In BroControl, the option (in broctl.cfg) "CFlowAddr" was renamed
 to "CFlowAddress".
 Upgrading From Bro 1.5 to 2.0
 =============================
 As the version number jump suggests, Bro 2.0 is a major upgrade and
 lots of things have changed. Most importantly, we have rewritten
 almost all of Bro's default scripts from scratch, using quite
 different structure now and focusing more on operational deployment.
 The result is a system that works much better "out of the box", even
 without much initial site-specific configuration. The down-side is
 that 1.x configurations will need to be adapted to work with the new
 version. The two rules of thumb are:
    (1) If you have written your own Bro scripts
        that do not depend on any of the standard scripts formerly
        found in ``policy/``, they will most likely just keep working
        (although you might want to adapt them to use some of the new
        features, like the new logging framework; see below).
    (2) If you have custom code that depends on specifics of 1.x
        default scripts (including most configuration tuning), that is
        unlikely to work with 2.x. We recommend to start by using just
        the new scripts first, and then port over any customizations
        incrementally as necessary (they may be much easier to do now,
        or even unnecessary). Send mail to the Bro user mailing list
        if you need help.
 Below we summarize changes from 1.x to 2.x in more detail. This list
 isn't complete, see the :download:`CHANGES <CHANGES>` file in the
 distribution for the full story. 
 Default Scripts
 ===============
 Organization
 ------------
 In versions before 2.0, Bro scripts were all maintained in a flat
 directory called ``policy/`` in the source tree.  This directory is now
 renamed to ``scripts/`` and contains major subdirectories ``base/``,
 ``policy/``, and ``site/``, each of which may also be subdivided
 further.
 The contents of the new ``scripts/`` directory, like the old/flat
 ``policy/`` still gets installed under the ``share/bro``
 subdirectory of the installation prefix path just like previous
 versions.  For example, if Bro was compiled like ``./configure
 --prefix=/usr/local/bro && make && make install``, then the script
 hierarchy can be found in ``/usr/local/bro/share/bro``.
 The main
 subdirectories of that hierarchy are as follows:
 - ``base/`` contains all scripts that are loaded by Bro by default
  (unless the ``-b`` command line option is used to run Bro in a
  minimal configuration). Note that is a major conceptual change:
  rather than not loading anything by default, Bro now uses an
  extensive set of default scripts out of the box.
  The scripts under this directory generally either accumulate/log
  useful state/protocol information for monitored traffic, configure a
  default/recommended mode of operation, or provide extra Bro
  scripting-layer functionality that has no significant performance cost.
 - ``policy/`` contains all scripts that a user will need to explicitly
  tell Bro to load.  These are scripts that implement
  functionality/analysis that not all users may want to use and may have
  more significant performance costs. For a new installation, you
  should go through these and see what appears useful to load.
 - ``site/`` remains a directory that can be used to store locally 
  developed scripts. It now comes with some preinstalled example
  scripts that contain recommended default configurations going beyond
  the ``base/`` setup. E.g. ``local.bro`` loads extra scripts from
  ``policy/`` and does extra tuning. These files can be customized in
  place without being overwritten by upgrades/reinstalls, unlike
  scripts in other directories.
 With version 2.0, the default ``BROPATH`` is set to automatically
 search for scripts in ``policy/``, ``site/`` and their parent
 directory, but **not** ``base/``.  Generally, everything under
 ``base/`` is loaded automatically, but for users of the ``-b`` option,
 it's important to know that loading a script in that directory
 requires the extra ``base/`` path qualification.  For example, the
 following two scripts:
 * ``$PREFIX/share/bro/base/protocols/ssl/main.bro``
 * ``$PREFIX/share/bro/policy/protocols/ssl/validate-certs.bro``
 are referenced from another Bro script like:
 .. code:: bro
    @load base/protocols/ssl/main
    @load protocols/ssl/validate-certs
 Notice how ``policy/`` can be omitted as a convenience in the second
 case. ``@load`` can now also use relative path, e.g., ``@load
 ../main``.
 Logging Framework
 -----------------
 - The logs generated by scripts that ship with Bro are entirely redone
  to use a standardized, machine parsable format via the new logging
  framework. Generally, the log content has been restructured towards
  making it more directly useful to operations. Also, several
  analyzers have been significantly extended and thus now log more
  information. Take a look at ``ssl.log``.
  * A particular format change that may be useful to note is that the
    ``conn.log`` ``service`` field is derived from DPD instead of
    well-known ports (while that was already possible in 1.5, it was
    not the default).
  * Also, ``conn.log`` now reports raw number of packets/bytes per
    endpoint.
 - The new logging framework makes it possible to extend, customize,
  and filter logs very easily. See the :doc:`logging framework <logging>`
  for more information on usage.
 - A common pattern found in the new scripts is to store logging stream
  records for protocols inside the ``connection`` records so that
  state can be collected until enough is seen to log a coherent unit
  of information regarding the activity of that connection.  This
  state is now frequently seen/accessible in event handlers, for
  example, like ``c$<protocol>`` where ``<protocol>`` is replaced by
  the name of the protocol.  This field is added to the ``connection``
  record by ``redef``'ing it in a
  ``base/protocols/<protocol>/main.bro`` script.
 - The logging code has been rewritten internally, with script-level
  interface and output backend now clearly separated. While ASCII
  logging is still the default, we will add further output types in
  the future (binary format, direct database logging).
 Notice Framework
 ----------------
 The way users interact with "notices" has changed significantly in
 order to make it easier to define a site policy and more extensible
 for adding customized actions. See the :doc:`notice framework <notice>`.
 New Default Settings
 --------------------
 - Dynamic Protocol Detection (DPD) is now enabled/loaded by default.
 - The default packet filter now examines all packets instead of
  dynamically building a filter based on which protocol analysis scripts
  are loaded. See ``PacketFilter::all_packets`` for how to revert to old
  behavior.
 API Changes
 -----------
 - The ``@prefixes`` directive works differently now.
  Any added prefixes are now searched for and loaded *after* all input
  files have been parsed.  After all input files are parsed, Bro
  searches ``BROPATH`` for prefixed, flattened versions of all of the
  parsed input files.  For example, if ``lcl`` is in ``@prefixes``, and
  ``site.bro`` is loaded, then a file named ``lcl.site.bro`` that's in
  ``BROPATH`` would end up being automatically loaded as well.  Packages
  work similarly, e.g. loading ``protocols/http`` means a file named
  ``lcl.protocols.http.bro`` in ``BROPATH`` gets loaded automatically.
 - The ``make_addr`` BIF now returns a ``subnet`` versus an ``addr``
 Variable Naming
 ---------------
 - ``Module`` is more widely used for namespacing. E.g. the new
  ``site.bro`` exports the ``local_nets`` identifier (among other
  things) into the ``Site`` module.
 - Identifiers may have been renamed to conform to new `scripting
  conventions
  <http://www.bro.org/development/script-conventions.html>`_
 BroControl
 ==========
 BroControl looks pretty much similar to the version coming with Bro 1.x,
 but has been cleaned up and streamlined significantly internally.
 BroControl has a new ``process`` command to process a trace on disk
 offline using a similar configuration to what BroControl installs for
 live analysis.
 BroControl now has an extensive plugin interface for adding new
 commands and options. Note that this is still considered experimental.
 We have removed the ``analysis`` command, and BroControl currently
 does not send daily alarm summaries anymore (this may be restored
 later).
 Removed Functionality
 =====================
 We have remove a bunch of functionality that was rarely used and/or
 had not been maintained for a while already:
    - The ``net`` script data type.
    - The ``alarm`` statement; use the notice framework instead.
    - Trace rewriting.
    - DFA state expiration in regexp engine.
    - Active mapping.
    - Native DAG support (may come back eventually)
    - ClamAV support.
    - The connection compressor is now disabled by default, and will
      be removed in the future. 
 Development Infrastructure
 ==========================
 Bro development has moved from using SVN to Git for revision control.
 Users that want to use the latest Bro development snapshot by checking it out
 from the source repositories should see the `development process
 <http://www.bro.org/development/process.html>`_. Note that all the various
 sub-components now reside in their own repositories. However, the
 top-level Bro repository includes them as git submodules so it's easy
 to check them all out simultaneously.
 Bro now uses `CMake <http://www.cmake.org>`_ for its build system so
 that is a new required dependency when building from source.
 Bro now comes with a growing suite of regression tests in
 ``testing/``.
--- a/doc/misc/geoip.rst
+++ b/doc/misc/geoip.rst
@ -0,0 +1,102 @@
 ===========
 GeoLocation
 ===========
 .. rst-class:: opening
    During the process of creating policy scripts the need may arise
    to find the geographic location for an IP address. Bro has support
    for the `GeoIP library <http://www.maxmind.com/app/c>`__ at the
    policy script level beginning with release 1.3 to account for this
    need.
 .. contents::
 GeoIPLite Database Installation
 ------------------------------------
 A country database for GeoIPLite is included when you do the C API
 install, but for Bro, we are using the city database which includes
 cities and regions in addition to countries.
 `Download <http://www.maxmind.com/app/geolitecity>`__ the geolitecity
 binary database and follow the directions to install it.
 FreeBSD Quick Install
 ---------------------
 .. console::
    pkg_add -r GeoIP
    wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
    gunzip GeoLiteCity.dat.gz
    mv GeoLiteCity.dat /usr/local/share/GeoIP/GeoIPCity.dat
    # Set your environment correctly before running Bro's configure script
    export CFLAGS=-I/usr/local/include
    export LDFLAGS=-L/usr/local/lib
 CentOS Quick Install
 --------------------
 .. console::
    yum install GeoIP-devel
    wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
    gunzip GeoLiteCity.dat.gz
    mkdir -p /var/lib/GeoIP/
    mv GeoLiteCity.dat /var/lib/GeoIP/GeoIPCity.dat
    # Set your environment correctly before running Bro's configure script
    export CFLAGS=-I/usr/local/include
    export LDFLAGS=-L/usr/local/lib
 Usage
 -----
 There is a single built in function that provides the GeoIP
 functionality:
 .. code:: bro
    function lookup_location(a:addr): geo_location
 There is also the ``geo_location`` data structure that is returned
 from the ``lookup_location`` function:
 .. code:: bro
    type geo_location: record {
      country_code: string;
      region: string;
      city: string;
      latitude: double;
      longitude: double;
    };
 Example
 -------
 To write a line in a log file for every ftp connection from hosts in
 Ohio, this is now very easy:
 .. code:: bro
    global ftp_location_log: file = open_log_file("ftp-location");
    event ftp_reply(c: connection, code: count, msg: string, cont_resp: bool)
    {
      local client = c$id$orig_h;
      local loc = lookup_location(client);
      if (loc$region == "OH" && loc$country_code == "US")
      {
        print ftp_location_log, fmt("FTP Connection from:%s (%s,%s,%s)", client, loc$city, loc$region, loc$country_code); 
      }
    }
--- a/doc/misc/index.rst
+++ b/doc/misc/index.rst
@ -0,0 +1,9 @@
 ====================
 Miscellaneous Topics
 ====================
 .. toctree::
   :maxdepth: 2
    geoip
--- a/doc/reference/events.rst
+++ b/doc/reference/events.rst
@ -1,5 +0,0 @@
 ================
 Events (Missing)
 ================
--- a/doc/reference/frameworks.rst
+++ b/doc/reference/frameworks.rst
@ -1,5 +0,0 @@
 ====================
 Frameworks (Missing)
 ====================
--- a/doc/reference/index.rst
+++ b/doc/reference/index.rst
@ -1,13 +0,0 @@
 =========
 Reference
 =========
 .. toctree::
    :maxdepth: 2
    :numbered:
    frameworks.rst
    events.rst
    language.rst
    subsystems.rst
--- a/doc/reference/subsystems.rst
+++ b/doc/reference/subsystems.rst
@ -1,4 +0,0 @@
 ====================
 Subsystems (Missing)
 ====================
--- a/doc/user-manual/scripting.rst
+++ b/doc/user-manual/scripting.rst
@ -1,7 +1,7 @@
-=========
+===================
-Scripting
+Writing Bro Scripts
-=========
+===================
 .. toctree::
    :maxdepth: 2
--- a/doc/scripts/index.rst
+++ b/doc/scripts/index.rst
@ -1,8 +1,21 @@
 .. This is a stub doc to which broxygen appends during the build process
-Index of All Individual Bro Scripts
+================
-===================================
+Script Reference
 ================
 .. toctree::
   :maxdepth: 1
   builtins
   bifs
   scripts
   packages
   internal
 Indices
 =======
    * `Notice Index <bro-noticeindex.html>`_
--- a/doc/scripts/scripts.rst
+++ b/doc/scripts/scripts.rst
@ -0,0 +1,8 @@
 .. This is a stub doc to which broxygen appends during the build process
 ========================
 Index of All Bro Scripts
 ========================
 .. toctree::
   :maxdepth: 1
--- a/doc/user-manual/index.rst
+++ b/doc/user-manual/index.rst
@ -1,13 +0,0 @@
 ===========
 User Manual
 ===========
 .. toctree::
    :maxdepth: 2
    :numbered:
    intro.rst
    quickstart.rst
    scripting.rst
--- a/doc/user-manual/intro.rst
+++ b/doc/user-manual/intro.rst
@ -1,4 +0,0 @@
 ======================
 Introduction (Missing)
 ======================
--- a/doc/using/index.rst
+++ b/doc/using/index.rst
@ -0,0 +1,6 @@
 ===================
 Using Bro (Missing)
 ===================
 TODO.