zeek/tools/binpac at b442c25389d6dcc6165b59ff6ac7bc317637c46a - Mirror/zeek

mirror of https://github.com/zeek/zeek.git synced 2025-10-02 06:38:20 +00:00
History
Tim Wojtulewicz adb28453a7 Remove some unnecessary #includes from binpac source files		2025-09-02 11:52:52 -07:00
..
lib	Remove some unnecessary #includes from binpac source files	2025-09-02 11:52:52 -07:00
patches	binpac: Add cmake-format and typos pre-commit configs	2025-08-20 08:52:24 -07:00
src	Remove some unnecessary #includes from binpac source files	2025-09-02 11:52:52 -07:00
CMakeLists.txt	binpac: Remove submodule, adapt CMake configuration for Zeek build	2025-08-20 08:52:24 -07:00
README	binpac: Restore README file without version number	2025-08-20 08:52:24 -07:00
TODO	binpac: Rename Bro to Zeek	2025-08-20 08:52:23 -07:00
README

..	-*- mode: rst-mode -*-

======
BinPAC
======

BinPAC is a high level language for describing protocol parsers and
generates C++ code.  It is currently maintained and distributed with the
Zeek Network Security Monitor distribution, however, the generated parsers
may be used with other programs besides Zeek.

BinPAC originally existed as a separate project to the main Zeek repository.
You can see the archived repository at https://github.com/zeek/binpac. The
repository only exists for historical reasons, as all new work done to
BinPAC is done in the main Zeek repo.

.. contents::

Prerequisites
=============

BinPAC relies on the following libraries and tools, which need to be
installed before you begin:

    * Flex (Fast Lexical Analyzer)
       Flex is already installed on most systems, so with luck you can
       skip having to install it yourself.

    * Bison (GNU Parser Generator)
       Bison is also already installed on many system.

    * CMake 3.15.0 or greater
       CMake is a cross-platform, open-source build system, typically
       not installed by default.  See http://www.cmake.org for more
       information regarding CMake and the installation steps below for
       how to use it to build this distribution.  CMake generates native
       Makefiles that depend on GNU Make by default

Glossary and Convention
=======================

To make this document easier to read, the following are the glossary
and convention used.

    - PAC grammar - .pac file written by user.
    - PAC source - _pac.cc file generated by binpac
    - PAC header - _pac.h file generated by binpac
    - Analyzer - Protocol decoder generated by compiling PAC grammar
    - Field - a member of a record
    - Primary field - member of a record as direct result of parsing
    - Derivative field - member of a record evaluated through post processing

BinPAC Language Reference
=========================

BinPAC language consists of:

    - analyzer
    - type - data structure like definition describing parsing unit. Types can built on each other to form more complex type similar to yacc productions.
    - flow - "flow" defines how data will be fed into the analyzer and the top level parsing unit.
    - Keywords
    - Built-in macros

Defining an analyzer
--------------------

There are two components to an analyzer definition: the top level context
and the connection definition.


Context Definition
~~~~~~~~~~~~~~~~~~

Each analyzer requires a top level context defined by the following syntax:

.. code::

 analyzer <ContextName> withcontext {
 ... context members ...
 }

Typically top level context contains pointer to top level analyzer
and connection definition like below:

.. code::

 analyzer HTTP withcontext {
    connection : HTTP_analyzer;
    flow     : HTTP_flow;
 };


Connection Definition
~~~~~~~~~~~~~~~~~~~~~

A "connection" defines the entry point into the analyzer. It consists of
two "flow" definitions, an "upflow" and a "downflow".

.. code::

 connection <AnalyzerName>(optional parameter) {
  upflow = <UpflowConstructor>;
  downflow = <DownflowConstructor>;
 }

Example:

.. code::

 connection HTTP_analyzer {
    upflow = HTTP_flow (true);
    downflow = HTTP_flow (false);
 };

type
----

A "type" is the basic building block of binpac-generated parser, and describes
the structure of a byte segment. Each non-primitive "type" generates a C++
class that can independently parse the structure which it describes.

Syntax:

.. code::

 type <typeName>{(<optional type parameter(s)>)} = <compositor or primitive class>{
   cases or members declaration.
 } <optional attribute(s)>;

Example:

PAC grammar::

 type myType = record {
    data:uint8;
 };

PAC header::

 class myType{
 public:
    myType();
    ~myType();
    int Parse(const_byteptr const t_begin_of_data, const_byteptr const t_end_of_data);
    uint8 data() const  { return data_; }
 protected:
    uint8 data_;
 };


Primitives
~~~~~~~~~~

Primitive type can be treated as #define in C language. They are embedded
into other type which reference them but do not generate any parsing
code of their own. Available primitive types are:

    - int8
    - int16
    - int32
    - uint8
    - uint16
    - uint32
    - Regular expression ( ``type HTTP_URI = RE/[[:alnum:][:punct:]]+/;`` )
    - bytestring

Examples:

.. code::

 type foo = record { x: number; };

is equivalent to:

.. code::

 type foo = record { x: uint8[3]; };

(Note: this behavior may change in future versions of binpac.)

record
~~~~~~

A "record" composes primitive type(s) and other record(s) to create
new "type". This new "type" in turn can be used as part of parent type
or directly for parsing.

Example:

.. code::

 type SMB_body = record {
    word_count  : uint8;
    parameter_words : uint16[word_count];
    byte_count  : uint16;
 }

case
~~~~

The "case" compositor allows switching between different parsing methods.

.. code::

 type SMB_string(unicode: bool, offset: int) = case unicode of {
    true  -> u: SMB_unicode_string(offset);
    false -> a: SMB_ascii_string;
 };

A "case" supports an optional "default" label to denote none of the
above labels are matched. If no fields follow a given label, a user
can specify an arbitrary field name with the "empty" type. See
the following example.

.. code::

 type HTTP_Message(expect_body: ExpectBody) = record {
        headers:     HTTP_Headers;
        body_or_not: case expect_body of {
                BODY_NOT_EXPECTED -> none: empty;
                default           -> body: HTTP_Body(expect_body);
        };
 };

Note that only one field is allowed after a given label. If multiple fields
are to be specified, they should be packed in another "record" type first.
The other usages of `case`_ are described later.

array
~~~~~

A type can be defined as a sequence of "single-type elements". By default,
array type continue parsing for the array element in an infinite loop.
Or an array size can be specified to control the number of
match. &until can be also conditionally end parsing:

.. code::

 # This will match for 10 element only
 type HTTP_Headers = HTTP_Header [10];

 # This will match until the condition is met
 type HTTP_Headers = HTTP_Header [] &until(/*Some condition*/);

Array can also be used directly inside of "record". For example:

.. code::

 type DNS_message = record {
  header:      DNS_header;
  question:    DNS_question(this)[header.qdcount];
  answer:      DNS_rr(this, DNS_ANSWER)[header.ancount];
  authority:   DNS_rr(this, DNS_AUTHORITY)[header.nscount];
  additional:  DNS_rr(this, DNS_ADDITIONAL)[header.arcount];
 }&byteorder = bigendian, &exportsourcedata

flow
----

A "flow" defines how data is fed into the analyzer. It also maintains
custom state information declared by `%member`_. flow is configured by
specifying type of data unit.

Syntax:

.. code::

 flow <Flow name>(<optional attribute>) {
   <flowunit|datagram> = <top level data unit> withcontext (<context constructor parameter>);
 };

When "flow" is added to top level context analyzer, it enables use of &oneline
and &length in "record" type. flow buffers data when there is not enough
to evaluate the record and dispatches data for evaluation when the
threshold is reached.

flowunit
~~~~~~~~

When flowunit is used, the analyzer uses flow buffer to handle incremental
input and provide support for &oneline/&length. For further detail on
this, see `Buffering`_.

.. code::

 flowunit = HTTP_PDU(is_orig) withcontext (analyzer, this);

datagram
~~~~~~~~

Opposite to flowunit, by declaring data unit as datagram, flow buffer is
opted out. This results in faster parsing but no incremental input
or buffering support.

.. code::

 datagram = HTTP_PDU(is_orig) withcontext (analyzer, this);

Byte Ordering and Alignment
---------------------------

Byte Ordering
~~~~~~~~~~~~~

Byte Alignment
~~~~~~~~~~~~~~

.. code::

 type RPC_Opaque = record {
    length: uint32;
    data:   uint8[length];
    pad:    padding align 4;    # pad to 4-byte boundary
 };

Functions
---------

User can define functions in binpac.
Function can be declared using one of the three ways:

PAC with embedded body
~~~~~~~~~~~~~~~~~~~~~~

PAC style function prototype and embed the body using %{ %}::

 function print_stuff(value :const_bytestring):bool
 %{
    printf("Value [%s]\n", std_str(value).c_str());
 %}

PAC with PAC-case body
~~~~~~~~~~~~~~~~~~~~~~

Pac style function with a case body, this type of declaration is useful for
extending later by casefunc::

 function RPC_Service(prog: uint32, vers: uint32): EnumRPCService =
    case prog of {
        default -> RPC_SERVICE_UNKNOWN;
    };


Inlined by %code
~~~~~~~~~~~~~~~~

Function can be completely inlined by using %code::

 %code{
 EnumRPCService RPC_Service(const RPC_Call* call)
    {
    return call ? call->service() : RPC_SERVICE_UNKNOWN;
    }
 %}


Extending
---------

PAC code can be extended by using "refine". This is useful for code
reusing and splitting functionality for parallel development.

Extending record
~~~~~~~~~~~~~~~~

Record can be extended to add additional attribute(s) by
using "refine typeattr". One of the typical use is to add &let for split
protocol parsing from protocol analysis.

.. code::

 refine typeattr HTTP_RequestLine += &let {
    process_request: bool =
        process_func(method, uri, version);
 };

Extending type case
~~~~~~~~~~~~~~~~~~~

.. code::

 refine casetype RPC_Params += {
    RPC_SERVICE_PORTMAP -> portmap: PortmapParams(call);
 };

Extending function case
~~~~~~~~~~~~~~~~~~~~~~~

Function which is declared as a PAC case can be extended by adding
additional case into the switch.

.. code::

 refine casefunc RPC_BuildCallVal += {
    RPC_SERVICE_PORTMAP ->
        PortmapBuildCallVal(call, call.params.portmap);
 };

Extending connection
~~~~~~~~~~~~~~~~~~~~

Connection can be extended to add functions and members.  Example::

 refine connection RPC_Conn += {
    function ProcessPortmapReply(results: PortmapResults): bool
        %{
        %}
 };

State Management
----------------

State is maintained by extending parsing class by declaring derivative.
State lasts until the top level parsing unit (flowunit/datagram is destroyed).

Keywords
--------

Source code embedding
~~~~~~~~~~~~~~~~~~~~~

C++ code can be embedded within the .pac file using the following
directives. These code will be copied into the final generated code.

- %header{...%}

  Code to be inserted in binpac generated header file.

- %code{...%}

  Code to be inserted at the beginning of binpac generated C++ file.

.. _%member:

- %member{...%}

  Add additional member(s) to connection (?) and flow class.

- %init{...%}

  Code to be inserted in flow constructor.

- %cleanup{...%}

  Code to be inserted in flow destructor.

Embedded pac primitive
~~~~~~~~~~~~~~~~~~~~~~

- ${

- $set{

- $type{

- $typeof{

- $const_def{

Condition checking
~~~~~~~~~~~~~~~~~~

&until
......

"&until" is used in conjunction with array declaration. It specifies exit
condition for array parsing.

.. code::

 type HTTP_Headers = HTTP_Header[] &until($input.length() == 0);

&requires
.........

Process data dependencies before evaluating field.

Example: typically, derivative field is evaluated after primary field.
However "&requires" is used to force evaluate of length before msg_body.

.. code::

 type RPC_Message = record {
    xid:        uint32;
    msg_type:   uint32;
    msg_body:   case msg_type of {
        RPC_CALL    -> call:    RPC_Call(this);
        RPC_REPLY   -> reply:   RPC_Reply(this);
    } &requires(length);
 } &let {
    length = sourcedata.length();   # length of the RPC_Message
 } &byteorder = bigendian, &exportsourcedata, &refcount;

&if
...

Evaluate field only if condition is met.

.. code::

 type DNS_label(msg: DNS_message) = record {
    length:     uint8;
    data:       case label_type of {
        0 ->    label:  bytestring &length = length;
        3 ->    ptr_lo: uint8;
    };
 } &let {
    label_type: uint8   = length >> 6;
    last: bool      = (length == 0) || (label_type == 3);
    ptr: DNS_name(msg)
        withinput $context.flow.get_pointer(msg.sourcedata,
            ((length & 0x3f) << 8) | ptr_lo)
        &if(label_type == 3);
    clear_pointer_set: bool = $context.flow.reset_pointer_set()
        &if(last);
 };

.. _case:

case
....

There are two uses to the "case" keyword.

* As part of record field. In this scenario, it allow alternative
  methods to parse a field.  Example::

    type RPC_Reply(msg: RPC_Message) = record {
      stat:       uint32;
      reply:      case stat of {
          MSG_ACCEPTED -> areply:  RPC_AcceptedReply(call);
          MSG_DENIED   -> rreply:  RPC_RejectedReply(call);
      };
    } &let {
      call: RPC_Call = context.connection.FindCall(msg.xid);
      success: bool = (stat == MSG_ACCEPTED && areply.stat == SUCCESS);
    };


* As function definition.  Example::

    function RPC_Service(prog: uint32, vers: uint32): EnumRPCService =
        case prog of {
                default -> RPC_SERVICE_UNKNOWN;
        };


Note that one can "refine" both types of cases:

.. code::

 refine casefunc RPC_Service += {
        100000  -> RPC_SERVICE_PORTMAP;
 };

Built-in macros
~~~~~~~~~~~~~~~

$input
......

This macro refers to the data that was passed into the ParseBuffer
function. When $input is used, binpac generate a const_bytestring
which contains the start and end pointer of the input.

PAC grammar::

 &until($input.length()==0);

PAC source::

 const_bytestring t_val__elem_input(t_begin_of_data, t_end_of_data);
 if (  ( t_val__elem_input.length() == 0 )  )

$element
........

$element provides access to entry of the array type. Following are
the ways which $element can be used.

* Current element.  Check on the value of the most recently parsed entry.
  This would get executed after each time an entry is parsed.  Example::

    type SMB_ascii_string       = uint8[] &until($element == 0);

* Current element's field.  Example::

    type DNS_label(msg: DNS_message) = record {
       length:     uint8;
       data:       case label_type of {
           0 ->    label:  bytestring &length = length;
           3 ->    ptr_lo: uint8;
       };
    } &let {
       label_type: uint8 = length >> 6;
       last:       bool  = (length == 0) || (label_type == 3);
    };
    type DNS_name(msg: DNS_message) = record {
       labels:     DNS_label(msg)[] &until($element.last);
    };

$context
........

This macro refers to the Analyzer context class (Context<Name> class gets
generated from analyzer <Name> withcontext {}). Using this macro, users
can gain access to the "flow" object and "analyzer" object.

Other keywords
~~~~~~~~~~~~~~

&transient
..........

Do not create copy of the bytestring

.. code::

 type MIME_Line = record {
    line:   bytestring &restofdata &transient;
 } &oneline;

&let
....

Adds derivative field to a record

.. code::

 type ncp_request(length: uint32) = record {
    data        : uint8[length];
 } &let {
    function    = length > 0 ? data[0] : 0;
    subfunction = length > 1 ? data[1] : 0;
 };

let
...

Declares global value. If the user does not specify a type,
the compiler will assume the "int" type.

PAC grammar::

 let myValue:uint8=10;

PAC source::

 uint8 const myValue = 10;

PAC header::

 extern uint8 const myValue;

&restofdata
...........

Grab the rest of the data available in the FlowBuffer.

PAC grammar::

    onebyte: uint8;
    value: bytestring &restofdata &transient;

PAC source::

    // Parse "onebyte"
    onebyte_ = *((uint8 const *) (t_begin_of_data));
    // Parse "value"
    int t_value_string_length;
    t_value_string_length = (t_end_of_data) - ((t_begin_of_data + 1));
    int t_value__size;
    t_value__size = t_value_string_length;
    value_.init((t_begin_of_data + 1), t_value_string_length);

&length
.......

Length can appear in two different contexts: as property of a field
or as property of a record.
Examples:
&length as field property::

 protocol    : bytestring &length = 4;

translates into::

 const_byteptr t_end_of_data = t_begin_of_data + 4;
 int t_protocol_string_length;
 t_protocol_string_length = 4;
 int t_protocol__size;
 t_protocol__size = t_protocol_string_length;
 protocol_.init(t_begin_of_data, t_protocol_string_length);


&check
......

This was originally intended to implement the behavior of the
superseding "&enforce" attribute.  It always has and always will just be
a no-op to ensure anything that uses this doesn't suddenly and
unintentionally break.

&enforce
........

Check a condition and raise exception if not met.

&chunked and $chunk
...................

When parsing a long field with variable length, "chunked" can be used to
improve performance. However, chunked field are not buffered across
packet. Data for the chunk in the current packet can be access by
using "$chunk".

&exportsourcedata
.................

Data matched for a particular type, the data matched can be retained by
using "&exportsourcedata".

.pac file

.. code::

 type myType = record {
    data:uint8;
 } &exportsourcedata;

_pac.h

.. code::

 class myType
 {
 public:
    myType();
    ~myType();
    int Parse(const_byteptr const t_begin_of_data, const_byteptr const  _end_of_data);
    uint8 myData() const    { return myData_; }
    const_bytestring const & sourcedata() const { return sourcedata_; }
 protected:
    uint8 myData_;
    const_bytestring sourcedata_;
 };

_pac.cc

.. code::

 sourcedata_ = const_bytestring(t_begin_of_data, t_end_of_data);
 sourcedata_.set_end(t_begin_of_data + 1);

Source data can be used within the type that match it or at the parent type.

.. code::

 type myParentType (child:myType) = record {
     somedata:uint8;
 } &let{
    do_something:bool = print_stuff(child.sourcedata);
 };

translates into

.. code::

 do_something_ = print_stuff(child()->sourcedata());

&refcount
.........


withinput
.........


Parsing Methodology
===================

.. _Buffering:

Buffering
---------

binpac supports incremental input to deal with packet fragmentation. This
is done via use of FlowBuffer class and maintaining buffering/parsing states.

FlowBuffer Class
~~~~~~~~~~~~~~~~

FlowBuffer provides two mode of buffering: line and frame. Line mode is
useful for parsing line based language like HTTP. Frame mode is best for
fixed length message. Buffering mode can be switched during parsing and
is done transparently to the grammar writer.

At compile time binpac calculates number of bytes required to evaluate
each field. During run time, data is buffered up in FlowBuffer until
there is enough to evaluate the "record". To optimize the buffering
process, if FlowBuffer has enough data to evaluate on the first NewData,
it would only mark the start and end pointer instead of copying.

- void **NewMessage**\();

  - Advances the orig_data_begin\_ pointer depend on current mode\_. Moves
    by 1/2 characters in LINE_MODE, by frame_length\_ in FRAME_MODE
    and nothing in UNKNOWN_MODE (default mode).

  - Set buffer_n\_ to 0

  - Reset message_complete\_

- void **NewLine**\();

  - Reset frame_length\_ and chunked\_, set mode\_ to LINE_MODE

- void **NewFrame**\(int frame_length, bool chunked\_);

- void **GrowFrame**\(int new_frame_length);

- void **AppendToBuffer**\(const_byteptr data, int len);

  - Reallocate buffer\_ to add new data then copy data

- void **ExpandBuffer**\(int length);

  - Reallocate buffer\_ to new size if new size is bigger than current size.

  - Set minimum size to 512 (optimization?)

- void **MarkOrCopyLine**\();

  - Seek current input for end of line (CR/LF/CRLF depend on line break mode).
    If found append found data to buffer if one is already created or mark (set
    frame_length\_) if one is not created (to minimize copying). If end of line
    is not found, append partial data till end of input to buffer. Buffer
    is created if one is not there.

- const_byteptr **begin**\()/**end**\()

  - Returns buffer\_ and buffer_n\_ if a buffer exist, otherwise
    orig_data_begin\_ and orig_data_begin\_ + frame_length\_.

Parsing States
~~~~~~~~~~~~~~

* buffering_state\_ - each parsing class contains a flag indicating whether
  there are enough data buffered to evaluate the next block.

* parsing_state\_ - each parsing class which consists of multiple parsing
  data unit (line/frames) has this flag indicating the parsing stage. Each
  time new data comes in, it invokes parsing function and switch on
  parsing_state to determine which sub parser to use next.

Regular Expression
------------------

Evaluation Order
----------------

Running Binpac-generated Analyzer Standalone
============================================

To run binpac-generated code independent of Zeek. Regex library must be
substituted. Below is one way of doing it. Use the following three header
files.

RE.h
----

.. code::

 /*Dummy file to replace Zeek's file*/
 #include "binpac_pcre.h"
 #include "bro_dummy.h"

bro_dummy.h
-----------

.. code::

 #ifndef BRO_DUMMY
 #define BRO_DUMMY
 #define DEBUG_MSG(x...)  fprintf(stderr, x)
 /*Dummy to link, this function suppose to be in Zeek*/
 double network_time();
 #endif

binpac_pcre.h
-------------

.. code::

 #ifndef bro_pcre_h
 #define bro_pcre_h
 #include <stdio.h>
 #include <assert.h>
 #include <string>
 using namespace std;
 // TODO: use configure to figure out the location of pcre.h
 #include "pcre.h"
 class RE_Matcher {
 public:
    RE_Matcher(const char* pat){
        pattern_ = "^";
        pattern_ += "(";
        pattern_ += pat;
        pattern_ += ")";
        pcre_   = NULL;
        pextra_ = NULL;
    }
    ~RE_Matcher() {
        if (pcre_) {
            pcre_free(pcre_);
        }
    }
    int Compile() {
        const char *err = NULL;
        int erroffset = 0;
        pcre_ = pcre_compile(pattern_.c_str(),
                                     0,  // options,
                                     &err,
                                     &erroffset,
                                     NULL);
        if (pcre_ == NULL) {
            fprintf(stderr,
                    "Error in RE_Matcher::Compile(): %d:%s\n",
                    erroffset, err);
            return 0;
        }
        return 1;
    }

    int MatchPrefix (const char* s, int n){
        const char *err=NULL;
        assert(pcre_);
        const int MAX_NUM_OFFSETS = 30;
        int offsets[MAX_NUM_OFFSETS];
        int ret = pcre_exec(pcre_,
                                    pextra_,  // pcre_extra
                                    //NULL,  // pcre_extra
                                    s, n,
                                    0,     // offset
                                    0,     // options
                                    offsets,
                                    MAX_NUM_OFFSETS);
        if (ret < 0) {
            return -1;
        }
        assert(offsets[0] == 0);
        return offsets[1];
    }
 protected:
    pcre *pcre_;
    string pattern_;
 };
 #endif

main.cc
-------

In your main source, add this dummy stub.

.. code::

 /*Dummy to link, this function suppose to be in Zeek*/
 double network_time(){
    return 0;
 }


Q & A
=====

* Does &oneline only work when "flow" is used?

  Yes. binpac uses the flowunit definition in "flow" to figure out which
  types require buffering. For those that do, the parse function is:

  .. code::

    bool ParseBuffer(flow_buffer_t t_flow_buffer, ContextHTTP * t_context);

  And the code of flow_buffer_t provides the functionality of buffering up to
  one line. That's why &oneline is only active when "flow" is used and the
  type requires buffering.

  In certain cases we would want to use &oneline even if the type does
  not require buffering, binpac currently does not provide such functionality.

* How would incremental input work in the case of regex?

  A regex should not take incremental input. (The binpac compiler will
  complain when that happens.) It should always appear below some type
  that has either &length=... or &oneline.

* What is the role of Context_<Name> class (generated by analyzer <Name>
  withcontext)?

* What is the difference between ''withcontext'' and w/o ''withcontext''?

  withcontext should always be there. It's fine to have an empty context.

* Elaborate on $context and how it is related to "withcontext".

  A "context" parameter is passed to every type. It provides a vehicle to
  pass something to every type without adding a parameter to every type.
  In that sense, it's optional. It exists for convenience.

* Example usage of composite type array.

  Please see HTTP_Headers in http-protocol.pac in the Zeek source code.

* Clarification on "connection" keyword (binpac paper).

* Need a new way to attach hook additional code to each class beside &let.

* &transient, how is this different from declaring anonymous field? and
  currently it doesn't seem to do much

  .. code::

    type HTTP_Header = record {
        name:   HTTP_HEADER_NAME &transient;
        :       HTTP_WS;
        value:  bytestring &restofdata &transient;
    } &oneline;

  .. code::

    // Parse "name"
    int t_name_string_length;
    t_name_string_length =
        HTTP_HEADER_NAME_re_011.MatchPrefix(
            t_begin_of_data,
            t_end_of_data - t_begin_of_data);
    if ( t_name_string_length < 0 )
        {
        throw ExceptionStringMismatch( "./http-protocol.pac:96",
             "|([^: \\t]+:)",
             string((const char *) (t_begin_of_data), (const char *) t_end_of_data).c_str()
             );
        }
    int t_name__size;
    t_name__size = t_name_string_length;
    name_.init(t_begin_of_data, t_name_string_length);

* Detail on the globals ($context, $element, $input...etc)

* How does BinPAC work with dynamic protocol detection?

  Well, you can use the code in DNS-binpac.cc as a reference. First,
  create a pointer to the connection.  (See the example in DNS-binpac.cc)

  .. code::

    interp = new binpac::DNS::DNS_Conn(this);

  Pass the data received from "DeliverPacket" or "DeliverStream" to
  "interp->NewData()".  (Again, see the example in DNS-binpac.cc)

  .. code::

    void DNS_UDP_Analyzer_binpac::DeliverPacket(int len, const u_char* data, bool orig, int seq, const IP_Hdr* ip, int caplen)
        {
        Analyzer::DeliverPacket(len, data, orig, seq, ip, caplen);
        interp->NewData(orig, data, data + len);
        }

* Explanation of &withinput

* Difference between using flow and not using flow (binpac generates Parse
  method instead of ParseBuffer)

* &check currently working?

* Difference between flowunit and datagram, datagram and &oneline, &length?

* Go over TODO list in binpac release

* How would input get handle/buffered when length is not known (chunked)

* More feature multi byte character? utf16 utf32 etc.

TODO List
=========

New Features
------------

* Provides a method to match simple ascii text.

* Allows use fixed length array in addition to vector.

Bugs
----

Small clean-ups
~~~~~~~~~~~~~~~

* Remove anonymous field bytestring assignment.

* Redundant overflow checking/more efficient fixed length text copying.

Warning/Errors
~~~~~~~~~~~~~~

Things that compiler should flag out at code generation time

* Give warning when &transient is used on none bytestring

* Give warning when &oneline, &length is used and flowunit is not.

* Warning when more than one "connection" is defined