zeek/doc/ref-manual/analysis.texi


@node Analyzers and Events
@chapter Analyzers and Events

@cindex analyzers
@cindex scripts, standard
@cindex standard scripts
In this chapter we detail the different analyzers that Bro provides.
Some analyzers look at traffic in fairly generic terms, such as at
the level of TCP or UDP connections.  Others delve into the specifics
of a particular application that is carried on top of TCP or UDP.

As we use the term here, @emph{analyzer} primarily refers to Bro's event
engine.  We use the term @emph{script} to refer to a set of event handlers
(and related functions and variables) written in the Bro language;
@emph{module} to refer to a script that serves primarily to provide utility
(helper) functions and variables, rather than event handlers; and
@emph{handler} to denote an event handler written in the Bro language.
Furthermore, the @emph{standard script} is the
script that comes with the Bro distribution for handling the events
generated by a particular analyzer.

Note: However, we also sometimes use @emph{analyzer} to refer to the event
handler that processes events generated by the event engine.

We characterize the analyzers in terms of @emph{what} events they
generate, but don't here go into the details of @emph{how} they
generate the events (i.e., the nitty gritty C++ implementations of
the analyzers).

@menu
* Activating an Analyzer::
* Module Facility::
* General Processing Events::
* Generic Connection Analysis::
* Site-specific information::
* hot Analyzer::
* scan Analyzer::
* port-name Analysis Script::
* brolite Analysis Script::
* alarm Analysis Script::
* active Analysis Script::
* demux Analysis Script::
* dns Analysis Script::
* finger Analyzer::
* frag Analysis Script::
* hot-ids Analysis Script::
* ftp Analyzer::
* http Analyzer::
* ident Analyzer::
* irc Analyzer::
* login Analyzer::
* pop3 Analyzer::
* portmapper Analyzer::
* analy Analyzer::
* signature Analysis Script::
* SSL Analyzer::
* weird Analysis Script::
* icmp Analyzer::
* stepping Analyzer::
* ssh-stepping Analysis Script::
* backdoor Analyzer::
* interconn Analyzer::
@end menu

@node Activating an Analyzer,
@section Activating an Analyzer

@cindex analyzers, activating
@cindex analyzers, instantiating
@cindex performance, analysis tradeoffs
In general, Bro will only do the work associated with a particular
analyzer if your policy script defines one or more event handlers
associated with the analyzer.  For example, Bro will
instantiate an FTP analyzer only if your script defines an
@code{ftp_request} or @code{ftp_reply} handler.  If it doesn't, then
when a new FTP connection begins, Bro will only instantiate a
generic TCP analyzer for it.  This is an important point, because
some analyzers can require Bro to capture a large volume of
traffic (See @ref{Filtering}) and perform a lot of computation;
therefore, you need to have a way to trade off between the type
of analysis you do and the performance requirements it entails,
so you can strike the best balance for your particular monitoring needs.

@emph{Deficiency: While Bro attempts to instantiate an analyzer if you define a handler for any of the events the analyzer generates, its method for doing so is incomplete: if you only define an analyzer's less mainstream handlers, Bro may fail to instantiate the analyzer.}

@menu
* Loading Analyzers::
* Filtering::
@end menu

@node Loading Analyzers,
@subsection Loading Analyzers

@cindex analyzers,loading
The simplest way to use an analyzer is to @code{@@load} the standard script
associated with the analyzer.  (See @ref{load directive} for a discussion
of @code{@@load}).  However, there's nothing magic about these scripts; you can
freely modify or write your own.  The only caveat is that some scripts
@code{@@load} other scripts, so the original version may wind up being loaded
even though you've also written your own version.
@emph{Deficiency: It would be useful to have a mechanism to fully override one script with another.}

In this chapter we discuss each of the standard scripts
as we discuss their associated analyzers.

@node Filtering,
@subsection Filtering

@cindex filters
@cindex analyzers, filtering
@cindex performance, filtering
Most analyzers require Bro to capture a particular type of network traffic.
These traffic flows can vary immensely in volume, so different analyzers
can cost greatly differing amounts in terms of performance.
Bro declares two redefinable tables in @code{pcap.bro}
that have special interpretations with regard to filtering:

@example
    global capture_filters: table[string] of string &redef;
    global restrict_filters: table[string] of string &redef;
@end example

The key strings serve as a user-definable identifier for the filter
strings they are associated with.  The entries of the @code{capture_filters}
table define what traffic Bro @emph{should} capture, while @code{restrict_filters}'
entries @emph{limit} what traffic Bro captures. Bro builds the following @code{tcpdump}
filter from both tables:
@quotation
(@emph{OR of capture_filters' entries}) and (@emph{AND of restrict_filters' entries})
@end quotation

@cindex restricting traffic
@cindex traffic, restricting
@cindex excluding hosts
@cindex hosts, excluding
Thus, repeated @ref{Refinement}s of @code{capture_filters} using the @code{+=} initializer
are combined using logical ``OR''s, whereas for @code{restrict_filters} ``AND''s are
used.  This follows from the tables' respective purposes---@code{capture_filters}
permits @emph{any} of its components, while @code{restrict_filters} rejects
everything that does not comply with @emph{all} of its components.

@cindex filtering, default
@cindex default, filtering
If you do not define @code{capture_filters}, then its value
is set to ``@code{tcp or udp}'';
if you do not define @code{restrict_filters}, then no restriction is
in effect.

Here is an example. If you specify:
@example
    redef capture_filters = @{ ["HTTP"] = "port http" @};
    redef restrict_filter = @{ ["mynet"] = "net 128.3" @};
@end example

then the corresponding @code{tcpdump} filter will be:
@example
    (port http) and (net 128.3)
@end example

which will capture only the TCP port 80 traffic that has either a source
or destination address belonging to the @code{128.3} network (i.e.,
@code{128.3/16}). A more complex example:

@example
    redef capture_filters += @{ ["DNS"] = "udp port 53" @};
    redef capture_filters += @{ ["FTP"] = "port ftp" @};

    redef restrict_filters += @{ ["foonet"] = "net 128.3" @};
    redef restrict_filters += @{ ["noflood"] = "not host syn-flood.magnet.com" @};
@end example

yields this @code{tcpdump} filter:

@example
    ((udp port 53) or (port ftp)) and ((net 128.3) and (not host syn-flood.magnet.com))
@end example

@cindex filters, displaying
As you add analyzers, the final @code{tcpdump} filter can become quite
complicated.  You can load the predefined @code{print-filter} script
to print out the resulting filter.
This script handles the @code{bro_init} event and exits Bro after printing
the filter.  Its intended use is that you can add it to the Bro command
line (``@code{bro @emph{my-own-script} print-filter}'') when you want to
see what filter the script @emph{my-own-script} winds up using.

@cindex debugging,filtering problems
@cindex filters,errors
@cindex tcpdump,bugs
@cindex bugs,tcpdump

There are two particular uses for @code{print-filter}.  The first is to debug
filtering problems.  Unfortunately, Bro sometimes
uses sufficiently complicated expressions that they tickle bugs in
@code{tcpdump}'s optimizer.  You can take the filter printed out for
your script and try running it through @code{tcpdump} by hand, and
then also try using @code{tcpdump}'s
@code{-O} option to see if turning
off the optimizer fixes the problem.
@cindex shadowing
The second use is to provide a @emph{shadow} backup to Bro: that is,
a version of @code{tcpdump}
running either on the same machine or a
separate machine that uses the same network filter as Bro.  While
@code{tcpdump} can't perform any analysis of the traffic, the shadow
guards against the possibility of Bro crashing, because if it does,
you will still have a record of the subsequent network traffic which
you can run through Bro for post-analysis.

@cindex filters
@cindex analyzers, filtering

@node Module Facility
@section Module Facility

The module facility implements namespaces. Everything is in some namespace
or other. The default namespace is called "GLOBAL" and is searched by
default when doing name resolution. The scoping operator is "::" as in
C++. You can only access things in the current namespace, things in the
GLOBAL namespace, or things that have been explicitly exported from a
different namespace. Exported variables and functions still require
fully-qualified names. The syntax is as follows:

@verbatim
module foo;  # Sets the current namespace to "foo"
  export {
    int i;
    int j;
  }
  int k;

module bar;
  int i;

  foo::i = 1;
  bar::i = 2;
  print i;    # bar::i (since we're currently in module bar)
  j = 3;      # ERROR: j is exported, but the fully qualified name
              #        foo::j is required
  foo::k = 4; # ERROR: k is not exported

@end verbatim

The same goes for calling functions.

One restriction currently in place is that variables not in the "GLOBAL"
namespace can't shadow those in GLOBAL, so you can't have:

@example
    module GLOBAL;
    global i: int;

    module other_module;
    global i: int;
@end example

It is a little confusing that the "global" declaration really only means
that the variable i is global to the current module, not that it is truly
global and thus visible everywhere (that would require that it be in
GLOBAL, or if using the full name is okay, that it be exported).  Perhaps
there will be a change to the syntax in the future to address this.

The "module" statement cuts across @@load commands, so that if you say:

@example
    module foo;
    @@load other_script;
@end example

then other_script will be in module foo. Likewise if other_script changes
to module bar, then the current module will be module bar even after
other_script is done.  However, this functionality may change in the future
if it proves problematic.

The policy scripts in the Bro distribution have not yet been updated to
use it, but there is a backward-compatibility feature so that existing
scripts should work without modification. In particular, everything is
put in GLOBAL by default.


@node General Processing Events,
@section General Processing Events

@cindex general Bro processing events
@cindex events, general Bro processing
Bro provides the following events relating to its overall processing:

@table @samp
@cindex initialization event
@cindex startup, event
@cindex events, initialization
@cindex events, startup
@item @code{bro_init ()}
is generated when Bro first starts up.  In particular, after Bro has initialized
the network (or initialized to read from a save file) and executed any
initializations and global statements, and just before
Bro begins to read packets from the network input source(s).

@cindex termination event
@cindex finish event
@cindex network cleanup event
@cindex events, termination
@cindex events, finish

@item @code{net_done (t: time)}
generated when Bro has finished reading from the network,
due to either having exhausted reading the save file(s), or having
received a terminating signal (See @ref{General Processing Events}).  @emph{Deficiency: This event is generated on a terminating signal even if Bro is not reading network traffic. }  @code{t} gives the time at which network processing
finished.

This event is generated @emph{before} @code{bro_done}.  Note: If Bro
terminates due to an invocation of @code{exit}, then this event is
@emph{not} generated.

@cindex termination event
@cindex finish event
@cindex cleanup event
@cindex events, termination
@cindex events, finish

@item @code{bro_done ()}
generated when Bro is about to terminate, either due to having exhausted
reading the save file(s), receiving a terminating signal
(See @ref{General Processing Events}), or because Bro was run without the network input
source and has finished executing any global statements .

This event is generated @emph{after} @code{net_done}.  If you have cleanup
that only needs to be done when processing network traffic, it likely is
better done using @code{net_done}.  Note: If Bro terminates due to an
invocation of @code{exit}, then this event is @emph{not} generated.

@cindex signal handling
@cindex handling signals
@cindex SIGTERM

@cindex TERM
@cindex SIGINT

@cindex SIGHUP

@item @code{bro_signal (signal: count)}
generated when Bro receives a signal.  Currently, the signals Bro
handles are @emph{SIGTERM}, @emph{SIGINT}, and @emph{SIGHUP}.

Receiving either of the first two terminates Bro, though if Bro is in the
middle of processing a set of events, it first finishes with them before
shutting down.  The shutdown leads to invocations of @code{net_done}
and @code{bro_done}, in that order.  @emph{Deficiency: In this case, Bro fails to invoke @code{bro_signal}, clearly a bug. }

Upon receiving @emph{SIGHUP}, Bro invokes @code{flush_all}  (in addition
to your handler, if any).

@cindex network statistics
@cindex packets, drops
@item @code{net_stats_update (t: time, ns: net_stats)}
This event includes two arguments, @code{t}, the @code{time} at which
the event was generated, and @code{ns}, a @code{net_stats} record,
as defined in the example below.
Regarding this second parameter,
the @code{pkts_recvd} field gives the total number of packets accepted
by the packet filter so far during this execution of Bro; @code{pkts_dropped}
gives the total number of packets reported @emph{dropped} by the kernel;
and @code{interface_drops} gives the total number of packets reported
by the kernel as having been dropped by the network interface.

Note: An important consideration is that, as shown by experience, the
kernel's reporting of these statistics is not always accurate.
In particular, the @code{$pkts_dropped}
statistic is sometimes missing actual packet drops, and some operating
systems do not support the @code{interface_drops} statistic at all.
See the @code{ack_above_hole} event for an alternate
way to detect if packets are being dropped.

@end table

@example
type net_stats: record @{
    # All counts are cumulative.
    pkts_recvd: count;       # Number of packets received so far.
    pkts_dropped: count;     # Number of packets *reported* dropped.
    interface_drops: count;  # Number of drops reported by interface(s).
@};
@end example

@node Generic Connection Analysis,
@section Generic Connection Analysis

@cindex analyzers, generic
@cindex connection, analysis
@cindex connection, generic analysis
@cindex generic connection analysis
The @code{conn} analyzer performs generic connection analysis:
connection start time, duration, sizes, hosts, and the like.  You don't
in general load @code{analyzer} directly, but instead do so implicitly
by loading the  @code{tcp}, @code{udp}, or @code{icmp}
analyzers.
Consequently, @code{analyzer} doesn't configure @code{capture_filters}
by itself, but instead uses whatever is set up by these more specific
analyzers.

@code{conn} analyzes a number of events related to connections beginning
or ending.  We first describe the @code{connection} record data type that
keeps track of the state associated with each connection (See @ref{connection record}),
and then we detail the events in @ref{Generic TCP connection events}.  The main output of its
analysis are one-line connection summaries, which we describe in
@ref{Connection summaries}, and in @ref{Connection functions} we give an overview
of the different callable functions provided by @code{conn}.

@code{conn} also loads three other Bro modules: the @code{hot}
and @code{scan} analyzers, and the @code{port_name} utility
module.

@menu
* connection record::
* Definitions of connections::
* Generic TCP connection events::
* tcp analyzer::
* udp analyzer::
* Connection summaries::
* Connection functions::
@end menu

@node connection record,
@subsection The @code{connection} record

@cindex record, connection
@cindex connection record

@example
type conn_id: record @{
    orig_h: addr;  # Address of originating host.
    orig_p: port;  # Port used by originator.
    resp_h: addr;  # Address of responding host.
    resp_p: port;  # Port used by responder.
@};

type endpoint: record @{
    size: count;  # Bytes sent by this endpoint so far.
    state: count; # The endpoint's current state.
@};

type connection: record @{
    id: conn_id;        # Originator/responder addresses/ports.
    orig: endpoint;     # Endpoint info for originator.
    resp: endpoint;     # Endpoint info for responder.
    start_time: time;   # When the connection began.
    duration: interval; # How long it was active (or has been so far).
    service: string;    # The service we associate with it (e.g., "http").
    addl: string;       # Additional information associated with it.
    hot: count;         # How many times we've marked it as sensitive.
@};
@end example

@cindex connection, addresses
@cindex connection, initiator
@cindex connection, originator
A connection record record holds the state associated with a connection,
as shown in the example above.
Its first field, @emph{id}, is
defined in terms of the conn_id record, which has the
following fields:

@table @samp

@item @code{orig_h}
The IP address of the host that originated (initiated) the connection.
In ``client/server'' terminology, this is the ``client.''

@cindex connection, ports
@item @code{orig_p}
The TCP or UDP port used by the connection originator (client).  For
ICMP ``connections'', it is set to 0 @ref{icmp Analyzer}.

@cindex connection, addresses
@cindex connection, initiator
@cindex connection, originator
@item @code{resp_h}
The IP address of the host that responded (received) the connection.
In ``client/server'' terminology, this is the ``server.''

@cindex connection, ports
@item @code{resp_p}
The TCP or UDP port used by the connection responder (server).  For
ICMP ``connections'', it is set to 0 @ref{icmp Analyzer}.

@end table

The @code{orig} and @code{resp} fields of a @code{connection}
record both hold @code{endpoint} record values, which consist
of the following fields:

@table @samp
@cindex connection, bytes
@cindex connection, size
@item @code{size}
How many bytes the given endpoint has transmitted so far.  Note that
for some types of filtering, the size will be zero until the connection
terminates, because the nature of the filtering is to discard the
connection's intermediary packets and only capture its start/stop packets.

@cindex connection, state
@item @code{state}
The current state the endpoint is in with respect to the connection.
The table below defines the different possible states
for TCP and UDP connections.
@emph{Deficiency:The states are currently defined as @code{count}, but should instead be an enumerated type; but Bro does not yet support enumerated types.}

Note: UDP ``connections'' do not have a well-defined structure, so
the states for them are quite simplistic.  See @ref{Definitions of connections} for
further discussion.

@end table

The remaining fields in a @code{connection}
record are:

@table @samp
@cindex connection, start time
@cindex beginning time of a connection
@cindex start time of a connection
@item @code{start_time}
The time at which the first packet associated with this connection was seen.

@cindex connection, duration
@cindex duration of a connection
@item @code{duration }
How long the connection lasted, or, if it is still active, how long since
it began.

@cindex connection, service
@cindex service associated with a connection

@item @code{service}
The name of the service associated with the connection.  For example,
if @code{$id$resp_p} is @code{tcp/80}, then the service will be
@code{"http"}.  Usually, this mapping is provided by the global variable, perhaps via the @code{endpoint_id} function; but
the service does not always directly correspond to
@code{$id$resp_p}, which is why it's a separate field.  In particular,
an FTP data connection can have a @code{service} of @code{"ftp-data"}
even though its @code{$id$resp_p} is something other than @code{tcp/20}
(which is not consistently used by FTP servers).

If the name of the service has not yet been determined, then this field
is set to an empty string.

@cindex connection, additional information
@cindex information associated with a connection
@cindex additional information associated with a connection

@item @code{addl}
Additional information associated with the connection.  For example,
for a @emph{login} connection, this is the username associated with
the login.

@emph{Deficiency: A significant deficiency associated with the @code{addl} field is that it is simply a @code{string} without any further structure. In practice, this has proven too restrictive.  For example, we may well want to associate an unambiguous username with a login session, and also keep track of the names associated with failed login attempts.  (See the @code{login} analyzer for an example of how this is implemented presently.)  What's needed is a notion of @code{union} types which can then take on a variety of values in a type-safe manner. }

If no additional information is yet associated with this connection,
then this field is set to an empty string.

@cindex connection, hot
@cindex connection, sensitivity
@cindex sensitivity associated with a connection

@item @code{hot}
How many times this connection has been marked as potentially sensitive
or reflecting a break-in.  The default value of 0 means that so far the
connection has not been regarded as ``hot''.

Note: Bro does not presently make fine-grained use of this field; the
standard scripts alarm on connections with a non-zero @code{hot} field,
and do not in general alarm on those that do not, though there are exceptions.
In particular, the @code{hot} field is @emph{not} rigorously maintained
as an indicator of trouble; it instead is used loosely as an indicator
of particular types of trouble (access to sensitive hosts or usernames).

@end table

@node Definitions of connections,
@subsection Definitions of connections

@cindex connection, definitions

@cindex TCP, connections
@cindex connection, TCP
@cindex tcp_inactivity_timeout
Connections
for TCP are well-defined, because establishing and terminating a connection
plays a central part of the TCP protocol. Beyond those, Bro enforces a hard
connection timeout after the period of time specified through the
@code{tcp_inactivity_timeout} variable, defined in bro.init.

@cindex UDP, connections``connections''

@cindex connection, UDP
@cindex UDP, timeout
@cindex udp_inactivity_timeout
For UDP, a connection begins when host @emph{A} sends
a packet to host @emph{B} for the first time, @emph{B} never having sent anything
to @emph{A}.  This transmission is termed a @emph{request}, even if in fact
the application protocol being used is not based on requests and replies.
If @emph{B} sends a packet back, then that packet is termed a @emph{reply}.
Each packet @emph{A} or @emph{B} sends is another request or reply.
UDP connection timeouts are specified through the @code{udp_inactivity_timeout}
variable, defined in bro.init.

@cindex ICMP, connections
@cindex connection, ICMP
@cindex ICMP, timeout
@cindex icmp_inactivity_timeout
For ICMP, Bro likewise creates a connection the first time it sees
an ICMP packet from @emph{A} to @emph{B}, even if @emph{B} previously sent a packet
to @emph{A}, because that earlier packet would have been for a different
@emph{transport} connection than the ICMP itself---the ICMP will likely
@emph{refer} to that connection, but it itself is not part of
the connection.  For simplicity, this holds even for ICMP ECHOs and
ECHO_REPLYs; if you want to pair them up, you need to do so explicitly
in the policy script.
ICMP connection timeouts are specified through the @code{icmp_inactivity_timeout}
variable, defined in bro.init.

@node Generic TCP connection events,
@subsection Generic TCP connection events

@cindex events, generic TCP connection
@cindex connection, events

@cindex TCP, events
@cindex TCP-specific connection events
@cindex connection events, TCP-specific
There are a number of generic events associated with TCP connections,
all of which have a single @code{connection} record as their argument:

@table @samp
@cindex connection, new
@cindex new connection
@item @code{new_connection}
Generated whenever state for a new (TCP) connection is instantiated.

Note: Handling this event is potentially expensive.  For example,
during a SYN flooding attack, every spoofed SYN packet will lead to
a new @code{new_connection} event.

@cindex connection, establishment
@cindex established connections
@item @code{connection_established}
Generated when a connection has become established, i.e., both participating
endpoints have agreed to open the connection.

@cindex connection, attempt
@cindex attempted connections
@item @code{connection_attempt}
Generated when the originator (client) has unsuccessfully attempted to
establish a connection.  ``Unsuccessful'' is defined as at least
@code{ATTEMPT_INTERVAL} seconds having elapsed since the client first
sent a connection establishment packet to the responder (server),
where @code{ATTEMPT_INTERVAL} is an internal Bro variable which is
presently set to 300 seconds.
@emph{Deficiency:This variable should be user-settable.}  If you want to
@emph{immediately} detect that a client is attempting to connect to
a server, regardless of whether it may soon succeed, then you want
to handle the @code{new_connection} event instead.

Note: Handling this event is potentially expensive.  For example,
during a SYN flooding attack, every spoofed SYN packet will lead to
a new @code{connection_attempt} event, albeit delayed by
@code{ATTEMPT_INTERVAL.}

@cindex crud
@cindex scanning, stealth
@cindex stealth scans
@cindex connection, partial
@cindex partial connections

@item @code{partial_connection}
Generated when both connection endpoints enter the @code{TCP_PARTIAL} state
This means that we have seen traffic generated by each endpoint, but
the activity did not begin with the usual connection establishment.
@emph{Deficiency:For completeness, Bro's event engine should generate another form of @code{partial_connection} event when a single endpoint becomes active (see @code{new_connection} below).  This hasn't been implemented because our experience is network traffic often contains a great deal of ``crud'', which would lead to a large number of these really-partial events.  However, by not providing the event handler, we miss an opportunity to detect certain forms of stealth scans until they begin to elicit some form of reply.}


@float Table, Connection States
@multitable  @columnfractions .25 .6
@item @strong{State} @tab @strong{Meaning}
@item @code{TCP_INACTIVE}
@tab The endpoint has not sent any traffic.
@item @code{TCP_SYN_SENT}
@tab It has sent a SYN to initiated a connection.
@item @code{TCP_SYN_ACK_SENT}
@tab  It has sent a SYN ACK to respond to a connection request.
@item @code{TCP_PARTIAL}
@tab The endpoint has been active, but we did not see the beginning of the connection.
@item @code{TCP_ESTABLISHED}
@tab The two endpoints have established a connection.
@item @code{TCP_CLOSED}
@tab The endpoint has sent a FIN in order to close its end of the connection.
@item @code{TCP_RESET}
@tab The endpoint has sent a RST to abruptly terminate the connection.
@item @code{UDP_INACTIVE}
@tab The endpoint has not sent any traffic.
@item @code{UDP_ACTIVE}
@tab The endpoint has sent some traffic.
@end multitable
@caption{TCP and UDP connection states, as stored in an @code{endpoint} record}
@end float

@*


@cindex connection, finished
@cindex completed connections

@item @code{connection_finished}
Generated when a connection has gracefully closed.

@cindex connection, rejected
@cindex rejected connections

@item @code{connection_rejected}
Generated when a server rejects a connection attempt by a client.

@cindex TCP Wrappers, reset vs. rejected connections
Note: This event is only generated as the client attempts to establish
a connection.  If the server instead accepts the connection and then
later aborts it, a @code{connection_reset} event is generated (see below).
This can happen, for example, due to use of TCP Wrappers.

Note: Per the discussion above, a client attempting to connect to a server
will result in @emph{one} of @code{connection_attempt},
@code{connection_established}, or @code{connection_rejected}; they are
mutually exclusive.

@cindex connection, completion
@cindex connection, half finished
@cindex half-finished connections
@item @code{connection_half_finished }
Generated when Bro sees one endpoint of a connection attempt to gracefully
close the connection, but the other endpoint is in the @code{TCP_INACTIVE}
state.  This can happen due to @emph{split routing},
in which Bro only sees one side of a connection.

@cindex connection, reset
@cindex reset connections
@item @code{connection_reset}
Generated when one endpoint of an established
connection terminates the connection
abruptly by sending a TCP RST packet.

@cindex connection, partial close
@cindex partially closed connections
@item @code{connection_partial_close }
Generated when a previously inactive endpoint attempts to close a connection
via a normal FIN handshake or an abort RST sequence.  When it sends one
of these packets, Bro waits @code{PARTIAL_CLOSE_INTERVAL} (an
internal Bro variable set to 10 seconds) prior to generating the event,
to give the other endpoint a chance to close the connection normally.

@cindex connection, pending
@cindex pending connections
@item @code{connection_pending}
Generated for each still-open connection when Bro terminates.

@end table

@node tcp analyzer,
@subsection The @code{tcp} analyzer

@cindex TCP, analysis
@cindex SYN control packet
@cindex FIN control packet
@cindex RST control packet
@cindex TCP control packets (SYN/FIN/RST)
@cindex control packets (SYN/FIN/RST)
@cindex packets, control (SYN/FIN/RST)
The general @code{tcp} analyzer lets you specify that you're interested in
generic connection analysis for TCP.  It
simply @code{@@load}'s @code{conn} and adds the following
to :
@example
    tcp[13] & 0x7 != 0
@end example

which instructs Bro to capture all TCP SYN, FIN and RST packets;
that is, the control packets that delineate the beginning (SYN)
and end (FIN) or abnormal termination (RST) of a connection.

@node udp analyzer,
@subsection The @code{udp} analyzer

@cindex UDP, analysis
The general @code{udp} analyzer lets you specify that you're interested in
generic connection analysis for UDP.  It
@code{@@load}'s both @code{hot} and @code{conn}, and defines two event handlers:

@table @samp
@item @code{udp_request (u: connection)}
Invoked whenever a UDP packet is seen on the forward (request) direction of
a UDP connection.  See @ref{Definitions of connections} for a discussion of how Bro
defines UDP connections.

The analyzer invokes @code{check_hot} with a mode of @code{CONN_ATTEMPTED}
and then @code{record_connections} to generate a connection summary
(necessary because Bro does not time out UDP connections, and hence
cannot generate a connection-attempt-failed event).

@item @code{udp_reply (u: connection)}
Invoked whenever a UDP packet is seen on the reverse (reply) direction of
a UDP connection.  See @ref{Definitions of connections} for a discussion of how Bro
defines UDP connections.

The analyzer invokes @code{check_hot} with a mode of @code{CONN_ESTABLISHED}
and then again with a mode of @code{CONN_FINISHED} to cover the general
case that the reply reflects that the connection was both established and
is now complete.  Finally, it invokes  to
generate a connection summary.

@end table

Note: The standard script does @emph{not} update @code{capture_filters}
to capture UDP traffic.  Unlike for TCP, where there is a natural generic
filter that captures only a subset of the traffic, the only natural UDP
filter would be simply to capture all UDP traffic, and that can often be
a huge load.

@node Connection summaries,
@subsection Connection summaries
@cindex connection, summaries

The main output of @code{conn} is a one-line ASCII summary
of each connection.  By tradition, these summaries are written to
a file with the name @code{conn.tag.log}, where @emph{tag} uniquely
identifies the Bro session generating the logs.

The summaries are produced by the @code{record_connection} function,
and have the following format:

@quotation
@code{ <@emph{start}> <@emph{duration}> <@emph{local IP}> <@emph{remote IP}>
<@emph{service}> <@emph{local port}> <@emph{remote port}>
<@emph{protocol}> <@emph{org bytes sent}>, <@emph{res bytes sent}>
<@emph{state}> <@emph{flags}> <@emph{tag}>}
@end quotation

@cindex connection, start time
@cindex beginning time of a connection
@cindex start time of a connection
@table @samp
@item @emph{start}
corresponds to the connection's start time, as defined by @code{start_time}.
@cindex connection, duration
@cindex duration of a connection
@item @emph{duration}
gives the connection's duration, as defined by @code{duration}.
@cindex connection, hosts
@cindex hosts, in a connection
@cindex connection, addresses
@cindex addresses, in a connection
@item @emph{local IP}, @emph{remote IP}
correspond to the @emph{local} and @emph{remote} addresses
that participated in the connection, respectively.  The notion of which
addresses are local is controlled by the
global variable @code{local_nets}, which has a default value of empty.  If
@code{local_nets} has @emph{not} been redefined, then @emph{local IP} is the
connection @emph{responder} and @emph{remote IP} is the connection @emph{originator}.
@cindex service associated with a connection
@item @emph{service}
is the connection's service, as defined by @code{service}.
@cindex ports associated with a connection
@item @emph{local port}, @emph{remote port}
are the ports used by the connection.
@cindex connection, size
@cindex connection, bytes
@cindex size of connection
@cindex bytes in connection
@item @emph{org bytes sent} @emph{res bytes sent}
give the number of bytes sent by the @emph{originator}
and @emph{responder}, respectively.  These correspond to the @code{size}
fields of the corresponding @code{endpoint} records.
@cindex connection, state
@cindex state of connection
@item @emph{state}
reflects the state of the connection at the time
the summary was written (which is usually either when the connection
terminated, or when Bro terminated).  The different states are summarized
in the table below.
@quotation
@float Table, Connection State Summaries
@multitable  @columnfractions .15 .6
@item @strong{Name} @tab @strong{Meaning}
@item @code{S0}
@tab Connection attempt seen, no reply.
@item @code{S1}
@tab Connection established, not terminated.
@item @code{SF}
@tab Normal establishment and termination. Note that this is the
same symbol as for state S1. You can tell the two apart because
for S1 there will not be any byte counts in the summary, while
for SF there will be.
@item @code{REJ}
@tab Connection attempt rejected.
@item @code{S2}
@tab Connection established and close attempt by originator seen
(but no reply from responder).
@item @code{S3}
@tab Connection established and close attempt by responder seen
(but no reply from originator).
@item @code{RSTO}
@tab Connection established, originator aborted (sent a RST).
@item @code{RSTR}
@tab Established, responder aborted.
@item @code{RSTOS0}
@tab Originator sent a SYN followed by a RST, we never saw a SYN
ACK from the responder.
@item @code{RSTRH}
@tab Responder sent a SYN ACK followed by a RST, we never saw
a SYN from the (purported) originator.
@item @code{SH}
@tab Originator sent a SYN followed by a FIN, we never saw a
SYN ACK from the responder (hence the connection was "half" open).
@item @code{SHR}
@tab Responder sent a SYN ACK followed by a FIN, we never saw
a SYN from the originator.
@item @code{OTH}
@tab No SYN seen, just midstream traffic (a "partial connection" that
was not later closed).
@end multitable
@caption{Summaries of connection states, as reported in @code{conn.log} files}
@end float
@end quotation

The ASCII @code{Name} given in the Table is
what appears in the @code{conn.tag.log} log file; it is returned by the @code{conn_state}
function.  The @code{Symbol} is used when generating human-readable versions
of the file---see @ref{hot-report script}.

For UDP connections, the analyzer reports connections for which both
endpoints have been active as @code{SF}; those for which just the originator
was active as @code{S0}; those for which just the responder was active
as @code{SHR}; and those for which neither was active as @code{OTH} (this
latter shouldn't happen!).
@cindex connection, flags
@cindex flags of connection
@item @emph{flags}
reports a set of additional binary state associated with the connection:
@table @samp
@item @code{L}
indicates that the connection was initiated @emph{locally},
i.e., the host corresponding to @emph{@math{A_l}} initiated the connection.  If @code{L}
is missing, then the host corresponding to @emph{@math{A_r}} initiated the connection.
@item @code{U}
indicates the connection involved one of the networks
listed in the @code{neighbor_nets} variable.  The use
of ``@code{U}'' for this indication (rather than ``@code{N}'', say) is
historical, as for the most part is the whole notion of ``neighbor network.''
Note that connection can have both @code{L} and @code{U} set (see next item).
@item @code{X}
is used to indicate that @emph{neither} the ``@code{L}''
or ``@code{U}'' flags is associated with this connection.
@end table
@cindex connection, additional information
@cindex information associated with a connection
@cindex additional information associated with a connection
@item @emph{tag}
Reference tag to log lines containing additional information associated with the
connection in other log files, (e.g.: http.log).

@end table

@*
Putting all of this together, here is an example of a @code{conn.log} connection
summary:
@example
931803523.006848 54.3776 http 7320 38891 206.132.179.35
	128.32.162.134 RSTO X %103
@end example

The connection began at timestamp 931803523.006848 (18:18:43 hours GMT
on July 12, 1999; see the @code{cf} utility for how to determine this)
and lasted 54.3776 seconds.  The service was HTTP (presumably; this conclusion
is based just on the responder's use of port @code{80/tcp}).
The originator sent 7,320 bytes, and the responder sent 38,891 bytes.
Because the ``@code{L}'' flag is absent, the connection was initiated by
host 128.32.162.134, and the responding host was 206.132.179.35.  When
the summary was written, the connection was in the ``@code{RSTO}'' state,
i.e., after establishing the connection and transferring data, the originator
had terminated it with a RST (this is unfortunately common for Web clients).  The connection had neither
the @code{L} or @code{U} flags associated with it, and there was additional
information, summarized by the string ``@code{%103}'' (see the
@code{http} analyzer for an explanation of this information).


@node Connection functions,
@subsection Connection functions
@cindex connection, functions
@cindex connection, functions
We finish our discussion of generic connection analysis with a brief
summary of the different Bro functions provided by the @code{conn} analyzer:

@table @samp
@cindex connection, size
@cindex connection, bytes
@cindex size of connection
@cindex bytes in connection

@item @code{conn_size e: endpoint, is_tcp: bool): string}
returns a string giving either the number of bytes the endpoint sent
during the given connection, or @code{"?"} if from the connection state
this can't be determined.  The @code{is_tcp} parameter is needed
so that the function can inspect the endpoint's state to determine
whether the connection was closed.

@cindex connection, state
@cindex state of connection
@item @code{conn_state (c: connection, is_tcp: bool): string}
returns the name associated with the connection's state, as
given in the above table.

@cindex connection, service
@cindex service associated with a connection
@item @code{determine_service c: connection): bool}
sets the @code{service} field of the given connection,
using @code{port_names}.
If you are using the @code{ftp} analyzer, then it knows about FTP
data connections and maps them to @code{port_names[20/tcp]}, i.e.,
@code{"ftp-data"}.

@cindex connection ID
@cindex ID of connection

@item @code{full_id_string (c: connection): string}
returns a string identifying the connection in one of the two
following forms.  If the connection is in state @code{S0}, @code{S1},
or @code{REJ}, then no data has been transferred,
and the format is:
@quotation
@emph{@math{A_o} <state> @math{A_r}/<service> <addl>}
@end quotation

where @emph{@math{A_o}} is the IP address of the originator (@code{$id$orig_h}),
@emph{state} is as
given in the @strong{Symbol} column of the above table.
@emph{@math{A_r}} is the
IP address of the responder (@code{$id$resp_h}), @emph{service} gives
the application service (@code{$service}) as set by @code{determine_service},
and @emph{addl} is the contents of the @code{$addl} field (which may be
an empty string).

@cindex port, ephemeral
@cindex ephemeral port
Note that the ephemeral port used
by the originator is not reported.  If you want to display it, use
@code{id_string}.

So, for example:
@example
    128.3.6.55 > 131.243.88.10/telnet "luser"
@end example

identifies a connection originated by @code{128.3.6.55} to @code{131.243.88.10}'s
Telnet server, for which the additional associated information is @code{"luser"},
the username successfully used during the authentication dialog as determined
by the  analyzer.  From the table above we see that
the connection must be in state @code{S1}, as that's the only state of
@code{S0}, @code{S1}, or @code{REJ} that has a @code{>} symbol.  (We can tell
it's @emph{not} in state @code{SF} because the format used for that state
differs---see below.)

For connections in other states, Bro has size and duration information
available, and the format returned by @code{full_id_string} is:
@quotation
@emph{@math{A_o} @math{S_o}b <state> @math{A_r}/<service> @math{S_r}b @math{D_s} <addl>}
@end quotation

where @emph{@math{A_o}}, @emph{@math{A_r}}, @emph{state}, @emph{service}, and @emph{addl} are
as before, @emph{@math{S_o}} and @emph{@math{S_r}} give the number of bytes transmitted so far
by the originator to the responder and vice versa, and @emph{D} gives the
duration of the connection in seconds (reported with one decimal place)
so far.

An example of this second format is:
@example
    128.3.6.55 63b > 131.243.88.10/telnet 391b 39.1s "luser"
@end example

which reflects the same connection as before, but now @code{128.3.6.55} has
transmitted 63 bytes to @code{131.243.88.10}, which has transmitted 391 bytes
in response, and the connection has been active for 39.1 seconds.  The
``@code{>}'' indicates that the connection is in state @code{SF}.

@cindex connection, ID
@cindex ID of connection
@item @code{id_string (id: conn_id): string}
returns a string identifying the connection by its address/port quadruple.
Regardless of the connection's state, the format is:
@quotation
@emph{@math{A_o}@code{/}@math{P_o} @code{>} @math{A_r}@code{/}@math{P_r}}
@end quotation
where @emph{@math{A_o}} and @emph{@math{A_r}} are the originator and responder addresses,
respectively, and @emph{@math{P_o}} and @emph{@math{P_r}} are representations of the originator
and responder ports as returned by the @code{port-name} module,
i.e., either
or a string like ``@code{http}'' for a well-known port such as @code{80/tcp}.

An example:
@example
    128.3.6.55/2244 > 131.243.88.10/telnet
@end example

Note, @code{id_string} is implemented using a pair of calls to @code{endpoint_id}.

@emph{Deficiency:It would be convenient to have a form of @code{id_string} that can incorporate a notion of directionality, for example @code{128.3.6.55/2244 < 131.243.88.10/telnet} to indicate the same connection as before, but referring specifically to the flow from responder to originator in that connection (indicated by using ``@code{<}'' instead of ``@code{>}'').}

@cindex connection, hot
@cindex connection, logging
@cindex logging, connection
@item @code{log_hot_conn (c: connection)}
logs a real-time SensitiveConnection alarm of the form:
@quotation
hot: @code{<}@emph{connection-id}@code{>}
@end quotation
where @emph{connection-id} is the format returned by @code{full_id_string}.
@code{log_hot_conn} keeps track of which connections it has logged and
will not log the same connection more than once.

@cindex log file, connection summary (red)
@cindex connection, recording
@cindex recording connections

@item @code{record_connection (c: connection, disposition: string)}
Generates a connection summary to the @file{conn} file
in the format described in @ref{Connection summaries}.
If the connection's @code{hot} field is positive, then also logs
the connection using @code{log_hot_conn}.  The @code{disposition} is a text
description of the connection's state, such as @code{"attempt"} or
@code{"half_finished"}; it is not presently used.

@cindex connection, service
@cindex service associated with a connection

@item  @code{service_name (c: connection): string}
returns a string describing the service associated with the connection,
computed as follows.  If the responder port (@code{$id$resp_p}), @emph{p}, is
well-known, that is, in the @code{port_names} table,
then @emph{p}'s entry in the table is returned (such as @code{"http"} for TCP
port 80).  Otherwise, for TCP connections, if the responder port
is less than 1024, then @code{priv-@emph{p}} is returned, otherwise
@code{other-@emph{p}}.  For UDP connections, the corresponding service
names are @code{upriv-@emph{p}} and @code{uother-@emph{p}}.

@cindex connection, terminating with extreme prejudice
@cindex terminating connections forcibly

@item @code{terminate_connection (c: connection)}
Attempts to terminate the given connection using the @code{rst} utility
in the current directory.  It does not check to see whether the utility
is actually present, so an unaesthetic shell error will appear if the utility
is not available.

@code{rst} terminates connections by forging RST packets.  It is not
presently distributed with Bro, due to its potential for disruptive use.

@cindex analysis, on-line
@cindex on-line analysis
@cindex analysis, off-line
@cindex off-line analysis
If Bro is reading a trace file rather than live network traffic,
then @code{terminate_connection} logs the @code{rst} invocation
but does not actually invoke the utility.  In either case, it finishes
by logging that the connection is being terminated.

@end table

@cindex analyzers, generic

@node Site-specific information,
@section Site-specific information

@cindex analyzers, site-specific information
@cindex site-specific, information
The @code{site} analyzer is not actually an analyzer but
simply a set of global variables (and @emph{Updateme: one function}) used
to define a site's basic topological information.

@menu
* Site variables::
* Site-specific functions::
@end menu

@node Site variables,
@subsection Site variables

@cindex site-specific, variables

The @code{site} module defines the following variables, all redefinable:

@cindex addresses, local
@cindex local addresses
@cindex subnets
@cindex prefixes, network
@cindex network prefixes
@table @samp
@item @code{local_nets set[net]}
Defines which @code{net}'s Bro should consider as reflecting a local address.

Default: empty.

@cindex CIDR
@cindex addresses, local
@cindex local addresses
@cindex subnets
@cindex prefixes, network
@cindex network prefixes
@item @code{local_16_nets set[net]}
Defines which /16 prefixes Bro should consider as reflecting a local address.
@emph{Deficiency:Bro currently is inconsistent regarding when it consults @code{local_nets} versus @code{local_16_nets}, so you should ensure that this variable and the previous one are always consistent.}

Default: empty.

@cindex addresses, local
@cindex local addresses
@item @code{local_24_nets set[net]}
The same, but for /24 addresses.

Default: empty.

@cindex addresses, neighbor
@cindex neighbor addresses
@item @code{neighbor_nets set[net]}
Defines which @code{net}'s Bro should consider as reflecting a ``neighbor.''
Neighbors networks can be treated specially in some policies, distinct
from other non-local addresses.  In particular,
will not drop connectivity to an address belonging to a neighbor.

The notion is somewhat historical, as
is the use of ``@code{U}'' to mark neighbors in connection summaries
(See @ref{Connection summaries}).

Default: empty.

@cindex addresses, neighbor
@cindex neighbor addresses
@item @code{neighbor_16_nets set[addr]}
Defines which /16 addresses Bro should consider as reflecting a neighbor;
the only use of this variable in the standard scripts is that a scan
originating from an address with one of these prefixes will not be dropped
.  @emph{Deficiency:The name is poorly chosen and should be changed to better reflect this use.}  @emph{Deficiency:In addition, this variable should be kept consistent with @code{neighbor_nets}, until the fine day when the processing is rectified to only use one variable.}

Default: empty.

@cindex addresses, neighbor
@cindex neighbor addresses
@item @code{neighbor_24_nets set[net]}
The same, but for /24 addresses.

Default: empty.

@end table

@cindex site-specific, variables

@node Site-specific functions,
@subsection Site-specific functions

@cindex functions, site-specific
@cindex site-specific, functions

Currently, the @code{site} module only defines one function:

@table @samp
@cindex addresses, local
@cindex local addresses
@cindex site addresses
@item @code{is_local_addr (a: addr): bool}
returns true if the given address belongs to one of the ``local'' networks,
false otherwise.  @emph{Updateme: Currently, the test is made by masking the address to /16 and /24 and comparing it to @code{local_16_nets} and @code{local_24_nets}.}

@end table

@cindex site-specific, functions

@cindex analyzers, site-specific information

@node hot Analyzer,
@section The @code{hot} Analyzer

@cindex connection, analysis
@cindex connection, hot analysis
@cindex hot connection, analysis

The standard @code{hot} script defines policy relating to fairly
generic notions of allowed and prohibited connections.  It defines
a number of variables that you will need to refine to customize your
site's policies.  It also provides two functions for checking
connections against the policies, which can be used by other of the standard
scripts.

@menu
* hot variables::
* hot functions::
@end menu

@node hot variables,
@subsection @code{hot} variables

@cindex analyzers, hot, variables

The standard @code{hot} script defines the following variables, all redefinable:

@table @samp
@cindex spoofing, detection
@cindex local addresses, spoofing
@item @code{same_local_net_is_spoof : bool}
If true, then a connection with a local originator address and a local
responder address is considered by
to have been spoofed.  @emph{Deficiency:The name is poorly chosen (and may be changed in the future) to something more accurate like @code{both_local_nets_is_spoof}.}

@cindex DMZ, spoof detection
@cindex internal networks, spoof detection
In general, you want to use true for a Bro that is monitoring Internet access
links (DMZs) and false for internal monitors.

Default: @code{F}.

@cindex spoofing, allowable services
@cindex local addresses, spoofing
@item @code{allow_spoof_services : set[port]}
Defines a set of services (responder ports) for which Bro should not
generate notices if it sees apparent spoofed traffic.

Default: @code{110/tcp} (POP version 3; RFC-1939).
This default was chosen because
in our experience one common form of benign spoof is an off-site laptop
attempting to read mail while still configured to use its on-site address.

@cindex access, allowable address pairs
@cindex allowable address pairs
@item @code{allow_pairs : set[addr, addr]}
Defines pairs of source and destination addresses for which the
source is allowed to connect to the destination.  The intent with
this variable is that the source or destination address will be a sensitive
host (such as defined with @code{host_src} or
@code{host_dsts}), for which this particular access should
be allowed.

Default: empty.

@cindex access, allowable /16 network pairs
@cindex allowable /16 network pairs
@item @code{allow_16_net_pairs : set[addr, addr]}
Defines pairs of source and destination /16 networks for which the
source is allowed to connect to the destination, similar to @code{allow_pairs}.
@emph{Note: The set is defined in terms of @code{addr}'s and not @code{net}'s.
So, for example, rather than specifying @code{128.32.}, which is a @code{net}
constant, you'd use @code{128.32.0.0} (an @code{addr} constant).  }

Default: empty.

@cindex access, sensitive source addresses
@cindex sensitive source addresses
@cindex hot source addresses
@cindex addresses, hot sources
@item @code{hot_srcs : table[addr] of string}
Defines source addresses that should be considered ``hot''.
A successfully established connection from such a source address
generates an alarm, unless one of the access exception variables such as
@code{allow_pairs} also matches the connection.  The value of the
table gives an explanatory message as to why the source is
hot; for example, @code{"known attacker site"}.
Note: This value
is not currently used, though it aids in documenting the policy script.

Default: empty.

Example: redefining @code{hot_srcs} using
@example
redef hot_srcs: table[addr] of string = @{
    [ph33r.the.eleet.com] = "script kideez",
@};
@end example

would result in Bro noticing any traffic coming @code{ph33r.the.eleet.com}.
@cindex kiddies, script
@cindex script kiddies

@cindex access, sensitive destination addresses
@cindex sensitive destination addresses
@cindex hot destination addresses
@cindex addresses, hot destinations
@item @code{hot_dsts : table[addr] of string}
Same as @code{hot_srcs}, except for destination addresses.

Default: empty.

@cindex access, sensitive /24 source networks
@cindex sensitive /24 source networks
@cindex hot /24 source networks
@cindex networks, hot sources
@item @code{hot_src_24nets : table[addr] of string}
Defines /24 source networks should be considered ``hot,''
similar to @code{hot_srcs}.  @emph{Deficiency:Other network masks, particularly /16, should be provided.}

Default: empty.

@cindex CIA detection
@cindex Central Intelligence Agency, detection
Example: redefining @code{hot_src_24nets} using
@example
redef hot_src_24nets: table[addr] of string = @{
    [198.81.129.0] = "CIA incoming!",
@};
@end example

would result in Bro noticing any traffic coming from the @code{198.81.129/24}
network.

@cindex access, sensitive /24 destination networks
@cindex sensitive /24 destination networks
@cindex hot /24 destination networks
@cindex networks, hot destinations
@item @code{hot_dst_24nets : table[addr] of string}
same as @code{hot_src_24nets}, except for destination networks.

Default: empty.

@cindex access, allowable services
@cindex services, allowable
@item @code{allow_services : set[port]}
Defines a set of services that are always allowed, regardless of
whether the source or destination address is ``hot.''

Default: @code{ssh}, @code{http}, @code{gopher} @code{ident}, @code{smtp},
@code{20/tcp} (FTP data).

Note: The defaults are a bit unusual.  They are intended for a quite open
site with many services.

@cindex access, service allowed to a particular host
@cindex services, allowed to a particular host
@item @code{allow_services_to : set[addr, port]}
Defines a set of services that are always allowed if the server is the
given host, regardless of whether the source or destination address is
``hot.''

Default: empty.

Example: redefining @code{allow_services_to} using
@example
redef allow_services_to: set[addr, port] += @{
    [ns.mydomain.com, [domain, 123/tcp]],
@} &redef;
@end example

would result in Bro not noticing any TCP DNS or NTP traffic heading
to @code{ns.mydomain.com}.  You might add this if @code{ns.mydomain.com}
is also in @code{hot_dsts}, because in general you want to consider
any access (other than DNS or NTP) as sensitive.

@cindex access, service allowed to particular host pairs
@cindex services, allowed to particular host pairs
@item @code{allow_services_pairs : set[addr, addr, port]}
Defines a set of services that are always allowed if the connection
originator is the first address and the responder (server) the second
address.

Default: empty.

Example: redefining @code{allow_services_pairs} using
@example
redef allow_services_pairs: set[addr, addr, port] += @{
    [ns2.mydomain.com, ns.mydomain.com, [domain, 123/tcp]],
@} &redef;
@end example

would result in Bro not noticing any TCP DNS or NTP traffic initiated
from @code{ns2.mydomain.com} to @code{ns.mydomain.com}.

@cindex access, forbidden services
@cindex services, forbidden
@item @code{flag_successful_service : table[port] of string}
The opposite of @code{allow_services}.
Defines a set of services that should always be flagged as sensitive,
even if neither the source nor the destination address is ``hot.''
The @code{string} value in the table gives the reason for why
the service is considered hot.
Note: Bro currently does not use these explanatory messages.

Default: @code{31337/tcp} (a popular backdoor because in stylized lettering
it spells @code{ELEET}) and @code{2766/tcp} (the Solaris @code{listen} service,
in our experience rarely used legitimately in wide-area traffic).

@cindex ephemeral ports, confused with sensitive services
@cindex sensitive services, confused with ephemeral ports
@cindex FTP, ephemeral ports confused with sensitive services

@emph{Note: Bro can flag these services erroneously when a server happens to
run a different service on the same port.  For example, if you're not
running the FTP analyzer, then Bro won't know that FTP data connections
using ephemeral ports in fact belong to legitimate FTP traffic, and will
flag any that coincide with these services.  A related problem arises
when a user has configured their SSH access to tunnel FTP control channels
through the FTP connection, but not the corresponding data connections (so
they don't pay the expense of encrypting the data transfers), so again
Bro can't recognize that the ephemeral ports used for the data connections
does not reflect the presumed sensitive service.}

Example: redefining @code{flag_successful_service} using
@example
redef flag_successful_service: table[port] of string += @{
        [1524/tcp] = "popular backdoor",
@};
@end example

would result in Bro also noticing any successful connection to
a server running on TCP port 1524.

@cindex access, forbidden inbound services
@cindex services, forbidden if inbound
@cindex inbound services, forbidden

@item @code{flag_successful_inbound_service : table[port] of string}
The same as @code{flag_successful_service}, except only applies to
connections with a remote initiator and a local responder (determined
by finding the responder address in @code{local_nets}).

@cindex etc/inetd.conf/etc/inetd.conf
@cindex inetd.conf.conf
Default: @code{1524/tcp} (@code{ingreslock}, a popular backdoor because an
attacker can place an entry for the backdoor in @emph{/etc/inetd.conf} using
a service name rather than a raw port number, and hence more likely to
appear legitimate to casual inspection).  Note: There's no compelling
reason why @code{ingreslock} is in this table rather than the more
general @code{flag_successful_service}, though it does tend to result
in a few more false hits than the others, presumably because it's a lower
port number, and hence more likely on some systems to be chosen for
an ephemeral port.

Note: Symmetry would call for @code{flag_successful_outbound_service}.
This hasn't been implemented in Bro yet simply because the
Bro development site has a threat model structured primarily around
external threats.

@cindex access, fatal inbound services
@cindex services, fatal if inbound
@cindex inbound services, fatal
@item @code{terminate_successful_inbound_service : table[port] of string}
The same as @code{flag_successful_inbound_service}, except invokes
 in an attempt to terminate the connection.

Default: empty.

Note: As for @code{flag_successful_inbound_service}, it would be symmetric
to have @code{terminate_successful_outbound_service}, and also to have
a more general @code{terminate_successful_service}.

@cindex access, forbidden attempted services
@cindex services, forbidden if attempted
@cindex attempted services, forbidden
@code{flag_rejected_service table[port] of string}
Similar to @code{flag_successful_service}, except applies to connections
that a server rejects.  For example, you could detect a particular, failed
Linux @emph{mountd} attack by adding @code{10752/tcp} to this table, since
that happens to be the port used by the commonly
available version of the exploit
for its backdoor if the attack succeeds.  Note: You would of course
likely also want to put @code{10752/tcp} in @code{flag_successful_service};
or put the entire @code{flag_rejected_service} table
into @code{flag_successful_service}, as discussed in @ref{Inserting tables into tables}.

Default: none.

@emph{Deficiency:It might make sense to have @code{flag_attempted_service}, which doesn't require that a server actively reject the connection, but Bro doesn't currently have this.}

@end table

@cindex analyzers, hot, variables

@node hot functions,
@subsection @code{hot} functions

@cindex analyzers, hot, functions

The @code{hot} module defines two functions for external use:

@table @samp
@cindex spoofing, detection
@cindex local addresses, spoofing
@item @code{check_spoof (c: connection): bool}
checks the originator and responder addresses of the given connection to
determine if they are both local (and the connection is not explicitly
allowed in @code{allow_spoof_services}).  If so, and if @code{same_local_net_is_spoof} is true, then marks the connection as ``hot''.

@cindex Land attack
@cindex attack, Land
The function also checks for a specific denial of service attack, the
``Land'' attack, in which the addresses are the same and so are the ports.
If so, then it generates a  event with a name of
@code{"Land_attack"}.  It makes this check even if is false.

Returns: true if the connection is now hot (or was upon entry), false
otherwise.

@cindex hot detection
@cindex detecting sensitive connections
@cindex connection, detecting sensitive
@item @code{check_hot (c: connection, state: count): bool}
checks the given connection against the various policy variables discussed
above, and bumps the connection's @code{hot} field if it matches
the policies for being sensitive, and does not match the various exceptions.
It also uses @code{check_spoof} to see if the connection reflects a possible
spoofing attack; and terminates the connection if
@code{terminate_successful_service} indicates so.

The caller indicates the connection's state in the second parameter to the
function, using one of the values given in the Table below.
As noted in the Table, the processing differs depending on the state.


@float Table, Hot connection states
@multitable  @columnfractions .2 .4 .3
@item @strong{State} @tab @strong{Meaning} @tab @strong{Tests}
@item @code{CONN_ATTEMPTED}
@tab Connection attempted, no reply seen.  Note that you should also use this value
for scans with undetermined state, such as possible stealth scans. For example,
connection @code{half_finished} does this.
@tab @code{check_spoof}
@item @code{CONN_ESTABLISHED}
@tab Connection established. Also used for connections apparently established, per @code{partial_connection}.
@tab @code{check_spoof, flag_successful_service,
flag_successful_inbound service, allow_services_to,
terminate_successful inbound_service}
@item @code{APPL_ESTABLISHED}
@tab The connection has reached application-layer establishment. For
example, for Telnet or Rlogin, this is after the user has authenticated.
@tab @code{allow_services_to, allow_service_pairs,
allow_pairs, allow_16_net_pairs, hot_srcs,
hot_dsts, hot_src_24nets, hot_dst_24nets}
@item @code{CONN_FINISHED}
@tab The connection has finished, either cleanly or abnormally (for example, @code{connection_reset}.
@tab Same as @code{APPL_ESTABLISHED}, if the connection exchanged non-zero
amounts of data in both directions, and if the service wasn’t one of the ones that
generates @code{APPL_ESTABLISHED}
@item @code{CONN_REJECTED}
@tab The connection attempt was rejected by the server.
@tab @code{check_spoof, flag_rejected_service}
@end multitable
@caption{Different connection states to use when calling check @code{hot}}
@end float
@*


In general, the pattern is to make one call when the connection is first
seen, either @code{CONN_ATTEMPTED}, @code{CONN_ESTABLISHED},
or @code{CONN_REJECTED}.  If the application is one for which connections
should only be considered ``established'' after a successful pre-exchange
between originator and responder, then a subsequent call is made
with a state of @code{APPL_ESTABLISHED}.   The idea here is to provide a
way to filter out what are in fact not really successful connections
so that they are not analyzed in terms of successful service.
Finally, for services that don't use @code{APPL_ESTABLISHED}, a
call is made instead when the connection finishes for some reason,
using state @code{CONN_FINISHED}.  Note: This approach delays
noticing until the connection is over, which might be later than
you want, in which case you may need to edit @code{check_hot} to
provide the desired functionality.

Returns: true if the connection is now hot (or was upon entry), false
otherwise.

@end table

@cindex analyzers, hot, functions

@node scan Analyzer,
@section The @code{scan} Analyzer

@cindex scan detection
@cindex detecting scans
@cindex scanning, address
@cindex scanning, port
@cindex address scanning
@cindex port scanning
@cindex passwords, guessing
The @code{scan} analyzer detects connection attempts to numerous machines
(address scanning), connection attempts to many different services
on the same machine (port scanning), and attempts to access many different
accounts (password guessing).  The basic methodology is to use tables to
keep track of the distinct addresses and ports to which a given host
attempts to connect, and to trigger notices when either of these reaches
a specified size.  @emph{Deficiency:As currently written, the analyzer will not detect distributed scans, i.e., when many sites are used to probe individually just a few, but together a large number, of ports or addresses.}

A powerful technique that Bro potentially provides is dropping
border connectivity with remote scanning sites, though you must
supply the magic script to talk with your router and effect the
block.  See @code{drop_address} below for a discussion of the
interface provided.  Note: Naturally, providing this capability means
you might become vulnerable to denial-of-service attacks in which spoofed
packets are used in an attempt to trigger a block of a site to which
you want to have access.

@menu
* scan variables::
* scan functions::
* scan event handlers::
@end menu

@node scan variables,
@subsection @code{scan} variables

@cindex analyzers, scan, variables

In addition to internal variables for its bookkeeping, the analyzer
provides the following redefinable variables:

@table @samp
@code{report_peer_scan : set[count]}
Generate an alarm whenever a remote host (as determined by
@code{is_local_address}) has attempted to connect to the given
number of distinct hosts.

Default: @code{@{ 100, 1000, 10000, @}}.  So, for example, if
a remote host attempts to connect to 3,500 different local hosts,
a report will be generated when it makes the 100th attempt, and
another when it makes the 1,000th attempt.

@item @code{report_outbound_peer_scan : set[count]}
The same as @code{report_peer_scan}, except for connections
initiated locally.

Default: @code{@{ 1000, 10000, @}}.

@item @code{possible_port_scan_thresh : count}
Initially, port scan detection is done based on how many different
ports a given host connects to, regardless of on which hosts.  Once
this threshold is reached, however, then the analyzer begins tracking
ports accessed per-server, which is important for reducing false
positives.  Note: The reason this variable exists  is because it
is very expensive to track per-server ports accessed for every
active host; this variable limits such tracking to only active hosts
contacting a significant number of different ports.

Default: @code{25}.

@item @code{report_accounts_tried : set[count]}
Whenever a remote host has attempted to access a number of local
accounts present in this set, generate an alarm.  Each distinct
username/password pair is considered a different access.

Default: @code{@{ 25, 100, 500, @}}.

@item @code{report_remote_accounts_tried : set[count]}
The same, except for access to remote accounts rather than local ones.

Default: @code{@{ 100, 500, @}}.

@item @code{skip_accounts_tried : set[addr]}
Do not do bookkeeping for account attempts for the given hosts.

Default: empty.

@item @code{skip_outbound_services : set[port]}
Do not do outbound-scanning bookkeeping for connections involving
the given services.

Default: @code{allow_services}, @code{ftp}, @code{addl_web} (see next item).

@item @code{addl_web : set[port]}
Additional ports that should be considered as Web traffic (and hence skipped
for outbound-scan bookkeeping).

Default: @code{@{ 81/tcp, 443/tcp, 8000/tcp, 8001/tcp, 8080/tcp, @}}.

@item @code{skip_scan_sources : set[addr]}
Hosts that are allowed to address-scan without complaint.

Default: @code{scooter.pa-x.dec.com}, @code{scooter2.av.pa-x.dec.com}
(AltaVista crawlers; you get the idea.)

@item @code{skip_scan_nets_24 : set[addr, port]}
/24 networks that are allowed to address scan for the given port
without complaint.

Default: empty.

@cindex connectivity, dropping
@cindex dropping connectivity
@cindex firewall, reactive
@cindex reactive firewall
@item @code{can_drop_connectivity : bool}
True if the Bro has the capability of dropping connectivity,
per @code{drop_address}.

Default: false.

@cindex scanning, shutting down
@cindex shutting down scans
@item @code{shut_down_scans : set[port]}
Scans of these ports trigger connectivity-dropping (if the Bro
is capable of dropping connectivity), unless @code{shut_down_all_scans}
is defined (next item).

Default: empty.

@item @code{shut_down_all_scans : bool}
Ignore @code{shut_down_scans} and simply drop all scans regardless of
service.

Default: false.

@item @code{shut_down_thresh : count}
Shut down connectivity after a host has scanned this many addresses.`

Default: @code{100}.

@item @code{never_shut_down : set[addr]}
Purported scans from these addresses are never shut down.

Default: the root name servers (@code{a.root-servers.net} through
@code{m.root-servers.net}).

@end table

@cindex analyzers, scan, variables

@node scan functions,
@subsection @code{scan} functions

@cindex analyzers, scan, functions
The standard @code{scan} script provides the following functions:

@table @samp
@cindex connectivity, dropping
@cindex dropping connectivity
@cindex firewall, reactive
@cindex reactive firewall
@cindex scanning, shutting down
@cindex shutting down scans
@item @code{drop_address (a: addr, msg: string)}
Drops external connectivity to the given address and generates a notification
using the given message.

@cindex drop-connectivity shell script-connectivity shell script
@cindex shell scripts, drop-connectivity-connectivity
Dropping connectivity requires all of the following to be true:
@itemize @bullet
@item  @code{can_drop_connectivity} is true.

@item
The address is neither local
nor a neighbor (See @ref{Site variables}).

@item
The address is not in @code{never_shut_down}.
@end itemize

If these checks succeed, then the script simply attempts to invoke
a shell script @emph{drop-connectivity} with a single argument, the IP
address to block.  It is up to you to provide the script, using whatever
interface to your router/firewall you have available.

The function does not return a value.

@cindex scanning, stealth
@cindex stealth scans
@item @code{check_scan (c: connection, established: bool, reverse: bool): bool}
Updates the analyzer's internal bookkeeping on the basis of the
new connection @code{c}.  If @code{established} is true, then the connection
was successfully established, otherwise not.  If @code{reverse} is true,
then the function should consider the originator/responder fields in
the connection's record as reversed.  Note: This last is needed
for some unusual new connections that may reflect stealth scanning.
For example, when the event engine sees a SYN-ack without a corresponding
SYN, it instantiates a new connection with an assumption that the SYN-ack
came from the responder (and it missed the initial SYN either due to
split routing (See @ref{Split routing}), a packet drop (See @ref{Packet drops}),
or Bro having started running after the initial SYN was sent).

If the originating host's activity matches the policy defined by the
variables above, then the analyzer logs this fact, and possibly
attempts to drop connectivity to the originating host.  The function
also schedules an event for 24 hours in the future (or when Bro terminates)
to generate a summary of the scanning activity (so if the host continues
scanning, you get a report on how many hosts it wound up scanning).
@emph{Deficiency:This time interval should be selectable.}

Note: Purported scans of the FTP data port (@code{20/tcp}) or the @code{ident}
service (@code{113/tcp}) are never reported or dropped, as experience
has shown they yield too many false hits.

The function does not return a value.

@end table

@cindex analyzers, scan, functions

@node scan event handlers,
@subsection @code{scan} event handlers

@cindex analyzers, scan, event handlers

The standard @code{scan} script defines one event handler:

@table @samp
@item @code{account_tried (c: connection, user: string, passwd: string)}
The given connection made an attempt to access the given username and
password.  Each distinct username/password pair is considered a new access.
The event handler generates an alarm if the access matches the logging
policy outlined above.

Note: @code{account_tried} events are generated by  @code{login}
and @code{ftp} analyzers.

@end table

@cindex analyzers, scan, event handlers

@cindex scan detection

@node port-name Analysis Script,
@section The @code{port-name} Analysis Script

The @code{port-name} utility module provides one redefinable variable
and one callable function:

@table @samp
@item @code{port_names : table[port] of string}
Maps TCP/UDP ports to names for the services associated with those ports.
For example, @code{80/tcp} maps to @code{"http"}.  These names are used by
the @code{conn} analyzer when generating connection logs
(See @ref{Generic Connection Analysis}).

@item @code{endpoint_id (h: addr, p: port): string }
Returns a printable form of the given address/port connection
endpoint.  The format is either
@code{<}@emph{address}@code{>/<}@emph{service-name}@code{>}
or
@code{<}@emph{address}@code{>/<}@emph{port-number}@code{>}
depending on whether the port appears in @code{port_names}.
@end table

@node brolite Analysis Script,
@section The @code{brolite} Analysis Script

The @code{brolite} module is intended to provide a convenient way
to run (almost) all of the analyzers.  It @code{@@load}'s the following
other modules and analyzers:
@code{alarm, dns, hot, port-name, frag, tcp, scan, weird, finger, ident, ftp,
login} and @code{portmapper}.
So you can run Bro using @emph{bro -i in0 brolite} to have it analyze
traffic on interface @emph{in0} using the above analyzers
; or you can @code{@@load brolite} to load in the above
analyzers.

Note: The @code{brolite} analyzer doesn't load @code{http}  (because
it can prove a very high load for many sites)
nor experimental analyzers such as  @code{stepping}
or @code{backdoor}.

@node alarm Analysis Script,
@section The @code{alarm} Analysis Script

The @code{alarm} utility module redefines a single variable:

@table @samp
@cindex alarm file
@item @code{bro_alarm_file : file}
A special Bro variable used internally to specify a file where Bro should
record messages logged by @code{alarm} statements (as well
as generating real-time notifications via @emph{syslog}).

Default: if the @code{$BRO_LOG_SUFFIX} environment variable is defined,
then @code{alarm.@code{<}@emph{$BRO_LOG_SUFFIX}@code{>}}, otherwise @code{alarm.log}.

See @code{bro_alarm_file} for further discussion.

@end table

If you do not include this module, then Bro records alarm messages
to @emph{stderr}.
@cindex stderr

Here is a sample definition of @code{alarm_hook}:
@example
global msg_count: table[string] of count &default = 0;

event alarm_summary(msg: string)
    @{
    alarm fmt("(%s) %d times", msg, msg_count[msg]);
    @}

function alarm_hook(msg: string): bool
    @{
    if ( ++msg_count[msg] == 1 )
        # First time we've seen this message - log it.
        return T;

    if ( msg_count[msg] == 5 )
        # We've seen it five times, enough to be worth
        # summarizing.  Do so five minutes from now,
        # for whatever total we've seen by then.
        schedule +5 min @{ alarm_summary(msg) @};

    return F;
    @}

@end example

You can also control Bro's alarm processing by defining the
special function @emph{alarm-hook}.  It takes a single
argument, @code{msg: string}, the message in a just-executed
@code{alarm} statement, and returns a boolean value: true if Bro
should indeed log the message, false if not.  The above example
shows a definition of @code{alarm_hook} that
checks each alarm message to see whether the same text has
been logged before.  It only logs the first instance of a message.
If a message appears at least five times, then it schedules a
future @code{alarm_summary} event for 5 minutes in the future;
the purpose of this event is to summarize the total number of
times the message has appeared at that point in time.

@node active Analysis Script,
@section The @code{active} Analysis Script

The @emph{active} utility module provides a single, non-redefinable
variable that holds information about active connections:

@table @samp
@item @code{active_conn : table[conn_id] of connection}
Indexed by a @code{conn_id} giving the
originator/responder addresses/ports, returns the connection's
@code{connection} record.  As usual, accessing
the table with a non-existing index results in a run-time error,
so you should first test for the presence of the index using
the @code{in} operator.

Default: empty.
@end table

This functionality is quite similar to that of the @code{active_connection}
function, and @emph{Deficiency:arguably this module should be removed in favor of the function}.  It does, however, provide a useful example of maintaining
bookkeeping by defining additional handlers for events that already have
handlers elsewhere.

@node demux Analysis Script,
@section The @code{demux} Analysis Script

The @emph{demux} utility module provides a single function:

@table @samp
@item @code{demux_conn (id: conn_id, tag: string, otag: string, rtag: string): bool }
Instructs Bro to write (``demultiplex'') the contents of the connection
with the given @code{id} to a pair of files whose names are constructed
out of @code{tag}, @code{otag}, and @code{rtag}, as follows.

The originator-to-responder direction of the connection goes into a file named:
@quotation
@code{<}@emph{otag}@code{>.<}@emph{tag}@code{>.<}@emph{orig-addr}@code{>.<}@emph{orig-port}@code{>-<}@emph{resp-addr}@code{>.<}@emph{resp-port}@code{>}
@end quotation
and the other direction in:
@quotation
@code{<}@emph{rtag}@code{>.<}@emph{tag}@code{>.<}@emph{resp-addr}@code{>.<}@emph{resp-port}@code{>-<}@emph{orig-addr}@code{>.<}@emph{orig-port}@code{>}
@end quotation
Accordingly, @emph{tag} can be used to associate a unique label with
the pair of files, while @emph{otag} and @emph{rtag} provide distinct
labels for the two directions.

If Bro is already demuxing the connection, or if the connection is
not active, then nothing happens, and the function returns false.
Otherwise, it returns true.

@end table

Bro places demuxed streams in a directory defined by the redefinable
global @code{demux_dir}, which defaults in the usual fashion to
@code{open_log_file("xscript")}.

@emph{Deficiency:Experience has shown that it would be highly convenient if Bro would demultiplex the @emph{entire} connection contents into the files, instead of just the part of the connection seen subsequently after the call to @code{demux_conn}.  One way to do this would be for @code{demux_conn} to offset the contents in the file by the current stream position, and then to invoke a utility tool that goes through the Bro output trace file  and copies the contents up to the current stream position to the front of the file.  This utility tool might even be another instance of Bro running with suitable arguments.}

@node dns Analysis Script,
@section The @code{dns} Analysis Script

The @code{dns} module deals with Bro's internal mapping of hostnames
to/from IP addresses.
@emph{Deficiency: There is no DNS protocol analyzer available at present.}
Furthermore, @emph{Deficiency: the lookup mechanisms discussed here are not available to the Bro script writer, other than implicitly by using hostnames in lieu of addresses in variable initializations (see @ref{Hostnames vs addresses}).}

@cindex bro-dns-cache.bro-dns-cache
@cindex DNS, Bro's private cache

The module's function is to handle different events that can
occur when Bro resolves hostnames upon startup.  Bro maintains its
own cache of DNS information which persists across invocations of Bro
on the same machine and by the same user.
The role of the cache is to allow Bro to resolve
hostnames even in the face of DNS outages; the philosophy is that it's
better to use old addresses than none at all, and this helps harden Bro
against attacks in which the attacker causes DNS outages in order to
prevent Bro from resolving particular sensitive hostnames (e.g., @code{hot_srcs}
).  The cache is stored in the file ``@code{.bro-dns-cache}''
in the user's home directory.  You can delete this file whenever you want,
for example to purge out old entries no longer needed, and Bro will recreate
it next time it's invoked using @code{-P}.

Currently, all of the event handlers are invoked upon @emph{comparing}
the results of a new attempt to look up a name or an address versus the
results obtained the @emph{last time} Bro did the lookup.  When Bro looks
up a name for the first time, no events are generated.

Also, Bro currently only looks up hostnames to map them to addresses.
It does not perform inverse lookups.

@menu
* dns_mapping record::
* dns variables::
* dns event handlers::
@end menu

@node dns_mapping record,
@subsection The @code{dns_mapping} record

@cindex DNS, mappings

All of the events handled by the module include at least one
record of DNS mapping information, defined by the @code{dns_mapping}
type shown in the example below.
The corresponding fields are:

@table @samp
@item @code{creation_time}
When the mapping was created.

@item @code{req_host}
The hostname looked up, or an empty string if this was not a hostname
lookup.

@item @code{req_addr}
The address looked up (reverse lookup), or @code{0.0.0.0} if this was not
an address lookup.

@item @code{valid}
True if an answer was received for a lookup (even if the answer was that
the request name or address does not exist in the DNS).

@item @code{hostname}
The hostname answer in response to an address lookup, or the
string @code{"@code{<}none@code{>"}} if an answer was received but
it indicated there was no PTR record for the given address.

@item @code{addrs}
A set of addresses in response to a hostname lookup.  Empty
if an answer was received but it indicated that there was no A record
for the given hostname.

@end table

@example
type dns_mapping: record @{
    creation_time: time;  # When the mapping was created.

    req_host: string;     # The hostname in the request, if any.
    req_addr: addr;       # The address in the request, if any.

    valid: bool;          # Whether we received an answer.
    hostname: string;     # The hostname in the answer, or "<none>".
    addrs: set[addr];     # The addresses in the answer, if any.
@};
@end example

@node dns variables,
@subsection @code{dns} variables

@cindex modules, dns, variables

The modules provides one redefinable variable:

@table @samp
@item @code{dns_interesting_changes : set[string]}
The different DNS events have names associated with them.  If the
name is present in this set, then the event will generate a notice, otherwise
not.

One exception to this list is that DNS changes involving the
loopback address @code{127.0.0.1} are always considered notice-worthy,
since they may reflect DNS corruption.

Default: @code{@{ "unverified", "old name", "new name", "mapping", @}}.

@end table

@cindex modules, dns, variables

@node dns event handlers,
@subsection @code{dns} event handlers

@cindex modules, dns, event handlers

The DNS module supplies the following event handlers:

@table @samp
@item @code{dns_mapping_valid (dm: dns_mapping)}
The given request was looked up and it was identical to its previous
mapping.

@item @code{dns_mapping_unverified (dm: dns_mapping)}
The given request was looked up but no answer came back.

@item @code{dns_mapping_new_name (dm: dns_mapping)}
In the past, the given address did not resolve to a hostname;
this time, it did.

@item @code{dns_mapping_lost_name (dm: dns_mapping)}
In the past, the given address resolved to a hostname; now,
that name has gone away.  (An answer was received, but it stated
that there is no hostname corresponding to the given address.)

@item @code{dns_mapping_name_changed (old_dm: dns_mapping, new_dm: dns_mapping)}
The name returned this time for the given address differs from the
name returned in the past.

@item @code{dns_mapping_altered (dm: dns_mapping, old_addrs: set[addr], new_addrs: set[addr])}
The addresses associated with the given hostname have changed.  Those
in @code{old_addrs} used to be part of the set returned for the name, but
aren't any more; while those in @code{new_addrs} didn't used to be, but
now are.  There may also be some unchanged addresses, which are those in
@code{dm$addrs} but not in @code{new_addrs}.

@end table

@cindex modules, dns, event handlers

@node finger Analyzer,
@section The @code{finger} Analyzer

@cindex analyzers, application-specific
@cindex Finger, analysis
The @code{finger} analyzer processes traffic associated with
the Finger service RFC-1288.  Bro instantiates a @code{finger}
analyzer for any connection with service port @code{79/tcp} (if you
@code{@@load} the finger analyzer in your script, or define your own
@code{finger_request} or @code{finger_reply} handlers, of course).

The analyzer uses a capture filter of ``@code{port finger}''
(See: @ref{Filtering}).

In the past, attackers often used Finger requests to obtain information
about a site's users, and sometimes to launch attacks of various forms
(buffer overflows, in particular).  In our experience, exploitation
of the service has
greatly diminished over the past years (no doubt in part to the service
being increasingly turned off, or prohibited by firewalls).  Now it is only
rarely associated with an attack.

@menu
* finger variables::
* finger event handlers::
@end menu

@node finger variables,
@subsection @code{finger} variables

@cindex analyzers, finger, variables

The standard script defines two redefinable variables:

@table @samp
@item @code{hot_names : set[string]}
A list of usernames that should be considered sensitive (notice-worthy)
if included in a Finger request.

Default: @code{@{ "root", "lp", "uucp", "nuucp", "demos", "operator", "sync", "guest", "visitor", @}}.

@item @code{max_request_length : count}
The largest reasonable request size (used to flag possible buffer
overflow attacks).  Bro marks a connection as ``hot'' if its request
exceeds this length, and truncates its logging of the request to
this many bytes, followed by @code{"..."}.

Default: @code{80}.

@end table

@cindex analyzers, finger, variables

@node finger event handlers,
@subsection @code{finger} event handlers

@cindex analyzers, finger, event handlers

The standard script defines one event handler:

@table @samp
@item @code{finger_request (c: connection, request: string, full: bool)}
Invoked upon connection @code{c} having made the request @code{request}.
The @code{full} flag is true if the request included the ``long format''
option (which the event engine will have removed from the request).

The standard script flags long requests and truncates them as noted above,
and then checks whether the request is for a name in @code{hot_names}.
It then formats the request either by placing double quotation marks
around it, or, if the request was empty---indicating a request for
information on all users---the request is changed to the string @code{ALL}
with no quotes around it.

If the originator already made a request, then this additional request
is placed in parentheses (though multiple requests violate the Finger
protocol).  If the request was for the @code{full} format, then the
text ``@code{(/W)}'' is appended to the request.  Finally, the request
is appended to the connection's  field.

@end table

The event engine generates an additional event that the predefined
@code{finger} script does not handle:

@table @samp
@item @code{finger_reply (c: connection, reply_line: string)}
Generated for each line of text sent in response to the originator's
request.

@end table

@cindex analyzers, finger

@node frag Analysis Script,
@section The @code{frag} Analysis Script

The @code{frag} utility module simply refines the capture filter
(See: @ref{Filtering}) so that Bro will capture and reassemble IP fragments.
Bro reassembles any fragments it receives; but normally it doesn't receive
any, except the beginnings of TCP fragments (see the @code{tcp}
module), and UDP port 111 (per the @code{portmapper} module).

@cindex fragment reassembly
@cindex fragments, TCP vs. UDP
@cindex TCP, fragments
@cindex UDP, fragments
So, to make Bro do fragment reassembly, you simply use ``@code{load} @code{frag}''.
It effects this by adding:
@example
    (ip[6:2] & 0x3fff != 0) and tcp
@end example

to the filter.  The first part of this expression matches all IP fragments,
while the second restricts those matched to TCP traffic.  We would @emph{like}
to use:
@example
    (ip[6:2] & 0x3fff != 0) and (tcp or udp port 111)
@end example

to also include portmapper fragments, but that won't work---the port
numbers will only be present in the first fragment, so the packet filter
won't recognize the subsequent fragments as belonging to a UDP port 111
packet, and will fail to capture them.

@cindex NFS traffic, high volume fragments
@emph{Note: Alternatively, we might be tempted to use ``@code{(tcp or udp)}''
and so capture @emph{all} UDP fragments, including port 111.  This would
work in principle, but in practice can capture very high volumes of
traffic due to NFS traffic, which can send all of its file data in
UDP fragments.}

@node hot-ids Analysis Script,
@section The @code{hot-ids} Analysis Script

@cindex usernames, sensitive
@cindex sensitive usernames
@cindex hot usernames

The @code{hot-ids} module defines a number of redefinable variables
that specify usernames Bro should consider sensitive:

@table @samp
@item @code{forbidden_ids set[string]}
lists usernames that should never be used.  If Bro detects use of one,
it will attempt to terminate the corresponding connection.

Default: @code{@{ "uucp", "daemon", "rewt", "nuucp", "EZsetup", "OutOfBox", "4Dgifts", "ezsetup", "outofbox", "4dgifts", "sgiweb", @}}.
All of these
correspond to accounts that some systems have enabled by default
(with well-known passwords), except for @code{"rewt"}, which corresponds
to a username often used by (weenie) attackers.
@cindex attackers, weenie

@emph{Deficiency: The repeated definitions such as @code{"EZsetup"} and @code{"ezsetup"} reflect that this variable is a @code{set} and not a @code{pattern}.  Consequently, the exact username must appear in it (with a pattern, we could use character classes to match both upper and lower case).  }

@item @code{forbidden_ids_if_no_password : set[string]}
Same as @code{forbidden_ids} except only considered forbidden if
the login succeeded with an empty password.

Default: @code{"lp"}, a default passwordless IRIX account.

@item @code{forbidden_id_patterns : pattern}
A pattern giving user ids that should be considered forbidden.
@emph{Deficiency: This pattern is currently only used to check Telnet/Rlogin user ids, not ids seen in other contexts, such as FTP sessions.}

Default: @code{/(y[o0]u)(r|ar[e3])([o0]wn.*)/}, a particularly
egregious style of username of which we've observed variants
in different break-ins.

@item @code{always_hot_ids : set[string]}
A list of usernames that should always be considered sensitive,
though not necessarily so sensitive that they should be terminated
whenever used.

Default: @code{@{ "lp", "warez", "demos", forbidden_ids, @}}.  The
@code{"lp"} and @code{"demos"} accounts are specified here rather
than @code{forbidden_ids} because it's possible that they might be
used for legitimate accounts.  @code{"warez"} (for ``wares'', i.e.,
bootlegged software) is listed because its use likely constitutes
a policy violation, not a security violation.

Note: @code{forbidden_ids} is incorporated into @code{always_hot_ids}
to avoid replicating the list of particularly sensitive ids by listing
it twice and risking inconsistencies.

@item @code{hot_ids set[string]}
User ids that generate notices if the user logs in successfully.

Default: @code{@{ "root", "system", always_hot_ids, @}}.  The
ones included in addition to @code{always_hot_ids} are only considered
sensitive if the user logs in successfully.

@end table

@node ftp Analyzer,
@section The @code{ftp} Analyzer

@cindex FTP, analysis
The @code{ftp} analyzer processes traffic associated with
the FTP file transfer service RFC-959.  Bro instantiates an
@code{ftp} analyzer for any connection with service port @code{21/tcp},
providing you have loaded the @code{ftp} analyzer, or defined a handler
for @code{ftp_request} or @code{ftp_reply}.

The analyzer uses a capture filter of ``@code{port ftp}'' (See: @ref{Filtering}).
It generates summaries of FTP sessions;
looks for sensitive usernames, access to sensitive files, and possible
FTP ``bounce'' attacks, in which the host specified in a ``@code{PORT}'' or
``@code{PASV}'' directive does not correspond to the host sending
the directive; or in which a different host than the server (client) connects
to the endpoint specified in a @code{PORT} (@code{PASV}) directive.

@menu
* ftp_session_info record::
* ftp variables::
* ftp functions::
* ftp event handlers::
* ftp notices::
@end menu

@node ftp_session_info record,
@subsection The @code{ftp_session_info} record

@cindex FTP, session information

The main data structure managed by the @code{ftp} analyzer is
a collection of @code{ftp_session_info} records, where the
record type is shown below:

@example
type ftp_session_info: record @{
    id: count;              # unique number associated w/ session
    user: string;           # username, if determined
    request: string;        # pending request or requests
    num_requests: count;    # count of pending requests
    request_t: time;        # time of request
    log_if_not_denied: bool;        # unless code 530 on reply, log it
    log_if_not_unavail: bool;       # unless code 550 on reply, log it
    log_it: bool;           # if true, log the request(s)
@};
@end example

The corresponding fields are:

@table @samp
@item @code{id}
The unique session identifier assigned to this session.  Sessions
are numbered starting at @code{1} and incrementing with each new session.

@item @code{user}
The username associated with this session (from the initial FTP
authentication dialog), or an empty string if not yet determined.

@item @code{request}
The pending request, if the client has issued any.  Ordinarily there
would be at most one pending request, but a client can in fact send
multiple requests to the server all at once,
and an attacker could do so attempting
to confuse the analyzer into mismatching responses with requests,
or simply forgetting about previous requests.

@item @code{num_requests}
A count of how many requests are currently pending.

@item @code{request_t}
The time at which the pending request was issued.

@item @code{log_if_not_denied}
If true, then when the reply to the current request comes in,
Bro should log it, unless the reply code is @code{530} (``@code{denied}'').

@item @code{log_if_not_unavail}
If true, then when the reply to the current request comes in,
Bro should log it, unless the reply code is @code{550} (``@code{unavail}'').

@item @code{log_it}
If true, then when the reply to the current request comes in,
Bro should log it.

@end table

@node ftp variables,
@subsection @code{ftp} variables

@cindex analyzers, ftp, variables

The standard script defines the following redefinable variables:

@table @samp
@item @code{ftp_guest_ids : set[string]}
A set of usernames associated with publicly accessible ``guest''
services.  Bro interprets guest usernames as indicating Bro should
use the authentication @emph{password} as the effective username.

Default: @code{@{ "anonymous", "ftp", "guest", @}}.

@item @code{ftp_skip_hot : set[addr, addr, string]}
Entries indicate that a connection from the first given address to the
second given address, using the given string username, should not
be treated as hot even if the username is sensitive.

Default: empty.

Example: redefining @code{ftp_skip_hot} using
@example
redef ftp_skip_hot: set[addr, addr, string] += @{
    [[bob1.dsl.home.net, bob2.dsl.home.net],
      bob.work.com, "root"], @};
@end example

would result in Bro not noticing FTP connections as user @code{"root"}
from either @code{bob1.dsl.home.net} or @code{bob2.dsl.home.net} to the
server running on @code{bob.work.com}.

@item @code{ftp_hot_files : pattern}
Bro matches the argument given in each FTP file manipulation
request (RETR, STOR, etc.)
against this pattern to see if the file is sensitive. If so,
and if the request succeeds, then the access is logged.

@cindex eggdrop
@cindex filenames, sensitive
Default: @code{aggdrop} a pattern that matches various flavors of password files, plus
any string with @code{eggdrop} in it.  @emph{Note: Eggdrop is an IRC management
tool often installed by certain attackers upon a successful break-in.}

@item @code{ftp_not_actually_hot_files : pattern}
A pattern giving exceptions to @code{ftp_hot_files}.  It turns out
that a pattern like @code{/passwd/} generates a lot of false hits,
such as from @code{passwd.c} (source for the @emph{passwd} utility;
this can turn up in FTP sessions that fetch entire sets of utility sources
using @code{MGET}) or @code{passwd.html} (a Web page explaining how to enter
a password for accessing a particular page).

Default: @code{/(passwd|shadow).*@code{.}(c|gif|htm|pl|rpm|tar|zip)/} .

@item @code{ftp_hot_guest_files pattern}
Files that guests should not attempt to access.

@cindex rhosts
@cindex forward
Default: @code{.rhosts} and @code{.forward} .

@item @code{skip_unexpected : set[addr]}
If a new host (address) unexpectedly connects to the endpoint specified in a
@code{PORT} or @code{PASV} directive, then if either the original host
or the new host is in this set, no message is
generated.  The idea is that you can specify multi-homed hosts that
frequently show up in your FTP traffic, as these can generate innocuous
warnings about connections from unexpected hosts.

Default: some @code{hp.com} hosts, as an example.  Most are specified
as raw IP addresses rather than hostnames, since the hostnames
don't always consistently resolve.

@item @code{skip_unexpected_net : set[addr]}
The same as @code{skip_unexpected}, except addresses are masked to /24 and
/16 before looked up in this set.

Default: empty.

@end table

@cindex FTP, log file
@cindex log file, FTP
@cindex ftp session summary file
In addition, @code{ftp_log} holds the name of the FTP log file to
which Bro writes FTP session summaries.  It defaults to
@code{open_log_file("ftp")}.

Here is an example of what entries in this file look like:

@example
972499885.784104 #26 131.243.70.68/1899 > 64.55.26.206/ftp start
972499886.685046 #26 response (220 tuvok.ooc.com FTP server
    (Version wu-2.6.0(1) Fri Jun 23 09:17:44 EDT 2000) ready.)
972499886.686025 #26 USER anonymous/IEUser@@ (logged in)
972499887.850621 #26 TYPE I (ok)
972499888.421741 #26 PASV (227 64.55.26.206/2427)
972499889.493020 #26 SIZE /pub/OB/4.0/JOB-4.0.3.zip (213 1675597)
972499890.135706 #26 *RETR /pub/OB/4.0/JOB-4.0.3.zip, ABOR (complete)
972500055.491045 #26 response (225 ABOR command successful.)
@end example

Here we see a transcript of
the 26th FTP session seen since Bro started running.  The first line
gives its start time and the participating hosts and ports.  The
next line (split across two lines above for clarity) gives the server's
welcome banner.  The client then logged in as user ``@code{anonymous}'',
and because this is one of the guest usernames, Bro recorded their
password too, which in this case was ``@code{IEUser@@}'' (a useless
string supplied by their Web browser).  The server accepted this
authentication, so the status on the line is ``@code{(logged in)}''.

The client then issues a request for the Image file type, to which the
server agreed.  Next they issued a @code{PASV} directive, and received a
response instructing them to connect to the server on port @code{2427/tcp}
for the next transfer.  At this point, after issuing a @code{SIZE} directive
(to which the server returned 1,675,597 bytes), they send @code{RETR} to
fetch the file @emph{/pub/OB/4.0/JOB-4.0.3.zip}.  However, before the
transfer completed, they issued @code{ABOR}, but the transfer finished
before the server processed the abort, so the log shows a status of @code{completed}.  Furthermore, because the client issued two commands without
waiting for an intervening response, these are shown together in the log
file, and the line marked with a ``@code{*}'' so it draws the eye.  Finally,
because Bro paired up the @code{(completed)} with the multi-request line, it
then treats the response to the @code{ABOR} command as a reply by itself,
showing in the last line that the server reported it successfully carried
out the abort.

The corresponding lines in the @file{conn} file look like:
@example
    972499885.784104 565.836 ftp 118 427 131.243.70.68 64.55.26.206
        RSTO L #26 anonymous/IEUser@@
    972499888.984116 165.098 ftp-data ? 1675597 131.243.70.68
	64.55.26.206 RSTO L
@end example

The first line summarizes the FTP control session (over which the client
sends its requests and receives the server's responses).  It includes
an @code{addl} annotation of ``@code{#26 anonymous/IEUser@@}'',
summarizing the session number (so you can find the corresponding records
in the @code{ftp} log file) and the authentication information.

@cindex connection size, undetermined for RST termination
@cindex RST termination, causing undetermined connection size
The second line summarizes the single FTP data transfer, of 1,675,597 bytes.
The amount of data sent by the client for this connection is shown as
unknown because the client aborted the connection with a RST (hence
the state @code{RSTO}).   For connections that Bro does not look inside
(such as FTP data transfers), it learns the amount of data transferred from
the sequence numbers of the SYN and FIN connection control packets, and
can't (reliably) learn them for the sender of a RST.  (It can for the
receiver of the RST.)

They also aborted the control session (again, state @code{RSTO}), but
in this case, Bro captured all of the packets of the session, so it
could still assign sizes to both directions.

@cindex analyzers, ftp, variables

@node ftp functions,
@subsection @code{ftp} functions

@cindex analyzers, ftp, functions

The standard @code{ftp} script provides one function for external use:

@table @samp
@item @code{is_ftp_data_conn (c: connection): bool }
Returns true if the given connection matches one we're expecting
as the data connection half of an FTP session.  @emph{Note: This function is
not idempotent: if the connection matches an expected one, then
Bro updates its state such that that connection is no longer expected.
It also logs a discrepancy if the connection appears to be usurping
another one that generated either a ``@code{PORT}'' or a ``@code{PASV}''
directive.}

Also returns true if the source port is @code{20/tcp} and there's currently
an FTP session active between the originator and responder, in case for
some reason Bro's bookkeeping is inconsistent.

@end table

@cindex analyzers, ftp, functions

@node ftp event handlers,
@subsection @code{ftp} event handlers
@cindex analyzers, ftp, event handlers

The standard script handles the following events:

@table @samp
@item @code{ftp_request (c: connection, command: string, arg: string)}
Invoked upon the client side of connection @code{c} having made the request
@code{command} with the argument @code{arg}.

The processing depends on the particular command:
@table @samp

@item @code{USER}
Specifies the username that the client wishes to
use for authentication. If it is sensitive---in @code{hot_ids} (which
the @code{ftp} analyzer accesses via a @code{@@load} of @code{hot-ids})---then
the analyzer flags the FTP session as notice-worthy.  In addition, if
the username is in @code{forbidden_ids}, then the analyzer terminates
the session.

The analyzer also updates the connection's @code{addl} field
with the username.

@item @code{PASS}
Specifies the password to use for authentication.

If the password is empty and the username appears in
@code{forbidden_ids_if_no_password} (also from the @code{hot-ids} analyzer),
then the analyzer terminates the connection.

If the username corresponds to a guest account (@code{ftp_guest_ids}),
then the analyzer updates the connection's @code{addl}  field
with the password as additional account information.  Otherwise,
it generates an @code{account_tried} event to
facilitate detection of password guessing.

@item @code{PORT}
Instructs the FTP server to connect to the given
IP address and port for delivery of the next FTP data item.  The analyzer
first checks the address/port specifier for validity.  If valid, it
will generate a notice if either the address specified in the directive
does not match that of the client, or if the port corresponds to a
``privileged'' port, i.e., one in the range 0--1023.  Finally, it
establishes state so that @code{is_ftp_data_conn} can identify a
subsequent connection corresponding to this directive as belonging to
this FTP session.

@item @code{ACCT}
Specifies additional accounting information associated with a session,
which the analyzer simply adds to the connection's
field.

@item @code{APPE}, @code{CWD}, @code{DELE}, @code{MKD}, @code{RETR}, @code{RMD}, @code{RNFR}, @code{RNTO}, @code{STOR}, @code{STOU}
All of these manipulate files (and directories).  The analyzer checks
the filename against the policies to see if it is sensitive in the
context of the given username (i.e., guest or non-guest), and, if so,
marks the connection to generate a notice unless the operation fails.
The analyzer also checks for an excessively long filename, currently
by checking its length against a @emph{Deficiency:hardwired maximum of 250 bytes}.
@end table

@item @code{ftp_reply (c: connection, code: count, msg: string, cont_resp: bool)}
Invoked upon the server side of connection @code{c} having replied to
a request using the given status code and text message.  @code{cont_resp}
is true if the reply line is tagged as being continued to the next line.
The analyzer only processes requests when the last line of a continued
reply is received.

The analyzer checks the reply against any expected for the connection
(for example, ``@code{log_if_not_denied}'') and generates notices accordingly.
If the reply corresponds to a @code{PASV} directive, then it parses the
address/port specification in the reply and generates notices in an analogous
fashion as done by the @code{ftp_request} handler for @code{PORT} directives.

Finally, if the reply is not one that the analyzer is hardwired to skip
(code @code{150}, used at the beginning of a data transfer, and code
@code{331}, used to prompt for a password),
then it writes a summary of the request and reply to the FTP log file
(See: @ref{ftp variables}).  Also, if the reply is an ``orphan'' (there was
no corresponding request, perhaps because Bro started up after the
request was made), then the reply is summarized in the log file by
itself.

@end table

The standard @code{ftp} script defines one other handler, an instance of
 used to flush FTP session information
in case the session terminates abnormally and no reply is seen to
the pending request(s).

@cindex analyzers, ftp, event handlers

@node ftp notices,
@subsection ftp notices
@cindex ftp, notices

The FTP analyzer can generate the following Notices:

@itemize
@item FTP::FTP_BadPort	 - Bad format in PORT/PASV
@item FTP::FTP_ExcessiveFilename	- Very long filename seen
@item FTP::FTP_PrivPort	- Privileged port used in PORT/PASV
@item FTP::FTP_Sensitive -Sensitive connection (as defined in hot)
@item FTP::FTP_UnexpectedConn	- Data transfer from unexpected src.
Suppose there's an FTP session between client A and server B, and either
A issues a PORT or B issues a PASV.  Then what's expected is that A will
rendezvous with B using the port specified in the PORT/PASV.  If instead
a new IP address C connects to (or accepts from) the negotiated port, that
generated FTP_UnexpectedConn.
@end itemize


@node http Analyzer,
@section The @code{http} Analyzer

@cindex HTTP, analysis

The @code{http} analyzer processes traffic associated with
the Hyper Text Transfer Protocol (HTTP) [RFC-1945],
the main protocol used by the Web.  Bro instantiates an
@code{http} analyzer for any connection with service port @code{80/tcp},
providing you have loaded the @code{http} analyzer, or defined a handler
for @code{http_request}.  It also instantiates an analyzer for
service ports @code{8080/tcp} and @code{8000/tcp}, as these are
often also used for Web servers.

The analyzer uses a capture filter of ``@code{tcp dst port 80 or tcp dst port 8080 or tcp dst port 8000}'' (See: @ref{Filtering}).  Note: This filter excludes
traffic sent by an HTTP server (that would be matched by @code{tcp src port 80},
etc.), because @emph{Deficiency: Bro doesn't yet have an analyzer for HTTP replies.  It generates summaries of HTTP sessions (connections between the same client and server) and looks for access to sensitive URIs (effectively, URLs).}

@menu
* http variables::
* http event handlers::
@end menu

@node http variables,
@subsection @code{http} variables

@cindex analyzers, http, variables

@table @samp
@cindex HTTP methods
@item @code{sensitive_URIs : pattern}
Any HTTP method (e.g., @code{GET}, @code{HEAD},
@code{POST}) specifying
a URI that matches this pattern is flagged as sensitive.

@cindex etc/passwd
@cindex etc/shadow
@cindex Cold Fusion exploits
Default: URIs with @code{/etc/passwd} or @code{/etc/shadow} embedded
in them, or @code{/cfdocs/expeval} (used in some Cold Fusion exploits).
Note: This latter generates some false hits; it's mainly included
just to convey the notion of looking for direct attacks rather than
attacks used to exploit sensitive files like the first ones.

@emph{Deficiency: It would be very handy to have variables providing hooks for more context when considering whether a particular access is sensitive, such as whether the request was inbound or outbound. }

@item @code{sensitive_post_URIs : pattern}
Any @code{POST} method specifying a URI that matches this pattern is flagged as
sensitive.

Default: URIs with @code{wwwroot} embedded in them.

@end table

@cindex frogs, dissecting
@cindex http session summary file

@cindex HTTP, log file
@cindex log file, HTTP
In addition, @code{http_log} holds the name of the HTTP log file to
which Bro writes HTTP session summaries.  It defaults to
@code{open_log_file("http")}.

Here we show an example of what entries in this file look like:

@example
972482763.371224 %1596 start 200.241.229.80 > 131.243.2.12
%1596 GET /ITG.hm.pg.docs/dissect/portuguese/dissect.html
%1596 GET /vfrog/bottom.icon.gif
%1596 GET /vfrog/top.icon.gif
%1596 GET /vfrog/movies/off.gif
%1596 GET /vfrog/new.frog.small.gif
@end example

Here we see a transcript of
the 1596th HTTP session seen since Bro started running.  The first line
gives its start time and the participating hosts.  The
next five lines all correspond to @code{GET} methods retrieving different
items from the Web server.  @emph{Deficiency: Bro can't log whether the retrievals succeeded or failed because it doesn't yet have an HTTP reply analyzer. }

The corresponding lines in the @code{conn} file look like:
@example
    972482762.872695 481.551 http 441 5040 131.243.2.12 200.241.229.80
        S3 X %10596
    972482764.686470 18.7611 http 596 7712 131.243.2.12 200.241.229.80
        S3 X %10596
    972482764.685047 ? http 603 2959 131.243.2.12 200.241.229.80
        S1 X %10596
@end example

That there are three rather than five reflects @emph{(i)} that the client
used persistent HTTP, and so didn't need one connection per item, but
also @emph{(ii)} the client used three parallel connections (the maximum
the standard allows is only two) to fetch the items more quickly.  As with FTP
sessions, the @code{%10596} @code{addl} annotation lets you
correlate the @code{conn} entries with the log entries.

@emph{Note: All three of the connections wound up in unusual states.  The first
two are in state @code{S3}, which, as indicated by Table 7.3,
means that the responder (in this case, the Web server) attempted to close
the connection, but their was no reply from the originator.  The last is
in state @code{S1}, indicating that neither side attempted to close the
connection (which is why no duration is listed for the connection).}

@cindex analyzers, http, variables

@node http event handlers,
@subsection @code{http} event handlers

@cindex analyzers, http, event handlers

The standard HTTP script defines one event handler:

@table @samp
@item @code{http_request c: connection, request: string, URI: string}
Invoked whenever the client side of the given connection generates an
HTTP request.  @code{request} gives the HTTP method and @code{URI} the
associated resource.  The analyzer matches the URI against the ones
defined as sensitive, as given above.
@end table

@emph{Deficiency: As mentioned above, the event engine does not currently generate an @code{http_reply} event.  This is for two reasons: first, the HTTP request stream is much lower volume than the HTTP reply stream, and I was interested in the degree to which Bro could get away without analyzing the higher volume stream.  (Of course, this argument is shallow, since one could control whether or not Bro should analyze HTTP replies by deciding whether or not to define an @code{http_reply} handler.)  Second, matching HTTP replies in their full generality involves a lot of work, because the HTTP standard allows replies to be delimited in a number of ways.  That said, most of the work for implementing @code{http_reply} is already done in the event engine, but it is missing testing and debugging.}

@cindex analyzers, http, event handlers

@node ident Analyzer,
@section The @code{ident} Analyzer

@cindex IDENT, analysis

The @code{ident} analyzer processes traffic associated with
the Identification Protocol [RFC-1413], which provides a simple
service whereby clients can query Ident servers to discover user information
associated with an existing connection between the server's host and
the client's host.  Bro instantiates an @code{ident} analyzer for
any connection with service port @code{113/tcp}, providing you have loaded
the @code{ident} analyzer, or defined a handler for @code{ident_request},
@code{ident_reply}, or @code{ident_error}.

The analyzer uses a capture filter of ``@code{tcp port 113}''
(See: @ref{Filtering}).
The @code{ident_reply} handler annotates the @code{addl}
field of the connection for which the Ident client made its query with the
user information returned in the reply.  It also checks the user information
against sensitive usernames, because a match indicates that the connection
in the Ident query was initiated by a possibly-compromised account.

@menu
* ident variables::
* ident event handlers::
@end menu

@node ident variables,
@subsection @code{ident} variables

@cindex analyzers, ident, variables

The standard script defines the following pair of redefinable variables:

@table @samp
@item @code{hot_ident_ids : set[string]}
usernames to flag as sensitive if they appear in an Ident reply.

Default: @code{always_hot_ids} (See: @ref{hot-ids Analysis Script}).

@item @code{hot_ident_exceptions : set[string]}
usernames not to consider sensitive even if they appear in
@code{hot_ident_ids}.

@cindex daemons, as innocuous user names
Default: @code{@{ "uucp", "nuucp", "daemon", @}}.  These usernames
are exceptions because daemons sometimes run with the given user ids
and their use is often innocuous.

@end table

@cindex analyzers, ident, variables

@node ident event handlers,
@subsection @code{ident} event handlers

@cindex analyzers, ident, event handlers

The standard script handles the following events:

@table @samp
@item @code{ident_request (c: connection, lport: port, rport: port)}
Invoked when a client request arrives on connection @code{c}, querying
about the connection from local port @code{lport} to remote port
@code{rport}, where local and remote are relative to the client.

@item @code{ident_reply (c: connection, lport: port, rport: port, user_id: string, system: string)}
Invoked when a server replies to an Ident request.  @code{lport} and
@code{rport} are again the local and remote ports (relative to the
client) of the connection being asked about.  @code{user_id} is the
user information returned in the Ident server's reply, and @code{system}
is information regarding the operating system (the Ident specification
does not further standardize this information).

The handler annotates the queried connection with the user information,
which it also checks against @code{hot_ident_ids} and @code{hot_ident_exceptions}
as discussed above.  At present, it does nothing with the @code{system}
information.

@item @code{ident_error (c: connection, lport: port, rport: port, line: string)}
Invoked when the given request yielded an error reply from the Ident
server.  The handler annotates the connection with
@code{ident/}@code{<}@emph{error}@code{>},
where @emph{error} is the text given in @code{line}.

@end table

@cindex analyzers, ident, event handlers

@node irc Analyzer,
@section The @code{irc} Analyzer

@cindex IRC, analysis

The @code{IRC} analyzer processes traffic from chat sessions that
use the IRC (Internet Relay Chat) protocol.
It can analyze client-server connections and server-server connections.

Bro instantiates an @code{IRC} analyzer for any connection with service
ports @code{6666/tcp} or @code{6667/tcp},
providing you have loaded the @code{IRC} analyzer, or defined a handler
for one of the IRC events.
It it also possible to analyze server connections, but to do so you need
to recompile Bro to include the necessary ports if they are not the
usual ones.

Bro can analyze compressed connections if it sees the beginning of the
connection.

@menu
* irc records::
* irc variables::
* irc event handlers::
@end menu

@node irc records,
@subsection @code{irc} records

@cindex analyzers, irc, records

The standard script defines a record for users and one for channels.
This is the user record:
@example
type irc_user: record @{
	u_nick: string;			# nick name
	u_real: string;			# real name
	u_host: string;			# client host
	u_channels: set[string];	# channels user is a member of
	u_is_operator: bool;		# user is server operator
	u_conn: connection;
@}
@end example

This record represents a user inside the IRC network.
The corresponding fields are:

@table @samp

@item @code{u_nick}
The nick name of the user.

@item @code{u_real}
The real name of the user.

@item @code{u_host}
This is the client's host name.

@item @code{u_channels}
A list of channels the user has joined.

@item @code{u_isOp}
If the user got operator status in the IRC network this will be set to true.

@item @code{u_conn}
The TCP connection which this IRC connection is based on.

@end table

This is the channel record:
@example
type irc_channel: record @{
	c_name: string;		# channel name
	c_users: set[string];	# users in channel
	c_ops: set[string];	# channel operators
	c_type: string;		# channel type
	c_modes: string;	# channel modes
@}
@end example

This record represents a channel inside the IRC network.
The corresponding fields are:

@table @samp

@item @code{c_name}
The name of the channel.

@item @code{c_users}
A list of nick names of users in this channel.

@item @code{c_ops}
A list of nick names of users with operator status in this channel.

@item @code{c_type}
The channel type.

@item @code{c_modes}
The channel modes.
@end table

@cindex analyzers, irc, records

@node irc variables,
@subsection @code{irc} variables

@cindex analyzers, irc, variables

The standard script defines the following set of redefinable variables:

@table @samp

@item @code{IRC::hot_words}
list of regular expressions which will generate notice messages.  The
analyzer searches for these patterns in user messages, notices and all
unknown IRC commands.

@item @code{IRC::ignore_in_other_msgs: set[string]}
list of IRC commands which are ignored in the events for unknown commands.

@item @code{IRC::ignore_in_other_responses: set[count]}
list of IRC return codes which are ignored in the event for unknown return
codes.

@end table

These variables contain information about users and channels which were identified by Bro.

@table @samp

@item @code{IRC::users: table[string]}
contains all identified IRC users as @code{irc_user} objects.

@item @code{IRC::channels: table[string]}
contains all identified IRC channels as @code{irc_channel} objects.

@end table

@cindex analyzers, irc, variables

@node irc event handlers,
@subsection @code{irc} event handlers

@cindex analyzers, irc, event handlers

The standard script handles the following events:

@table @samp

@item @code{irc_privmsg_message (c: connection, source: string, target: string, message: string)}
A user sent a message to another user or channel.

@code{IRC} command: PRIVMSG

The source is the user who sent the message to the target user/channel.
Message contains the data sent to the target.

@item @code{irc_notice_message (c: connection, source: string, target: string, message: string)}
This is very similar to the irc_privmsg_message. It is typically used by
services or client scripts to send status messages.

@code{IRC} command: NOTICE

The source is the user who sent the message to the target user/channel.
Message contains the data sent to the target.

@item @code{irc_squery_message (c: connection, source: string, target: string, message: string)}
This event is activated if somebody sends a message to an IRC service.

@code{IRC} command: SQUERY

The source is the user who sent the message to the target service.
Message contains the data sent to the target.

@item @code{irc_enter_message (c: connection, nick: string, realname: string)}
Every time a user enters the IRC network this event occurs.

@code{IRC} command: USER

Nick contains the selected nick name of the user and realname the user's
name in real life.

@item @code{irc_quit_message (c: connection, nick: string, message: string)}

Every time a user quits the IRC network this event occurs.

@code{IRC} command: QUIT

Nick contains the nick name of the sender. An optional quit message is included in message.

@item @code{irc_join_message (c: connection, infoList: irc_join_list)}

If a user joins one or more IRC channels this event occurs.

@code{IRC} command: JOIN

The infoList contains a list of joined channel names and - if provided by
user - the passwords for them.

@item @code{irc_part_message (c: connection, nick: string, channels: string_set, message: string)}

If a user exits one or more IRC channels this event occurs.

@code{IRC} command: PART

Nick contains the nick name of the user.
Channels is a set of channel names.
If the user supplies a quit message it is included in message.

@item @code{irc_nick_message (c: connection, who: string, newnick: string)}

This event occurs when users change their nick names.

@code{IRC} command: NICK

Who contains the IRC message prefix which includes the user nick and host.
Newnick is the new nick name of this user.

@item @code{irc_invalid_nick (c: connection)}

This event occurs when users change their nick names and the name was invalid.

@code{IRC} response to: NICK

@item @code{irc_network_info (c: connection, users: count, services: count, servers: count)}

This a summary of the status of the whole IRC network.

@code{IRC} response to: LUSERS

Users, services and servers are the total number of users, services and IRC servers connected to the IRC network.

@item @code{irc_server_info (c: connection, users: count, services: count, servers: count)}

This a summary of an IRC server status.

@code{IRC} response to: LUSERS

Users, services and servers are the total number of users, services and IRC servers connected with this IRC server.

@item @code{irc_channel_info (c: connection, channels: count)}

Displays the total number of channels.

@code{IRC} response to: LUSERS

Channels is the number of IRC channels formed on this server (local + global).

@item @code{irc_who_message (c: connection, mask: string, oper: bool)}

The event occurs if an IRC user sent the WHO command to get information about an IRC user or channel.

@code{IRC} command: WHO

Mask is the target of the search. This can be a channel or user name,
wildcards are allowed.  If oper is true then the user asks only for operator
user results.

@item @code{irc_who_line (c: connection, target_nick: string, channel:
string, user: string, host: string, server: string, nick: string, params:
string, hops: count, realname: string)}

This includes several information about an IRC user.

@code{IRC} response to: WHO

@code{Target_nick} is the nick name of the IRC user who sent the WHO request.
The username of the returned IRC user is included in @code{user}, his nick
name in @code{nick} and real name in @code{realname}.  The client DNS/IP
address is @code{host}. @code{Params} includes the channel parameters for
this user (e.g. "@@" for channel operators).  The user is connected to IRC
server @code{server} and the number of servers between him and the requester
is @code{hops}.  @code{Channel} includes the channel name which was target
for the request.

@item @code{irc_whois_message (c: connection, server: string, users: string)}

The event occurs if an IRC user sent the WHOIS command to get information
about one or more IRC users.

@code{IRC} command: WHOIS

If server is given then the user wants this specific server to answer.
Users is comma separated list of nick names for which information is
requested.

@item @code{irc_whois_user_line (c: connection, nick: string, user: string, host: string, realName: string)}

This includes several information about an IRC user.

@code{IRC} response to: WHOIS

The user with nick name nick has the user name user and his real name is realname. The IRC client runs on host.

@item @code{irc_whois_operator_line (c: connection, nick: string)}

This response to an WHOIS command gives information if an IRC user is operator.

@code{IRC} response to: WHOIS

The IRC user with nick name nick has operator status.

@item @code{irc_whois_channel_line (c: connection, nick: string, channels: string_set)}

This response to an WHOIS command gives information on the channels of an
IRC user.

@code{IRC} response to: WHOIS

The IRC user with nick name nick is member in all IRC channels of the
variable channels.

@item @code{irc_oper_message (c: connection, user: string, password: string)}

This means that an IRC user requested operator status.

@code{IRC} command: OPER

The user and password parameters are used to authenticate the possible
operator. They must fit to the IRCD server settings.

@item @code{irc_oper_response (c: connection, got_oper: bool)}

This is the answer to an operator request.

@code{IRC} response to: OPER

If the IRC user got operator status the got_oper variable is true.

@item @code{irc_kick_message (c: connection, prefix: string, channels: string, users: string, comment: string)}

An user requested to remove somebody from a channel.

@code{IRC} command: KICK

Prefix includes the requesters nick name and host. The user requested to
remove the users (comma separated list) from the channels (comma separated
list).  If the requester provided an optional kick message it is included
in comment.

@item @code{irc_error_message (c: connection, prefix: string, message: string)}

An IRC server sent an error message to one or more clients.

@code{IRC} command: ERROR

Prefix includes the server name and message contains the error message.

@item @code{irc_invite_message (c: connection, prefix: string, nickname: string, channel: string)}

An IRC user sent an invitation for a closed channel to another user.

@code{IRC} command: INVITE

Prefix includes the senders nick and host. The IRC user with the nick name
nickname is invited to the channel with name channel.

@item @code{irc_mode_message (c: connection, prefix: string, params: string)}

An IRC user sent an user or channel mode message.

@code{IRC} command: MODE

@item @code{irc_squit_message (c: connection, prefix: string, server: string, message: string)}

This means that the disconnection of a server link was requested. This
command is only available to operators.

@code{IRC} command: SQUIT

Prefix includes the requesters nick and host. Server is the host name of
the server to disconnect and message contains an optional comment.

@item @code{irc_names_info (c: connection, c_type: string, channel: string, users: string_set)}

This reply to a NAMES command gives information what users are on what
channels.

@code{IRC} response to: NAMES

C_type is "@@" for secret, "*" for private and "=" for public channels.
Channel contains the channel name.  Users is a list of nick names that are
member of this channel.

@item @code{irc_dcc_message (c: connection, prefix: string, target: string, dcc_type: string,
argument: string, address: addr, dest_port: count, size: count)}

An user sent a DCC request to another user to setup a direct connection
between these users.

@code{IRC} command: PRIVMSG DCC

Prefix contains the requesters nick and host. Target contains the target
user's nick name.  Dcc_type can be "CHAT" for chat connections or "SEND"
for file transfers.  Argument contains the file name for file transfers
or "chat" for chat connections.  Address and dest_port specify where the
target user should connect.  Size is only given for file transfers and
contains the file size in bytes.

@item @code{irc_request (c: connection, prefix: string, command: string, arguments: string)}

All client messages that do not fit to the other events are handled here.

Prefix is usually formated like this: <nickname>!<user>@@<hostname>.
Command contains the command string which was sent and arguments the
corresponding argument values.

@item @code{irc_message (c: connection, prefix: string, command: string, message: string)}

All server messages that do not fit to the other events are handled here.

Prefix is usually the server name.  Command contains the command string
which was sent and message contains additional parameters.

@item @code{irc_response (c: connection, prefix: string, code: count, params: string)}

All server response messages that do not fit to the other events are handled
here.

Prefix is usually the server name.  Code is the numeric reply code and
params contains any additional parameters.

@end table

@node login Analyzer,
@section The @code{login} Analyzer
@cindex login session
@cindex Rlogin, sessions
@cindex Telnet, sessions
@cindex Unix analysis
@cindex keystrokes, analysis
@cindex input, analysis
@cindex user keystrokes, analysis
The @code{login} analyzer inspects interactive login sessions
to extract username and password information, and monitors user
keystrokes and the text returned by the login server.  It is one of
the most powerful Bro modules for detecting break-ins to Unix
systems because of the ability to look for particular commands that
attackers often execute once they have penetrated a Unix machine.

The analyzer is generic in the sense that it applies to more
than one protocol.  Currently, Bro instantiates a @code{login}
analyzer for both Telnet [RFC-854] and Rlogin [RFC-1282]
traffic.  In principle, it could do the same for other protocols such as
SSH [RFC-XXX] or perhaps X11 [RFC-1013], if one could write
the corresponding elements of the event engine to decrypt the
SSH session (naturally, this would require access to the encryption keys)
or extract authentication information and keystrokes from the
X11 event stream.  @emph{Note: The analyzer does an exceedingly limited
form of SSH analysis; see @code{hot_ssh_orig_ports} }.

@cindex authentication dialog
@cindex usernames, extracting
@cindex sniffing
@cindex passwords, sniffing
@cindex heuristics, extracting username information
@cindex Telnet, options
@cindex options, Telnet
For Telnet, the event engine knows how to remove in-band Telnet option
sequences [RFC-855]
from the text stream, and does not deliver these to
the event handlers, except for a few options
that the engine
analyzes in detail (such as attempts to negotiate authentication).
Unfortunately, the Telnet protocol does not include any explicit
marking of username or password information (unlike the FTP protocol,
as discussed in @ref{ftp Analyzer}).  Consequently, Bro employs a series
of heuristics that attempt to extract the username and password from the
authentication dialog the session is presumed to begin with.  The
analysis becomes quite complicated due to the possible use of
type-ahead and editing sequences by the user, plus the possibility
@cindex evasion, authentication dialog
that the user may be an attacker who attempts to mislead the heuristics
in order to disguise the username they are accessing.

@cindex rhosts
Analyzing Rlogin is nominally easier than analyzing Telnet because Rlogin
has a simpler in-band option scheme, and because the Rlogin protocol
explicitly indicates the username in the initial connection dialog.
However, this last is not actually a help to the analyzer, because
for most Rlogin servers, if the initial username fails authentication
(for example, is not present in the @code{.rhosts} file local to
the server), then the server falls back on the same authentication
dialog as with Telnet
(prompting for username and then password, or perhaps just
for a password to go with the transmitted username).
Consequently, the event engine employs the same set of heuristics
as for Telnet.

Each connection processed by the analyzer is in a distinct state:
user attempting to authenticate, user has successfully authenticated,
analyzer is skipping any further processing, or the analyzer is
confused (See: @ref{login analyzer confusion}).  You can find out the state of
a given connection using @code{get_login_state}.

The analyzer uses a capture filter of ``@code{tcp port 23 or tcp port 513}''
@code{@ref{Filtering}}.  It annotates each connection
with the username(s) present in the authentication dialog.  If
the username was authenticated successfully, then it encloses
the annotation in quotes. If the authentication failed, then
the name is marked as @code{failed/}@code{<}@emph{username}@code{>}.
So, for example, if user ``smith'' successfully authenticates,
then the connection's @code{addl} field will have
@code{"smith"} appended to it:
@example
931803523.006848 254.377 telnet 324 8891 1.2.3.4 5.6.7.8 SF L "smith"
@end example

while if ``smith'' failed to authenticate, the report will look like:
@example
931803523.006848 254.377 telnet 324 8891 1.2.3.4 5.6.7.8 SF L fail/smith
@end example

and if they first tried as ``smith'' and failed, and then succeeded
as ``jones'', the record would look like:
@example
931803523.006848 254.377 telnet 324 8891 1.2.3.4 5.6.7.8 SF L
	fail/smith "jones"
@end example

@cindex passwords, inadvertently exposed
@cindex sensitive information, inadvertently exposed
@emph{Note: The event engine's heuristics can sometimes get out of synch
such that it interprets a password as a username; in addition, users
sometimes type their password when they should instead enter
their username.  Consequently, the connection logs sometimes include
passwords in the annotations, and so should be treated as very sensitive
information (e.g., not readable by any user other than the one running
Bro). }

@menu
* login analyzer confusion::
* login variables::
* login functions::
* login event handlers::
@end menu

@node login analyzer confusion,
@subsection @code{login} analyzer confusion
@cindex login analysis, confusion
@cindex confused login analysis
@cindex heuristics, confusion
@cindex confusion of heuristics
@cindex failure of heuristics
@cindex authentication dialog
@cindex usernames, extracting
@cindex heuristics, extracting username information
@cindex rhosts
Because there is no well-defined protocol for Telnet authentication
(or Rlogin, if the initial
@code{.rhosts} authentication fails), the @code{login} analyzer employs a set
of heuristics to detect the username, password, and whether the authentication
attempt succeeded.  All in all, these heuristics work quite well, but
it is possible for them to become confused and reach incorrect conclusions.

Bro attempts to detect such confusion.  If it does, then it generates a
 event, after which the event engine will no
longer attempt to follow the authentication dialog.  In particular, it will
@emph{not} generate subsequent @code{login_failure} or
@code{login_sucess} events.  The @code{login_confused} event includes
a string describing the type of confusion, using one of the values
given in the table below.


@float Table, Login analyzer confusion
@multitable  @columnfractions .35 .55
@item @strong{Type of confusion} @tab @strong{Meaning}
@item "excessive typeahead"
@tab The user has typed ahead 12 or more lines. Deficiency: The upper bound
should be adjustable.
@item "extra repeat text"
@tab The user has entered more than one VMS repeat sequence (an escape
followed by "[A") on the same line. Note: Bro determines
that a login session involves a VMS server if the server prompts with
"@code{Username:}". It then interprets VMS repeat sequences as indicating
it should replace the current line with the previous line.
@item "multiple USERs"
@tab The user has specified more than one username using the $USER environment
variable.
@item "multiple login prompts"
@tab The analyzer has seen several login prompts on the same line, and has
not seen a corresponding number of lines typed ahead previously by the
user.
@item "no login prompt"
@tab The analyzer has seen 50 lines sent by the server without any of them
matching login prompts. Deficiency: The value of 50 should be adjustable.
@item "no username"
@tab The analyzer is generating an event after having already seen a login
failure, but the user's input has not provided another username to include
with the event. Note: If the analyzer's heuristics indicate it's okay that
no new username has been given, such as when the event is generated
due to one connection endpoint closing the connection, then it instead
uses the username @code{<none>}.
@item "no username2"
@tab The analyzer saw an additional password prompt without seeing an intervening
username, and it has no previous username to reuse.
@item "non empty multi login"
@tab The analyzer saw multiple adjacent login prompts, with an apparently
ignored intervening username typed-ahead between them.
@item "possible login ploy"
@tab The client sent text that matches one of the patterns reflecting text usually
sent by the server. This form of confusion can reflect an attacker attempting
to evade the monitor. For example, the client may have sent the text
"@code{login:} as a username so that when echoed back by the server, the
analyzer would misinterpret it as reflecting another login prompt from
the server.
@item "repeat without username"
@tab The user entered a VMS repeat sequence but there is no username to
repeat. (See extra repeat text for a discussion of the analyzer's
heuristics for dealing with VMS servers.)
@item "responder environment"
@tab The responder (login server) has signaled a set of environment variables
to the originator (login client). This is in the opposite direction as to what
makes sense.
@item "username with embedded repeat"
@tab The line repeated by a VMS server in response to a repeat sequence itself
contains a repeat sequence.
@end multitable
@caption{Different types of confusion that login analyzer can report}
@end float

@node login variables,
@subsection @code{login} variables

@cindex analyzers, login, variables

The standard script defines a large number of variables for refining the
analysis policy:

@table @samp
@item @code{input_trouble : pattern}
lists patterns that the analyzer
should flag if they appear in the user's input (keystroke) stream.

@cindex user keystrokes, editing
@cindex keystrokes, editing
@cindex input, editing
The analyzer searches for these patterns both in the raw text typed
by the user and the same lines after applying @emph{editing}
using the @code{edit} function twice: once with interpreting
@emph{BS} (ctrl-H) as delete-one-character, and once with @emph{DEL}
as the edit character.  If any of these matches, then the analyzer
considers the pattern to have matched.

@code{eggdrop}
@cindex root, backdoors
@cindex Internet Relay Chat (IRC), attacker subpopulation
@cindex exploits, Unix
Default: a pattern matching occurrences of the strings
``@code{rewt}'',
``@code{eggdrop}'',
``@code{loadmodule}'', or
``@code{/bin/eject}''.  The first of these is a popular username attackers
use for root backdoor accounts.  The second reflects that one prevalent
class of attackers are devotees of Internet Relay Chat (IRC), who
frequently upon breaking into an account install the IRC @code{eggdrop}
utility.

@item @code{edited_input_trouble : pattern}
is the same as @code{input_trouble} except the analyzer only checks the edited
user input against the pattern, not the raw input (see above).

This variable is provided so you can specify patterns that can
occur innocuously as typos; whenever the user corrects the typo before
terminating the line, the pattern won't match, because it won't be present
in the edited version of the line.  In addition, for matches to
these patterns, the analyzer @emph{delays} reporting the match until
it sees the next line of output from the server.  It then includes
both the line that triggered the match and the corresponding response
from the server, which makes it easy for a human inspecting the logs
to tell if the occurrence of the pattern was in fact innocuous.

@cindex filenames, sensitive
@cindex directory names, sensitive
@cindex sensitive filenames
Here's an example of an innocuous report:
@example
936723303.760483 1.2.3.4/21550 > 5.6.7.8/telnet
    input "cd ..." yielded output "ksh: ...:  not found."
@end example

It was flagged because the user's input included
``@code{...}'', a name commonly used by attackers to surreptitiously
hide a directory containing their tools and the like.  However, we
see from the Telnet server's response that this was not actual access
to such a directory, but merely a typing mistake.

On the other hand:
@example
937528764.579039 1.2.3.4/3834 > 5.6.7.8/telnet
    input "cd ..." yielded output "maroon# ftp
	sunspot.sunspot.noao.edu "
@end example

shows a problem---the lines returned by the server was a root
prompt (``@code{maroon@code{#}}''), to which the user issued a command to
access a remote FTP server.

@emph{Deficiency: The analyzer should decouple the notion of waiting to receive the server's reply from the notion of matching only the edited form of the line; there might be raw inputs for which it is useful to see the server's response, and edited inputs for which the server's response is unimportant in terms of knowing that the input spells trouble. }

Default: the pattern
@example
    /[ \t]*cd[ \t]+((['"]?\.\.\.)|(["'](\.[^"']*)[ \t]))/
@end example

which looks for a ``@code{cd}'' command to either a directory beginning
with ``@code{...}'' (optionally quoted by the user) or a directory
name beginning with ``@code{.}'' that is quoted and includes an
embedded blank or tab.

@item @code{output_trouble : pattern}
lists patterns that
the analyzer should flag if they occur in the output sent by the
login server back to the user.

@cindex buffer overflow tools
@cindex exploits, buffer overflow
@cindex sniffer logs
@cindex trojaning
@cindex Linux, super exploit
@cindex smurf attacks
@cindex attacks, smurf
@cindex log file, altering
@cindex altering log files
@code{PATH_UTMP sensitive pattern}
@code{smashdu.c exploit tool}
@cindex root, setuid
@cindex setuid root
@cindex command shell, setuid root
@cindex ls
@cindex utilities, ls
@cindex lynx utility
@cindex utilities, lynx
@cindex fetch utility
@cindex utilities, fetch
@cindex anticode.com
@cindex TFreak
Default: the pattern
@example
      /^-r.s.*root.*\/bin\/(sh|csh|tcsh)/
    | /Jumping to address/
    | /smashdu\.c/
    | /PATH_UTMP/
    | /Log started at =/
    | /www\.anticode\.com/
    | /smurf\.c by TFreak/
    | /Trojaning in progress/
    | /Super Linux Xploit/
@end example

The first of these triggers any time the user inspects with the
@emph{ls} utility an executable whose pathname ends in @code{/bin/} followed
by one of the popular command shells, and the @emph{ls} output shows
that the command shell has been altered to be setuid to root.
The remainder match either the output generated by some popular
exploit tools (for example, ``@code{Jumping to address}'', present
in many buffer overflow exploit tools), exploit tool names (``@code{smashdu.c}''),
text found within the tool source code (``@code{smurf.c by TFreak}''),
or URLs accessed (say via the @emph{lynx} or @emph{fetch} utilities)
to retrieve attack software (``@code{www.anticode.com}'').

@cindex backdoor, prompts
@item @code{backdoor_prompts : pattern}
lists patterns that
the analyzer should flag if they are seen as the first line sent by the
server to the user, because they often correspond with
backdoors that offer a remote user immediate command shell access
without having to first authenticate.

Default: the pattern ``@code{/^[!-~]*( ?)[#%$] /}'', which matches
a line that begins with a series of printable, non-blank characters and
ends with a likely prompt character, with a blank just after
the prompt character and perhaps before it.

@cindex backdoor, avoiding false positives
@item @code{non_backdoor_prompts : pattern}
lists patterns
that if a possible backdoor prompt also matches, then the analyzer
should not consider the server output as indicating a backdoor prompt.
Used to limit false positives for @code{backdoor_prompts}.

Default: the pattern ``@code{/^ *#.*#/}'', which catches lines with
more than one occurrence of a @code{#}.  Some servers generate such
lines as part of their welcome banner.

@cindex backdoor, triggered by terminal type
@cindex magic terminal types
@item @code{hot_terminal_types : pattern}
lists ``magic''
terminal types sometimes used by attackers to access backdoors.
Both Telnet and Rlogin have mechanisms for negotiating a terminal
type (name; e.g., ``@code{xterm}''); these backdoors trigger and skip
authentication if the name has a particular value.

@code{VT666}
Default: the name ``@code{VT666}'', one of the trigger terminal types
we've observed in practice.

@cindex backdoor, triggered by ephemeral port
@cindex ephemeral port, triggering a backdoor
@cindex client port, triggering a backdoor

@item @code{hot_telnet_orig_ports : set[port]}
Some Telnet backdoors trigger if the ephemeral port used by the client side of the connection
happens to be a particular value.  This variable is used to list the
port values whose use should be considered as possibly indicating
a backdoor.  @emph{Note: Clearly, this mechanism can generate false
positives when the client by chance happens to choose one of the
listed ports.}

Default: @code{53982/tcp}, one of the trigger ports we have observed
in practice.

@emph{Deficiency: There should be a corresponding variable for Rlogin backdoors triggered by a similar mechanism. }

@item @code{hot_ssh_orig_ports : set[port]}
Similar to @code{hot_telnet_orig_ports}, only for SSH.

Default: @code{31337/tcp}, a trigger port that we've observed in practice.

@item @code{skip_authentication : set[string]}
A set of strings
that, if present in the server's initial output (i.e., its welcome banner),
indicates the analyzer should not attempt to analyze the session for an
authentication dialog.  This is used for servers that provide public
access and don't bother authenticating the user.

Default: the string @code{"WELCOME TO THE BERKELEY PUBLIC LIBRARY"},
which corresponds to a frequently accessed public server in the
Berkeley area.  (Obviously, we include this default as an example,
and not because it will be appropriate for most Bro users!  But it
does little harm to include it.)

@emph{Deficiency: It would be more natural if this variable and a number of others listed below were of type @code{pattern} rather than @code{set[string]}.  They are actually converted internally by the event engine into regular expressions. }

@item @code{direct_login_prompts : set[string]}
A set of strings
that if seen during the authentication dialog mean that the user will
be logged in as soon as they answer the prompt.

Default: @code{"TERMINAL?"}, a prompt used by some terminal servers.

@code{login_prompts : set[string]}
A set of strings corresponding to login username prompts during an authentication
dialog.

Default: the strings
@example
    Login:
    login:
    Name:
    Username:
    User:
    Member Name
@end example

and the default contents of @code{direct_login_prompts}.

@item @code{login_failure_msgs : set[string]}
A set of strings
that if seen in text sent by the server during the authentication dialog
correspond to a failed login attempt.

Default: the strings
@example
    invalid
    Invalid
    incorrect
    Incorrect
    failure
    Failure,
    User authorization failure,
    Login failed,
    INVALID
    Sorry,
    Sorry.
@end example

@item @code{login_non_failure_msgs : set[string]}
A set of strings
similar to @code{login_failure_msgs} that if present mean that the
server text does not actually correspond to an authentication failure
(i.e., if @code{login_failure_msgs} also matches, it's a false
positive).

Default: the strings
@example
    Failures
    failures
    failure since last successful login
    failures since last successful login
@end example

@item @code{router_prompts : set[string]}
A set of strings
corresponding to prompts returned by the local routers when a
user successfully authenticates to the router.
For the purpose of this variable, see the next variable.

Default: empty.

@item @code{login_success_msgs : set[string]}
A set of strings
that if seen in text sent by the server during the authentication dialog
correspond to a successful authentication attempt.

Default: the strings
@example
    Last login
    Last successful login
    Last   successful login
    checking for disk quotas
    unsuccessful login attempts
    failure since last successful login
    failures since last successful login
@end example

and the default contents of the @code{router_prompts} variable.

@emph{Deficiency: Since by default @code{router_prompts} is empty, this last inclusion does nothing.  In particular, if you redefine @code{router_prompts} then @code{login_success_msgs} will @emph{not}  pick up the change; you will need to redefine it to (again) include @code{router_prompts}, using: @w{redef login_success_msgs += router_prompts}. This is clearly a misfeature of Bro and will be fixed one fine day. }

@item @code{login_timeouts : set[string]}
A set of strings that if seen in text sent by the server during the authentication dialog
correspond to the server having timed out the authentication attempt.

Default: the strings
@example
    timeout
    timed out
    Timeout
    Timed out
    Error reading command input
@end example

(This last is returned by the VMS operating system.)

@item @code{non_ASCII_hosts : set[addr]}
A set of addresses corresponding to hosts whose login servers do not (primarily) use
7-bit ASCII.  The analyzer will not attempt to analyze authentication
dialogs to such hosts, and will not complain about huge lines
generated by either the sender or receiver (per @code{excessive_line}).

Default: empty.

@item @code{skip_logins_to : set[addr]}
A set of addresses corresponding to hosts for which the analyzer should not attempt
to analyze authentication dialogs.

Default: the (empty) contents of @code{non_ASCII_hosts}.

@item @code{always_hot_login_ids : set[string]} A set of usernames
that the analyzer should always flag as sensitive, even if they're seen in
a session for which the analyzer is @emph{confused} @ref{login analyzer confusion}.

Default: the value of @code{always_hot_ids} defined by the
@code{hot} analyzer.

@item @code{hot_login_ids : set[string]}
A set of usernames that the analyzer should flag as sensitive, unless it sees them
in a session for which the analyzer is @emph{confused}
(See: @ref{login analyzer confusion}).

Default: the value of @code{hot_ids} defined by the
@code{hot-ids} analyzer.

@cindex rhosts
@item @code{rlogin_id_okay_if_no_password_exposed : set[string]}
A set of username exceptions to @code{hot_login_ids} which the
analyzer should not flag as sensitive if the user authenticated without
exposing a password (so, for example, via @code{.rhosts}).

Default: the username @code{"root"}.

@end table

@cindex analyzers, login, variables

@node login functions,
@subsection @code{login} functions

@cindex analyzers, login, functions

The standard @code{login} script provides the following functions for external use:

@table @samp
@item @code{is_login_conn (c: connection): bool }
Returns true if the given connection is one analyzed by @code{login}
(currently, Telnet or Rlogin), false otherwise.

@item @code{hot_login (c: connection, msg: string, tag: string) }
Marks the given connection as hot, logs the given message, and
demultiplexes @code{demux} the subsequent server-side contents of the
connection to a filename based on @code{tag} and the client-side
to a filename based on the name @code{"keys"}.  No return value.

@item @code{is_hot_id (id: string, successful: bool, confused: bool): bool}
Returns true if the username id should be considered sensitive,
given that the user either did or did not successfully authenticate,
and that the analyze was or was not in a @emph{confused} state
(See: @ref{login analyzer confusion}).

@item @code{is_forbidden_id (id: string): bool }
Returns true if the username id is present in
@code{forbidden_ids} or @code{forbidden_id_patterns}.

@item @code{edit_and_check_line (c: connection, line: string, successful: bool): check_info}
Tests whether the given line of text seen on connection @code{c} includes
a sensitive username, after first applying @emph{BS} and @emph{DEL}
keystroke editing (see: @ref{login variables}).  @code{successful} should
be true if the user has successfully authenticated, false otherwise.

The return value is a @code{check_info} record, which contains four
@code{check_info}
fields:
@table @samp
@item @code{expanded_line}
All of the different editing interpretations of the line, separated
by commas.  For example, if the original line is
@quotation
@code{"@code{rob<}@emph{DEL}@code{><}@emph{BS}@code{><}@emph{BS}@code{>ot}"}
@end quotation
then the different editing interpretations are
@code{"@code{ro<}@emph{BS}@code{><}@emph{BS}@code{>ot}"}
and @code{"root"}, so the return value will be:
@quotation
@code{"@code{rob<}@emph{DEL}@code{><}@emph{BS}@code{><}@emph{BS}@code{>ot},@code{ro<}@emph{BS}@code{><}@emph{BS}@code{>ot},root"}
@end quotation

@emph{Deficiency: Ideally, these values would be returned in a list of some form, so that they can be accessed separately and unambiguously. The current form is really suitable only for display to a person, and even that can be quite confusing if @code{line} happens to contain commas already.  @emph{Or}, perhaps an algorithm of ``simply pick the shortest'' would find the correct editing every time anyway.}

@item @code{hot: bool}
True if any editing sequence resulted in a match against a sensitive username.

@item @code{hot_id: string}
The version of the input line (with or without editing) that was considered
hot, or an empty string if none.

@item @code{forbidden: bool}
True if any editing sequence resulted in a match against a username considered
``forbidden'', per @code{is_forbidden_id}.

@end table

@item @code{edit_and_check_user (c: connection, user: string, successful: bool, fmt_s: string): bool}
Tests whether the given username used for authentication on connection @code{c}
is sensitive, after first applying @emph{BS} and @emph{DEL}
keystroke editing (See: @ref{login variables}).  @code{successful} should be
true if the user has successfully authenticated, false otherwise.

@code{fmt_s} is a @code{fmt} format specifying how the username
information should be included in the connection's
@code{addl} field.  It takes two @code{string} parameters, the current value of the
field and the expanded version of the username as described in @code{expanded_line}.

If @code{edit_and_check_line} indicates that the username is sensitive,
then @code{edit_and_check_user} records the connection into its own
demultiplexing files .  If the username is @emph{forbidden},
then unless the analyzer is confused, we attempt to terminate the
connection using @code{terminate_connection}.

@cindex connection, hot
@cindex hot connections
Returns true if the connection is now considered ``hot,'' either
due to having a sensitive username, or because it was hot upon
entry to the function.

@cindex connection, hot
@cindex hot connections
@item @code{edit_and_check_password(c: connection, password: string): bool}
Checks the given password to see whether it contains a sensitive username.
If so, then marks the connection as hot and logs the sensitive password.
No return value.

@emph{Note: The purpose of this function is to catch instances in which the
event engine becomes out of synch with the authentication dialog and mistakes
what is, in fact, a username being entered, for a password being entered.
Such confusion can come about either due to a failure of the event
engine's heuristics, or due to deliberate manipulation of the event
engine by an attacker. }

@end table

@cindex analyzers, login, functions

@node login event handlers,
@subsection @code{login} event handlers

@cindex analyzers, login, event handlers

The standard @code{login} script handles the following events:

@cindex rhosts
@table @samp
@item @code{login_failure (c: connection, user: string, client_user: string, password: string, line: string)}
Invoked when the event engine has seen a failed attempt to authenticate
as @code{user} with @code{password} on the given connection @code{c}.
@code{client_user} is the user's username on the client side of the
connection.  For Telnet connections, this is an empty string, but
for Rlogin connections, it is the client name passed in the initial
authentication information (to check against
@code{.rhosts}).  @code{line} is the
line of text that led the analyzer to conclude that the authentication
had failed.

The analyzer first generates an @code{account_tried}
event to facilitate detection of password guessing, and then checks for
a sensitive username or password.  If the username was not sensitive
and the password is empty, then no further analysis is applied, since
clearly the attempt was half-hearted and aborted.  Otherwise, the
analyzer annotates the connection's @code{addl}
field with @code{fail/@code{<}@emph{username}@code{>}} to mark the
authentication failure, and also checks the @code{client_user} to
see if it is sensitive.  If we then find that the connection is
hot, the analyzer logs a message to that effect.

@item @code{login_success (c: connection, user: string, client_user: string, password: string, line: string)}
Invoked when the event engine has seen a successful attempt to authenticate.
The parameters are the same as for @code{login_failure}.

The analyzer invokes @code{check_hot} with mode @code{APPL_ESTABLISHED}
since the application session has now been established.  It generates
an @code{account_tried}
event to facilitate detection of password guessing, and then checks for
a sensitive username or password.  The event engine uses the special
password @code{"@code{<}none@code{>}"} to indicate that no password
was exposed, and this mitigates the sensitivity of logins using particular
usernames per @code{rlogin_id_okay_if_no_password_exposed}.

The analyzer annotates the connection's @code{addl}
field with @code{"@code{<}@emph{username}@code{>}"} to mark the
successful authentication.  Finally, if we then find that the connection
is hot, the analyzer logs a message to that effect.

@item @code{login_input_line (c: connection, line: string)}
Invoked for every line of text sent by the client side of the login
session to the server side.  The analyzer matches the text against
@code{input_trouble} and @code{edited_input_trouble} and invokes
@code{hot_login} with a tag of @code{"trb"} if it sees a match,
which will log a notice concerning the connection.  However, this
invocation is only done while the connection's @code{hot}
field count is <= 2, to avoid cascaded notices when an attacker gets
really busy and steps on a lot of sensitive patterns.

@item @code{login_output_line (c: connection, line: string)}
Invoked for every line of text sent by the server side of the login
session to the client side.  The analyzer checks @code{backdoor_prompts}
 and any pending input notices that
were waiting on the server output, per @code{edited_input_trouble}.
These last are then logged unless the output matched the pattern:
@example
    /No such file or directory/
@end example

@emph{Deficiency: Clearly, this pattern should not be hardwired but instead specified by a redefinable variable. }

Finally, if the line is not too long and the text matches @code{output_trouble}
and the connection's @code{hot}
field count is <= 2 (to avoid cascaded notices), the analyzer
invokes @code{hot_login}  with a tag of @code{"trb"}.
@emph{Deficiency: ``Too long'' is hardwired to be a length $\ge 256$ bytes. It, too, should be specifiable via a redefinable variable. }
Note: We might
wonder if not checking overly long lines presents an evasion threat: the
attacker can bury their access to a sensitive string in an excessive line
and thus avoid detection.  While this is true, it doesn't appear to cost
much.  First, some of the sensitive patterns are generated in server output
that will be hard to manipulate into being overly long.  Second, if the
attacker is trying to avoid detection, there are easier ways, such as
passing their output through a filter that alters it a good deal.

@item @code{login_confused (c: connection, msg: string, line: string)}
Invoked when the event engine's heuristics have concluded that they
have become confused and can no longer correctly track the authentication
dialog (See: @ref{login analyzer confusion}).
@code{msg} gives the particular problem the heuristics detected
(for example, @code{multiple_login_prompts} means that the engine saw
several login prompts in a row, without the type-ahead from the client side
presumed necessary to cause them) and @code{line} the line of text that
caused the heuristics to conclude they were confused.

Once declaring that it's confused, the event engine will no longer attempt
to follow the authentication dialog.  In particular, it will @emph{not}
generate subsequent @code{login_failure} or @code{login_success} events.

Upon this event, the standard
@code{login} script invokes @code{check_hot} with
mode @code{APPL_ESTABLISHED} since it could well be that the application
session is now established (it can't know for sure, of course, because
the event engine has given up).  It annotates the connection's
 @code{addl} field with
@code{confused}@code{<}@emph{line}@code{>} to mark the confused state,
and then logs to the @file{weird} file the particulars of the
connection and the type of confusion (@code{msg}).  @emph{Deficiency: This should be done by generating a @emph{weird}-related event instead. }

Finally, the analyzer invokes @code{set_record_packets} to specify
that all of the packets associated with this connection should be recorded
to the @file{trace} file.
@emph{Note: For the current @code{login} analyzer, this call is not needed---it
records every packet of every login session anyway, because the generally
philosophy is that Bro should record whatever it analyzes, so that the
analysis may be repeated or examined in detail.  Since the current analyzer
looks at every input and output line via @code{login_input} and @code{login_output}, it records all of the packets of every such analyzed session.
There is commented-out text in @code{login_success} to be used if
@code{login_input} and @code{login_output} are not being used; it turns
off recording of a session's packets after the user has successfully logged
in (assuming the connection is not considered hot).}

@item @code{login_confused_text (c: connection, line: string)}
Invoked for every line the user types after the event engine has
entered the @emph{confused} state.  If the connection is not already
considered hot, then the analyzer checks for the presence of sensitive
usernames in the line using @code{edit_and_check_line}, and, if
present, annotates the connection's @code{addl} field
with
@code{confused}@code{<}@emph{line}@code{>}, logs that the connection
has become hot, and invokes @code{set_record_packets} to record
to the @file{trace} file all of the packets associated with the connection.

@item @code{login_terminal (c: connection, terminal: string)}
Invoked when the client transmits a terminal type to the server.
The mechanism by which the client transmits the type depends on the
underlying protocol (Rlogin or Telnet).

The handler checks the terminal type against @code{hot_terminal_types}
and if it finds a match invokes @code{hot_login} with a tag of
@code{"trb"}.

@cindex tunneling
@cindex evasion, using tunneling
@cindex Napster, tunneled over Telnet or Rlogin
@cindex excessively long lines
@item @code{excessive_line (c: connection)}
Invoked when the event engine observes a very long line sent by either
the client or the server.  Such long lines are seen as potential attempts
by an attacker to evade the @code{login} analyzer; or, possibly, as
a Login session carrying an unusual application.  @emph{Note: One example
we have observed occurs when a high-bandwidth binary payload protocol such
as Napster is sent over the Telnet or Rlogin well-known port in an
attempt to either evade detection or tunnel through a firewall.}

@cindex Network Virtual Terminal (NVT)
@cindex NVT (Network Virtual Terminal)
This event is actually generic to any TCP connection carrying
an application that uses the ``Network Virtual Terminal'' (NVT) abstraction,
which presently comprises Telnet and FTP.  But the only handler defined
in the demonstration Bro policy is for Telnet, hence we discuss it here.
For this reason, the handler first invokes @code{is_login_conn}
to check whether the connection is in fact a login session.  If so, then
if the connection is not hot, and if the analyzer finds the server
listed in @code{non_ACSII_HOSTS}, then it presumes the long line
is due to use of a non-ASCII character set; the analyzer invokes
@code{set_login_state} and @code{set_record_packets} to avoid
further analysis or recording of the connection.

Otherwise, if the connection is still in the authentication dialog, then
the handler generates a  event with a
confusion-type of @code{"excessive_line"}, and changes the connection's
state to @emph{confused}.

@emph{Deficiency: The event engine is currently hardwired to consider a line of >= 1024 bytes as ``excessive''; clearly this should be user-redefinable. }

@cindex NVT options, inconsistent
@cindex Telnet, options, inconsistent
@item @code{inconsistent_option (c: connection)}
NVT options are specified by the client and server stating which options
they are willing to support vs. which they are not,
and then instructing one another
which in fact they should or should not use for the current connection.
If the event engine sees a peer violate either what the other peer has
instructed it to do, or what it itself offered in terms of options in
the past, then the engine generates an @code{inconsistent_option} event.

The handler for this event simply records an entry about it to the
 file.  @emph{Deficiency: The event handler invocation does not include enough information to determine what option was inconsistently specified; in addition, it would be convenient to integrate the handling of problems like this within the general ``weird'' framework. }

Note: As for @code{excessive_line} above, this event is actually a
generic one applicable to any NVT-based protocol.  It is handled here
because the problem most often crops up for Telnet sessions.
Note: Also, the handler does not check to see whether the connection
is a login session (as it does for @code{excessive_line}); it serves
as the handler for any NVT session with an excessive line.

Note: Finally, note that this event can be generated if the session
contains a stream of binary data.  One way this can occur is when
the session is encrypted but Bro fails to recognize this fact.
@cindex encryption, leading to ``excessive lines''

@cindex NVT options, bad
@cindex Telnet, options, bad
@item @code{bad_option (c: connection)}
If an NVT option is either ill-formed (e.g., a bad length field) or
unrecognized, then the analyzer generates this event.

The processing of this event (recording information to the
file) and the various notes and deficiencies associated with it are
the same as those for @code{inconsistent_option} above.

@cindex NVT options, bad termination
@cindex Telnet, options, bad termination
@item @code{bad_option_termination (c: connection)}
If an NVT option fails to be terminated correctly (for example,
a character is seen within the option that is disallowed for use
in the option), then the analyzer generates this event.

The processing of this event (recording information to the
file) and the various notes and deficiencies associated with it are
the same as those for @code{inconsistent_option} above.

@cindex NVT options, authentication
@cindex Telnet, options, authentication
@cindex authentication, accepted
@item @code{authentication_accepted (name: string, c: connection)}
The NVT framework includes options for negotiating authentication.
When such an option is sent from client to server and the server
replies that it accepts the authentication, then the event engine
generates this event.

The handler annotates the connection's @code{addl} field
with
@code{auth}@code{<}@emph{name}@code{>},
unless that annotation is already present.

@cindex NVT options, authentication
@cindex Telnet, options, authentication
@cindex authentication, rejected
@item @code{authentication_rejected (name: string, c: connection)}
The same as @code{authentication_accepted}, except invoked when the
server replies that it rejects the attempted authentication.

The handler annotates the connection's @code{addl} field
with @code{auth-failed}@code{<}@emph{name}@code{>}.

@cindex authentication, skipped
@item @code{authentication_skipped (c: connection)}
Invoked when the event engine sees a line in the authentication dialog
that matches .

The handler annotates the connection's @code{addl} field
with `` @code{skipped}''
to mark that authentication was skipped,
and then invokes @code{skip_further_processing} and (unless the
connection is hot) @code{set_record_packets} to skip any further
analysis of the connection, and to stop recording its packets to
the @file{trace} file.

@item @code{connection_established (c: connection)}
@code{connection_established} is a generic
event generated for all TCP connections; however, the @code{login} analyzer
defines an additional handler for it.

The handler first checks (via @code{is_login_conn}) whether this is a Telnet
or Rlogin connection.  If so, it generates an @code{authentication_skipped}
 event if the server's address occurs
in @code{skip_logins_to}, and also (for Telnet) checks whether the
client's port occurs in @code{hot_telnet_orig_ports}, invoking @code{hot_login}
 with the tag @code{"orig"} if it does.

For SSH connections, it likewise checks the client's port, but
in @code{hot_ssh_orig_ports}, marking the connection as hot and
logging a real-time notice if it is.

@item @code{partial_connection (c: connection)}
As noted earlier, @code{partial_connection} is a generic
event generated for all TCP connections.  The @code{login} analyzer
also defines a handler for it, one which (if it's a Telnet/Rlogin
connection) sets the connection's state to @emph{confused} and
checks for @code{hot_telnet_orig_ports}.

@cindex NVT options, encryption
@cindex Telnet, options, encryption
@cindex encrypted login sessions
@item @code{activating_encryption (c: connection)}
The NVT framework includes options for negotiating encryption.  When such
a series of options is successfully negotiated, the event engine generates
this event.  @emph{Note: The negotiation sequence is complex and can fail at a number of points.  The event engine does not attempt to generate events for each possible failure, but instead only looks for the option sent after a successful negotiation sequence. }

The handler annotates the connection's @code{addl} field
with ``@code{(encrypted)}'' to mark that authentication was encrypted.
@emph{Note: The event engine itself marks the connection as requiring no further processing.  This is done by the event engine rather than the handler because the event engine cannot do its job (regardless of the policy the handler might desire) in the face of encryption. }

@end table

@cindex analyzers, login, event handlers

@node pop3 Analyzer,
@section The @code{pop3} Analyzer
The @code{pop3} analyzer does a protocol analysis of the Post Office
Protocol - Version 3.

When Bro runs with the pop3 Analyzer, it processes all packets with
destination port 110/tcp, generating a log file @code{pop3.log}. Each line
contains a timestamp, a connection ID, the originator and responder IP
addresses, and the message sent. The message consists of the command and
arguments on client side, and the status on server side.

@menu
* pop3 pop3_session_info record::
* pop3 variables::
* pop3 event handlers::
@end menu

@node pop3 pop3_session_info record,
@subsection The @code{pop3_session_info} record

@cindex pop3, session information

The @code{pop3} analyzer maintains a @code{pop3_session_info} record per
@code{pop3} connection:

@example
type pop3_session_info: record @{
    id: count;              # Unique session ID.
    quit_sent: bool;        # Client issued a QUIT.
    last_command: string;   # Last command of client.
@};
@end example

The corresponding fields are:

@table @samp
@item @code{id}
The unique session identifier assigned to this session.  Sessions
are numbered starting at @code{1} and incremented with each new session.

@item @code{quit_sent}
True if the client has sent a QUIT command.

@item @code{last_command}
Last command issued by the client.

@end table

@node pop3 variables,
@subsection @code{pop3} variables
@table @samp
@item @code{pop_connections: table[conn_id] of pop3_session_info}
This table contains all active POP3-sessions indexed by their Connection IDs.
As soon as the TCP Connection terminates or expires, they are deleted.
@item @code{pop_connection_weirds: table[addr] of count &default=0 &create_expire = 5 mins}
This table contains all the POP3-session originators for which unexpected behavior was recorded.
@item @code{error_threshold: count = 3}
This variable contains a threshold for the maximum number of negative status
indicators per originator received from a server. It is used for recognizing
potential abuses, e.g., trial and error password guessing attacks.
@item @code{ignore_commands: set[string] }
Set of commands to ignore while generating the log file.
@end table

@node pop3 event handlers,
@subsection @code{pop3} event handlers
@table @samp
@item @code{pop3_request(c: connection, is_orig: bool, command: string, arg: string)}
Generated for each valid command sent from the client
to the server.
@item @code{pop3_reply(c: connection, is_orig: bool, cmd: string, msg: string) }
Generated for each server reply containing a valid status indicator.
@item @code{pop3_data(c: connection, is_orig: bool, data: string) }
Generated for every data line sent by the server as a reply to the client,
including commands that yield multi-line answers.
@item @code{pop3_unexpected(c: connection, is_orig: bool, msg: string, detail: string) }
Generated when something semantically unexpected has happened.
@item @code{pop3_login_success(c: connection, is_orig: bool, user: string, password: string)}
Generated when a user authenticates successfully.
The password may be empty if it has not been observed.
@item @code{pop3_login_failure(c: connection, is_orig: bool, user: string, password: string)}
Generated when a user fails to authenticate correctly.
@end table

@node portmapper Analyzer,
@section The @code{portmapper} Analyzer
@cindex remote procedure call (RPC)
@cindex RPC (Remote Procedure Call)
The @code{portmapper} analyzer monitors one particularly
important form of remote procedure call (RPC) [RFC-1831, RFC-1832]
traffic: the portmapper service, used to map between RPC program (and
version) numbers and the TCP or UDP port on which the service runs for a
particular host.  For example, @emph{rstatd} is an RPC service that provides
``remote host status monitoring'' so that a set of hosts can be informed
when any of them reboots.  @emph{rstatd} has been assigned a standard
RPC program number of 100002.  To find out the corresponding TCP or UDP
port on a given host, a remote host would usually first contact the
portmapper RPC service running on the host and request the port
corresponding to program 100002.

@float Table, Calls to RPC portmapper service
@multitable  @columnfractions .15 .55
@item @strong{Call} @tab @strong{Meaning}
@item NULL
@tab A do-nothing call typically provided by all RPC services.
@item GETPORT
@tab Look up the port associated with a given RPC program.
@item SET
@tab Add a new port mapping (or replace an existing mapping) for an RPC program.
@item UNSET
@tab Remove a port mapping.
@item DUMP
@tab Retrieve all of the RPC program mappings.
@item CALLIT
@tab Both look up a program and then directly call it.
@end multitable
@caption{Types of calls to the RPC portmapper service}
@end float

All in all, clients can make six different types of calls to the portmapper,
as summarized in the above table.
Attackers often use
GETPORT and DUMP to see whether a host may be running an RPC service
vulnerable to a known exploit.

The analyzer uses a capture filter of ``@code{port 111}'' (See: @ref{Filtering}),
equivalent to ``@code{tcp port 111 or udp port 111}'' (since the portmapper
service ordinarily accepts calls using either TCP or UDP, both on port 111).
It checks the different types of portmapper calls against policies
expressed using a number of different variables.

@emph{Note:  An important point not to overlook is that an attacker does @emph{not} have to first call the portmapper service in order to call an RPC program.  They might instead happen to know the port on which the service runs @emph{a priori}, since for example it may generally run on the same port for a particular operating system; or they might scan the host's different TCP or UDP ports directly looking for a reply from the service.  Thus, while portmapper monitoring proves very useful in detecting attacks, it does @emph{not} provide comprehensive monitoring of attempts to exploit RPC services. }

@menu
* portmapper variables::
* portmapper functions::
* portmapper event handlers::
@end menu

@node portmapper variables,
@subsection @code{portmapper} variables

@cindex analyzers, portmapper, variables

The standard script provides the following redefinable variables:

@table @samp
@item @code{rpc_programs : table[count] of string}
Maps RPC program numbers to a string used to name the service.
For example, the @code{[100002]} entry is mapped to @code{"rstatd"}.

Default: a large list of RPC services.

@cindex NFS (Network File System)
@cindex Network File System (NFS)
@item @code{NFS_services : set of string}
Lists the names of those RPC services that correspond to
Network File System (NFS) [RFC-1094, RFC-1813] services.  This
variable is provided because it is convenient to express policies
specific to accessing NFS file systems.

Default: the services @emph{mountd}, @emph{nfs}, @emph{pcnfsd},
@emph{nlockmgr}, @emph{rquotad}, @emph{status}.

@emph{Deficiency: Bro's notion of NFS is currently confined to just knowledge of the existence of these services.  It does not analyze the particulars of different NFS operations. }

@item @code{RPC_okay : set[addr, addr, string]}
Indexed by the host providing a given service and then by the host
accessing the service.  If an entry is present, it means that the
given access is allowed.  For example, an entry of:
@example
    [1.2.3.4, 5.6.7.8, "rstatd"]
@end example

means that host @code{5.6.7.8} is allowed to access the @emph{rstatd}
service on host @code{1.2.3.4}.

Default: empty.

@item @code{RPC_okay_nets : set[net]}
A set of networks allowed to make GETPORT requests without complaint.
The notion behind providing this variable is that the listed
networks are trusted.  However, the trust doesn't extend beyond
GETPORT to other portmapper requests, because GETPORT is the only
portmapper operation used routinely by a set of hosts trusted by
another set of hosts (but that don't belong to the same group, and hence
are not issuing SET and UNSET calls).

Default: empty.

@cindex walld
@item @code{RPC_okay_services : set[string]}
A set of services for which GETPORT requests should not generate
complaints.  These might be services that are widely invoked and
believed exploit-free, such as @emph{walld}, though care should
be taken with blithely assuming that a given service is indeed
exploit-free.

Note that, like for @code{RPC_okay_nets}, the trust does not
extend beyond GETPORT, because it should be the only portmapper
operation routinely invoked.

Default: empty.

@item @code{NFS_world_servers : set[addr]}
A set of hosts that provide public access to an NFS file system,
and thus should not have any of their NFS traffic flagged as
possibly sensitive.  (The presumption here is that such public
servers have been carefully secured against any remote NFS operations.)
An example of such a server might be one providing read-only
access to a public database.

Default: empty.

@item @code{RPC_dump_okay : set[addr, addr]}
Indexed first by the host requesting a portmapper dump, and second
by the host from which it's requesting the dump.  If an entry is
present, then the dump operation is not flagged.

Default: empty.

@item @code{any_RPC_okay : set[addr, string]}
Pairs of hosts and services for which any GETPORT access to the given
service is allowed.

@cindex ypserv
@item @code{sun-rpc.mcast.net}
@cindex RPC (Remote Procedure Call), reserved multicast address
Default:
@example
    [NFS_world_servers, NFS_services],
    [sun-rpc.mcast.net, "ypserv"]
@end example

The first of these allows access to any NFS service of any of the
@code{NFS_world_servers}, using Bro's cross-product initialization
feature (See @ref{Initializing Tables}).  The second allows @emph{ypserv}
requests to the multicast address reserved for RPC multicasts.@footnote{ I don't know how much this type of access is actually used in practice, but experience shows that requests for @emph{ypserv} directed to that address pop up not infrequently. }

@cindex walld
@item @code{suppress_pm_log : table[addr, string] of bool}
Do not generate real-time notices for access by the given address
for the given service.  Note that unlike most Bro policy variables,
this one is not @code{const} but is modified at run-time to add
to it any host that invokes the @emph{walld} RPC service, so that
such access is only reported once for each host.

Default: empty, but dynamic as discussed above.

@end table

@cindex analyzers, portmapper, variables

@node portmapper functions,
@subsection @code{portmapper} functions

@cindex analyzers, portmapper, functions

The standard script provides the following externally accessible functions:

@table @samp
@item @code{rpc_prog (p: count): string }
Returns the name of the RPC program with the given number,
if it's present in ; otherwise returns
the text @code{"unknown-@code{<}@emph{p}@code{>}"}.

@item @code{pm_check_getport (r: connection, prog: string): bool }
Checks a GETPORT request for the given program against the policy expressed
by @code{RPC_okay_services}, @code{any_RPC_okay},
@code{RPC_okay}, and @code{RPC_okay_nets},
returning true if the request violates policy, false if it's allowed.

@item @code{pm_activity (r: connection, log_it: bool) }
A bookkeeping function invoked when there's been portmapper activity
on the given connection.

The function records the connection via ,
unless it is a TCP connection (which will instead be recorded by
@code{connection_finished}).  If @code{log_it} is true then the
function generates a real-time notice of the form:
@quotation
rpc:
@code{<}@emph{connection-id}@code{>}
@code{<}@emph{RPC-service}@code{>}
@code{<}@emph{r$addl}@code{>}
@end quotation
For example:
@example
    972616255.679799 rpc: 65.174.102.21/832 >
	182.7.9.47/portmapper pm_getport: nfs -> 2049/udp
@end example

However, it does not generate the notice if either the client host and
service are present in @code{suppress_pm_log}, or if it already generated
a notice in the past for the same client, server and service (to prevent
notice cascades).

@item @code{pm_request (r: connection, proc: string, addl: string, log_it: bool) }
Invoked when the given connection has made a portmapper request of some
sort for the given RPC procedure @code{proc}.  @code{addl} gives an
annotation to add to the connection's @code{addl} field.
If @code{log_it} is true, then connection should be logged; it will also
be logged if the function determines that it is hot.

The function first invokes @code{check_scan} and @code{scan_hot}
(with a mode of @code{CONN_ESTABLISHED}),
unless @code{r} is a TCP connection, in which case these checks have already
been made by @code{connection_established}.  The function then adds
@code{addl} to the connection's @code{addl} field, though if the field's
length already exceeds 80 bytes, then it just tacks on @code{"..."}
(unless already present).  This last is necessary because Bro will sometimes
see zillions of successive portmapper requests that all use the same
connection ID, and these will each add to @code{addl} until it
becomes unwieldy in size.  @emph{Deficiency: Clearly, the byte limit of 80 should be adjustable. }

Finally, the function invokes @code{check_hot} with a mode
of @code{CONN_FINISHED}, and @code{pm_activity} to finish up
bookkeeping for the connection.

No return value.

@item @code{pm_attempt (r: connection, proc: string, status: count, addl: string, log_it: bool) }
Invoked when the given connection attempted to make a portmapper request
of some sort, but the request failed or went unanswered.  The arguments
are the same as for @code{pm_request}, with the addition of
@code{status}, which gives the RPC status code corresponding to why the
attempt failed (see below).

The function first invokes @code{check_scan} and @code{check_hot}
(with a mode of @code{CONN_ATTEMPTED}),
unless @code{r} is a TCP connection, in which case these checks have already
been made by @code{connection_attempt}.

The function then adds
@code{addl} to the connection's @code{addl} field, along with
a text description of the RPC status code, as given in
the Table below.

No return value.

@float Table, RPC status codes
@multitable  @columnfractions .2 .7
@item @strong{Status description} @tab @strong{Meaning}
@item "ok"
@tab The call succeeded.
@item "prog unavail"
@tab The call was for an RPC program that has not registered with the portmapper.
@item "mismatch"
@tab The call was for a version of the RPC program that has not registered with the portmapper.
@item "garbage args"
@tab The parameters in the call did not decode correctly.
@item "system err"
@tab A system error (such as out-of-memory) occurred when processing the call.
@item "timeout"
@tab No reply was received within 24 seconds of the request.
@item "auth error"
@tab The caller failed to authenticate to the server, or was not authorized to make the call.
@item "unknown"
@tab An unknown error occurred.
@end multitable
@caption{Types of RPC status codes}
@end float

@end table

@cindex analyzers, portmapper, functions

@node portmapper event handlers,
@subsection @code{portmapper} event handlers

@cindex analyzers, portmapper, event handlers

The standard script handles the following events:

@table @samp
@item @code{pm_request_null (r: connection)}
Invoked upon a successful portmapper request for the ``null'' procedure.
The script invokes @code{pm_request} with @code{log_it=F}.

@item @code{pm_request_set (r: connection, m: pm_mapping, success: bool)}
Invoked upon a nominally successful portmapper request to set the portmapper
binding @code{m}.  The script invokes @code{pm_request} with @code{log_it=T}.
@code{success} is true if the server honored the request, false otherwise;
the script turns this into an annotation of @code{"ok"} or @code{"failed"}.

The @code{pm_mapping} type (for @code{m}) has three fields,
@code{program: count}, @code{version: count} and @code{p: port}, the
port for the mapping of the given program and version.
@code{pm_mapping}

@item @code{pm_request_unset (r: connection, m: pm_mapping, success: bool)}
Invoked upon a nominally successful portmapper request to remove a portmapper
binding.  The script invokes @code{pm_request} with @code{log_it=T}.
@code{success} is true if the server honored the request, false otherwise;
the script turns this into an annotation of @code{"ok"} or @code{"failed"}.

@item @code{pm_request_getport (r: connection, pr: pm_port_request, p: port)}
Invoked upon a successful portmapper request to look up a portmapper
binding.  @code{pr}, of type
@code{pm_port_request}, has three fields:
@code{program: count}, @code{version: count}, and @code{is_tcp: bool},
this last indicating whether the caller is request the TCP or UDP
port, if the given program/version has mappings for both.
The script invokes @code{pm_request} with @code{log_it} set
according to the return value of
and an annotation of the mapping.

@item @code{pm_request_dump (r: connection, m: pm_mappings)}
Invoked upon a successful portmapper request to dump the portmapper
bindings.  The script invokes @code{pm_request} with @code{log_it=T}
unless  indicates that the dump call is allowed.
The script ignores @code{m}, which gives the mappings as a
@code{table[count] of pm_mapping}, where the table index simply reflects
the order in which the mappings were returned, starting with an index
of 1.  @emph{Deficiency: What the script @emph{should} do, instead, is keep track of the mappings so that Bro can identify the service associated with connections for otherwise unknown ports. }

@cindex walld
@item @code{pm_request_callit (r: connection, pm_callit_request, p: port)}
Invoked upon a successful portmapper request to look up and call
an RPC procedure.  The script invokes @code{pm_request} with @code{log_it=T}
unless the combination of the caller and the
program are in @code{suppress_pm_log}.  Finally, if the program
called is @emph{walld}, then the script adds the caller to @code{suppress_pm_log}.

The @code{pm_callit_request} type has four fields:
@code{pm_callit_request}
@code{program: count}, @code{version: count}, @code{proc: count}, and
@code{arg_size: count}.  These reflect the procedure being looked up and
called, and the size of the arguments being passed to it, respectively.
@emph{Deficiency: Currently, the event engine does not do any analysis or refinement of the arguments passed to the procedure (such as making them available to the event handler) or the return value.}  @code{p} is
the port value returned by the call.

@item @code{pm_attempt_null (r: connection, status: count)}
Invoked upon a failed portmapper request for the ``null'' procedure.
@code{status} gives the reason for the failure.
The script invokes @code{pm_attempt} with @code{log_it=T}.

@item @code{pm_attempt_set (r: connection, status: count, m: pm_mapping)}
Invoked upon a failed portmapper request to set the portmapper
binding @code{m}.  The script invokes @code{pm_attempt} with @code{log_it=T}.

@item @code{pm_attempt_unset (r: connection, status: count, m: pm_mapping)}
Invoked upon a failed portmapper request to remove a portmapper
binding.  The script invokes @code{pm_attempt} with @code{log_it=T}.

@item @code{pm_attempt_getport (r: connection, status: count, pr: pm_port_request)}
Invoked upon a failed portmapper request to look up a portmapper
binding.  @code{pr}, of type @code{pm_port_request}, has three fields:
@code{program: count}, @code{version: count}, and @code{is_tcp: bool},
this last indicating whether the caller requested the TCP or UDP port.
The script invokes @code{pm_attempt} with @code{log_it} set
according to the return value of @code{pm_check_get_port}.

@item @code{pm_attempt_dump (r: connection, status: count)}
Invoked upon a failed portmapper request to dump the portmapper
bindings.  The script invokes @code{pm_attempt} with @code{log_it=T}
unless @code{RPC_dump_okay} indicates that the dump call is allowed.

@cindex walld
@item @code{pm_attempt_callit (r: connection, status: count, pm_callit_request)}
Invoked upon a failed portmapper request to look up and call
an RPC procedure.  The script invokes @code{pm_attempt} with @code{log_it=T}
unless the combination of the caller and the
program are in @code{suppress_pm_log}.  Finally, if the program
called is @emph{walld}, then the script adds the caller to
@code{suppress_pm_log}.

@item @code{pm_bad_port (r: connection, bad_p: count)}
Invoked when a portmapper request or response includes an invalid
port number.  Since ports are represented by unsigned 4-byte integers,
they can stray outside the allowed range of 0--65535 by being >= 65536.
The script invokes @code{conn_weird_log} with a @emph{weird tag}
of @code{"bad_pm_port"}.

@end table

@cindex analyzers, portmapper, event handlers

@cindex analyzers, application-specific

@node analy Analyzer,
@section The @code{analy} Analyzer
@cindex statistical analysis
@cindex connection, analysis
The @code{analy} analyzer provides a limited mechanism to
use Bro to do statistical analysis on TCP connections.  Its primary
purpose is to demonstrate that Bro has applications to network
traffic analysis beyond intrusion detection.  It defines one
event handler:

@table @samp
@item @code{conn_stats c: connection, os: endpoint_stats, rs: endpoint_stats}
Invoked for each connection when it terminates (for whatever reason).
@code{os} and @code{rs} are the statistics for the originator endpoint and
the responder endpoint, respectively; the table below
gives the different record fields.

@end table

@code{endpoint_stats} fields for summarizing connection endpoint statistics,
all of type @code{count}.

@float Table, endpoint_stats fields
@multitable  @columnfractions .2 .75
@item @strong{Field} @tab @strong{Meaning}
@item num_pkts
@tab The number of packets sent by the endpoint, as seen by the monitor. The endpoint may
have sent others that the network dropped upstream from the monitor.
@item num_rxmit
@tab The number of packets retransmitted by the endpoint, as seen by the monitor.
@item num_rxmit_bytes
@tab The number of bytes retransmitted by the endpoint.
@item num_in_order
@tab The number of packets sent by the endpoint that arrived at the monitor in order, where "in
order" means in the same order as sent by the endpoint, rather than in sequence number.
(Thus, a retransmission can arrive in order, by this definition.) Bro determines if the packet
arrived in order by applying heuristics to the IP identification (ID) field, which in general
will increase by a small amount between successive packets transmitted by an endpoint.
@item num_OO
@tab The number of packets sent by the endpoint that arrived at the monitor out of order. See the
previous entry for the definition of "in order", and hence "out of order".
@item num_repl
@tab The number of extra copies of packets sent by the endpoint that arrived at the monitor. Bro
considers a packet replicated if its IP ID field is the same as for the previous packet it saw
from the endpoint. Using this definition, a replication is most likely caused by a network
mechanism such as duplication of a packet by a router, rather than a transport mechanism
such as retransmission, though some TCPs fully reuse packets when retransmitting them,
including their IP ID field.
@item endian_type
@tab Whether the advance of the IP ID field as seen by the monitor was consistent with bigendian
(network order) addition, little-endian, or undetermined. The three values are represented
by the Bro constants ENDIAN_BIG, ENDIAN_LITTLE, and ENDIAN_UNKNOWN.
In addition, the value can be ENDIAN_CONFUSED, meaning that the monitor saw conflicting
evidence for little- and big-endian.
@end multitable
@caption{@code{endpoint_stats} fields for summarizing connection endpoint statistics, all of type @code{count}}
@end float


@node signature Analysis Script,
@section The @code{signature} Analysis Script

@cindex signature analysis
@cindex exploit scans
@cindex scans, exploit
@cindex horizontal exploit scans
@cindex vertical exploit scans
The @code{signature} module analyzes @emph{signature matches}
(see @ref{Signatures}).
For each signature, you can specify one of the actions
defined in Table 7.2.
In addition, the module identifies two types of @emph{exploit scans}:
@emph{horizontal} (a host triggers a signature for multiple destinations) and
@emph{vertical} (a host triggers multiple signature for the same destination).

@cindex signatures, log file
@cindex log file, signatures

The module handles one event:

@table @samp
@item @code{signature_match (state: signature_state, msg: string, data: string)}
Invoked upon a match of a signature which contains an @code{event} action (See @ref{Actions}).
@end table

It provides the following redefinable variables:

@table @samp
@item @code{sig_actions : table[string] of count}
Maps signature IDs to actions as defined in the table below.

@float Table, signature actions
@multitable  @columnfractions .2 .6
@item @strong{Action} @tab @strong{Meaning}
@item SIG_IGNORE @tab Ignore the signature completely.
@item SIG_QUIET @tab Process for scan detection but don't report individually.
@item SIG_FILE @tab Write matches to signatures-log
@item SIG_FILE_BUT_NOT_SCAN @tab Same, but ignore for scan processing
@item SIG_ALARM @tab Alarm and write to signatures, notice, and alarm files
@item SIG_ALARM_ONCE @tab Same, but only for the first instance
@item SIG_ALARM_PER_ORIG @tab Same, but once per originator
@item SIG_ALARM_NO_WORM @tab Same, but ignore if generated by known worm-source
@item SIG_COUNT_PER_RESP @tab Count per destination and alarm if threshold reached
@item SIG_SUMMARY @tab Don't alarm, but generate per-originator summary
@end multitable
@caption{Possible actions to take for signatures matches}
@end float

Default: @code{SIG_FILE}.

@item @code{horiz_scan_thresholds : set[count]}
Generate a notice whenever a remote host triggers a signature for
the given number of hosts.

Default: @code{@{ 5, 10, 50, 100, 500, 1000@} }

@item @code{vert_scan_thresholds : set[count]}
Generate a notice whenever a remote host triggers the
given number of signatures for the same destination.

Default: @code{@{ 5, 10, 50, 100, 500, 1000@} }

@end table

The module defines one function for external use:

@table @samp
@item @code{has_signature_matched (id: string, orig: addr, resp: addr): bool}
Returns true if the given signature has already matched for the
(originator,responder) pair.
@end table

@node SSL Analyzer,
@section The @code{SSL} Analyzer

@cindex SSL, analysis
The @code{SSL} analyzer processes traffic associated with the SSL
(Secure Socket Layer) protocol versions 2.0, 3.0
and 3.1.  SSL version 3.1 is also known as TLS (Transport
Layer Security) version 1.0 since from that version onward the IETF has taken
responsibility for further development of SSL.

Bro instantiates an @code{SSL} analyzer for any connection with service
ports @code{443/tcp (https), 563/tcp (nntps), 585/tcp (imap4-ssl), 614/tcp (sshell), 636/tcp (ldaps), 989/tcp (ftps-data), 990/tcp (ftps), 992/tcp (telnets), 993/tcp (imaps), 994/tcp (ircs), 995/tcp (pop3s)}, providing
you have loaded the @code{SSL} analyzer, or defined a handler for one of
the SSL events.

By default, the analyzer uses the above set of ports as a capture filter
(See: @ref{Filtering}).  It currently checks the SSL handshake process for
consistency, tries to verify seen certificates, generates several events,
does connection logging, tries to detect security weaknesses, and produces
simple statistics.  It is also able to store seen certificates on disk.
However, it does no decryption, so analysis is limited to clear text SSL
records. This means that analysis stops in the middle of the handshaking
phase for SSLv2 and at the end of it for SSLv3.0/SSLv3.1 (TLS).  For this
reason we have not implemented the SSL session caching mechanism (yet).

The analyzer consists of the four files: @code{ssl.bro}, @code{ssl-ciphers.bro},
@code{ssl-errors.bro},
and @code{ssl-alerts.bro}, which are accessed by  @code{@@load} @code{ssl}.
The analyzer writes to the @code{weird} and @code{ssl} log files.
The first receives all non-conformant and ``weird'' activity, while
the latter tracks the SSL handshaking phase.

@menu
* x509 record::
* ssl_connection_info record::
* SSL variables::
* SSL event handlers::
@end menu

@node x509 record,
@subsection The @code{x509} record

@cindex SSL, x509

This record is a very simplified structure for storing X.509
certificate information. It currently supports only the issuer and
subject names.

@example
type x509: record @{
    issuer:  string; # issuer name of the certificate
    subject: string; # subject name of the certificate
@};
@end example

@node ssl_connection_info record,
@subsection The @code{ssl_connection_info} record

@cindex SSL, connection information


The main data structure managed by the @code{SSL} analyzer is
a collection of @code{ssl_connection_info} records, where the
record type is shown below.

@example
type ssl_connection_info: record @{
id: count;                      # the log identifier number
connection_id: conn_id;         # IP connection information
version: count;                 # version associated with connection
client_cert: x509;
server_cert: x509;
id_index: string;               # index for associated sessionID
handshake_cipher: count;        # cipher suite client and server agreed upon
@};
@end example

The corresponding fields are @emph{Fixme: the description here is out of date}:

@table @samp
@item @code{id}
The unique connection identifier assigned to this connection.  Connections
are numbered starting at @code{1} and incrementing with each new connection.

@item @code{connection_id}
The TCP connection which this SSL connection is based on.

@item @code{version }
The SSL version number for this connection. Possible values are
@code{SSLv20}, for SSL version 2.0, @code{SSLv30} for version 3.0, and
@code{SSLv31} for version 3.1.

@item @code{client_cert }
The information from the client certificate, if available.

@item @code{server_cert }
The information from the server certificate, if available.

@item @code{id_index }
Index into associated @code{SSL_sessionID_record} table.

@item @code{handshake_cipher }
The cipher suite client and server agreed upon.
@emph{Note: For SSLv2 cached sessions, this is a placeholder (@code{0xABCD})}.

@end table

@node SSL variables,
@subsection @code{SSL} variables

@cindex analyzers, SSL, variables

The standard script defines the following redefinable variables:

@table @samp
@item @code{ssl_compare_cipherspecs : bool}
If true, remember the client and server cipher specs and perform additional
tests.  This costs an extra amount of memory (normally only for a short
time) but enables detection of non-intersecting cipher sets, for example.

Default: @code{T}.

@item @code{ssl_analyze_certificates : bool}
If true, analyze certificates seen in SSL connections, which
includes the following steps:
@itemize @bullet
@item
Generating a hash of the certificate and checking if we already
saw it earlier from the current host. If so, we won't
verify it, because we already did and verifying is a
computational expensive process. If the certificate has
changed for the current host, generate a weird event.

@item
Verify the certificate.

@item
Store of the certificate on disk in DER format.
@end itemize

Default: @code{T}.

@item @code{ssl_store_certificates : bool}
If certificates are analyzed, this variable determines they should be stored
on disk.

Default: @code{T}.

@item @code{ssl_store_cert_path : string}
Path where certificates are stored.
If empty, use the current directory.
@emph{Note: The path must not end with a slash!}

Default: @code{"../certs"}.

@item @code{ssl_verify_certificates : bool}
If certificates are analyzed, whether to verify them.

Default: @code{T}.

@item @code{x509_trusted_cert_path : string}
Path where OpenSSL looks for trusted certificates.
If empty, use the default OpenSSL path.

Default: @code{""}.

@item @code{ssl_max_cipherspec_size : count}
Maximum size in bytes for an SSL cipherspec.  If we see attempted use of
larger cipherspecs, warn and skip comparing it.

Default: @code{45}.

@item @code{ssl_store_key_material : bool}
If true, stores key material exchanged in the handshaking phase.
@emph{Note: This is mainly for decryption purposes and currently useless.}

Default: @code{T}.

@float Figure, SSL example
@example
1046778101.534846 #1 192.168.0.98/32988 >
		213.61.126.124/https start
1046778101.534846 #1 connection attempt version: 3.1
1046778101.534846 #1 cipher suites: SSLv3x_RSA_WITH_RC4_128_MD5 (0x4),
	SSLv3x_RSA_FIPS_WITH_3DES_EDE_CBC_SHA (0xFEFF),
	SSLv3x_RSA_WITH_3DES_EDE_CBC_SHA (0xA),
	SSLv3x_RSA_FIPS_WITH_DES_CBC_SHA (0xFEFE),
	SSLv3x_RSA_WITH_DES_CBC_SHA(0x9), SSLv3x_RSA_EXPORT1024_WITH_RC4_56_SHA (0x64),
	SSLv3x_RSA_EXPORT1024_WITH_DES_CBC_SHA (0x62),
	SSLv3x_RSA_EXPORT_WITH_RC4_40_MD5 (0x3),
	SSLv3x_RSA_EXPORT_WITH_RC2_CBC_40_MD5 (0x6),
1046778101.753356 #1 server reply, version: 3.1
1046778101.753356 #1 cipher suite: SSLv3x_RSA_WITH_RC4_128_MD5 (0x4),
1046778101.762601 #1 X.509 server issuer: /C=DE/ST=Hamburg/L=Hamburg/O=TC
	TrustCenter for Security in Data Networks GmbH/OU=TC
	TrustCenter Class 3 CA/Email=certificate@@trustcenter.de,
1046778101.762601 #1 X.509 server subject: /C=DE/ST=Berlin/O=Lehmanns
	Fachbuchhandlung GmbH/OU=Zentrale EDV/CN=www.jfl.de/Email=admin@@lehmanns.de
1046778101.894567 #1 handshake finished, version 3.1, cipher suite:
	SSLv3x_RSA_WITH_RC4_128_MD5 (0x4)
1046778104.877207 #1 finish
---

Used cipher-suites statistics:
SSLv3x_RSA_WITH_RC4_128_MD5 (0x4): 1

@end example
@caption{Example of SSL log file with a single SSL session.}
@end float

@cindex SSL, log file
@cindex log file, SSL
@cindex SSL session summary file
In addition, @code{ssl_log} holds the name of the SSL log file to
which Bro writes SSL connection summaries.  It defaults to
@code{open_log_file("ssl")}.
@end table

The above figure shows an example of how entries in the SSL log file look like.
We see a transcript of the first SSL connection seen since Bro started
running.  The first line gives its start and the participating hosts and
ports.  Next, we see a client trying to attempt a SSL (Version 3.1)
connection and the cipher suites offered.  The server replies with a SSL
3.1 @code{SERVER-REPLY} and the desired cipher suite.
@emph{Note: In SSL v3.0/v3.1 this determines which cipher suite will be used for the connection}.
Following this is the certificate the server sends,
including the issuer and subject.  Finally, we see that the handshaking
phase for this SSL connection is finished now, and that client and server
agreed on the cipher suite: @code{RSA_WITH_RC4_128_MD5}.  Due to encryption,
the SSL analyzer skips all further data.  We only see the end of the
connection.  When Bro finishes, we get some statistics about
the cipher suites used in all monitored SSL connections.

@cindex analyzers, SSL, variables

@node SSL event handlers,
@subsection @code{SSL} event handlers

@cindex analyzers, SSL, event handlers

The standard script handles the following events:

@table @samp
@item @code{ssl_conn_attempt (c: connection, version: count, cipherSuites: cipher_suites_list)}

Invoked upon the client side of connection @code{c} when the analyzer sees a @code{CLIENT-HELLO}
of SSL version @code{version} including the cipher suites the client offers @code{cipherSuites}.

The version can be @code{0x0002}, @code{0x0300} or @code{0x0301}.
A new entry is generated inside the SSL connection table and the cipher suites
are listed. Ciphers, that are known as weak (according to a corresponding table of
weak ciphers) are logged inside the @code{weak.log} file. This also happens to
cipher suites that we do not know yet.
@emph{Note: See the file @code{ssl-ciphers.bro} for a list of known cipher suites.}

@item @code{ssl_conn_server_reply (c: connection, version: count, cipherSuites: cipher_suites_list)}

This event is invoked upon the analyzer receiving a @code{SERVER-HELLO} of the SSL server.
It contains the SSL version the server wishes to use (@emph{Note: This finally determines, which SSL version will be used further}) and the cipher suite he offers. If it is SSL version 3.0 or 3.1, the server determines
within this @code{SERVER-HELLO} the cipher suite for the following connection (so it will only be one).
But if it's a SSL version 2.0 connection, the server only announces the cipher suites he supports and
it's up to the client to decide which one to use.

Again, the cipher suites are listed and weak and unknown cipher suites are reported inside
@code{weak.log}.

@item @code{ssl_certificate_seen (c: connection, isServer: int)}

Invoked whenever we see a certificate from client or server but before
verification of the certificate takes place.
This may be useful, if you want to do something before certificate verification
(e.g. do not verify certificates of some given servers).

@item @code{ssl_certificate (c: connection, cert: x509, isServer: bool)}

Invoked after the certificate from server or client (@code{isServer}) has been verified.
@emph{Note: We only verify certificates once. If we see them again, we only check if they have changed!}
@code{cert} holds the issuer and subject of the certificate, which gets stored
inside this SSL connection's information record inside the SSL connection table and
are written to @code{ssl.log}.

@item @code{ssl_conn_reused (c: connection, session_id: string)}

Invoked whenever a former SSL session is reused. @code{session_id} holds
the session ID as string of the reused session and is written to @code{ssl.log}.
Currently we don't do session tracking, because SSL version 2.0 doesn't
send the session ID in clear text when it's generated.

@item @code{ssl_conn_established (c: connection, version: count, cipher_suite: count)}

Invoked when the handshaking phase of an SSL connection is finished.  We
see the used SSL version and the cipher suite that will be used for
cryptography (written to @code{ssl.log}) if we have SSL version 3.0 or 3.1.
In case of SSL version 2.0 we can only determine the used cipher suite for
new sessions, not for reused ones.  (@emph{Note: In SSL version 3.0 and 3.1 the
cipher suite to be used is already announced in the @code{SERVER-HELLO}.})

@item @code{ssl_conn_alert (c: connection, version: count, level: count, description: count)}

Invoked when the analyzer receives an SSL alert.  The @code{level} of the
alert (warning or fatal) and the @code{description} are written into
@code{ssl.log}. (@emph{Note: See @code{ssl-alerts.bro}}).

@item @code{ssl_conn_weak (name: string, c: connection)}

This event is called when the analyzer sees:
@itemize @bullet
@item weak ciphers (See: @code{ssl_conn_attempt}, @code{ssl_server_reply}, @code{ssl_conn_established}),
@item unknown ciphers (See: @code{ssl_conn_attempt}, @code{ssl_server_reply}, @code{ssl_conn_established})
@item or certificate verification failed.
@end itemize

See @code{weak.bro}.

@end table

@cindex analyzers, SSL, event handlers

@node weird Analysis Script,
@section The @code{weird} Analysis Script

@cindex weird events
@cindex events, exceptional
@cindex exceptional events
@cindex unusual events

The @code{weird} module processes unusual or exceptional
events.  A number of these ``shouldn't'' or even ``can't'' happen,
yet they do.  The general design philosophy of Bro is to check
for such events whenever possible, because they can reflect incorrect
assumptions (either Bro's or the user's), attempts by attackers to
confuse the monitor and evade detection, broken hardware, misconfigured
networks, and so on.

Weird events are divided into three categories, namely those pertaining
to: connections; flows (a pair of hosts, but for which a specific connection
cannot be identified); and network behavior (cannot be associated with a
pair of hosts).  These categories have a total of four event handlers:
@code{conn_weird}, @code{conn_weird_addl}, @code{flow_weird}, and @code{net_weird},
and in the corresponding sections below we
catalog the events handled by each.  In addition, we separately catalog
the events generated by the standard scripts themselves
(See: @ref{Events generated by the standard scripts}).  Finally, two more weird events have their
own handlers, in order to associate detailed information with the event:
@code{rexmit_inconsistency} and @code{ack_above_hole}.

@cindex weird event summary file
@cindex log file, weird events
@code{weird_file} is the logging file  that
the module uses to record exceptional
events.  It defaults to @code{open_log_file("weird")}.

@cindex crud
@cindex unusual events, prevalence in actual network traffic
@cindex weird events, prevalence in actual network traffic
@cindex buggy implementations, causing ``weird'' events
@cindex diverse network use, causing ``weird'' events
@cindex Bro bugs/limitations, causing ``weird'' events
@cindex bugs, causing ``weird'' events

@emph{Note:  While these events ``shouldn't'' happen, in reality they often
do.  For example, of the 73 listed below, a search of 10 months' worth of
logs at LBNL shows that 42 were seen operationally.  While some of the
instances reflect attacks, the great majority are simply due to i) buggy
implementations, ii) diverse use of the network, or iii) Bro bugs or
limitations.  Accordingly, you may initially be inclined to log each
instance, but don't be surprised to find that you soon decide to only
record many of them in the @code{weird} file, or not record them at all.
(For further discussion, see the section on ``crud'' in [Pa99].) }

@menu
* Actions for weird events::
* weird variables::
* weird functions::
* Events handled by conn_weird::
* Events handled by conn_weird_addl::
* Events handled by flow_weird::
* Events handled by net_weird::
* Events generated by the standard scripts::
* Additional handlers for weird events::
@end menu

@node Actions for weird events,
@subsection Actions for ``weird'' events

@cindex weird events, actions

The general approach taken by the module is to categorize for each event
the action to take when the event engine generates the event.
Table XX summarizes the different possible actions.

@float Table, Weird Event Actions
@multitable  @columnfractions .2 .75
@item @strong{Action} @tab @strong{Meaning}
@item WEIRD_UNSPECIFIED
@tab No action specified.
@item WEIRD_IGNORE
@tab Ignore the event.
@item WEIRD_FILE
@tab Record the event to weird file, if it has not been seen for these hosts before. (But see
weird do not ignore repeats.)
@item WEIRD_NOTICE_ALWAYS
@tab Record the event to weird file and generate a notice each time the event occurs.
@item WEIRD_NOTICE_ONCE
@tab Record the event to weird file; generate a notice the first time the event occurs.
@item WEIRD_NOTICE_PER_CONN
@tab Record the event to weird file; generate a notice the first time it occurs for a
given connection.
@item WEIRD_NOTICE_PER_ORIG
@tab Record the event to weird file; generate a notice the first time it occurs for a
given originating host.
@end multitable
@caption{Different types of possible actions to take for "weird" events}
@end float

@node weird variables,
@subsection @code{weird} variables

The standard @code{weird} script provides the following redefinable variables:

@table @samp
@item @code{weird_action : table[string] of count}
Maps different weird events to actions as given in Table in @ref{Actions for weird events} above.

Default: as specified in @code{conn_weird}, @code{conn_weird_addl}, @code{flow_weird}, @code{net_weird},
and @ref{Events generated by the standard scripts}.  As usual, you can change particular
values using refinement.  For example:
@example
redef weird_action: table[string] of count += @{
    [["bad_TCP_checksum", "bad_UDP_checksum"]] = WEIRD_IGNORE,
    ["fragment_overlap"] = WEIRD_NOTICE_PER_CONN,
@};
@end example

would specify to ignore TCP and UDP checksum errors (rather than the default
of @code{WEIRD_FILE}), and to notice fragment overlaps once per connection
in which they occur, rather than the default of @code{WEIRD_NOTICE_ALWAYS}.

@item @code{weird_action_filters : table[string] of function(c: connection): count}
Indexed by the name of a weird event, yields a function that when called
for a given connection exhibiting the event, returns an action from
the table in section @ref{Actions for weird events}.
A return value of @code{WEIRD_UNSPECIFIED}
means ``no special action, use the action you normally would.''
This variable thus allows arbitrary
customization of the handling of particular events.

Default: empty, for the @code{weird} analyzer itself.  The
 analyzer redefines this variable as follows:
@example
    redef weird_action_filters += @{
        [["bad_RPC", "excess_RPC", "multiple_RPCs",
		"partial_RPC"]] = RPC_weird_action_filter,
@};
@end example

where @code{RPC_weird_action_filter} is a function internal to the
analyzer that returns @code{WEIRD_FILE} if the originating host
is in , and @code{WEIRD_UNSPECIFIED} otherwise.

@item @code{weird_ignore_host : set[addr, string]}
Specifies that the analyzer should ignore the given weird event (named by
the second index) if it involves the given address (as either originator
or responder host).

Default: empty.

@item @code{weird_do_not_ignore_repeats : set[string]}
Gives a set of weird events that, if their action is @code{WEIRD_FILE},
should still be recorded to the @code{weird_file} each time they occur.

Default: the events relating to checksum errors, i.e.,
@code{"bad_IP_checksum"},
@code{"bad_TCP_checksum"},
@code{"bad_UDP_checksum"}, and
@code{"bad_ICMP_checksum"}.
These are recorded multiple times because it can prove handy to
be able to track clusters of checksum errors.

@end table

@node weird functions,
@subsection @code{weird} functions

The @code{weird} analyzer includes the following functions:

@table @samp
@item @code{report_weird (t: time, name: string, id: string, action: WeirdAction, no_log: bool)}
Processes an occurrence of the weird event @code{name} associated with
the connection described by the string @code{id} (which may be empty
if no connection is associated with the event).  @code{action} is the
action associated with the event.  For @code{report_weird}, the only
distinctions made between the different actions are that @code{WEIRD_IGNORE}
causes the function to do nothing; any of @code{WEIRD_NOTICE_xxx}
cause the function to generate a notice, unless @code{no_log} is true; and @code{WEIRD_UNSPECIFIED}
causes the function to look up the action in @code{weird_action}.
If the function does @emph{not} find an action
for the event, then it uses @code{WEIRD_NOTICE_ALWAYS} and prepends the log
message with a pair of asterisks (``@code{**}'') to flag that this event
does not have a specified action.

For @code{WEIRD_FILE}, @code{report_weird} only
records the event once to the file, unless the given event is present
in @code{weird_do_not_ignore_repeats}.  Events with notice-able actions
are always recorded to @code{weird_file}.

@item @code{report_weird_conn (t: time, name: string, id: string, c: connection)}
Processes an occurrence of the weird event @code{name} associated with
the connection @code{c}, which is described by the string @code{id}.

If @code{report_weird_conn} finds one of the hosts and the given event name
in @code{weird_ignore_host}, then it does nothing.  Then, if the event
is in @code{weird_action}, then it looks up the event in
@code{weird_action_filters} and invokes the corresponding function
if present, otherwise taking the action from @code{weird_action}.
It then implements the various flavors of @code{WEIRD_NOTICE_xxx}
by not generating notices more than once per connection, originator host,
etc., though the events are still written to @code{weird_file}.
Finally, the function invokes  to do the
actual recording and/or writing to @code{weird_file}.

@item @code{report_weird_orig (t: time, name: string, id: string, orig: addr)}
Processes an occurrence of the weird event @code{name} associated with
the source address @code{orig}.  @code{id} textually describes the flow from
@code{orig} to the destination, for example using @code{endpoint_id}.

The function looks up the event name in @code{weird_action} and
passes it along to @code{report_weird}.

@end table

@node Events handled by conn_weird,
@subsection Events handled by @code{conn_weird}

@cindex weird events, handled by conn_weird
@cindex event handling, weird

@table @samp
@item @code{conn_weird (name: string, c: connection)}
Invoked for most ``weird'' events.
@code{name} is the name of the weird event, and @code{c} is the
connection with which it's associated.

@end table

@noindent @code{conn_weird} handles the following events, all of which have
a default action of @code{WEIRD_FILE}:

@table @samp
@item @code{active_connection_reuse}
A new connection attempt (initial SYN)
was seen for an already-established connection that has not
yet terminated.
@cindex HTTP, weird events
@item @code{bad_HTTP_reply}
The first line of a reply from an HTTP
server did not include @code{HTTP/}@emph{version}.
@item @code{bad_HTTP_version}
The first line of a request from an HTTP
client did not include @code{HTTP/}@emph{version}.
@cindex ICMP, weird events
@cindex packets, corrupted
@cindex corrupted packets
@cindex checksum error, ICMP
@cindex ICMP, checksum error
@item @code{bad_ICMP_checksum}
The checksum field in an
ICMP packet was invalid.
@cindex Rlogin, weird events
@item @code{bad_rlogin_prolog}
The beginning of an Rlogin connection had
a syntactical error.
@cindex RPC (Remote Procedure Call), weird events
@item @code{bad_RPC}
A Remote Procedure Call was ill-formed.
@item @code{bad_RPC_program}
A portmapper RPC call did not include the
correct portmapper program number.
@item @code{bad_SYN_ack}
A TCP SYN acknowledgment (SYN-ack) did not acknowledge
the sequence number sent in the initial SYN.
@cindex TCP, weird events
@cindex checksum error, TCP
@cindex TCP, checksum error
@item @code{bad_TCP_checksum}
A TCP packet had a bad checksum.
@cindex UDP, weird events
@cindex checksum error, UDP
@cindex UDP, checksum error
@item @code{bad_UDP_checksum}
A UDP packet had a bad checksum.
@item @code{baroque_SYN}
A TCP SYN was seen with an unlikely
combination of other flags (the URGent pointer).
@item @code{blank_in_HTTP_request}
The URL in an HTTP request includes
an embedded blank.
@item @code{connection_originator_SYN_ack}
A TCP endpoint that originated
a connection by sending a SYN followed this up by sending a SYN-ack.
@item @code{data_after_reset}
After a TCP endpoint sent a RST to terminate
a connection, it sent some data.
@item @code{data_before_established}
Before the connection was fully
established, a TCP endpoint sent some data.
@item @code{excessive_RPC_len}
An RPC record sent over a TCP connection
exceeded 8 KB.
@item @code{excess_RPC}
The sender of an RPC request or reply included
leftover data beyond what the RPC parameters or result value
themselves consumed.
@item @code{FIN_advanced_last_seq}
A TCP endpoint retransmitted a FIN with
a higher sequence number than previously.
@item @code{FIN_after_reset}
A TCP endpoint sent a FIN after sending a RST.
@cindex packets, storms
@cindex storms
@item @code{FIN_storm}
The monitor saw a flurry of FIN packets all sent on
the same connection.  A ``flurry'' is defined as 1,000 packets that
arrived with less than 1 sec between successive FINs.
@emph{Deficiency: Clearly, this numbers should be user-controllable. }
@item @code{HTTP_unknown_method}
The method in an HTTP request was
not GET, POST or HEAD.
@item @code{HTTP_version_mismatch}
A persistent HTTP connection sent a
different version number for a subsequent item than it
did initially.
@item @code{inappropriate_FIN}
A TCP endpoint sent a FIN before the
connection was fully established.
@item @code{multiple_HTTP_request_elements}
An HTTP request included multiple
methods.
@item @code{multiple_RPCs}
A TCP RPC stream included more than one
remote procedure call.
@cindex NULs
@item @code{NUL_in_line}
A NUL (ASCII 0) was seen in a text stream
that is expected to be free of NULs.  @emph{Updateme:  Currently, the only such stream is that associated with an FTP control connection. }
@item @code{originator_RPC_reply}
The originator (and hence presumed client)
of an RPC connection sent an RPC reply (either instead of a request,
or in addition to a request).
@cindex Finger, weird events
@item @code{partial_finger_request}
When a Finger connection terminated, it
included a final line of unanalyzed text because the text was
not newline-terminated.
@cindex FTP, weird events
@item @code{partial_ftp_request}
When an FTP connection terminated, it
included a final line of unanalyzed text because the text was
not newline-terminated.
@item @code{partial_ident_request}
When an IDENT connection terminated, it
included a final line of unanalyzed text because the text was
not newline-terminated.
@item @code{partial_portmapper_request}
A portmapper connection terminated with
an unanalyzed request because the data stream was incomplete.
@item @code{partial_RPC}
An RPC was missing some required header information
due to truncation.
@cindex data, unanalyzed
@cindex unanalyzed data
@item @code{pending_data_when_closed}
A TCP connection closed even though
not all of the data in it was analyzed due to a sequence hole.
@cindex split routing
@cindex routing, split
@cindex vantage point
@cindex bidirectional vs. unidirectional analysis
@cindex analysis, bidirectional vs. unidirectional
@cindex undirectional analysis
@cindex scanning, stealth
@cindex stealth scans
@item @code{possible_split_routing}
Bro appears to be seeing only one
direction of some bi-directional connections .
This can also occur due to certain forms of stealth-scanning.
@cindex connection, reuse
@cindex Maximum Segment Lifetime (MSL)
@cindex MSL (Maximum Segment Lifetime)
@item @code{premature_connection_reuse}
A TCP connection tuple is being
reused less than 30 sec after its previous use.  (The standard
requires waiting 2 * @w{MSL} = 4 minutes [p. 27] [RFC-793].)
@item @code{repeated_SYN_reply_wo_ack}
A TCP responder that replied to an
initial SYN with a SYN-ack has subsequently sent a SYN @emph{without}
an acknowledgment.
@item @code{repeated_SYN_with_ack}
A TCP originator that sent an
initial SYN has subsequently sent a SYN-ack.
@item @code{responder_RPC_call}
The responder (and hence presumed server)
of an RPC connection sent an RPC request (either instead of a reply,
or in addition to a reply).
@item @code{rlogin_text_after_rejected}
An Rlogin client sent additional text
to an Rlogin server after the server already presumably rejected
the client's service request.
@cindex retransmission, inconsistent
@cindex inconsistent retransmission
@cindex evasion, inconsistent RPC retransmission
@item @code{RPC_rexmit_inconsistency}
An RPC call was retransmitted, and
the retransmitted call differed from the original call.  This
could reflect an attempt by an attacker to evade the monitor.
@emph{Note:  This type of inconsistency checking is not available for RPC replies because the transmission of the reply in general marks the end of the RPC connection, and the monitor deletes the connection state shortly afterward. }
@item @code{RST_storm}
The monitor saw a flurry of RST packets all sent on
the same connection.  See @code{FIN_storm} for the definition of
``flurry.''
@item @code{RST_with_data}
A TCP RST packet included data.  This actually
is allowed by the specification [4.2.2.12] RFC-1122.
@emph{Deficiency: This event should include the data. }
@cindex simultaneous open
@cindex connection, simultaneous open
@item @code{simultaneous_open}
The monitor saw a TCP simultaneous open,
i.e., both endpoints sent initial SYNs to one another at the same time.
While the specification allows this [p. 30] RFC-793, none of the
protocols analyzed by Bro should be using it.
@cindex transients, startup
@cindex startup, transients
@cindex scanning, stealth
@cindex stealth scans
@item @code{spontaneous_FIN}
A TCP endpoint sent a FIN packet without
sending any previous packets.  This event can reflect stealth-scanning,
but can also occur when Bro has recently
started up and has not seen other traffic on a connection and hence does
not know that the connection already exists.
@item @code{spontaneous_RST}
A TCP endpoint sent a RST packet without
sending any previous packets.  As with @code{spontaneous_FIN}, this
event can reflect either stealth scanning or a Bro start-up
transient.
@item @code{SYN_after_close}
A TCP endpoint sent a SYN (connection
initiation) after sending a FIN (connection termination),
but before the connection fully closed.
@item @code{SYN_after_partial}
A TCP endpoint in a ``partial'' connection
 sent a SYN.
@item @code{SYN_after_reset}
A TCP endpoint sent a SYN after sending a
RST (reset connection).
@item @code{SYN_inside_connection}
A TCP endpoint sent a SYN during a
connection (or partial connection) on which it had already
sent data.
@item @code{SYN_seq_jump}
A TCP endpoint retransmitted a SYN or a
SYN-ack, but with a different sequence number.
@cindex T/TCP
@cindex TCP, transaction
@cindex transaction TCP
@item @code{SYN_with_data}
A TCP endpoint included data in a SYN packet
it sent.  Note, this can legitimately occur for T/TCP connections
[RFC-1644].
@cindex TCP, Christmas packet
@cindex Christmas packet
@item @code{TCP_christmas}
A TCP endpoint sent a SYN packet that
included the RST flag (a nonsensical combination).  The
term ``Christmas packet'' has been used in this context
(particularly if other flags are set, too) because the
packet's flags are ``lit up like a Christmas tree.''
@cindex length mismatch, UDP
@cindex UDP, length mismatch
@cindex evasion, length mismatch
@item @code{UDP_datagram_length_mismatch}
The length field in a UDP header
did not match the length field in the IP header.  This could
reflect an attempt by an attacker to evade the monitor.
@item @code{unpaired_RPC_response}
An RPC reply was seen for which no
request was seen.  This event could reflect a Bro start-up
transient (it started running after the request was sent).
@item @code{unsolicited_SYN_response}
A TCP endpoint sent a SYN-ack without
first receiving an initial SYN.  This event could reflect a
Bro start-up transient.

@end table

@node Events handled by conn_weird_addl,
@subsection Events handled by @code{conn_weird_addl}

@cindex weird events, handled by conn_weird_addl

@cindex polymorphic functions, need for
@table @samp
@item @code{conn_weird_addl (name: string, c: connection, addl: string)}
Invoked for a few ``weird'' events that require an extra (string)
argument to help clarify the event. @emph{Deficiency: It would likely be very handy if the general ``weird'' event handling was more flexible, with the ability to have various parameters associated with the events.  Doing so will likely have to wait on general Bro mechanism for dealing with default parameters and/or polymorphic functions and event handlers. }

@end table

@code{conn_weird_addl} handles the following events, all of which
have a default action of @code{WEIRD_FILE}:

@table @samp
@cindex IDENT, weird events
@item @code{bad_ident_reply}
A reply from an IDENT server was
syntactically invalid.
@item @code{bad_ident_request}
A request to an IDENT server was
syntactically invalid.
@item @code{ident_request_addendum}
An IDENT request included additional
text beyond that forming the request itself.
@end table

@node Events handled by flow_weird,
@subsection Events handled by @code{flow_weird}

@cindex weird events, handled by flow_weird

@table @samp
@item @code{flow_weird (name: string, src: addr, dst: addr)}
is invoked for ``weird'' events that cannot be associated with a
particular connection, but only with a pair of hosts, corresponding
to a flow of packets from @code{src} to @code{dst}. Presently, all of
these events deal with fragments.

@end table

@code{flow_weird} handles the following events:

@table @samp
@cindex IP, fragments
@cindex fragments, excessively large
@cindex denial of service, excessively large fragments
@item @code{excessively_large_fragment}
A set of IP fragments reassembled
to a maximum size exceeding 64,000 bytes.  @emph{Note:  Sizes between 64,000 and 65,535 bytes are allowed, strictly speaking, but are highly unlikely in legitimate traffic.  Sizes above 65,535 bytes generally represent attempted denial-of-service attacks, due to IP implementations that crash upon receiving such impossibly-large fragment sets. }

Default: @code{WEIRD_NOTICE_ALWAYS}.

@cindex fragments, excessively small
@cindex evasion, excessively small fragments
@item @code{excessively_small_fragment}
A fragment other than the
last fragment in a set was less than 64 bytes in size.
@emph{Note:  The standard allows such small fragments, but their presence may reflect an attacker attempting to evade the monitor by splitting header information across multiple fragments. }

Default: @code{WEIRD_NOTICE_ALWAYS}.

@cindex fragments, inconsistent
@cindex evasion, inconsistent fragments
@item @code{fragment_inconsistency}
A fragment overlaps with a previously
sent fragment, and the two disagree on data they share in common.
This event could reflect an attacker attempting to evade the
monitor; it can also occur because Bro keeps previous fragments
indefinitely (@emph{Deficiency: it needs to provide a means for flushing old fragments, otherwise it becomes vulnerable to a state-holding attack}), and occasionally a fragment will
overlap with one sent much earlier and long-since forgotten
by the endpoints.

Default: @code{WEIRD_NOTICE_ALWAYS}.

@cindex fragments, overlapping
@item @code{fragment_overlap}
A fragment overlaps with a previously
sent fragment.  As for @code{fragment_inconsistency}, this
event can occur due to Bro keeping previous fragments
indefinitely.  This event does not in general reflect a
possible attempt at evasion.

Default: @code{WEIRD_NOTICE_ALWAYS}.

@cindex fragments, inconsistent protocols
@item @code{fragment_protocol_inconsistency}
Two fragments were seen
for the same flow and IP ID which differed in their transport protocol
(e.g., UDP, TCP).  According to the specification, this is allowed
[p. 24] RFC-791, but its use appears highly unlikely.

Default: @code{WEIRD_FILE}, because it is difficult to see how
an attacker can exploit this anomaly.

@cindex fragments, inconsistent sizes
@cindex evasion, inconsistent fragment size
@item @code{fragment_size_inconsistency}
A ``last fragment'' was
seen twice, and the two disagree on how large the reassembled datagram
should be.  This event could reflect an attacker attempting to evade
the monitor.

Default: @code{WEIRD_FILE}, since it is more likely that this
occurs due to a high volume flow of fragments wrapping the
IP ID space than due to an actual attack.

@item @code{fragment_with_DF}
A fragment was seen with the ``Don't Fragment''
bit set in its header.  While strictly speaking this is not illegal,
and not impossible (a router could have fragmented a packet and then
decided that the fragments should not be further fragmented), its
presence is highly unusual.

Default: @code{WEIRD_FILE}, because it's difficult to see how
this could reflect malicious activity.

@item @code{incompletely_captured_fragment}
A fragment was seen whose
length field is larger than the fragment datagram appearing on the
monitored link.

Default: @code{WEIRD_NOTICE_ALWAYS}.

@end table

@node Events handled by net_weird,
@subsection Events handled by @code{net_weird}

@cindex weird events, handled by net_weird

@table @samp
@item @code{net_weird (name: string)}
is invoked for ``weird'' events that cannot be associated with
a particular connection or set of hosts.  Except as noted, the
default action for all such events is @code{WEIRD_FILE}.

@end table

@code{net_weird} handles the following events:

@cindex packets, corrupted
@cindex corrupted packets
@cindex IP, weird events
@cindex IP, checksum error
@cindex checksum error, IP
@table @samp
@item @code{bad_IP_checksum}
A packet had a bad IP header checksum.

@cindex TCP, corrupted header
@item @code{bad_TCP_header_len}
The length of the TCP header (which is
itself specified in the header) was smaller than the minimum
allowed size.

@cindex headers, truncated
@cindex truncated headers
@item @code{internally_truncated_header}
A captured packet with a valid
IP length field was smaller as actually recorded, such that the
captured version of the packet was illegally small.  This event
may reflect an error in Bro's packet capture hardware or software.

Default: @code{WEIRD_NOTICE_ALWAYS}, because this event can indicate
a basic problem with Bro's packet capture.

@item @code{truncated_IP}
A captured packet either was too small to
include a minimal IP header, or the full length as recorded by
the packet capture library was smaller than the length as indicated
by the IP header.

@item @code{truncated_header}
An IP datagram's header indicates a length
smaller than that required for the indicated transport type (TCP,
UDP, ICMP).

@end table

@node Events generated by the standard scripts,
@subsection Events generated by the standard scripts

@cindex weird events, generated by standard scripts

The following events are generated by the standard scripts themselves:

@table @samp
@item @code{bad_pm_port}
See @code{pm_bad_port}.  Handled by  @code{conn_weird_addl},
where the extra parameter is the text
@code{"port <}@emph{bad-port}@code{>"}.

@cindex denial of service, Land attack
@cindex Land attack
@code{Land_attack}
A TCP connection attempt was seen with identical
initiator and responder addresses and ports.  This event likely
reflects an attempted denial-of-service attack known as a
``Land'' attack.  See @code{check_spoof}.  Handled by @code{conn_weird}.

@end table

@node Additional handlers for weird events,
@subsection Additional handlers for ``weird'' events

@cindex weird events, additional handlers

In addition to the above, generalized events, Bro includes two specific
events that are defined by themselves so they can include additional
parameterization:

@cindex evasion, inconsistent TCP retransmission
@cindex retransmission, inconsistent
@cindex inconsistent retransmission
@table @samp
@item @code{rexmit_inconsistency (c: connection, t1: string, t2: string)}
Invoked when a retransmission associated with connection @code{c} differed
in its data from the contents transmitted previously.  @code{t1} gives
the original data and @code{t2} the different retransmitted data.

@cindex bugs, appalling
This event may reflect an attacker attempting to evade the monitor.
Unfortunately, however, experience has shown that
@emph{(i)} inconsistent retransmissions do in fact
happen due to (appalling) TCP implementation bugs, and
@emph{(ii)} once they occur, they tend to cascade, because often
the source of the bug is that the two endpoints have become
desynchronized.

The handler logs the message in the format
@code{"}@emph{id}@code{ rexmit inconsistency (<t1>) (<t2>)"} .  However,
the handler only logs the first instance of an inconsistency, due to
the cascade problem mentioned above.

@emph{Deficiency: The handler is not told which of the two connection endpoints was the faulty transmitter. }

@cindex packets, drops
@cindex acknowledgment holes
@cindex inconsistent acknowledgment
@cindex bugs, appalling
@item @code{ack_above_hole (c: connection, t1: string, t2: string)}
Invoked when Bro sees a TCP receiver acknowledge data above
a sequence hole.  In principle, this should never occur.  Its
presence generally means one of two things: @emph{(i)} a TCP
implementation with an appalling bug (these definitely exist),
or @emph{(ii)} a packet drop by Bro's packet capture facility,
such that it never saw the data now being acknowledged.

Because of the seriousness of this latter possibility, the
handler logs a message
@code{ack above a hole}.
@emph{Note:  You can often distinguish between a truly broken TCP acknowledgment and Bro dropping packets by the fact that in the latter case you generally see a cluster of ack-above-a-hole messages among otherwise unrelated connections. }

@emph{Deficiency: The handler is not told which of the two connection endpoints sent the acknowledgment. }

@end table

@cindex event handling, weird

@cindex unusual events
@cindex exceptional events
@cindex events, exceptional
@cindex weird events

@node icmp Analyzer,
@section The @code{icmp} Analyzer

not done.

@node stepping Analyzer,
@section The @code{stepping} Analyzer

not done.

@node ssh-stepping Analysis Script,
@section The @code{ssh-stepping} Analysis Script

not done.

@node backdoor Analyzer,
@section The @code{backdoor} Analyzer

not done.

@node interconn Analyzer,
@section The @code{interconn} Analyzer

not done.

@cindex standard scripts
@cindex scripts, standard
@cindex analyzers