mirror of
https://github.com/zeek/zeek.git
synced 2025-10-02 06:38:20 +00:00
Added a document for the SumStats framework.
This commit is contained in:
parent
eff96bef37
commit
fab47cc749
5 changed files with 184 additions and 0 deletions
|
@ -13,4 +13,5 @@ Frameworks
|
||||||
logging
|
logging
|
||||||
notice
|
notice
|
||||||
signatures
|
signatures
|
||||||
|
sumstats
|
||||||
|
|
||||||
|
|
36
doc/frameworks/sumstats-countconns.bro
Normal file
36
doc/frameworks/sumstats-countconns.bro
Normal file
|
@ -0,0 +1,36 @@
|
||||||
|
@load base/frameworks/sumstats
|
||||||
|
|
||||||
|
event connection_established(c: connection)
|
||||||
|
{
|
||||||
|
# Make an observation!
|
||||||
|
# This observation is global so the key is empty.
|
||||||
|
# Each established connection counts as one so the observation is always 1.
|
||||||
|
SumStats::observe("conn established",
|
||||||
|
SumStats::Key(),
|
||||||
|
SumStats::Observation($num=1));
|
||||||
|
}
|
||||||
|
|
||||||
|
event bro_init()
|
||||||
|
{
|
||||||
|
# Create the reducer.
|
||||||
|
# The reducer attaches to the "conn established" observation stream
|
||||||
|
# and uses the summing calculation on the observations.
|
||||||
|
local r1 = SumStats::Reducer($stream="conn established",
|
||||||
|
$apply=set(SumStats::SUM));
|
||||||
|
|
||||||
|
# Create the final sumstat.
|
||||||
|
# We give it an arbitrary name and make it collect data every minute.
|
||||||
|
# The reducer is then attached and a $epoch_result callback is given
|
||||||
|
# to finally do something with the data collected.
|
||||||
|
SumStats::create([$name = "counting connections",
|
||||||
|
$epoch = 1min,
|
||||||
|
$reducers = set(r1),
|
||||||
|
$epoch_result(ts: time, key: SumStats::Key, result: SumStats::Result) =
|
||||||
|
{
|
||||||
|
# This is the body of the callback that is called when a single
|
||||||
|
# result has been collected. We are just printing the total number
|
||||||
|
# of connections that were seen. The $sum field is provided as a
|
||||||
|
# double type value so we need to use %f as the format specifier.
|
||||||
|
print fmt("Number of connections established: %.0f", result["conn established"]$sum);
|
||||||
|
}]);
|
||||||
|
}
|
45
doc/frameworks/sumstats-toy-scan.bro
Normal file
45
doc/frameworks/sumstats-toy-scan.bro
Normal file
|
@ -0,0 +1,45 @@
|
||||||
|
@load base/frameworks/sumstats
|
||||||
|
|
||||||
|
# We use the connection_attempted event limit our observations to those
|
||||||
|
# which were attempted and not successful.
|
||||||
|
event connection_attempt(c: connection)
|
||||||
|
{
|
||||||
|
# Make an observation!
|
||||||
|
# This observation is about the host attempting the connection.
|
||||||
|
# Each established connection counts as one so the observation is always 1.
|
||||||
|
SumStats::observe("conn attempted",
|
||||||
|
SumStats::Key($host=c$id$orig_h),
|
||||||
|
SumStats::Observation($num=1));
|
||||||
|
}
|
||||||
|
|
||||||
|
event bro_init()
|
||||||
|
{
|
||||||
|
# Create the reducer.
|
||||||
|
# The reducer attaches to the "conn attempted" observation stream
|
||||||
|
# and uses the summing calculation on the observations. Keep
|
||||||
|
# in mind that there will be one result per key (connection originator).
|
||||||
|
local r1 = SumStats::Reducer($stream="conn attempted",
|
||||||
|
$apply=set(SumStats::SUM));
|
||||||
|
|
||||||
|
# Create the final sumstat.
|
||||||
|
# This is slightly different from the last example since we're providing
|
||||||
|
# a callback to calculate a value to check against the threshold with
|
||||||
|
# $threshold_val. The actual threshold itself is provided with $threshold.
|
||||||
|
# Another callback is
|
||||||
|
SumStats::create([$name = "finding scanners",
|
||||||
|
$epoch = 5min,
|
||||||
|
$reducers = set(r1),
|
||||||
|
# Provide a threshold.
|
||||||
|
$threshold = 5.0,
|
||||||
|
# Provide a callback to calculate a value from the result
|
||||||
|
# to check against the threshold field.
|
||||||
|
$threshold_val(key: SumStats::Key, result: SumStats::Result) =
|
||||||
|
{
|
||||||
|
return result["conn attempted"]$sum;
|
||||||
|
},
|
||||||
|
# Provide a callback for when a key crosses the threshold.
|
||||||
|
$threshold_crossed(key: SumStats::Key, result: SumStats::Result) =
|
||||||
|
{
|
||||||
|
print fmt("%s attempted %.0f or more connections", key$host, result["conn attempted"]$sum);
|
||||||
|
}]);
|
||||||
|
}
|
102
doc/frameworks/sumstats.rst
Normal file
102
doc/frameworks/sumstats.rst
Normal file
|
@ -0,0 +1,102 @@
|
||||||
|
==================
|
||||||
|
Summary Statistics
|
||||||
|
==================
|
||||||
|
|
||||||
|
.. rst-class:: opening
|
||||||
|
|
||||||
|
Measuring aspects of network traffic is an extremely common task in Bro.
|
||||||
|
Bro provides data structures which make this very easy as wellin
|
||||||
|
simplistic cases such as size limited trace file processing. In real-
|
||||||
|
world deployments though, there are difficulties that arise from
|
||||||
|
clusterization (many processes sniffing traffic) and unbounded data sets
|
||||||
|
(traffic never stops). The Summary Statistics (otherwise referred to as
|
||||||
|
SumStats) framework aims to define a mechanism for consuming unbounded
|
||||||
|
data sets and making them measurable in practice on large clustered and
|
||||||
|
non-clustered Bro deployments.
|
||||||
|
|
||||||
|
.. contents::
|
||||||
|
|
||||||
|
Overview
|
||||||
|
========
|
||||||
|
|
||||||
|
The Sumstat processing flow is broken into three pieces. Observations, where
|
||||||
|
some aspect of an event is observed and fed into the Sumstats framework.
|
||||||
|
Reducers, where observations are collected and measured, typically by taking
|
||||||
|
some sort of summary statistic measurement like average or variance (among
|
||||||
|
others). Sumstats, where reducers have an epoch (time interval) that their
|
||||||
|
measurements are performed over along with callbacks for monitoring thresholds
|
||||||
|
or viewing the collected and measured data.
|
||||||
|
|
||||||
|
Terminology
|
||||||
|
===========
|
||||||
|
|
||||||
|
Observation
|
||||||
|
|
||||||
|
A single point of data. Observations have a few components of their
|
||||||
|
own. They are part of an arbitrarily named observation stream, they
|
||||||
|
have a key that is something the observation is about, and the actual
|
||||||
|
observation itself.
|
||||||
|
|
||||||
|
Reducer
|
||||||
|
|
||||||
|
Calculations are applied to an observation stream here to reduce the
|
||||||
|
full unbounded set of observations down to a smaller representation.
|
||||||
|
Results are collected within each reducer per-key so care must be
|
||||||
|
taken to keep the total number of keys tracked down to a reasonable
|
||||||
|
level.
|
||||||
|
|
||||||
|
Sumstat
|
||||||
|
|
||||||
|
The final definition of a Sumstat where one or more reducers is
|
||||||
|
collected over an interval, also known as an epoch. Thresholding can
|
||||||
|
be applied here along with a callback in the event that a threshold is
|
||||||
|
crossed. Additionally, a callback can be provided to access each
|
||||||
|
result (per-key) at the end of each epoch.
|
||||||
|
|
||||||
|
Examples
|
||||||
|
========
|
||||||
|
|
||||||
|
These examples may seem very simple to an experienced Bro script developer and
|
||||||
|
they're intended to look that way. Keep in mind that these scripts will work
|
||||||
|
on small single process Bro instances as well as large many-worker clusters.
|
||||||
|
The complications from dealing with flow based load balancing can be ignored
|
||||||
|
by developers writing scripts that use Sumstats due to it's built in cluster
|
||||||
|
transparency.
|
||||||
|
|
||||||
|
Printing the number of connections
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
|
Sumstats provides a simple way of approaching the problem of trying to count
|
||||||
|
the number of connections over a given time interval. Here is a script with
|
||||||
|
inline documentation that does this with the Sumstats framework:
|
||||||
|
|
||||||
|
.. btest-include:: ${DOC_ROOT}/frameworks/sumstats-countconns.bro
|
||||||
|
|
||||||
|
When run on a sample PCAP file from the Bro test suite, the following output
|
||||||
|
is created:
|
||||||
|
|
||||||
|
.. btest:: sumstats-countconns
|
||||||
|
|
||||||
|
@TEST-EXEC: btest-rst-cmd bro -r ${TRACES}/workshop_2011_browse.trace ${DOC_ROOT}/frameworks/sumstats-countconns.bro
|
||||||
|
|
||||||
|
|
||||||
|
Toy Scan detection
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Taking the previous example even further, we can implement a simple detection
|
||||||
|
to demonstrate the thresholding functionality. This example is a toy to
|
||||||
|
demonstate how thresholding works in Sumstats and is not meant to be a real-
|
||||||
|
world functional example, that is left to the scan.bro script that is included
|
||||||
|
with Bro.
|
||||||
|
|
||||||
|
.. btest-include:: ${DOC_ROOT}/frameworks/sumstats-toy-scan.bro
|
||||||
|
|
||||||
|
Let's see if there any hosts that crossed the threshold in a PCAP file
|
||||||
|
containing a host running nmap:
|
||||||
|
|
||||||
|
.. btest:: sumstats-toy-scan
|
||||||
|
|
||||||
|
@TEST-EXEC: btest-rst-cmd bro -r ${TRACES}/nmap-vsn.trace ${DOC_ROOT}/frameworks/sumstats-toy-scan.bro
|
||||||
|
|
||||||
|
It seems the host running nmap was detected!
|
||||||
|
|
BIN
testing/btest/Traces/nmap-vsn.trace
Normal file
BIN
testing/btest/Traces/nmap-vsn.trace
Normal file
Binary file not shown.
Loading…
Add table
Add a link
Reference in a new issue