mirror of
https://github.com/zeek/zeek.git
synced 2025-10-02 14:48:21 +00:00
Merge branch 'master' into topic/jsiwek/faf-updates
Conflicts: testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
This commit is contained in:
commit
9bd7a65071
91 changed files with 14058 additions and 402 deletions
21
NEWS
21
NEWS
|
@ -80,7 +80,7 @@ New Functionality
|
||||||
with the following user-visibible functionality (some of that was
|
with the following user-visibible functionality (some of that was
|
||||||
already available before, but done differently):
|
already available before, but done differently):
|
||||||
|
|
||||||
[TODO: This will probably change with further script updates.]
|
[TODO: Update with changes from 984e9793db56.]
|
||||||
|
|
||||||
- A binary input reader interfaces the input framework with file
|
- A binary input reader interfaces the input framework with file
|
||||||
analysis, allowing to inject files on disk into Bro's
|
analysis, allowing to inject files on disk into Bro's
|
||||||
|
@ -108,6 +108,25 @@ New Functionality
|
||||||
shunting, and sampling; plus plugin support to customize filters
|
shunting, and sampling; plus plugin support to customize filters
|
||||||
dynamically.
|
dynamically.
|
||||||
|
|
||||||
|
- Bro now provides Bloom filters of two kinds: basic Bloom filters
|
||||||
|
supporting membership tests, and counting Bloom filters that track
|
||||||
|
the frequency of elements. The corresponding functions are:
|
||||||
|
|
||||||
|
bloomfilter_basic_init(fp: double, capacity: count, name: string &default=""): opaque of bloomfilter
|
||||||
|
bloomfilter_counting_init(k: count, cells: count, max: count, name: string &default=""): opaque of bloomfilter
|
||||||
|
bloomfilter_add(bf: opaque of bloomfilter, x: any)
|
||||||
|
bloomfilter_lookup(bf: opaque of bloomfilter, x: any): count
|
||||||
|
bloomfilter_merge(bf1: opaque of bloomfilter, bf2: opaque of bloomfilter): opaque of bloomfilter
|
||||||
|
bloomfilter_clear(bf: opaque of bloomfilter)
|
||||||
|
|
||||||
|
See <INSERT LINK> for full documentation.
|
||||||
|
|
||||||
|
- base/utils/exec.bro provides a module to start external processes
|
||||||
|
asynchronously and retrieve their output on termination.
|
||||||
|
base/utils/dir.bro uses it to monitor a directory for changes, and
|
||||||
|
base/utils/active-http.bro for providing an interface for querying
|
||||||
|
remote web servers.
|
||||||
|
|
||||||
Changed Functionality
|
Changed Functionality
|
||||||
~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
|
2
VERSION
2
VERSION
|
@ -1 +1 @@
|
||||||
2.1-824
|
2.1-945
|
||||||
|
|
|
@ -1 +1 @@
|
||||||
Subproject commit c39bd478b9d0ecd05b1b83aa9d09a7887893977c
|
Subproject commit 314fa8f65fc240e960c23c3bba98623436a72b98
|
|
@ -1 +1 @@
|
||||||
Subproject commit a9942558c7d3dfd80148b8aaded64c82ade3d117
|
Subproject commit 91d258cc8b2f74cd02fc93dfe61f73ec9f0dd489
|
|
@ -1 +1 @@
|
||||||
Subproject commit 889f9c65944ceac20ad9230efc39d33e6e1221c3
|
Subproject commit d59c73b6e0966ad63bbc63a35741b5f68263e7b1
|
|
@ -1 +1 @@
|
||||||
Subproject commit 0cd102805e73343cab3f9fd4a76552e13940dad9
|
Subproject commit 52fd91261f41fa1528f7b964837a364d7991889e
|
2
cmake
2
cmake
|
@ -1 +1 @@
|
||||||
Subproject commit 0187b33a29d5ec824f940feff60dc5d8c2fe314f
|
Subproject commit 026639f8368e56742c0cb5d9fb390ea64e60ec50
|
|
@ -27,10 +27,7 @@ Quick Start
|
||||||
Load the package of scripts that sends data into the Intelligence
|
Load the package of scripts that sends data into the Intelligence
|
||||||
Framework to be checked by loading this script in local.bro::
|
Framework to be checked by loading this script in local.bro::
|
||||||
|
|
||||||
@load policy/frameworks/intel
|
@load policy/frameworks/intel/seen
|
||||||
|
|
||||||
(TODO: find some good mechanism for getting setup with good data
|
|
||||||
quickly)
|
|
||||||
|
|
||||||
Refer to the "Loading Intelligence" section below to see the format
|
Refer to the "Loading Intelligence" section below to see the format
|
||||||
for Intelligence Framework text files, then load those text files with
|
for Intelligence Framework text files, then load those text files with
|
||||||
|
@ -61,16 +58,14 @@ data out to all of the nodes that need it.
|
||||||
|
|
||||||
Here is an example of the intelligence data format. Note that all
|
Here is an example of the intelligence data format. Note that all
|
||||||
whitespace separators are literal tabs and fields containing only a
|
whitespace separators are literal tabs and fields containing only a
|
||||||
hyphen a considered to be null values.::
|
hyphen are considered to be null values.::
|
||||||
|
|
||||||
#fields host net str str_type meta.source meta.desc meta.url
|
#fields indicator indicator_type meta.source meta.desc meta.url
|
||||||
1.2.3.4 - - - source1 Sending phishing email http://source1.com/badhosts/1.2.3.4
|
1.2.3.4 Intel::ADDR source1 Sending phishing email http://source1.com/badhosts/1.2.3.4
|
||||||
- 31.131.248.0/21 - - spamhaus-drop SBL154982 - -
|
a.b.com Intel::DOMAIN source2 Name used for data exfiltration -
|
||||||
- - a.b.com Intel::DOMAIN source2 Name used for data exfiltration -
|
|
||||||
|
|
||||||
For more examples of built in `str_type` values, please refer to the
|
For more examples of built in `indicator_type` values, please refer to the
|
||||||
autogenerated documentation for the intelligence framework (TODO:
|
autogenerated documentation for the intelligence framework.
|
||||||
figure out how to do this link).
|
|
||||||
|
|
||||||
To load the data once files are created, use the following example
|
To load the data once files are created, use the following example
|
||||||
code to define files to load with your own file names of course::
|
code to define files to load with your own file names of course::
|
||||||
|
@ -90,8 +85,7 @@ When some bit of data is extracted (such as an email address in the
|
||||||
"From" header in a message over SMTP), the Intelligence Framework
|
"From" header in a message over SMTP), the Intelligence Framework
|
||||||
needs to be informed that this data was discovered and it's presence
|
needs to be informed that this data was discovered and it's presence
|
||||||
should be checked within the intelligence data set. This is
|
should be checked within the intelligence data set. This is
|
||||||
accomplished through the Intel::seen (TODO: do a reference link)
|
accomplished through the Intel::seen function.
|
||||||
function.
|
|
||||||
|
|
||||||
Typically users won't need to work with this function due to built in
|
Typically users won't need to work with this function due to built in
|
||||||
hook scripts that Bro ships with that will "see" data and send it into
|
hook scripts that Bro ships with that will "see" data and send it into
|
||||||
|
@ -106,7 +100,7 @@ The full package of hook scripts that Bro ships with for sending this
|
||||||
"seen" data into the intelligence framework can be loading by adding
|
"seen" data into the intelligence framework can be loading by adding
|
||||||
this line to local.bro::
|
this line to local.bro::
|
||||||
|
|
||||||
@load policy/frameworks/intel
|
@load policy/frameworks/intel/seen
|
||||||
|
|
||||||
Intelligence Matches
|
Intelligence Matches
|
||||||
********************
|
********************
|
||||||
|
|
|
@ -17,6 +17,7 @@ rest_target(${psd} base/init-default.bro internal)
|
||||||
rest_target(${psd} base/init-bare.bro internal)
|
rest_target(${psd} base/init-bare.bro internal)
|
||||||
|
|
||||||
rest_target(${CMAKE_BINARY_DIR}/scripts base/bif/analyzer.bif.bro)
|
rest_target(${CMAKE_BINARY_DIR}/scripts base/bif/analyzer.bif.bro)
|
||||||
|
rest_target(${CMAKE_BINARY_DIR}/scripts base/bif/bloom-filter.bif.bro)
|
||||||
rest_target(${CMAKE_BINARY_DIR}/scripts base/bif/bro.bif.bro)
|
rest_target(${CMAKE_BINARY_DIR}/scripts base/bif/bro.bif.bro)
|
||||||
rest_target(${CMAKE_BINARY_DIR}/scripts base/bif/const.bif.bro)
|
rest_target(${CMAKE_BINARY_DIR}/scripts base/bif/const.bif.bro)
|
||||||
rest_target(${CMAKE_BINARY_DIR}/scripts base/bif/event.bif.bro)
|
rest_target(${CMAKE_BINARY_DIR}/scripts base/bif/event.bif.bro)
|
||||||
|
@ -164,9 +165,12 @@ rest_target(${psd} base/protocols/ssl/main.bro)
|
||||||
rest_target(${psd} base/protocols/ssl/mozilla-ca-list.bro)
|
rest_target(${psd} base/protocols/ssl/mozilla-ca-list.bro)
|
||||||
rest_target(${psd} base/protocols/syslog/consts.bro)
|
rest_target(${psd} base/protocols/syslog/consts.bro)
|
||||||
rest_target(${psd} base/protocols/syslog/main.bro)
|
rest_target(${psd} base/protocols/syslog/main.bro)
|
||||||
|
rest_target(${psd} base/utils/active-http.bro)
|
||||||
rest_target(${psd} base/utils/addrs.bro)
|
rest_target(${psd} base/utils/addrs.bro)
|
||||||
rest_target(${psd} base/utils/conn-ids.bro)
|
rest_target(${psd} base/utils/conn-ids.bro)
|
||||||
|
rest_target(${psd} base/utils/dir.bro)
|
||||||
rest_target(${psd} base/utils/directions-and-hosts.bro)
|
rest_target(${psd} base/utils/directions-and-hosts.bro)
|
||||||
|
rest_target(${psd} base/utils/exec.bro)
|
||||||
rest_target(${psd} base/utils/files.bro)
|
rest_target(${psd} base/utils/files.bro)
|
||||||
rest_target(${psd} base/utils/numbers.bro)
|
rest_target(${psd} base/utils/numbers.bro)
|
||||||
rest_target(${psd} base/utils/paths.bro)
|
rest_target(${psd} base/utils/paths.bro)
|
||||||
|
@ -184,15 +188,16 @@ rest_target(${psd} policy/frameworks/dpd/detect-protocols.bro)
|
||||||
rest_target(${psd} policy/frameworks/dpd/packet-segment-logging.bro)
|
rest_target(${psd} policy/frameworks/dpd/packet-segment-logging.bro)
|
||||||
rest_target(${psd} policy/frameworks/files/detect-MHR.bro)
|
rest_target(${psd} policy/frameworks/files/detect-MHR.bro)
|
||||||
rest_target(${psd} policy/frameworks/files/hash-all-files.bro)
|
rest_target(${psd} policy/frameworks/files/hash-all-files.bro)
|
||||||
rest_target(${psd} policy/frameworks/intel/conn-established.bro)
|
rest_target(${psd} policy/frameworks/intel/do_notice.bro)
|
||||||
rest_target(${psd} policy/frameworks/intel/dns.bro)
|
rest_target(${psd} policy/frameworks/intel/seen/conn-established.bro)
|
||||||
rest_target(${psd} policy/frameworks/intel/http-host-header.bro)
|
rest_target(${psd} policy/frameworks/intel/seen/dns.bro)
|
||||||
rest_target(${psd} policy/frameworks/intel/http-url.bro)
|
rest_target(${psd} policy/frameworks/intel/seen/http-host-header.bro)
|
||||||
rest_target(${psd} policy/frameworks/intel/http-user-agents.bro)
|
rest_target(${psd} policy/frameworks/intel/seen/http-url.bro)
|
||||||
rest_target(${psd} policy/frameworks/intel/smtp-url-extraction.bro)
|
rest_target(${psd} policy/frameworks/intel/seen/http-user-agents.bro)
|
||||||
rest_target(${psd} policy/frameworks/intel/smtp.bro)
|
rest_target(${psd} policy/frameworks/intel/seen/smtp-url-extraction.bro)
|
||||||
rest_target(${psd} policy/frameworks/intel/ssl.bro)
|
rest_target(${psd} policy/frameworks/intel/seen/smtp.bro)
|
||||||
rest_target(${psd} policy/frameworks/intel/where-locations.bro)
|
rest_target(${psd} policy/frameworks/intel/seen/ssl.bro)
|
||||||
|
rest_target(${psd} policy/frameworks/intel/seen/where-locations.bro)
|
||||||
rest_target(${psd} policy/frameworks/packet-filter/shunt.bro)
|
rest_target(${psd} policy/frameworks/packet-filter/shunt.bro)
|
||||||
rest_target(${psd} policy/frameworks/software/version-changes.bro)
|
rest_target(${psd} policy/frameworks/software/version-changes.bro)
|
||||||
rest_target(${psd} policy/frameworks/software/vulnerable.bro)
|
rest_target(${psd} policy/frameworks/software/vulnerable.bro)
|
||||||
|
|
|
@ -10,13 +10,14 @@ module Intel;
|
||||||
export {
|
export {
|
||||||
redef enum Log::ID += { LOG };
|
redef enum Log::ID += { LOG };
|
||||||
|
|
||||||
## String data needs to be further categoried since it could represent
|
## Enum type to represent various types of intelligence data.
|
||||||
## and number of types of data.
|
type Type: enum {
|
||||||
type StrType: enum {
|
## An IP address.
|
||||||
|
ADDR,
|
||||||
## A complete URL without the prefix "http://".
|
## A complete URL without the prefix "http://".
|
||||||
URL,
|
URL,
|
||||||
## User-Agent string, typically HTTP or mail message body.
|
## Software name.
|
||||||
USER_AGENT,
|
SOFTWARE,
|
||||||
## Email address.
|
## Email address.
|
||||||
EMAIL,
|
EMAIL,
|
||||||
## DNS domain name.
|
## DNS domain name.
|
||||||
|
@ -44,18 +45,15 @@ export {
|
||||||
|
|
||||||
## Represents a piece of intelligence.
|
## Represents a piece of intelligence.
|
||||||
type Item: record {
|
type Item: record {
|
||||||
## The IP address if the intelligence is about an IP address.
|
## The intelligence indicator.
|
||||||
host: addr &optional;
|
indicator: string;
|
||||||
## The network if the intelligence is about a CIDR block.
|
|
||||||
net: subnet &optional;
|
|
||||||
## The string if the intelligence is about a string.
|
|
||||||
str: string &optional;
|
|
||||||
## The type of data that is in the string if the $str field is set.
|
|
||||||
str_type: StrType &optional;
|
|
||||||
|
|
||||||
## Metadata for the item. Typically represents more deeply \
|
## The type of data that the indicator field represents.
|
||||||
|
indicator_type: Type;
|
||||||
|
|
||||||
|
## Metadata for the item. Typically represents more deeply
|
||||||
## descriptive data for a piece of intelligence.
|
## descriptive data for a piece of intelligence.
|
||||||
meta: MetaData;
|
meta: MetaData;
|
||||||
};
|
};
|
||||||
|
|
||||||
## Enum to represent where data came from when it was discovered.
|
## Enum to represent where data came from when it was discovered.
|
||||||
|
@ -65,23 +63,23 @@ export {
|
||||||
IN_ANYWHERE,
|
IN_ANYWHERE,
|
||||||
};
|
};
|
||||||
|
|
||||||
## The $host field and combination of $str and $str_type fields are mutually
|
|
||||||
## exclusive. These records *must* represent either an IP address being
|
|
||||||
## seen or a string being seen.
|
|
||||||
type Seen: record {
|
type Seen: record {
|
||||||
## The IP address if the data seen is an IP address.
|
|
||||||
host: addr &log &optional;
|
|
||||||
## The string if the data is about a string.
|
## The string if the data is about a string.
|
||||||
str: string &log &optional;
|
indicator: string &log &optional;
|
||||||
## The type of data that is in the string if the $str field is set.
|
|
||||||
str_type: StrType &log &optional;
|
## The type of data that the indicator represents.
|
||||||
|
indicator_type: Type &log &optional;
|
||||||
|
|
||||||
|
## If the indicator type was :bro:enum:`Intel::ADDR`, then this
|
||||||
|
## field will be present.
|
||||||
|
host: addr &optional;
|
||||||
|
|
||||||
## Where the data was discovered.
|
## Where the data was discovered.
|
||||||
where: Where &log;
|
where: Where &log;
|
||||||
|
|
||||||
## If the data was discovered within a connection, the
|
## If the data was discovered within a connection, the
|
||||||
## connection record should go into get to give context to the data.
|
## connection record should go into get to give context to the data.
|
||||||
conn: connection &optional;
|
conn: connection &optional;
|
||||||
};
|
};
|
||||||
|
|
||||||
## Record used for the logging framework representing a positive
|
## Record used for the logging framework representing a positive
|
||||||
|
@ -100,7 +98,7 @@ export {
|
||||||
## Where the data was seen.
|
## Where the data was seen.
|
||||||
seen: Seen &log;
|
seen: Seen &log;
|
||||||
## Sources which supplied data that resulted in this match.
|
## Sources which supplied data that resulted in this match.
|
||||||
sources: set[string] &log;
|
sources: set[string] &log &default=string_set();
|
||||||
};
|
};
|
||||||
|
|
||||||
## Intelligence data manipulation functions.
|
## Intelligence data manipulation functions.
|
||||||
|
@ -135,8 +133,8 @@ const have_full_data = T &redef;
|
||||||
|
|
||||||
# The in memory data structure for holding intelligence.
|
# The in memory data structure for holding intelligence.
|
||||||
type DataStore: record {
|
type DataStore: record {
|
||||||
net_data: table[subnet] of set[MetaData];
|
host_data: table[addr] of set[MetaData];
|
||||||
string_data: table[string, StrType] of set[MetaData];
|
string_data: table[string, Type] of set[MetaData];
|
||||||
};
|
};
|
||||||
global data_store: DataStore &redef;
|
global data_store: DataStore &redef;
|
||||||
|
|
||||||
|
@ -144,8 +142,8 @@ global data_store: DataStore &redef;
|
||||||
# This is primarily for workers to do the initial quick matches and store
|
# This is primarily for workers to do the initial quick matches and store
|
||||||
# a minimal amount of data for the full match to happen on the manager.
|
# a minimal amount of data for the full match to happen on the manager.
|
||||||
type MinDataStore: record {
|
type MinDataStore: record {
|
||||||
net_data: set[subnet];
|
host_data: set[addr];
|
||||||
string_data: set[string, StrType];
|
string_data: set[string, Type];
|
||||||
};
|
};
|
||||||
global min_data_store: MinDataStore &redef;
|
global min_data_store: MinDataStore &redef;
|
||||||
|
|
||||||
|
@ -157,15 +155,13 @@ event bro_init() &priority=5
|
||||||
|
|
||||||
function find(s: Seen): bool
|
function find(s: Seen): bool
|
||||||
{
|
{
|
||||||
if ( s?$host &&
|
if ( s?$host )
|
||||||
((have_full_data && s$host in data_store$net_data) ||
|
|
||||||
(s$host in min_data_store$net_data)))
|
|
||||||
{
|
{
|
||||||
return T;
|
return ((s$host in min_data_store$host_data) ||
|
||||||
|
(have_full_data && s$host in data_store$host_data));
|
||||||
}
|
}
|
||||||
else if ( s?$str && s?$str_type &&
|
else if ( ([to_lower(s$indicator), s$indicator_type] in min_data_store$string_data) ||
|
||||||
((have_full_data && [s$str, s$str_type] in data_store$string_data) ||
|
(have_full_data && [to_lower(s$indicator), s$indicator_type] in data_store$string_data) )
|
||||||
([s$str, s$str_type] in min_data_store$string_data)))
|
|
||||||
{
|
{
|
||||||
return T;
|
return T;
|
||||||
}
|
}
|
||||||
|
@ -177,8 +173,7 @@ function find(s: Seen): bool
|
||||||
|
|
||||||
function get_items(s: Seen): set[Item]
|
function get_items(s: Seen): set[Item]
|
||||||
{
|
{
|
||||||
local item: Item;
|
local return_data: set[Item];
|
||||||
local return_data: set[Item] = set();
|
|
||||||
|
|
||||||
if ( ! have_full_data )
|
if ( ! have_full_data )
|
||||||
{
|
{
|
||||||
|
@ -191,26 +186,23 @@ function get_items(s: Seen): set[Item]
|
||||||
if ( s?$host )
|
if ( s?$host )
|
||||||
{
|
{
|
||||||
# See if the host is known about and it has meta values
|
# See if the host is known about and it has meta values
|
||||||
if ( s$host in data_store$net_data )
|
if ( s$host in data_store$host_data )
|
||||||
{
|
{
|
||||||
for ( m in data_store$net_data[s$host] )
|
for ( m in data_store$host_data[s$host] )
|
||||||
{
|
{
|
||||||
# TODO: the lookup should be finding all and not just most specific
|
add return_data[Item($indicator=cat(s$host), $indicator_type=ADDR, $meta=m)];
|
||||||
# and $host/$net should have the correct value.
|
|
||||||
item = [$host=s$host, $meta=m];
|
|
||||||
add return_data[item];
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else if ( s?$str && s?$str_type )
|
else
|
||||||
{
|
{
|
||||||
|
local lower_indicator = to_lower(s$indicator);
|
||||||
# See if the string is known about and it has meta values
|
# See if the string is known about and it has meta values
|
||||||
if ( [s$str, s$str_type] in data_store$string_data )
|
if ( [lower_indicator, s$indicator_type] in data_store$string_data )
|
||||||
{
|
{
|
||||||
for ( m in data_store$string_data[s$str, s$str_type] )
|
for ( m in data_store$string_data[lower_indicator, s$indicator_type] )
|
||||||
{
|
{
|
||||||
item = [$str=s$str, $str_type=s$str_type, $meta=m];
|
add return_data[Item($indicator=s$indicator, $indicator_type=s$indicator_type, $meta=m)];
|
||||||
add return_data[item];
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -222,6 +214,12 @@ function Intel::seen(s: Seen)
|
||||||
{
|
{
|
||||||
if ( find(s) )
|
if ( find(s) )
|
||||||
{
|
{
|
||||||
|
if ( s?$host )
|
||||||
|
{
|
||||||
|
s$indicator = cat(s$host);
|
||||||
|
s$indicator_type = Intel::ADDR;
|
||||||
|
}
|
||||||
|
|
||||||
if ( have_full_data )
|
if ( have_full_data )
|
||||||
{
|
{
|
||||||
local items = get_items(s);
|
local items = get_items(s);
|
||||||
|
@ -250,8 +248,7 @@ function has_meta(check: MetaData, metas: set[MetaData]): bool
|
||||||
|
|
||||||
event Intel::match(s: Seen, items: set[Item]) &priority=5
|
event Intel::match(s: Seen, items: set[Item]) &priority=5
|
||||||
{
|
{
|
||||||
local empty_set: set[string] = set();
|
local info: Info = [$ts=network_time(), $seen=s];
|
||||||
local info: Info = [$ts=network_time(), $seen=s, $sources=empty_set];
|
|
||||||
|
|
||||||
if ( s?$conn )
|
if ( s?$conn )
|
||||||
{
|
{
|
||||||
|
@ -267,52 +264,37 @@ event Intel::match(s: Seen, items: set[Item]) &priority=5
|
||||||
|
|
||||||
function insert(item: Item)
|
function insert(item: Item)
|
||||||
{
|
{
|
||||||
if ( item?$str && !item?$str_type )
|
|
||||||
{
|
|
||||||
event reporter_warning(network_time(), fmt("You must provide a str_type for strings or this item doesn't make sense. Item: %s", item), "");
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
# Create and fill out the meta data item.
|
# Create and fill out the meta data item.
|
||||||
local meta = item$meta;
|
local meta = item$meta;
|
||||||
local metas: set[MetaData];
|
local metas: set[MetaData];
|
||||||
|
|
||||||
if ( item?$host )
|
# All intelligence is case insensitive at the moment.
|
||||||
|
local lower_indicator = to_lower(item$indicator);
|
||||||
|
|
||||||
|
if ( item$indicator_type == ADDR )
|
||||||
{
|
{
|
||||||
local host = mask_addr(item$host, is_v4_addr(item$host) ? 32 : 128);
|
local host = to_addr(item$indicator);
|
||||||
if ( have_full_data )
|
if ( have_full_data )
|
||||||
{
|
{
|
||||||
if ( host !in data_store$net_data )
|
if ( host !in data_store$host_data )
|
||||||
data_store$net_data[host] = set();
|
data_store$host_data[host] = set();
|
||||||
|
|
||||||
metas = data_store$net_data[host];
|
metas = data_store$host_data[host];
|
||||||
}
|
}
|
||||||
|
|
||||||
add min_data_store$net_data[host];
|
add min_data_store$host_data[host];
|
||||||
}
|
}
|
||||||
else if ( item?$net )
|
else
|
||||||
{
|
{
|
||||||
if ( have_full_data )
|
if ( have_full_data )
|
||||||
{
|
{
|
||||||
if ( item$net !in data_store$net_data )
|
if ( [lower_indicator, item$indicator_type] !in data_store$string_data )
|
||||||
data_store$net_data[item$net] = set();
|
data_store$string_data[lower_indicator, item$indicator_type] = set();
|
||||||
|
|
||||||
metas = data_store$net_data[item$net];
|
metas = data_store$string_data[lower_indicator, item$indicator_type];
|
||||||
}
|
}
|
||||||
|
|
||||||
add min_data_store$net_data[item$net];
|
add min_data_store$string_data[lower_indicator, item$indicator_type];
|
||||||
}
|
|
||||||
else if ( item?$str )
|
|
||||||
{
|
|
||||||
if ( have_full_data )
|
|
||||||
{
|
|
||||||
if ( [item$str, item$str_type] !in data_store$string_data )
|
|
||||||
data_store$string_data[item$str, item$str_type] = set();
|
|
||||||
|
|
||||||
metas = data_store$string_data[item$str, item$str_type];
|
|
||||||
}
|
|
||||||
|
|
||||||
add min_data_store$string_data[item$str, item$str_type];
|
|
||||||
}
|
}
|
||||||
|
|
||||||
local updated = F;
|
local updated = F;
|
||||||
|
|
|
@ -702,6 +702,7 @@ type entropy_test_result: record {
|
||||||
@load base/bif/strings.bif
|
@load base/bif/strings.bif
|
||||||
@load base/bif/bro.bif
|
@load base/bif/bro.bif
|
||||||
@load base/bif/reporter.bif
|
@load base/bif/reporter.bif
|
||||||
|
@load base/bif/bloom-filter.bif
|
||||||
|
|
||||||
## Deprecated. This is superseded by the new logging framework.
|
## Deprecated. This is superseded by the new logging framework.
|
||||||
global log_file_name: function(tag: string): string &redef;
|
global log_file_name: function(tag: string): string &redef;
|
||||||
|
@ -3047,3 +3048,5 @@ const snaplen = 8192 &redef;
|
||||||
@load base/frameworks/input
|
@load base/frameworks/input
|
||||||
@load base/frameworks/analyzer
|
@load base/frameworks/analyzer
|
||||||
@load base/frameworks/files
|
@load base/frameworks/files
|
||||||
|
|
||||||
|
@load base/bif
|
||||||
|
|
|
@ -5,9 +5,12 @@
|
||||||
##! you actually want.
|
##! you actually want.
|
||||||
|
|
||||||
@load base/utils/site
|
@load base/utils/site
|
||||||
|
@load base/utils/active-http
|
||||||
@load base/utils/addrs
|
@load base/utils/addrs
|
||||||
@load base/utils/conn-ids
|
@load base/utils/conn-ids
|
||||||
|
@load base/utils/dir
|
||||||
@load base/utils/directions-and-hosts
|
@load base/utils/directions-and-hosts
|
||||||
|
@load base/utils/exec
|
||||||
@load base/utils/files
|
@load base/utils/files
|
||||||
@load base/utils/numbers
|
@load base/utils/numbers
|
||||||
@load base/utils/paths
|
@load base/utils/paths
|
||||||
|
|
|
@ -226,7 +226,10 @@ event mime_one_header(c: connection, h: mime_header_rec) &priority=5
|
||||||
{
|
{
|
||||||
if ( ! c$smtp?$to )
|
if ( ! c$smtp?$to )
|
||||||
c$smtp$to = set();
|
c$smtp$to = set();
|
||||||
add c$smtp$to[h$value];
|
|
||||||
|
local to_parts = split(h$value, /[[:blank:]]*,[[:blank:]]*/);
|
||||||
|
for ( i in to_parts )
|
||||||
|
add c$smtp$to[to_parts[i]];
|
||||||
}
|
}
|
||||||
|
|
||||||
else if ( h$name == "X-ORIGINATING-IP" )
|
else if ( h$name == "X-ORIGINATING-IP" )
|
||||||
|
|
123
scripts/base/utils/active-http.bro
Normal file
123
scripts/base/utils/active-http.bro
Normal file
|
@ -0,0 +1,123 @@
|
||||||
|
##! A module for performing active HTTP requests and
|
||||||
|
##! getting the reply at runtime.
|
||||||
|
|
||||||
|
@load ./exec
|
||||||
|
|
||||||
|
module ActiveHTTP;
|
||||||
|
|
||||||
|
export {
|
||||||
|
## The default timeout for HTTP requests.
|
||||||
|
const default_max_time = 1min &redef;
|
||||||
|
|
||||||
|
## The default HTTP method/verb to use for requests.
|
||||||
|
const default_method = "GET" &redef;
|
||||||
|
|
||||||
|
type Response: record {
|
||||||
|
## Numeric response code from the server.
|
||||||
|
code: count;
|
||||||
|
## String response message from the server.
|
||||||
|
msg: string;
|
||||||
|
## Full body of the response.
|
||||||
|
body: string &optional;
|
||||||
|
## All headers returned by the server.
|
||||||
|
headers: table[string] of string &optional;
|
||||||
|
};
|
||||||
|
|
||||||
|
type Request: record {
|
||||||
|
## The URL being requested.
|
||||||
|
url: string;
|
||||||
|
## The HTTP method/verb to use for the request.
|
||||||
|
method: string &default=default_method;
|
||||||
|
## Data to send to the server in the client body. Keep in
|
||||||
|
## mind that you will probably need to set the *method* field
|
||||||
|
## to "POST" or "PUT".
|
||||||
|
client_data: string &optional;
|
||||||
|
## Arbitrary headers to pass to the server. Some headers
|
||||||
|
## will be included by libCurl.
|
||||||
|
#custom_headers: table[string] of string &optional;
|
||||||
|
## Timeout for the request.
|
||||||
|
max_time: interval &default=default_max_time;
|
||||||
|
## Additional curl command line arguments. Be very careful
|
||||||
|
## with this option since shell injection could take place
|
||||||
|
## if careful handling of untrusted data is not applied.
|
||||||
|
addl_curl_args: string &optional;
|
||||||
|
};
|
||||||
|
|
||||||
|
## Perform an HTTP request according to the :bro:type:`Request` record.
|
||||||
|
## This is an asynchronous function and must be called within a "when"
|
||||||
|
## statement.
|
||||||
|
##
|
||||||
|
## req: A record instance representing all options for an HTTP request.
|
||||||
|
##
|
||||||
|
## Returns: A record with the full response message.
|
||||||
|
global request: function(req: ActiveHTTP::Request): ActiveHTTP::Response;
|
||||||
|
}
|
||||||
|
|
||||||
|
function request2curl(r: Request, bodyfile: string, headersfile: string): string
|
||||||
|
{
|
||||||
|
local cmd = fmt("curl -s -g -o \"%s\" -D \"%s\" -X \"%s\"",
|
||||||
|
str_shell_escape(bodyfile),
|
||||||
|
str_shell_escape(headersfile),
|
||||||
|
str_shell_escape(r$method));
|
||||||
|
|
||||||
|
cmd = fmt("%s -m %.0f", cmd, r$max_time);
|
||||||
|
|
||||||
|
if ( r?$client_data )
|
||||||
|
cmd = fmt("%s -d -", cmd);
|
||||||
|
|
||||||
|
if ( r?$addl_curl_args )
|
||||||
|
cmd = fmt("%s %s", cmd, r$addl_curl_args);
|
||||||
|
|
||||||
|
cmd = fmt("%s \"%s\"", cmd, str_shell_escape(r$url));
|
||||||
|
return cmd;
|
||||||
|
}
|
||||||
|
|
||||||
|
function request(req: Request): ActiveHTTP::Response
|
||||||
|
{
|
||||||
|
local tmpfile = "/tmp/bro-activehttp-" + unique_id("");
|
||||||
|
local bodyfile = fmt("%s_body", tmpfile);
|
||||||
|
local headersfile = fmt("%s_headers", tmpfile);
|
||||||
|
|
||||||
|
local cmd = request2curl(req, bodyfile, headersfile);
|
||||||
|
local stdin_data = req?$client_data ? req$client_data : "";
|
||||||
|
|
||||||
|
local resp: Response;
|
||||||
|
resp$code = 0;
|
||||||
|
resp$msg = "";
|
||||||
|
resp$body = "";
|
||||||
|
resp$headers = table();
|
||||||
|
return when ( local result = Exec::run([$cmd=cmd, $stdin=stdin_data, $read_files=set(bodyfile, headersfile)]) )
|
||||||
|
{
|
||||||
|
# If there is no response line then nothing else will work either.
|
||||||
|
if ( ! (result?$files && headersfile in result$files) )
|
||||||
|
{
|
||||||
|
Reporter::error(fmt("There was a failure when requesting \"%s\" with ActiveHTTP.", req$url));
|
||||||
|
return resp;
|
||||||
|
}
|
||||||
|
|
||||||
|
local headers = result$files[headersfile];
|
||||||
|
for ( i in headers )
|
||||||
|
{
|
||||||
|
# The reply is the first line.
|
||||||
|
if ( i == 0 )
|
||||||
|
{
|
||||||
|
local response_line = split_n(headers[0], /[[:blank:]]+/, F, 2);
|
||||||
|
if ( |response_line| != 3 )
|
||||||
|
return resp;
|
||||||
|
|
||||||
|
resp$code = to_count(response_line[2]);
|
||||||
|
resp$msg = response_line[3];
|
||||||
|
resp$body = join_string_vec(result$files[bodyfile], "");
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
local line = headers[i];
|
||||||
|
local h = split1(line, /:/);
|
||||||
|
if ( |h| != 2 )
|
||||||
|
next;
|
||||||
|
resp$headers[h[1]] = sub_bytes(h[2], 0, |h[2]|-1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return resp;
|
||||||
|
}
|
||||||
|
}
|
66
scripts/base/utils/dir.bro
Normal file
66
scripts/base/utils/dir.bro
Normal file
|
@ -0,0 +1,66 @@
|
||||||
|
@load base/utils/exec
|
||||||
|
@load base/frameworks/reporter
|
||||||
|
@load base/utils/paths
|
||||||
|
|
||||||
|
module Dir;
|
||||||
|
|
||||||
|
export {
|
||||||
|
## The default interval this module checks for files in directories when
|
||||||
|
## using the :bro:see:`Dir::monitor` function.
|
||||||
|
const polling_interval = 30sec &redef;
|
||||||
|
|
||||||
|
## Register a directory to monitor with a callback that is called
|
||||||
|
## every time a previously unseen file is seen. If a file is deleted
|
||||||
|
## and seen to be gone, the file is available for being seen again in
|
||||||
|
## the future.
|
||||||
|
##
|
||||||
|
## dir: The directory to monitor for files.
|
||||||
|
##
|
||||||
|
## callback: Callback that gets executed with each file name
|
||||||
|
## that is found. Filenames are provided with the full path.
|
||||||
|
##
|
||||||
|
## poll_interval: An interval at which to check for new files.
|
||||||
|
global monitor: function(dir: string, callback: function(fname: string),
|
||||||
|
poll_interval: interval &default=polling_interval);
|
||||||
|
}
|
||||||
|
|
||||||
|
event Dir::monitor_ev(dir: string, last_files: set[string],
|
||||||
|
callback: function(fname: string),
|
||||||
|
poll_interval: interval)
|
||||||
|
{
|
||||||
|
when ( local result = Exec::run([$cmd=fmt("ls -i \"%s/\"", str_shell_escape(dir))]) )
|
||||||
|
{
|
||||||
|
if ( result$exit_code != 0 )
|
||||||
|
{
|
||||||
|
Reporter::warning(fmt("Requested monitoring of non-existent directory (%s).", dir));
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
local current_files: set[string] = set();
|
||||||
|
local files: vector of string = vector();
|
||||||
|
|
||||||
|
if ( result?$stdout )
|
||||||
|
files = result$stdout;
|
||||||
|
|
||||||
|
for ( i in files )
|
||||||
|
{
|
||||||
|
local parts = split1(files[i], / /);
|
||||||
|
if ( parts[1] !in last_files )
|
||||||
|
callback(build_path_compressed(dir, parts[2]));
|
||||||
|
add current_files[parts[1]];
|
||||||
|
}
|
||||||
|
|
||||||
|
schedule poll_interval
|
||||||
|
{
|
||||||
|
Dir::monitor_ev(dir, current_files, callback, poll_interval)
|
||||||
|
};
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function monitor(dir: string, callback: function(fname: string),
|
||||||
|
poll_interval: interval &default=polling_interval)
|
||||||
|
{
|
||||||
|
event Dir::monitor_ev(dir, set(), callback, poll_interval);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
185
scripts/base/utils/exec.bro
Normal file
185
scripts/base/utils/exec.bro
Normal file
|
@ -0,0 +1,185 @@
|
||||||
|
##! A module for executing external command line programs.
|
||||||
|
|
||||||
|
@load base/frameworks/input
|
||||||
|
|
||||||
|
module Exec;
|
||||||
|
|
||||||
|
export {
|
||||||
|
type Command: record {
|
||||||
|
## The command line to execute. Use care to avoid injection attacks.
|
||||||
|
## I.e. if the command uses untrusted/variable data, sanitize
|
||||||
|
## it with str_shell_escape().
|
||||||
|
cmd: string;
|
||||||
|
## Provide standard in to the program as a string.
|
||||||
|
stdin: string &default="";
|
||||||
|
## If additional files are required to be read in as part of the output
|
||||||
|
## of the command they can be defined here.
|
||||||
|
read_files: set[string] &optional;
|
||||||
|
# The unique id for tracking executors.
|
||||||
|
uid: string &default=unique_id("");
|
||||||
|
};
|
||||||
|
|
||||||
|
type Result: record {
|
||||||
|
## Exit code from the program.
|
||||||
|
exit_code: count &default=0;
|
||||||
|
## True if the command was terminated with a signal.
|
||||||
|
signal_exit: bool &default=F;
|
||||||
|
## Each line of standard out.
|
||||||
|
stdout: vector of string &optional;
|
||||||
|
## Each line of standard error.
|
||||||
|
stderr: vector of string &optional;
|
||||||
|
## If additional files were requested to be read in
|
||||||
|
## the content of the files will be available here.
|
||||||
|
files: table[string] of string_vec &optional;
|
||||||
|
};
|
||||||
|
|
||||||
|
## Function for running command line programs and getting
|
||||||
|
## output. This is an asynchronous function which is meant
|
||||||
|
## to be run with the `when` statement.
|
||||||
|
##
|
||||||
|
## cmd: The command to run. Use care to avoid injection attacks!
|
||||||
|
##
|
||||||
|
## returns: A record representing the full results from the
|
||||||
|
## external program execution.
|
||||||
|
global run: function(cmd: Command): Result;
|
||||||
|
|
||||||
|
## The system directory for temp files.
|
||||||
|
const tmp_dir = "/tmp" &redef;
|
||||||
|
}
|
||||||
|
|
||||||
|
# Indexed by command uid.
|
||||||
|
global results: table[string] of Result;
|
||||||
|
global pending_commands: set[string];
|
||||||
|
global pending_files: table[string] of set[string];
|
||||||
|
|
||||||
|
type OneLine: record {
|
||||||
|
s: string;
|
||||||
|
is_stderr: bool;
|
||||||
|
};
|
||||||
|
|
||||||
|
type FileLine: record {
|
||||||
|
s: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
event Exec::line(description: Input::EventDescription, tpe: Input::Event, s: string, is_stderr: bool)
|
||||||
|
{
|
||||||
|
local result = results[description$name];
|
||||||
|
if ( is_stderr )
|
||||||
|
{
|
||||||
|
if ( ! result?$stderr )
|
||||||
|
result$stderr = vector(s);
|
||||||
|
else
|
||||||
|
result$stderr[|result$stderr|] = s;
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
if ( ! result?$stdout )
|
||||||
|
result$stdout = vector(s);
|
||||||
|
else
|
||||||
|
result$stdout[|result$stdout|] = s;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
event Exec::file_line(description: Input::EventDescription, tpe: Input::Event, s: string)
|
||||||
|
{
|
||||||
|
local parts = split1(description$name, /_/);
|
||||||
|
local name = parts[1];
|
||||||
|
local track_file = parts[2];
|
||||||
|
|
||||||
|
local result = results[name];
|
||||||
|
if ( ! result?$files )
|
||||||
|
result$files = table();
|
||||||
|
|
||||||
|
if ( track_file !in result$files )
|
||||||
|
result$files[track_file] = vector(s);
|
||||||
|
else
|
||||||
|
result$files[track_file][|result$files[track_file]|] = s;
|
||||||
|
}
|
||||||
|
|
||||||
|
event Input::end_of_data(name: string, source:string)
|
||||||
|
{
|
||||||
|
local parts = split1(name, /_/);
|
||||||
|
name = parts[1];
|
||||||
|
|
||||||
|
if ( name !in pending_commands || |parts| < 2 )
|
||||||
|
return;
|
||||||
|
|
||||||
|
local track_file = parts[2];
|
||||||
|
|
||||||
|
Input::remove(name);
|
||||||
|
|
||||||
|
if ( name !in pending_files )
|
||||||
|
delete pending_commands[name];
|
||||||
|
else
|
||||||
|
{
|
||||||
|
delete pending_files[name][track_file];
|
||||||
|
if ( |pending_files[name]| == 0 )
|
||||||
|
delete pending_commands[name];
|
||||||
|
system(fmt("rm \"%s\"", str_shell_escape(track_file)));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
event InputRaw::process_finished(name: string, source:string, exit_code:count, signal_exit:bool)
|
||||||
|
{
|
||||||
|
if ( name !in pending_commands )
|
||||||
|
return;
|
||||||
|
|
||||||
|
Input::remove(name);
|
||||||
|
results[name]$exit_code = exit_code;
|
||||||
|
results[name]$signal_exit = signal_exit;
|
||||||
|
|
||||||
|
if ( name !in pending_files || |pending_files[name]| == 0 )
|
||||||
|
# No extra files to read, command is done.
|
||||||
|
delete pending_commands[name];
|
||||||
|
else
|
||||||
|
for ( read_file in pending_files[name] )
|
||||||
|
Input::add_event([$source=fmt("%s", read_file),
|
||||||
|
$name=fmt("%s_%s", name, read_file),
|
||||||
|
$reader=Input::READER_RAW,
|
||||||
|
$want_record=F,
|
||||||
|
$fields=FileLine,
|
||||||
|
$ev=Exec::file_line]);
|
||||||
|
}
|
||||||
|
|
||||||
|
function run(cmd: Command): Result
|
||||||
|
{
|
||||||
|
add pending_commands[cmd$uid];
|
||||||
|
results[cmd$uid] = [];
|
||||||
|
|
||||||
|
if ( cmd?$read_files )
|
||||||
|
{
|
||||||
|
for ( read_file in cmd$read_files )
|
||||||
|
{
|
||||||
|
if ( cmd$uid !in pending_files )
|
||||||
|
pending_files[cmd$uid] = set();
|
||||||
|
add pending_files[cmd$uid][read_file];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
local config_strings: table[string] of string = {
|
||||||
|
["stdin"] = cmd$stdin,
|
||||||
|
["read_stderr"] = "1",
|
||||||
|
};
|
||||||
|
Input::add_event([$name=cmd$uid,
|
||||||
|
$source=fmt("%s |", cmd$cmd),
|
||||||
|
$reader=Input::READER_RAW,
|
||||||
|
$fields=Exec::OneLine,
|
||||||
|
$ev=Exec::line,
|
||||||
|
$want_record=F,
|
||||||
|
$config=config_strings]);
|
||||||
|
|
||||||
|
return when ( cmd$uid !in pending_commands )
|
||||||
|
{
|
||||||
|
local result = results[cmd$uid];
|
||||||
|
delete results[cmd$uid];
|
||||||
|
return result;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
event bro_done()
|
||||||
|
{
|
||||||
|
# We are punting here and just deleting any unprocessed files.
|
||||||
|
for ( uid in pending_files )
|
||||||
|
for ( fname in pending_files[uid] )
|
||||||
|
system(fmt("rm \"%s\"", str_shell_escape(fname)));
|
||||||
|
}
|
|
@ -1,8 +0,0 @@
|
||||||
@load base/frameworks/intel
|
|
||||||
@load ./where-locations
|
|
||||||
|
|
||||||
event connection_established(c: connection)
|
|
||||||
{
|
|
||||||
Intel::seen([$host=c$id$orig_h, $conn=c, $where=Conn::IN_ORIG]);
|
|
||||||
Intel::seen([$host=c$id$resp_h, $conn=c, $where=Conn::IN_RESP]);
|
|
||||||
}
|
|
44
scripts/policy/frameworks/intel/do_notice.bro
Normal file
44
scripts/policy/frameworks/intel/do_notice.bro
Normal file
|
@ -0,0 +1,44 @@
|
||||||
|
|
||||||
|
@load base/frameworks/intel
|
||||||
|
@load base/frameworks/notice
|
||||||
|
|
||||||
|
module Intel;
|
||||||
|
|
||||||
|
export {
|
||||||
|
redef enum Notice::Type += {
|
||||||
|
## Intel::Notice is a notice that happens when an intelligence
|
||||||
|
## indicator is denoted to be notice-worthy.
|
||||||
|
Intel::Notice
|
||||||
|
};
|
||||||
|
|
||||||
|
redef record Intel::MetaData += {
|
||||||
|
## A boolean value to allow the data itself to represent
|
||||||
|
## if the indicator that this metadata is attached to
|
||||||
|
## is notice worthy.
|
||||||
|
do_notice: bool &default=F;
|
||||||
|
|
||||||
|
## Restrictions on when notices are created to only create
|
||||||
|
## them if the do_notice field is T and the notice was
|
||||||
|
## seen in the indicated location.
|
||||||
|
if_in: Intel::Where &optional;
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
event Intel::match(s: Seen, items: set[Item])
|
||||||
|
{
|
||||||
|
for ( item in items )
|
||||||
|
{
|
||||||
|
if ( item$meta$do_notice &&
|
||||||
|
(! item$meta?$if_in || s$where == item$meta$if_in) )
|
||||||
|
{
|
||||||
|
local n = Notice::Info($note=Intel::Notice,
|
||||||
|
$msg=fmt("Intel hit on %s at %s", s$indicator, s$where),
|
||||||
|
$sub=s$indicator);
|
||||||
|
|
||||||
|
if ( s?$conn )
|
||||||
|
n$conn = s$conn;
|
||||||
|
|
||||||
|
NOTICE(n);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
12
scripts/policy/frameworks/intel/seen/conn-established.bro
Normal file
12
scripts/policy/frameworks/intel/seen/conn-established.bro
Normal file
|
@ -0,0 +1,12 @@
|
||||||
|
@load base/frameworks/intel
|
||||||
|
@load ./where-locations
|
||||||
|
|
||||||
|
event connection_established(c: connection)
|
||||||
|
{
|
||||||
|
if ( c$orig$state == TCP_ESTABLISHED &&
|
||||||
|
c$resp$state == TCP_ESTABLISHED )
|
||||||
|
{
|
||||||
|
Intel::seen([$host=c$id$orig_h, $conn=c, $where=Conn::IN_ORIG]);
|
||||||
|
Intel::seen([$host=c$id$resp_h, $conn=c, $where=Conn::IN_RESP]);
|
||||||
|
}
|
||||||
|
}
|
|
@ -3,8 +3,8 @@
|
||||||
|
|
||||||
event dns_request(c: connection, msg: dns_msg, query: string, qtype: count, qclass: count)
|
event dns_request(c: connection, msg: dns_msg, query: string, qtype: count, qclass: count)
|
||||||
{
|
{
|
||||||
Intel::seen([$str=query,
|
Intel::seen([$indicator=query,
|
||||||
$str_type=Intel::DOMAIN,
|
$indicator_type=Intel::DOMAIN,
|
||||||
$conn=c,
|
$conn=c,
|
||||||
$where=DNS::IN_REQUEST]);
|
$where=DNS::IN_REQUEST]);
|
||||||
}
|
}
|
|
@ -4,8 +4,8 @@
|
||||||
event http_header(c: connection, is_orig: bool, name: string, value: string)
|
event http_header(c: connection, is_orig: bool, name: string, value: string)
|
||||||
{
|
{
|
||||||
if ( is_orig && name == "HOST" )
|
if ( is_orig && name == "HOST" )
|
||||||
Intel::seen([$str=value,
|
Intel::seen([$indicator=value,
|
||||||
$str_type=Intel::DOMAIN,
|
$indicator_type=Intel::DOMAIN,
|
||||||
$conn=c,
|
$conn=c,
|
||||||
$where=HTTP::IN_HOST_HEADER]);
|
$where=HTTP::IN_HOST_HEADER]);
|
||||||
}
|
}
|
|
@ -5,8 +5,8 @@
|
||||||
event http_message_done(c: connection, is_orig: bool, stat: http_message_stat)
|
event http_message_done(c: connection, is_orig: bool, stat: http_message_stat)
|
||||||
{
|
{
|
||||||
if ( is_orig && c?$http )
|
if ( is_orig && c?$http )
|
||||||
Intel::seen([$str=HTTP::build_url(c$http),
|
Intel::seen([$indicator=HTTP::build_url(c$http),
|
||||||
$str_type=Intel::URL,
|
$indicator_type=Intel::URL,
|
||||||
$conn=c,
|
$conn=c,
|
||||||
$where=HTTP::IN_URL]);
|
$where=HTTP::IN_URL]);
|
||||||
}
|
}
|
|
@ -4,8 +4,8 @@
|
||||||
event http_header(c: connection, is_orig: bool, name: string, value: string)
|
event http_header(c: connection, is_orig: bool, name: string, value: string)
|
||||||
{
|
{
|
||||||
if ( is_orig && name == "USER-AGENT" )
|
if ( is_orig && name == "USER-AGENT" )
|
||||||
Intel::seen([$str=value,
|
Intel::seen([$indicator=value,
|
||||||
$str_type=Intel::USER_AGENT,
|
$indicator_type=Intel::SOFTWARE,
|
||||||
$conn=c,
|
$conn=c,
|
||||||
$where=HTTP::IN_USER_AGENT_HEADER]);
|
$where=HTTP::IN_USER_AGENT_HEADER]);
|
||||||
}
|
}
|
|
@ -14,8 +14,8 @@ event intel_mime_data(f: fa_file, data: string)
|
||||||
local urls = find_all_urls_without_scheme(data);
|
local urls = find_all_urls_without_scheme(data);
|
||||||
for ( url in urls )
|
for ( url in urls )
|
||||||
{
|
{
|
||||||
Intel::seen([$str=url,
|
Intel::seen([$indicator=url,
|
||||||
$str_type=Intel::URL,
|
$indicator_type=Intel::URL,
|
||||||
$conn=c,
|
$conn=c,
|
||||||
$where=SMTP::IN_MESSAGE]);
|
$where=SMTP::IN_MESSAGE]);
|
||||||
}
|
}
|
97
scripts/policy/frameworks/intel/seen/smtp.bro
Normal file
97
scripts/policy/frameworks/intel/seen/smtp.bro
Normal file
|
@ -0,0 +1,97 @@
|
||||||
|
@load base/frameworks/intel
|
||||||
|
@load base/protocols/smtp
|
||||||
|
@load ./where-locations
|
||||||
|
|
||||||
|
event mime_end_entity(c: connection)
|
||||||
|
{
|
||||||
|
if ( c?$smtp )
|
||||||
|
{
|
||||||
|
if ( c$smtp?$path )
|
||||||
|
{
|
||||||
|
local path = c$smtp$path;
|
||||||
|
for ( i in path )
|
||||||
|
{
|
||||||
|
Intel::seen([$host=path[i],
|
||||||
|
$conn=c,
|
||||||
|
$where=SMTP::IN_RECEIVED_HEADER]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if ( c$smtp?$user_agent )
|
||||||
|
Intel::seen([$indicator=c$smtp$user_agent,
|
||||||
|
$indicator_type=Intel::SOFTWARE,
|
||||||
|
$conn=c,
|
||||||
|
$where=SMTP::IN_HEADER]);
|
||||||
|
|
||||||
|
if ( c$smtp?$x_originating_ip )
|
||||||
|
Intel::seen([$host=c$smtp$x_originating_ip,
|
||||||
|
$conn=c,
|
||||||
|
$where=SMTP::IN_X_ORIGINATING_IP_HEADER]);
|
||||||
|
|
||||||
|
if ( c$smtp?$mailfrom )
|
||||||
|
{
|
||||||
|
local mailfromparts = split_n(c$smtp$mailfrom, /<.+>/, T, 1);
|
||||||
|
if ( |mailfromparts| > 2 )
|
||||||
|
{
|
||||||
|
Intel::seen([$indicator=mailfromparts[2][1:-2],
|
||||||
|
$indicator_type=Intel::EMAIL,
|
||||||
|
$conn=c,
|
||||||
|
$where=SMTP::IN_MAIL_FROM]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if ( c$smtp?$rcptto )
|
||||||
|
{
|
||||||
|
for ( rcptto in c$smtp$rcptto )
|
||||||
|
{
|
||||||
|
local rcpttoparts = split_n(rcptto, /<.+>/, T, 1);
|
||||||
|
if ( |rcpttoparts| > 2 )
|
||||||
|
{
|
||||||
|
Intel::seen([$indicator=rcpttoparts[2][1:-2],
|
||||||
|
$indicator_type=Intel::EMAIL,
|
||||||
|
$conn=c,
|
||||||
|
$where=SMTP::IN_RCPT_TO]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if ( c$smtp?$from )
|
||||||
|
{
|
||||||
|
local fromparts = split_n(c$smtp$from, /<.+>/, T, 1);
|
||||||
|
if ( |fromparts| > 2 )
|
||||||
|
{
|
||||||
|
Intel::seen([$indicator=fromparts[2][1:-2],
|
||||||
|
$indicator_type=Intel::EMAIL,
|
||||||
|
$conn=c,
|
||||||
|
$where=SMTP::IN_FROM]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if ( c$smtp?$to )
|
||||||
|
{
|
||||||
|
for ( email_to in c$smtp$to )
|
||||||
|
{
|
||||||
|
local toparts = split_n(email_to, /<.+>/, T, 1);
|
||||||
|
if ( |toparts| > 2 )
|
||||||
|
{
|
||||||
|
Intel::seen([$indicator=toparts[2][1:-2],
|
||||||
|
$indicator_type=Intel::EMAIL,
|
||||||
|
$conn=c,
|
||||||
|
$where=SMTP::IN_TO]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if ( c$smtp?$reply_to )
|
||||||
|
{
|
||||||
|
local replytoparts = split_n(c$smtp$reply_to, /<.+>/, T, 1);
|
||||||
|
if ( |replytoparts| > 2 )
|
||||||
|
{
|
||||||
|
Intel::seen([$indicator=replytoparts[2][1:-2],
|
||||||
|
$indicator_type=Intel::EMAIL,
|
||||||
|
$conn=c,
|
||||||
|
$where=SMTP::IN_REPLY_TO]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
|
@ -10,14 +10,14 @@ event x509_certificate(c: connection, is_orig: bool, cert: X509, chain_idx: coun
|
||||||
{
|
{
|
||||||
local email = sub(cert$subject, /^.*emailAddress=/, "");
|
local email = sub(cert$subject, /^.*emailAddress=/, "");
|
||||||
email = sub(email, /,.*$/, "");
|
email = sub(email, /,.*$/, "");
|
||||||
Intel::seen([$str=email,
|
Intel::seen([$indicator=email,
|
||||||
$str_type=Intel::EMAIL,
|
$indicator_type=Intel::EMAIL,
|
||||||
$conn=c,
|
$conn=c,
|
||||||
$where=(is_orig ? SSL::IN_CLIENT_CERT : SSL::IN_SERVER_CERT)]);
|
$where=(is_orig ? SSL::IN_CLIENT_CERT : SSL::IN_SERVER_CERT)]);
|
||||||
}
|
}
|
||||||
|
|
||||||
Intel::seen([$str=sha1_hash(der_cert),
|
Intel::seen([$indicator=sha1_hash(der_cert),
|
||||||
$str_type=Intel::CERT_HASH,
|
$indicator_type=Intel::CERT_HASH,
|
||||||
$conn=c,
|
$conn=c,
|
||||||
$where=(is_orig ? SSL::IN_CLIENT_CERT : SSL::IN_SERVER_CERT)]);
|
$where=(is_orig ? SSL::IN_CLIENT_CERT : SSL::IN_SERVER_CERT)]);
|
||||||
}
|
}
|
||||||
|
@ -27,8 +27,8 @@ event ssl_extension(c: connection, is_orig: bool, code: count, val: string)
|
||||||
{
|
{
|
||||||
if ( is_orig && SSL::extensions[code] == "server_name" &&
|
if ( is_orig && SSL::extensions[code] == "server_name" &&
|
||||||
c?$ssl && c$ssl?$server_name )
|
c?$ssl && c$ssl?$server_name )
|
||||||
Intel::seen([$str=c$ssl$server_name,
|
Intel::seen([$indicator=c$ssl$server_name,
|
||||||
$str_type=Intel::DOMAIN,
|
$indicator_type=Intel::DOMAIN,
|
||||||
$conn=c,
|
$conn=c,
|
||||||
$where=SSL::IN_SERVER_NAME]);
|
$where=SSL::IN_SERVER_NAME]);
|
||||||
}
|
}
|
|
@ -1,71 +0,0 @@
|
||||||
@load base/frameworks/intel
|
|
||||||
@load base/protocols/smtp
|
|
||||||
@load ./where-locations
|
|
||||||
|
|
||||||
event mime_end_entity(c: connection)
|
|
||||||
{
|
|
||||||
if ( c?$smtp )
|
|
||||||
{
|
|
||||||
if ( c$smtp?$path )
|
|
||||||
{
|
|
||||||
local path = c$smtp$path;
|
|
||||||
for ( i in path )
|
|
||||||
{
|
|
||||||
Intel::seen([$host=path[i],
|
|
||||||
$conn=c,
|
|
||||||
$where=SMTP::IN_RECEIVED_HEADER]);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if ( c$smtp?$user_agent )
|
|
||||||
Intel::seen([$str=c$smtp$user_agent,
|
|
||||||
$str_type=Intel::USER_AGENT,
|
|
||||||
$conn=c,
|
|
||||||
$where=SMTP::IN_HEADER]);
|
|
||||||
|
|
||||||
if ( c$smtp?$x_originating_ip )
|
|
||||||
Intel::seen([$host=c$smtp$x_originating_ip,
|
|
||||||
$conn=c,
|
|
||||||
$where=SMTP::IN_X_ORIGINATING_IP_HEADER]);
|
|
||||||
|
|
||||||
if ( c$smtp?$mailfrom )
|
|
||||||
Intel::seen([$str=c$smtp$mailfrom,
|
|
||||||
$str_type=Intel::EMAIL,
|
|
||||||
$conn=c,
|
|
||||||
$where=SMTP::IN_MAIL_FROM]);
|
|
||||||
|
|
||||||
if ( c$smtp?$rcptto )
|
|
||||||
{
|
|
||||||
for ( rcptto in c$smtp$rcptto )
|
|
||||||
{
|
|
||||||
Intel::seen([$str=rcptto,
|
|
||||||
$str_type=Intel::EMAIL,
|
|
||||||
$conn=c,
|
|
||||||
$where=SMTP::IN_RCPT_TO]);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if ( c$smtp?$from )
|
|
||||||
Intel::seen([$str=c$smtp$from,
|
|
||||||
$str_type=Intel::EMAIL,
|
|
||||||
$conn=c,
|
|
||||||
$where=SMTP::IN_FROM]);
|
|
||||||
|
|
||||||
if ( c$smtp?$to )
|
|
||||||
{
|
|
||||||
for ( email_to in c$smtp$to )
|
|
||||||
{
|
|
||||||
Intel::seen([$str=email_to,
|
|
||||||
$str_type=Intel::EMAIL,
|
|
||||||
$conn=c,
|
|
||||||
$where=SMTP::IN_TO]);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if ( c$smtp?$reply_to )
|
|
||||||
Intel::seen([$str=c$smtp$reply_to,
|
|
||||||
$str_type=Intel::EMAIL,
|
|
||||||
$conn=c,
|
|
||||||
$where=SMTP::IN_REPLY_TO]);
|
|
||||||
}
|
|
||||||
}
|
|
|
@ -58,10 +58,6 @@ event bro_init()
|
||||||
$msg=fmt("%s appears to be guessing SSH passwords (seen in %d connections).", key$host, r$num),
|
$msg=fmt("%s appears to be guessing SSH passwords (seen in %d connections).", key$host, r$num),
|
||||||
$src=key$host,
|
$src=key$host,
|
||||||
$identifier=cat(key$host)]);
|
$identifier=cat(key$host)]);
|
||||||
# Insert the guesser into the intel framework.
|
|
||||||
Intel::insert([$host=key$host,
|
|
||||||
$meta=[$source="local",
|
|
||||||
$desc=fmt("Bro observed %d apparently failed SSH connections.", r$num)]]);
|
|
||||||
}]);
|
}]);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -14,18 +14,19 @@
|
||||||
# @load frameworks/control/controller.bro
|
# @load frameworks/control/controller.bro
|
||||||
@load frameworks/dpd/detect-protocols.bro
|
@load frameworks/dpd/detect-protocols.bro
|
||||||
@load frameworks/dpd/packet-segment-logging.bro
|
@load frameworks/dpd/packet-segment-logging.bro
|
||||||
|
@load frameworks/intel/do_notice.bro
|
||||||
|
@load frameworks/intel/seen/__load__.bro
|
||||||
|
@load frameworks/intel/seen/conn-established.bro
|
||||||
|
@load frameworks/intel/seen/dns.bro
|
||||||
|
@load frameworks/intel/seen/http-host-header.bro
|
||||||
|
@load frameworks/intel/seen/http-url.bro
|
||||||
|
@load frameworks/intel/seen/http-user-agents.bro
|
||||||
|
@load frameworks/intel/seen/smtp-url-extraction.bro
|
||||||
|
@load frameworks/intel/seen/smtp.bro
|
||||||
|
@load frameworks/intel/seen/ssl.bro
|
||||||
|
@load frameworks/intel/seen/where-locations.bro
|
||||||
@load frameworks/files/detect-MHR.bro
|
@load frameworks/files/detect-MHR.bro
|
||||||
@load frameworks/files/hash-all-files.bro
|
@load frameworks/files/hash-all-files.bro
|
||||||
@load frameworks/intel/__load__.bro
|
|
||||||
@load frameworks/intel/conn-established.bro
|
|
||||||
@load frameworks/intel/dns.bro
|
|
||||||
@load frameworks/intel/http-host-header.bro
|
|
||||||
@load frameworks/intel/http-url.bro
|
|
||||||
@load frameworks/intel/http-user-agents.bro
|
|
||||||
@load frameworks/intel/smtp-url-extraction.bro
|
|
||||||
@load frameworks/intel/smtp.bro
|
|
||||||
@load frameworks/intel/ssl.bro
|
|
||||||
@load frameworks/intel/where-locations.bro
|
|
||||||
@load frameworks/packet-filter/shunt.bro
|
@load frameworks/packet-filter/shunt.bro
|
||||||
@load frameworks/software/version-changes.bro
|
@load frameworks/software/version-changes.bro
|
||||||
@load frameworks/software/vulnerable.bro
|
@load frameworks/software/vulnerable.bro
|
||||||
|
|
|
@ -6,6 +6,9 @@ include_directories(BEFORE
|
||||||
# This collects generated bif and pac files from subdirectories.
|
# This collects generated bif and pac files from subdirectories.
|
||||||
set(bro_ALL_GENERATED_OUTPUTS CACHE INTERNAL "automatically generated files" FORCE)
|
set(bro_ALL_GENERATED_OUTPUTS CACHE INTERNAL "automatically generated files" FORCE)
|
||||||
|
|
||||||
|
# This collects bif inputs that we'll load automatically.
|
||||||
|
set(bro_AUTO_BIFS CACHE INTERNAL "BIFs for automatic inclusion" FORCE)
|
||||||
|
|
||||||
# If TRUE, use CMake's object libraries for sub-directories instead of
|
# If TRUE, use CMake's object libraries for sub-directories instead of
|
||||||
# static libraries. This requires CMake >= 2.8.8.
|
# static libraries. This requires CMake >= 2.8.8.
|
||||||
set(bro_HAVE_OBJECT_LIBRARIES FALSE)
|
set(bro_HAVE_OBJECT_LIBRARIES FALSE)
|
||||||
|
@ -150,6 +153,7 @@ set(bro_PLUGIN_LIBS CACHE INTERNAL "plugin libraries" FORCE)
|
||||||
|
|
||||||
add_subdirectory(analyzer)
|
add_subdirectory(analyzer)
|
||||||
add_subdirectory(file_analysis)
|
add_subdirectory(file_analysis)
|
||||||
|
add_subdirectory(probabilistic)
|
||||||
|
|
||||||
set(bro_SUBDIRS
|
set(bro_SUBDIRS
|
||||||
${bro_SUBDIR_LIBS}
|
${bro_SUBDIR_LIBS}
|
||||||
|
@ -383,8 +387,21 @@ set(BRO_EXE bro
|
||||||
CACHE STRING "Bro executable binary" FORCE)
|
CACHE STRING "Bro executable binary" FORCE)
|
||||||
|
|
||||||
# Target to create all the autogenerated files.
|
# Target to create all the autogenerated files.
|
||||||
|
add_custom_target(generate_outputs_stage1)
|
||||||
|
add_dependencies(generate_outputs_stage1 ${bro_ALL_GENERATED_OUTPUTS})
|
||||||
|
|
||||||
|
# Target to create the joint includes files that pull in the bif code.
|
||||||
|
bro_bif_create_includes(generate_outputs_stage2 ${CMAKE_CURRENT_BINARY_DIR} "${bro_AUTO_BIFS}")
|
||||||
|
add_dependencies(generate_outputs_stage2 generate_outputs_stage1)
|
||||||
|
|
||||||
|
# Global target to trigger creation of autogenerated code.
|
||||||
add_custom_target(generate_outputs)
|
add_custom_target(generate_outputs)
|
||||||
add_dependencies(generate_outputs ${bro_ALL_GENERATED_OUTPUTS})
|
add_dependencies(generate_outputs generate_outputs_stage2)
|
||||||
|
|
||||||
|
# Build __load__.bro files for standard *.bif.bro.
|
||||||
|
bro_bif_create_loader(bif_loader ${CMAKE_BINARY_DIR}/scripts/base/bif)
|
||||||
|
add_dependencies(bif_loader ${bro_SUBDIRS})
|
||||||
|
add_dependencies(bro bif_loader)
|
||||||
|
|
||||||
# Build __load__.bro files for plugins/*.bif.bro.
|
# Build __load__.bro files for plugins/*.bif.bro.
|
||||||
bro_bif_create_loader(bif_loader_plugins ${CMAKE_BINARY_DIR}/scripts/base/bif/plugins)
|
bro_bif_create_loader(bif_loader_plugins ${CMAKE_BINARY_DIR}/scripts/base/bif/plugins)
|
||||||
|
|
|
@ -560,6 +560,8 @@ void builtin_error(const char* msg, BroObj* arg)
|
||||||
#include "reporter.bif.func_def"
|
#include "reporter.bif.func_def"
|
||||||
#include "strings.bif.func_def"
|
#include "strings.bif.func_def"
|
||||||
|
|
||||||
|
#include "__all__.bif.cc" // Autogenerated for compiling in the bif_target() code.
|
||||||
|
|
||||||
void init_builtin_funcs()
|
void init_builtin_funcs()
|
||||||
{
|
{
|
||||||
bro_resources = internal_type("bro_resources")->AsRecordType();
|
bro_resources = internal_type("bro_resources")->AsRecordType();
|
||||||
|
@ -574,6 +576,8 @@ void init_builtin_funcs()
|
||||||
#include "reporter.bif.func_init"
|
#include "reporter.bif.func_init"
|
||||||
#include "strings.bif.func_init"
|
#include "strings.bif.func_init"
|
||||||
|
|
||||||
|
#include "__all__.bif.init.cc" // Autogenerated for compiling in the bif_target() code.
|
||||||
|
|
||||||
did_builtin_init = true;
|
did_builtin_init = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
126
src/H3.h
126
src/H3.h
|
@ -49,69 +49,95 @@
|
||||||
// hash a substring of the data. Hashes of substrings can be bitwise-XOR'ed
|
// hash a substring of the data. Hashes of substrings can be bitwise-XOR'ed
|
||||||
// together to get the same result as hashing the full string.
|
// together to get the same result as hashing the full string.
|
||||||
// Any number of hash functions can be created by creating new instances of H3,
|
// Any number of hash functions can be created by creating new instances of H3,
|
||||||
// with the same or different template parameters. The hash function is
|
// with the same or different template parameters. The hash function
|
||||||
// randomly generated using bro_random(); you must call init_random_seed()
|
// constructor takes a seed as argument which defaults to a call to
|
||||||
// before the H3 constructor if you wish to seed it.
|
// bro_random().
|
||||||
|
|
||||||
|
|
||||||
#ifndef H3_H
|
#ifndef H3_H
|
||||||
#define H3_H
|
#define H3_H
|
||||||
|
|
||||||
#include <climits>
|
#include <climits>
|
||||||
|
#include <cstring>
|
||||||
|
|
||||||
// The number of values representable by a byte.
|
// The number of values representable by a byte.
|
||||||
#define H3_BYTE_RANGE (UCHAR_MAX+1)
|
#define H3_BYTE_RANGE (UCHAR_MAX+1)
|
||||||
|
|
||||||
template<class T, int N> class H3 {
|
template <typename T, int N>
|
||||||
T byte_lookup[N][H3_BYTE_RANGE];
|
class H3 {
|
||||||
public:
|
public:
|
||||||
H3();
|
H3()
|
||||||
T operator()(const void* data, size_t size, size_t offset = 0) const
|
{
|
||||||
{
|
Init(false, 0);
|
||||||
const unsigned char *p = static_cast<const unsigned char*>(data);
|
}
|
||||||
T result = 0;
|
|
||||||
|
|
||||||
// loop optmized with Duff's Device
|
H3(T seed)
|
||||||
register unsigned n = (size + 7) / 8;
|
{
|
||||||
switch (size % 8) {
|
Init(true, seed);
|
||||||
case 0: do { result ^= byte_lookup[offset++][*p++];
|
}
|
||||||
case 7: result ^= byte_lookup[offset++][*p++];
|
|
||||||
case 6: result ^= byte_lookup[offset++][*p++];
|
|
||||||
case 5: result ^= byte_lookup[offset++][*p++];
|
|
||||||
case 4: result ^= byte_lookup[offset++][*p++];
|
|
||||||
case 3: result ^= byte_lookup[offset++][*p++];
|
|
||||||
case 2: result ^= byte_lookup[offset++][*p++];
|
|
||||||
case 1: result ^= byte_lookup[offset++][*p++];
|
|
||||||
} while (--n > 0);
|
|
||||||
}
|
|
||||||
|
|
||||||
return result;
|
void Init(bool have_seed, T seed)
|
||||||
}
|
{
|
||||||
|
T bit_lookup[N * CHAR_BIT];
|
||||||
|
|
||||||
|
for ( size_t bit = 0; bit < N * CHAR_BIT; bit++ )
|
||||||
|
{
|
||||||
|
bit_lookup[bit] = 0;
|
||||||
|
for ( size_t i = 0; i < sizeof(T)/2; i++ )
|
||||||
|
{
|
||||||
|
seed = have_seed ? bro_prng(seed) : bro_random();
|
||||||
|
// assume random() returns at least 16 random bits
|
||||||
|
bit_lookup[bit] = (bit_lookup[bit] << 16) | (seed & 0xFFFF);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
for ( size_t byte = 0; byte < N; byte++ )
|
||||||
|
{
|
||||||
|
for ( unsigned val = 0; val < H3_BYTE_RANGE; val++ )
|
||||||
|
{
|
||||||
|
byte_lookup[byte][val] = 0;
|
||||||
|
for ( size_t bit = 0; bit < CHAR_BIT; bit++ )
|
||||||
|
// Does this mean byte_lookup[*][0] == 0? -RP
|
||||||
|
if (val & (1 << bit))
|
||||||
|
byte_lookup[byte][val] ^= bit_lookup[byte*CHAR_BIT+bit];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
T operator()(const void* data, size_t size, size_t offset = 0) const
|
||||||
|
{
|
||||||
|
const unsigned char *p = static_cast<const unsigned char*>(data);
|
||||||
|
T result = 0;
|
||||||
|
|
||||||
|
// loop optmized with Duff's Device
|
||||||
|
register unsigned n = (size + 7) / 8;
|
||||||
|
switch ( size % 8 ) {
|
||||||
|
case 0: do { result ^= byte_lookup[offset++][*p++];
|
||||||
|
case 7: result ^= byte_lookup[offset++][*p++];
|
||||||
|
case 6: result ^= byte_lookup[offset++][*p++];
|
||||||
|
case 5: result ^= byte_lookup[offset++][*p++];
|
||||||
|
case 4: result ^= byte_lookup[offset++][*p++];
|
||||||
|
case 3: result ^= byte_lookup[offset++][*p++];
|
||||||
|
case 2: result ^= byte_lookup[offset++][*p++];
|
||||||
|
case 1: result ^= byte_lookup[offset++][*p++];
|
||||||
|
} while ( --n > 0 );
|
||||||
|
}
|
||||||
|
|
||||||
|
return result;
|
||||||
|
}
|
||||||
|
|
||||||
|
friend bool operator==(const H3& x, const H3& y)
|
||||||
|
{
|
||||||
|
return ! std::memcmp(x.byte_lookup, y.byte_lookup, N * H3_BYTE_RANGE);
|
||||||
|
}
|
||||||
|
|
||||||
|
friend bool operator!=(const H3& x, const H3& y)
|
||||||
|
{
|
||||||
|
return ! (x == y);
|
||||||
|
}
|
||||||
|
|
||||||
|
private:
|
||||||
|
T byte_lookup[N][H3_BYTE_RANGE];
|
||||||
};
|
};
|
||||||
|
|
||||||
template<class T, int N>
|
|
||||||
H3<T,N>::H3()
|
|
||||||
{
|
|
||||||
T bit_lookup[N * CHAR_BIT];
|
|
||||||
|
|
||||||
for (size_t bit = 0; bit < N * CHAR_BIT; bit++) {
|
|
||||||
bit_lookup[bit] = 0;
|
|
||||||
for (size_t i = 0; i < sizeof(T)/2; i++) {
|
|
||||||
// assume random() returns at least 16 random bits
|
|
||||||
bit_lookup[bit] = (bit_lookup[bit] << 16) | (bro_random() & 0xFFFF);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
for (size_t byte = 0; byte < N; byte++) {
|
|
||||||
for (unsigned val = 0; val < H3_BYTE_RANGE; val++) {
|
|
||||||
byte_lookup[byte][val] = 0;
|
|
||||||
for (size_t bit = 0; bit < CHAR_BIT; bit++) {
|
|
||||||
// Does this mean byte_lookup[*][0] == 0? -RP
|
|
||||||
if (val & (1 << bit))
|
|
||||||
byte_lookup[byte][val] ^= bit_lookup[byte*CHAR_BIT+bit];
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
#endif //H3_H
|
#endif //H3_H
|
||||||
|
|
|
@ -242,6 +242,7 @@ OpaqueType* md5_type;
|
||||||
OpaqueType* sha1_type;
|
OpaqueType* sha1_type;
|
||||||
OpaqueType* sha256_type;
|
OpaqueType* sha256_type;
|
||||||
OpaqueType* entropy_type;
|
OpaqueType* entropy_type;
|
||||||
|
OpaqueType* bloomfilter_type;
|
||||||
|
|
||||||
#include "const.bif.netvar_def"
|
#include "const.bif.netvar_def"
|
||||||
#include "types.bif.netvar_def"
|
#include "types.bif.netvar_def"
|
||||||
|
@ -307,6 +308,7 @@ void init_general_global_var()
|
||||||
sha1_type = new OpaqueType("sha1");
|
sha1_type = new OpaqueType("sha1");
|
||||||
sha256_type = new OpaqueType("sha256");
|
sha256_type = new OpaqueType("sha256");
|
||||||
entropy_type = new OpaqueType("entropy");
|
entropy_type = new OpaqueType("entropy");
|
||||||
|
bloomfilter_type = new OpaqueType("bloomfilter");
|
||||||
}
|
}
|
||||||
|
|
||||||
void init_net_var()
|
void init_net_var()
|
||||||
|
|
|
@ -247,6 +247,7 @@ extern OpaqueType* md5_type;
|
||||||
extern OpaqueType* sha1_type;
|
extern OpaqueType* sha1_type;
|
||||||
extern OpaqueType* sha256_type;
|
extern OpaqueType* sha256_type;
|
||||||
extern OpaqueType* entropy_type;
|
extern OpaqueType* entropy_type;
|
||||||
|
extern OpaqueType* bloomfilter_type;
|
||||||
|
|
||||||
// Initializes globals that don't pertain to network/event analysis.
|
// Initializes globals that don't pertain to network/event analysis.
|
||||||
extern void init_general_global_var();
|
extern void init_general_global_var();
|
||||||
|
|
151
src/OpaqueVal.cc
151
src/OpaqueVal.cc
|
@ -1,3 +1,5 @@
|
||||||
|
// See the file "COPYING" in the main distribution directory for copyright.
|
||||||
|
|
||||||
#include "OpaqueVal.h"
|
#include "OpaqueVal.h"
|
||||||
#include "NetVar.h"
|
#include "NetVar.h"
|
||||||
#include "Reporter.h"
|
#include "Reporter.h"
|
||||||
|
@ -515,3 +517,152 @@ bool EntropyVal::DoUnserialize(UnserialInfo* info)
|
||||||
|
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
BloomFilterVal::BloomFilterVal()
|
||||||
|
: OpaqueVal(bloomfilter_type)
|
||||||
|
{
|
||||||
|
type = 0;
|
||||||
|
hash = 0;
|
||||||
|
bloom_filter = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
BloomFilterVal::BloomFilterVal(OpaqueType* t)
|
||||||
|
: OpaqueVal(t)
|
||||||
|
{
|
||||||
|
type = 0;
|
||||||
|
hash = 0;
|
||||||
|
bloom_filter = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
BloomFilterVal::BloomFilterVal(probabilistic::BloomFilter* bf)
|
||||||
|
: OpaqueVal(bloomfilter_type)
|
||||||
|
{
|
||||||
|
type = 0;
|
||||||
|
hash = 0;
|
||||||
|
bloom_filter = bf;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BloomFilterVal::Typify(BroType* arg_type)
|
||||||
|
{
|
||||||
|
if ( type )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
type = arg_type;
|
||||||
|
type->Ref();
|
||||||
|
|
||||||
|
TypeList* tl = new TypeList(type);
|
||||||
|
tl->Append(type);
|
||||||
|
hash = new CompositeHash(tl);
|
||||||
|
Unref(tl);
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
BroType* BloomFilterVal::Type() const
|
||||||
|
{
|
||||||
|
return type;
|
||||||
|
}
|
||||||
|
|
||||||
|
void BloomFilterVal::Add(const Val* val)
|
||||||
|
{
|
||||||
|
HashKey* key = hash->ComputeHash(val, 1);
|
||||||
|
bloom_filter->Add(key->Hash());
|
||||||
|
delete key;
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t BloomFilterVal::Count(const Val* val) const
|
||||||
|
{
|
||||||
|
HashKey* key = hash->ComputeHash(val, 1);
|
||||||
|
size_t cnt = bloom_filter->Count(key->Hash());
|
||||||
|
delete key;
|
||||||
|
return cnt;
|
||||||
|
}
|
||||||
|
|
||||||
|
void BloomFilterVal::Clear()
|
||||||
|
{
|
||||||
|
bloom_filter->Clear();
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BloomFilterVal::Empty() const
|
||||||
|
{
|
||||||
|
return bloom_filter->Empty();
|
||||||
|
}
|
||||||
|
|
||||||
|
BloomFilterVal* BloomFilterVal::Merge(const BloomFilterVal* x,
|
||||||
|
const BloomFilterVal* y)
|
||||||
|
{
|
||||||
|
if ( ! same_type(x->Type(), y->Type()) )
|
||||||
|
{
|
||||||
|
reporter->Error("cannot merge Bloom filters with different types");
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
if ( typeid(*x->bloom_filter) != typeid(*y->bloom_filter) )
|
||||||
|
{
|
||||||
|
reporter->Error("cannot merge different Bloom filter types");
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
probabilistic::BloomFilter* copy = x->bloom_filter->Clone();
|
||||||
|
|
||||||
|
if ( ! copy->Merge(y->bloom_filter) )
|
||||||
|
{
|
||||||
|
reporter->Error("failed to merge Bloom filter");
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
BloomFilterVal* merged = new BloomFilterVal(copy);
|
||||||
|
|
||||||
|
if ( ! merged->Typify(x->Type()) )
|
||||||
|
{
|
||||||
|
reporter->Error("failed to set type on merged Bloom filter");
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
return merged;
|
||||||
|
}
|
||||||
|
|
||||||
|
BloomFilterVal::~BloomFilterVal()
|
||||||
|
{
|
||||||
|
Unref(type);
|
||||||
|
delete hash;
|
||||||
|
delete bloom_filter;
|
||||||
|
}
|
||||||
|
|
||||||
|
IMPLEMENT_SERIAL(BloomFilterVal, SER_BLOOMFILTER_VAL);
|
||||||
|
|
||||||
|
bool BloomFilterVal::DoSerialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
DO_SERIALIZE(SER_BLOOMFILTER_VAL, OpaqueVal);
|
||||||
|
|
||||||
|
bool is_typed = (type != 0);
|
||||||
|
|
||||||
|
if ( ! SERIALIZE(is_typed) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
if ( is_typed && ! type->Serialize(info) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
return bloom_filter->Serialize(info);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BloomFilterVal::DoUnserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
DO_UNSERIALIZE(OpaqueVal);
|
||||||
|
|
||||||
|
bool is_typed;
|
||||||
|
if ( ! UNSERIALIZE(&is_typed) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
if ( is_typed )
|
||||||
|
{
|
||||||
|
BroType* t = BroType::Unserialize(info);
|
||||||
|
if ( ! Typify(t) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
Unref(t);
|
||||||
|
}
|
||||||
|
|
||||||
|
bloom_filter = probabilistic::BloomFilter::Unserialize(info);
|
||||||
|
return bloom_filter != 0;
|
||||||
|
}
|
||||||
|
|
|
@ -3,10 +3,18 @@
|
||||||
#ifndef OPAQUEVAL_H
|
#ifndef OPAQUEVAL_H
|
||||||
#define OPAQUEVAL_H
|
#define OPAQUEVAL_H
|
||||||
|
|
||||||
|
#include <typeinfo>
|
||||||
|
|
||||||
#include "RandTest.h"
|
#include "RandTest.h"
|
||||||
#include "Val.h"
|
#include "Val.h"
|
||||||
#include "digest.h"
|
#include "digest.h"
|
||||||
|
|
||||||
|
#include "probabilistic/BloomFilter.h"
|
||||||
|
|
||||||
|
namespace probabilistic {
|
||||||
|
class BloomFilter;
|
||||||
|
}
|
||||||
|
|
||||||
class HashVal : public OpaqueVal {
|
class HashVal : public OpaqueVal {
|
||||||
public:
|
public:
|
||||||
virtual bool IsValid() const;
|
virtual bool IsValid() const;
|
||||||
|
@ -107,4 +115,37 @@ private:
|
||||||
RandTest state;
|
RandTest state;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
class BloomFilterVal : public OpaqueVal {
|
||||||
|
public:
|
||||||
|
explicit BloomFilterVal(probabilistic::BloomFilter* bf);
|
||||||
|
virtual ~BloomFilterVal();
|
||||||
|
|
||||||
|
BroType* Type() const;
|
||||||
|
bool Typify(BroType* type);
|
||||||
|
|
||||||
|
void Add(const Val* val);
|
||||||
|
size_t Count(const Val* val) const;
|
||||||
|
void Clear();
|
||||||
|
bool Empty() const;
|
||||||
|
|
||||||
|
static BloomFilterVal* Merge(const BloomFilterVal* x,
|
||||||
|
const BloomFilterVal* y);
|
||||||
|
|
||||||
|
protected:
|
||||||
|
friend class Val;
|
||||||
|
BloomFilterVal();
|
||||||
|
BloomFilterVal(OpaqueType* t);
|
||||||
|
|
||||||
|
DECLARE_SERIAL(BloomFilterVal);
|
||||||
|
|
||||||
|
private:
|
||||||
|
// Disable.
|
||||||
|
BloomFilterVal(const BloomFilterVal&);
|
||||||
|
BloomFilterVal& operator=(const BloomFilterVal&);
|
||||||
|
|
||||||
|
BroType* type;
|
||||||
|
CompositeHash* hash;
|
||||||
|
probabilistic::BloomFilter* bloom_filter;
|
||||||
|
};
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|
|
@ -77,6 +77,12 @@ int PktSrc::ExtractNextPacket()
|
||||||
|
|
||||||
data = last_data = pcap_next(pd, &hdr);
|
data = last_data = pcap_next(pd, &hdr);
|
||||||
|
|
||||||
|
if ( data && (hdr.len == 0 || hdr.caplen == 0) )
|
||||||
|
{
|
||||||
|
sessions->Weird("empty_pcap_header", &hdr, data);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
if ( data )
|
if ( data )
|
||||||
next_timestamp = hdr.ts.tv_sec + double(hdr.ts.tv_usec) / 1e6;
|
next_timestamp = hdr.ts.tv_sec + double(hdr.ts.tv_usec) / 1e6;
|
||||||
|
|
||||||
|
|
|
@ -49,6 +49,10 @@ SERIAL_IS(STATE_ACCESS, 0x1100)
|
||||||
SERIAL_IS_BO(CASE, 0x1200)
|
SERIAL_IS_BO(CASE, 0x1200)
|
||||||
SERIAL_IS(LOCATION, 0x1300)
|
SERIAL_IS(LOCATION, 0x1300)
|
||||||
SERIAL_IS(RE_MATCHER, 0x1400)
|
SERIAL_IS(RE_MATCHER, 0x1400)
|
||||||
|
SERIAL_IS(BITVECTOR, 0x1500)
|
||||||
|
SERIAL_IS(COUNTERVECTOR, 0x1600)
|
||||||
|
SERIAL_IS(BLOOMFILTER, 0x1700)
|
||||||
|
SERIAL_IS(HASHER, 0x1800)
|
||||||
|
|
||||||
// These are the externally visible types.
|
// These are the externally visible types.
|
||||||
const SerialType SER_NONE = 0;
|
const SerialType SER_NONE = 0;
|
||||||
|
@ -104,6 +108,7 @@ SERIAL_VAL(MD5_VAL, 16)
|
||||||
SERIAL_VAL(SHA1_VAL, 17)
|
SERIAL_VAL(SHA1_VAL, 17)
|
||||||
SERIAL_VAL(SHA256_VAL, 18)
|
SERIAL_VAL(SHA256_VAL, 18)
|
||||||
SERIAL_VAL(ENTROPY_VAL, 19)
|
SERIAL_VAL(ENTROPY_VAL, 19)
|
||||||
|
SERIAL_VAL(BLOOMFILTER_VAL, 20)
|
||||||
|
|
||||||
#define SERIAL_EXPR(name, val) SERIAL_CONST(name, val, EXPR)
|
#define SERIAL_EXPR(name, val) SERIAL_CONST(name, val, EXPR)
|
||||||
SERIAL_EXPR(EXPR, 1)
|
SERIAL_EXPR(EXPR, 1)
|
||||||
|
@ -197,10 +202,22 @@ SERIAL_FUNC(BRO_FUNC, 2)
|
||||||
SERIAL_FUNC(DEBUG_FUNC, 3)
|
SERIAL_FUNC(DEBUG_FUNC, 3)
|
||||||
SERIAL_FUNC(BUILTIN_FUNC, 4)
|
SERIAL_FUNC(BUILTIN_FUNC, 4)
|
||||||
|
|
||||||
|
#define SERIAL_BLOOMFILTER(name, val) SERIAL_CONST(name, val, BLOOMFILTER)
|
||||||
|
SERIAL_BLOOMFILTER(BLOOMFILTER, 1)
|
||||||
|
SERIAL_BLOOMFILTER(BASICBLOOMFILTER, 2)
|
||||||
|
SERIAL_BLOOMFILTER(COUNTINGBLOOMFILTER, 3)
|
||||||
|
|
||||||
|
#define SERIAL_HASHER(name, val) SERIAL_CONST(name, val, HASHER)
|
||||||
|
SERIAL_HASHER(HASHER, 1)
|
||||||
|
SERIAL_HASHER(DEFAULTHASHER, 2)
|
||||||
|
SERIAL_HASHER(DOUBLEHASHER, 3)
|
||||||
|
|
||||||
SERIAL_CONST2(ID)
|
SERIAL_CONST2(ID)
|
||||||
SERIAL_CONST2(STATE_ACCESS)
|
SERIAL_CONST2(STATE_ACCESS)
|
||||||
SERIAL_CONST2(CASE)
|
SERIAL_CONST2(CASE)
|
||||||
SERIAL_CONST2(LOCATION)
|
SERIAL_CONST2(LOCATION)
|
||||||
SERIAL_CONST2(RE_MATCHER)
|
SERIAL_CONST2(RE_MATCHER)
|
||||||
|
SERIAL_CONST2(BITVECTOR)
|
||||||
|
SERIAL_CONST2(COUNTERVECTOR)
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|
|
@ -1311,19 +1311,20 @@ IMPLEMENT_SERIAL(OpaqueType, SER_OPAQUE_TYPE);
|
||||||
bool OpaqueType::DoSerialize(SerialInfo* info) const
|
bool OpaqueType::DoSerialize(SerialInfo* info) const
|
||||||
{
|
{
|
||||||
DO_SERIALIZE(SER_OPAQUE_TYPE, BroType);
|
DO_SERIALIZE(SER_OPAQUE_TYPE, BroType);
|
||||||
return SERIALIZE(name);
|
return SERIALIZE_STR(name.c_str(), name.size());
|
||||||
}
|
}
|
||||||
|
|
||||||
bool OpaqueType::DoUnserialize(UnserialInfo* info)
|
bool OpaqueType::DoUnserialize(UnserialInfo* info)
|
||||||
{
|
{
|
||||||
DO_UNSERIALIZE(BroType);
|
DO_UNSERIALIZE(BroType);
|
||||||
|
|
||||||
char const* n;
|
const char* n;
|
||||||
if ( ! UNSERIALIZE_STR(&n, 0) )
|
if ( ! UNSERIALIZE_STR(&n, 0) )
|
||||||
return false;
|
return false;
|
||||||
|
|
||||||
name = n;
|
name = n;
|
||||||
delete [] n;
|
delete [] n;
|
||||||
|
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -103,7 +103,6 @@ void Manager::InitPreScript()
|
||||||
|
|
||||||
void Manager::InitPostScript()
|
void Manager::InitPostScript()
|
||||||
{
|
{
|
||||||
#include "analyzer.bif.init.cc"
|
|
||||||
}
|
}
|
||||||
|
|
||||||
void Manager::DumpDebug()
|
void Manager::DumpDebug()
|
||||||
|
|
|
@ -4975,4 +4975,3 @@ function anonymize_addr%(a: addr, cl: IPAddrAnonymizationClass%): addr
|
||||||
(enum ip_addr_anonymization_class_t) anon_class));
|
(enum ip_addr_anonymization_class_t) anon_class));
|
||||||
}
|
}
|
||||||
%}
|
%}
|
||||||
|
|
||||||
|
|
|
@ -100,6 +100,7 @@ File::~File()
|
||||||
{
|
{
|
||||||
DBG_LOG(DBG_FILE_ANALYSIS, "Destroying File object %s", id.c_str());
|
DBG_LOG(DBG_FILE_ANALYSIS, "Destroying File object %s", id.c_str());
|
||||||
Unref(val);
|
Unref(val);
|
||||||
|
|
||||||
// Queue may not be empty in the case where only content gaps were seen.
|
// Queue may not be empty in the case where only content gaps were seen.
|
||||||
while ( ! fonc_queue.empty() )
|
while ( ! fonc_queue.empty() )
|
||||||
{
|
{
|
||||||
|
|
|
@ -60,7 +60,6 @@ void Manager::RegisterAnalyzerComponent(Component* component)
|
||||||
|
|
||||||
void Manager::InitPostScript()
|
void Manager::InitPostScript()
|
||||||
{
|
{
|
||||||
#include "file_analysis.bif.init.cc"
|
|
||||||
}
|
}
|
||||||
|
|
||||||
void Manager::Terminate()
|
void Manager::Terminate()
|
||||||
|
|
578
src/probabilistic/BitVector.cc
Normal file
578
src/probabilistic/BitVector.cc
Normal file
|
@ -0,0 +1,578 @@
|
||||||
|
// See the file "COPYING" in the main distribution directory for copyright.
|
||||||
|
|
||||||
|
#include "BitVector.h"
|
||||||
|
|
||||||
|
#include <cassert>
|
||||||
|
#include <limits>
|
||||||
|
#include "Serializer.h"
|
||||||
|
|
||||||
|
using namespace probabilistic;
|
||||||
|
|
||||||
|
BitVector::size_type BitVector::npos = static_cast<BitVector::size_type>(-1);
|
||||||
|
BitVector::block_type BitVector::bits_per_block =
|
||||||
|
std::numeric_limits<BitVector::block_type>::digits;
|
||||||
|
|
||||||
|
namespace {
|
||||||
|
|
||||||
|
uint8_t count_table[] = {
|
||||||
|
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3, 4, 2,
|
||||||
|
3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3,
|
||||||
|
3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3,
|
||||||
|
4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4,
|
||||||
|
3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5,
|
||||||
|
6, 6, 7, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4,
|
||||||
|
4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5,
|
||||||
|
6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5,
|
||||||
|
3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3,
|
||||||
|
4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6,
|
||||||
|
6, 7, 6, 7, 7, 8
|
||||||
|
};
|
||||||
|
|
||||||
|
} // namespace <anonymous>
|
||||||
|
|
||||||
|
BitVector::Reference::Reference(block_type& block, block_type i)
|
||||||
|
: block(block), mask((block_type(1) << i))
|
||||||
|
{
|
||||||
|
assert(i < bits_per_block);
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::Reference& BitVector::Reference::Flip()
|
||||||
|
{
|
||||||
|
block ^= mask;
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::Reference::operator bool() const
|
||||||
|
{
|
||||||
|
return (block & mask) != 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BitVector::Reference::operator~() const
|
||||||
|
{
|
||||||
|
return (block & mask) == 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::Reference& BitVector::Reference::operator=(bool x)
|
||||||
|
{
|
||||||
|
if ( x )
|
||||||
|
block |= mask;
|
||||||
|
else
|
||||||
|
block &= ~mask;
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::Reference& BitVector::Reference::operator=(const Reference& other)
|
||||||
|
{
|
||||||
|
if ( other )
|
||||||
|
block |= mask;
|
||||||
|
else
|
||||||
|
block &= ~mask;
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::Reference& BitVector::Reference::operator|=(bool x)
|
||||||
|
{
|
||||||
|
if ( x )
|
||||||
|
block |= mask;
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::Reference& BitVector::Reference::operator&=(bool x)
|
||||||
|
{
|
||||||
|
if ( ! x )
|
||||||
|
block &= ~mask;
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::Reference& BitVector::Reference::operator^=(bool x)
|
||||||
|
{
|
||||||
|
if ( x )
|
||||||
|
block ^= mask;
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::Reference& BitVector::Reference::operator-=(bool x)
|
||||||
|
{
|
||||||
|
if ( x )
|
||||||
|
block &= ~mask;
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::BitVector()
|
||||||
|
{
|
||||||
|
num_bits = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::BitVector(size_type size, bool value)
|
||||||
|
: bits(bits_to_blocks(size), value ? ~block_type(0) : 0)
|
||||||
|
{
|
||||||
|
num_bits = size;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::BitVector(BitVector const& other)
|
||||||
|
: bits(other.bits)
|
||||||
|
{
|
||||||
|
num_bits = other.num_bits;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector BitVector::operator~() const
|
||||||
|
{
|
||||||
|
BitVector b(*this);
|
||||||
|
b.Flip();
|
||||||
|
return b;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::operator=(BitVector const& other)
|
||||||
|
{
|
||||||
|
bits = other.bits;
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector BitVector::operator<<(size_type n) const
|
||||||
|
{
|
||||||
|
BitVector b(*this);
|
||||||
|
return b <<= n;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector BitVector::operator>>(size_type n) const
|
||||||
|
{
|
||||||
|
BitVector b(*this);
|
||||||
|
return b >>= n;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::operator<<=(size_type n)
|
||||||
|
{
|
||||||
|
if ( n >= num_bits )
|
||||||
|
return Reset();
|
||||||
|
|
||||||
|
if ( n > 0 )
|
||||||
|
{
|
||||||
|
size_type last = Blocks() - 1;
|
||||||
|
size_type div = n / bits_per_block;
|
||||||
|
block_type r = bit_index(n);
|
||||||
|
block_type* b = &bits[0];
|
||||||
|
|
||||||
|
assert(Blocks() >= 1);
|
||||||
|
assert(div <= last);
|
||||||
|
|
||||||
|
if ( r != 0 )
|
||||||
|
{
|
||||||
|
for ( size_type i = last - div; i > 0; --i )
|
||||||
|
b[i + div] = (b[i] << r) | (b[i - 1] >> (bits_per_block - r));
|
||||||
|
|
||||||
|
b[div] = b[0] << r;
|
||||||
|
}
|
||||||
|
|
||||||
|
else
|
||||||
|
{
|
||||||
|
for (size_type i = last-div; i > 0; --i)
|
||||||
|
b[i + div] = b[i];
|
||||||
|
|
||||||
|
b[div] = b[0];
|
||||||
|
}
|
||||||
|
|
||||||
|
std::fill_n(b, div, block_type(0));
|
||||||
|
zero_unused_bits();
|
||||||
|
}
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::operator>>=(size_type n)
|
||||||
|
{
|
||||||
|
if ( n >= num_bits )
|
||||||
|
return Reset();
|
||||||
|
|
||||||
|
if ( n > 0 )
|
||||||
|
{
|
||||||
|
size_type last = Blocks() - 1;
|
||||||
|
size_type div = n / bits_per_block;
|
||||||
|
block_type r = bit_index(n);
|
||||||
|
block_type* b = &bits[0];
|
||||||
|
|
||||||
|
assert(Blocks() >= 1);
|
||||||
|
assert(div <= last);
|
||||||
|
|
||||||
|
if ( r != 0 )
|
||||||
|
{
|
||||||
|
for (size_type i = last - div; i > 0; --i)
|
||||||
|
b[i - div] = (b[i] >> r) | (b[i + 1] << (bits_per_block - r));
|
||||||
|
|
||||||
|
b[last - div] = b[last] >> r;
|
||||||
|
}
|
||||||
|
|
||||||
|
else
|
||||||
|
{
|
||||||
|
for (size_type i = div; i <= last; ++i)
|
||||||
|
b[i-div] = b[i];
|
||||||
|
}
|
||||||
|
|
||||||
|
std::fill_n(b + (Blocks() - div), div, block_type(0));
|
||||||
|
}
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::operator&=(BitVector const& other)
|
||||||
|
{
|
||||||
|
assert(Size() >= other.Size());
|
||||||
|
|
||||||
|
for ( size_type i = 0; i < Blocks(); ++i )
|
||||||
|
bits[i] &= other.bits[i];
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::operator|=(BitVector const& other)
|
||||||
|
{
|
||||||
|
assert(Size() >= other.Size());
|
||||||
|
|
||||||
|
for ( size_type i = 0; i < Blocks(); ++i )
|
||||||
|
bits[i] |= other.bits[i];
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::operator^=(BitVector const& other)
|
||||||
|
{
|
||||||
|
assert(Size() >= other.Size());
|
||||||
|
|
||||||
|
for ( size_type i = 0; i < Blocks(); ++i )
|
||||||
|
bits[i] ^= other.bits[i];
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::operator-=(BitVector const& other)
|
||||||
|
{
|
||||||
|
assert(Size() >= other.Size());
|
||||||
|
|
||||||
|
for ( size_type i = 0; i < Blocks(); ++i )
|
||||||
|
bits[i] &= ~other.bits[i];
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
namespace probabilistic {
|
||||||
|
|
||||||
|
BitVector operator&(BitVector const& x, BitVector const& y)
|
||||||
|
{
|
||||||
|
BitVector b(x);
|
||||||
|
return b &= y;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector operator|(BitVector const& x, BitVector const& y)
|
||||||
|
{
|
||||||
|
BitVector b(x);
|
||||||
|
return b |= y;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector operator^(BitVector const& x, BitVector const& y)
|
||||||
|
{
|
||||||
|
BitVector b(x);
|
||||||
|
return b ^= y;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector operator-(BitVector const& x, BitVector const& y)
|
||||||
|
{
|
||||||
|
BitVector b(x);
|
||||||
|
return b -= y;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool operator==(BitVector const& x, BitVector const& y)
|
||||||
|
{
|
||||||
|
return x.num_bits == y.num_bits && x.bits == y.bits;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool operator!=(BitVector const& x, BitVector const& y)
|
||||||
|
{
|
||||||
|
return ! (x == y);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool operator<(BitVector const& x, BitVector const& y)
|
||||||
|
{
|
||||||
|
assert(x.Size() == y.Size());
|
||||||
|
|
||||||
|
for ( BitVector::size_type r = x.Blocks(); r > 0; --r )
|
||||||
|
{
|
||||||
|
BitVector::size_type i = r - 1;
|
||||||
|
|
||||||
|
if ( x.bits[i] < y.bits[i] )
|
||||||
|
return true;
|
||||||
|
|
||||||
|
else if ( x.bits[i] > y.bits[i] )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
void BitVector::Resize(size_type n, bool value)
|
||||||
|
{
|
||||||
|
size_type old = Blocks();
|
||||||
|
size_type required = bits_to_blocks(n);
|
||||||
|
block_type block_value = value ? ~block_type(0) : block_type(0);
|
||||||
|
|
||||||
|
if ( required != old )
|
||||||
|
bits.resize(required, block_value);
|
||||||
|
|
||||||
|
if ( value && (n > num_bits) && extra_bits() )
|
||||||
|
bits[old - 1] |= (block_value << extra_bits());
|
||||||
|
|
||||||
|
num_bits = n;
|
||||||
|
zero_unused_bits();
|
||||||
|
}
|
||||||
|
|
||||||
|
void BitVector::Clear()
|
||||||
|
{
|
||||||
|
bits.clear();
|
||||||
|
num_bits = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
void BitVector::PushBack(bool bit)
|
||||||
|
{
|
||||||
|
size_type s = Size();
|
||||||
|
Resize(s + 1);
|
||||||
|
Set(s, bit);
|
||||||
|
}
|
||||||
|
|
||||||
|
void BitVector::Append(block_type block)
|
||||||
|
{
|
||||||
|
size_type excess = extra_bits();
|
||||||
|
|
||||||
|
if ( excess )
|
||||||
|
{
|
||||||
|
assert(! Empty());
|
||||||
|
bits.push_back(block >> (bits_per_block - excess));
|
||||||
|
bits[Blocks() - 2] |= (block << excess);
|
||||||
|
}
|
||||||
|
|
||||||
|
else
|
||||||
|
{
|
||||||
|
bits.push_back(block);
|
||||||
|
}
|
||||||
|
|
||||||
|
num_bits += bits_per_block;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::Set(size_type i, bool bit)
|
||||||
|
{
|
||||||
|
assert(i < num_bits);
|
||||||
|
|
||||||
|
if ( bit )
|
||||||
|
bits[block_index(i)] |= bit_mask(i);
|
||||||
|
else
|
||||||
|
Reset(i);
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::Set()
|
||||||
|
{
|
||||||
|
std::fill(bits.begin(), bits.end(), ~block_type(0));
|
||||||
|
zero_unused_bits();
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::Reset(size_type i)
|
||||||
|
{
|
||||||
|
assert(i < num_bits);
|
||||||
|
bits[block_index(i)] &= ~bit_mask(i);
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::Reset()
|
||||||
|
{
|
||||||
|
std::fill(bits.begin(), bits.end(), block_type(0));
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::Flip(size_type i)
|
||||||
|
{
|
||||||
|
assert(i < num_bits);
|
||||||
|
bits[block_index(i)] ^= bit_mask(i);
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector& BitVector::Flip()
|
||||||
|
{
|
||||||
|
for (size_type i = 0; i < Blocks(); ++i)
|
||||||
|
bits[i] = ~bits[i];
|
||||||
|
|
||||||
|
zero_unused_bits();
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BitVector::operator[](size_type i) const
|
||||||
|
{
|
||||||
|
assert(i < num_bits);
|
||||||
|
return (bits[block_index(i)] & bit_mask(i)) != 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::Reference BitVector::operator[](size_type i)
|
||||||
|
{
|
||||||
|
assert(i < num_bits);
|
||||||
|
return Reference(bits[block_index(i)], bit_index(i));
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::size_type BitVector::Count() const
|
||||||
|
{
|
||||||
|
std::vector<block_type>::const_iterator first = bits.begin();
|
||||||
|
size_t n = 0;
|
||||||
|
size_type length = Blocks();
|
||||||
|
|
||||||
|
while ( length )
|
||||||
|
{
|
||||||
|
block_type block = *first;
|
||||||
|
|
||||||
|
while ( block )
|
||||||
|
{
|
||||||
|
// TODO: use _popcnt if available.
|
||||||
|
n += count_table[block & ((1u << 8) - 1)];
|
||||||
|
block >>= 8;
|
||||||
|
}
|
||||||
|
|
||||||
|
++first;
|
||||||
|
--length;
|
||||||
|
}
|
||||||
|
|
||||||
|
return n;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::size_type BitVector::Blocks() const
|
||||||
|
{
|
||||||
|
return bits.size();
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::size_type BitVector::Size() const
|
||||||
|
{
|
||||||
|
return num_bits;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BitVector::Empty() const
|
||||||
|
{
|
||||||
|
return bits.empty();
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BitVector::AllZero() const
|
||||||
|
{
|
||||||
|
for ( size_t i = 0; i < bits.size(); ++i )
|
||||||
|
{
|
||||||
|
if ( bits[i] )
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::size_type BitVector::FindFirst() const
|
||||||
|
{
|
||||||
|
return find_from(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::size_type BitVector::FindNext(size_type i) const
|
||||||
|
{
|
||||||
|
if ( i >= (Size() - 1) || Size() == 0 )
|
||||||
|
return npos;
|
||||||
|
|
||||||
|
++i;
|
||||||
|
size_type bi = block_index(i);
|
||||||
|
block_type block = bits[bi] & (~block_type(0) << bit_index(i));
|
||||||
|
return block ? bi * bits_per_block + lowest_bit(block) : find_from(bi + 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::size_type BitVector::lowest_bit(block_type block)
|
||||||
|
{
|
||||||
|
block_type x = block - (block & (block - 1));
|
||||||
|
size_type log = 0;
|
||||||
|
|
||||||
|
while (x >>= 1)
|
||||||
|
++log;
|
||||||
|
|
||||||
|
return log;
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::block_type BitVector::extra_bits() const
|
||||||
|
{
|
||||||
|
return bit_index(Size());
|
||||||
|
}
|
||||||
|
|
||||||
|
void BitVector::zero_unused_bits()
|
||||||
|
{
|
||||||
|
if ( extra_bits() )
|
||||||
|
bits.back() &= ~(~block_type(0) << extra_bits());
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector::size_type BitVector::find_from(size_type i) const
|
||||||
|
{
|
||||||
|
while (i < Blocks() && bits[i] == 0)
|
||||||
|
++i;
|
||||||
|
|
||||||
|
if ( i >= Blocks() )
|
||||||
|
return npos;
|
||||||
|
|
||||||
|
return i * bits_per_block + lowest_bit(bits[i]);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BitVector::Serialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
return SerialObj::Serialize(info);
|
||||||
|
}
|
||||||
|
|
||||||
|
BitVector* BitVector::Unserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
return reinterpret_cast<BitVector*>(SerialObj::Unserialize(info, SER_BITVECTOR));
|
||||||
|
}
|
||||||
|
|
||||||
|
IMPLEMENT_SERIAL(BitVector, SER_BITVECTOR);
|
||||||
|
|
||||||
|
bool BitVector::DoSerialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
DO_SERIALIZE(SER_BITVECTOR, SerialObj);
|
||||||
|
|
||||||
|
if ( ! SERIALIZE(static_cast<uint64>(bits.size())) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
for ( size_t i = 0; i < bits.size(); ++i )
|
||||||
|
if ( ! SERIALIZE(static_cast<uint64>(bits[i])) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
return SERIALIZE(static_cast<uint64>(num_bits));
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BitVector::DoUnserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
DO_UNSERIALIZE(SerialObj);
|
||||||
|
|
||||||
|
uint64 size;
|
||||||
|
if ( ! UNSERIALIZE(&size) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
bits.resize(static_cast<size_t>(size));
|
||||||
|
|
||||||
|
for ( size_t i = 0; i < bits.size(); ++i )
|
||||||
|
{
|
||||||
|
uint64 block;
|
||||||
|
if ( ! UNSERIALIZE(&block) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
bits[i] = static_cast<block_type>(block);
|
||||||
|
}
|
||||||
|
|
||||||
|
uint64 n;
|
||||||
|
if ( ! UNSERIALIZE(&n) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
num_bits = static_cast<size_type>(n);
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
370
src/probabilistic/BitVector.h
Normal file
370
src/probabilistic/BitVector.h
Normal file
|
@ -0,0 +1,370 @@
|
||||||
|
// See the file "COPYING" in the main distribution directory for copyright.
|
||||||
|
|
||||||
|
#ifndef PROBABILISTIC_BITVECTOR_H
|
||||||
|
#define PROBABILISTIC_BITVECTOR_H
|
||||||
|
|
||||||
|
#include <iterator>
|
||||||
|
#include <vector>
|
||||||
|
|
||||||
|
#include "SerialObj.h"
|
||||||
|
|
||||||
|
namespace probabilistic {
|
||||||
|
|
||||||
|
/**
|
||||||
|
* A vector of bits.
|
||||||
|
*/
|
||||||
|
class BitVector : public SerialObj {
|
||||||
|
public:
|
||||||
|
typedef size_t block_type;
|
||||||
|
typedef size_t size_type;
|
||||||
|
typedef bool const_reference;
|
||||||
|
|
||||||
|
static size_type npos;
|
||||||
|
static block_type bits_per_block;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* An lvalue proxy for individual bits.
|
||||||
|
*/
|
||||||
|
class Reference {
|
||||||
|
public:
|
||||||
|
/**
|
||||||
|
* Inverts the bits' values.
|
||||||
|
*/
|
||||||
|
Reference& Flip();
|
||||||
|
|
||||||
|
operator bool() const;
|
||||||
|
bool operator~() const;
|
||||||
|
Reference& operator=(bool x);
|
||||||
|
Reference& operator=(const Reference& other);
|
||||||
|
Reference& operator|=(bool x);
|
||||||
|
Reference& operator&=(bool x);
|
||||||
|
Reference& operator^=(bool x);
|
||||||
|
Reference& operator-=(bool x);
|
||||||
|
|
||||||
|
private:
|
||||||
|
friend class BitVector;
|
||||||
|
|
||||||
|
Reference(block_type& block, block_type i);
|
||||||
|
void operator&();
|
||||||
|
|
||||||
|
block_type& block;
|
||||||
|
const block_type mask;
|
||||||
|
};
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Default-constructs an empty bit vector.
|
||||||
|
*/
|
||||||
|
BitVector();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Constructs a bit vector of a given size.
|
||||||
|
* @param size The number of bits.
|
||||||
|
* @param value The value for each bit.
|
||||||
|
*/
|
||||||
|
explicit BitVector(size_type size, bool value = false);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Constructs a bit vector from a sequence of blocks.
|
||||||
|
*
|
||||||
|
* @param first Start of range
|
||||||
|
* @param last End of range.
|
||||||
|
*
|
||||||
|
*/
|
||||||
|
template <typename InputIterator>
|
||||||
|
BitVector(InputIterator first, InputIterator last)
|
||||||
|
{
|
||||||
|
bits.insert(bits.end(), first, last);
|
||||||
|
num_bits = bits.size() * bits_per_block;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Copy-constructs a bit vector.
|
||||||
|
* @param other The bit vector to copy.
|
||||||
|
*/
|
||||||
|
BitVector(const BitVector& other);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Assigns another bit vector to this instance.
|
||||||
|
* @param other The RHS of the assignment.
|
||||||
|
*/
|
||||||
|
BitVector& operator=(const BitVector& other);
|
||||||
|
|
||||||
|
//
|
||||||
|
// Bitwise operations.
|
||||||
|
//
|
||||||
|
BitVector operator~() const;
|
||||||
|
BitVector operator<<(size_type n) const;
|
||||||
|
BitVector operator>>(size_type n) const;
|
||||||
|
BitVector& operator<<=(size_type n);
|
||||||
|
BitVector& operator>>=(size_type n);
|
||||||
|
BitVector& operator&=(BitVector const& other);
|
||||||
|
BitVector& operator|=(BitVector const& other);
|
||||||
|
BitVector& operator^=(BitVector const& other);
|
||||||
|
BitVector& operator-=(BitVector const& other);
|
||||||
|
friend BitVector operator&(BitVector const& x, BitVector const& y);
|
||||||
|
friend BitVector operator|(BitVector const& x, BitVector const& y);
|
||||||
|
friend BitVector operator^(BitVector const& x, BitVector const& y);
|
||||||
|
friend BitVector operator-(BitVector const& x, BitVector const& y);
|
||||||
|
|
||||||
|
//
|
||||||
|
// Relational operators
|
||||||
|
//
|
||||||
|
friend bool operator==(BitVector const& x, BitVector const& y);
|
||||||
|
friend bool operator!=(BitVector const& x, BitVector const& y);
|
||||||
|
friend bool operator<(BitVector const& x, BitVector const& y);
|
||||||
|
|
||||||
|
//
|
||||||
|
// Basic operations
|
||||||
|
//
|
||||||
|
|
||||||
|
/** Appends the bits in a sequence of values.
|
||||||
|
* @tparam Iterator A forward iterator.
|
||||||
|
* @param first An iterator pointing to the first element of the sequence.
|
||||||
|
* @param last An iterator pointing to one past the last element of the
|
||||||
|
* sequence.
|
||||||
|
*/
|
||||||
|
template <typename ForwardIterator>
|
||||||
|
void Append(ForwardIterator first, ForwardIterator last)
|
||||||
|
{
|
||||||
|
if ( first == last )
|
||||||
|
return;
|
||||||
|
|
||||||
|
block_type excess = extra_bits();
|
||||||
|
typename std::iterator_traits<ForwardIterator>::difference_type delta =
|
||||||
|
std::distance(first, last);
|
||||||
|
|
||||||
|
bits.reserve(Blocks() + delta);
|
||||||
|
|
||||||
|
if ( excess == 0 )
|
||||||
|
{
|
||||||
|
bits.back() |= (*first << excess);
|
||||||
|
|
||||||
|
do {
|
||||||
|
block_type b = *first++ >> (bits_per_block - excess);
|
||||||
|
bits.push_back(b | (first == last ? 0 : *first << excess));
|
||||||
|
} while (first != last);
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
else
|
||||||
|
bits.insert(bits.end(), first, last);
|
||||||
|
|
||||||
|
num_bits += bits_per_block * delta;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Appends the bits in a given block.
|
||||||
|
* @param block The block containing bits to append.
|
||||||
|
*/
|
||||||
|
void Append(block_type block);
|
||||||
|
|
||||||
|
/** Appends a single bit to the end of the bit vector.
|
||||||
|
* @param bit The value of the bit.
|
||||||
|
*/
|
||||||
|
void PushBack(bool bit);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Clears all bits in the bitvector.
|
||||||
|
*/
|
||||||
|
void Clear();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Resizes the bit vector to a new number of bits.
|
||||||
|
* @param n The new number of bits of the bit vector.
|
||||||
|
* @param value The bit value of new values, if the vector expands.
|
||||||
|
*/
|
||||||
|
void Resize(size_type n, bool value = false);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Sets a bit at a specific position to a given value.
|
||||||
|
* @param i The bit position.
|
||||||
|
* @param bit The value assigned to position *i*.
|
||||||
|
* @return A reference to the bit vector instance.
|
||||||
|
*/
|
||||||
|
BitVector& Set(size_type i, bool bit = true);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Sets all bits to 1.
|
||||||
|
* @return A reference to the bit vector instance.
|
||||||
|
*/
|
||||||
|
BitVector& Set();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Resets a bit at a specific position, i.e., sets it to 0.
|
||||||
|
* @param i The bit position.
|
||||||
|
* @return A reference to the bit vector instance.
|
||||||
|
*/
|
||||||
|
BitVector& Reset(size_type i);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Sets all bits to 0.
|
||||||
|
* @return A reference to the bit vector instance.
|
||||||
|
*/
|
||||||
|
BitVector& Reset();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Toggles/flips a bit at a specific position.
|
||||||
|
* @param i The bit position.
|
||||||
|
* @return A reference to the bit vector instance.
|
||||||
|
*/
|
||||||
|
BitVector& Flip(size_type i);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the complement.
|
||||||
|
* @return A reference to the bit vector instance.
|
||||||
|
*/
|
||||||
|
BitVector& Flip();
|
||||||
|
|
||||||
|
/** Retrieves a single bit.
|
||||||
|
* @param i The bit position.
|
||||||
|
* @return A mutable reference to the bit at position *i*.
|
||||||
|
*/
|
||||||
|
Reference operator[](size_type i);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Retrieves a single bit.
|
||||||
|
* @param i The bit position.
|
||||||
|
* @return A const-reference to the bit at position *i*.
|
||||||
|
*/
|
||||||
|
const_reference operator[](size_type i) const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Counts the number of 1-bits in the bit vector. Also known as *population
|
||||||
|
* count* or *Hamming weight*.
|
||||||
|
* @return The number of bits set to 1.
|
||||||
|
*/
|
||||||
|
size_type Count() const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Retrieves the number of blocks of the underlying storage.
|
||||||
|
* @param The number of blocks that represent `Size()` bits.
|
||||||
|
*/
|
||||||
|
size_type Blocks() const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Retrieves the number of bits the bitvector consist of.
|
||||||
|
* @return The length of the bit vector in bits.
|
||||||
|
*/
|
||||||
|
size_type Size() const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Checks whether the bit vector is empty.
|
||||||
|
* @return `true` iff the bitvector has zero length.
|
||||||
|
*/
|
||||||
|
bool Empty() const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Checks whether all bits are 0.
|
||||||
|
* @return `true` iff all bits in all blocks are 0.
|
||||||
|
*/
|
||||||
|
bool AllZero() const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Finds the bit position of of the first 1-bit.
|
||||||
|
* @return The position of the first bit that equals to one or `npos` if no
|
||||||
|
* such bit exists.
|
||||||
|
*/
|
||||||
|
size_type FindFirst() const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Finds the next 1-bit from a given starting position.
|
||||||
|
*
|
||||||
|
* @param i The index where to start looking.
|
||||||
|
*
|
||||||
|
* @return The position of the first bit that equals to 1 after position
|
||||||
|
* *i* or `npos` if no such bit exists.
|
||||||
|
*/
|
||||||
|
size_type FindNext(size_type i) const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Serializes the bit vector.
|
||||||
|
*
|
||||||
|
* @param info The serializaton informationt to use.
|
||||||
|
*
|
||||||
|
* @return True if successful.
|
||||||
|
*/
|
||||||
|
bool Serialize(SerialInfo* info) const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Unserialize the bit vector.
|
||||||
|
*
|
||||||
|
* @param info The serializaton informationt to use.
|
||||||
|
*
|
||||||
|
* @return The unserialized bit vector, or null if an error occured.
|
||||||
|
*/
|
||||||
|
static BitVector* Unserialize(UnserialInfo* info);
|
||||||
|
|
||||||
|
protected:
|
||||||
|
DECLARE_SERIAL(BitVector);
|
||||||
|
|
||||||
|
private:
|
||||||
|
/**
|
||||||
|
* Computes the number of excess/unused bits in the bit vector.
|
||||||
|
*/
|
||||||
|
block_type extra_bits() const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* If the number of bits in the vector are not not a multiple of
|
||||||
|
* bitvector::bits_per_block, then the last block exhibits unused bits which
|
||||||
|
* this function resets.
|
||||||
|
*/
|
||||||
|
void zero_unused_bits();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Looks for the first 1-bit starting at a given position.
|
||||||
|
* @param i The block index to start looking.
|
||||||
|
* @return The block index of the first 1-bit starting from *i* or
|
||||||
|
* `bitvector::npos` if no 1-bit exists.
|
||||||
|
*/
|
||||||
|
size_type find_from(size_type i) const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the block index for a given bit position.
|
||||||
|
*/
|
||||||
|
static size_type block_index(size_type i)
|
||||||
|
{
|
||||||
|
return i / bits_per_block;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the bit index within a given block for a given bit position.
|
||||||
|
*/
|
||||||
|
static block_type bit_index(size_type i)
|
||||||
|
{
|
||||||
|
return i % bits_per_block;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the bitmask block to extract a bit a given bit position.
|
||||||
|
*/
|
||||||
|
static block_type bit_mask(size_type i)
|
||||||
|
{
|
||||||
|
return block_type(1) << bit_index(i);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the number of blocks needed to represent a given number of
|
||||||
|
* bits.
|
||||||
|
* @param bits the number of bits.
|
||||||
|
* @return The number of blocks to represent *bits* number of bits.
|
||||||
|
*/
|
||||||
|
static size_type bits_to_blocks(size_type bits)
|
||||||
|
{
|
||||||
|
return bits / bits_per_block
|
||||||
|
+ static_cast<size_type>(bits % bits_per_block != 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the bit position first 1-bit in a given block.
|
||||||
|
* @param block The block to inspect.
|
||||||
|
* @return The bit position where *block* has its first bit set to 1.
|
||||||
|
*/
|
||||||
|
static size_type lowest_bit(block_type block);
|
||||||
|
|
||||||
|
std::vector<block_type> bits;
|
||||||
|
size_type num_bits;
|
||||||
|
};
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
#endif
|
244
src/probabilistic/BloomFilter.cc
Normal file
244
src/probabilistic/BloomFilter.cc
Normal file
|
@ -0,0 +1,244 @@
|
||||||
|
// See the file "COPYING" in the main distribution directory for copyright.
|
||||||
|
|
||||||
|
#include <typeinfo>
|
||||||
|
#include <cmath>
|
||||||
|
#include <limits>
|
||||||
|
|
||||||
|
#include "BloomFilter.h"
|
||||||
|
|
||||||
|
#include "CounterVector.h"
|
||||||
|
#include "Serializer.h"
|
||||||
|
|
||||||
|
using namespace probabilistic;
|
||||||
|
|
||||||
|
BloomFilter::BloomFilter()
|
||||||
|
{
|
||||||
|
hasher = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
BloomFilter::BloomFilter(const Hasher* arg_hasher)
|
||||||
|
{
|
||||||
|
hasher = arg_hasher;
|
||||||
|
}
|
||||||
|
|
||||||
|
BloomFilter::~BloomFilter()
|
||||||
|
{
|
||||||
|
delete hasher;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BloomFilter::Serialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
return SerialObj::Serialize(info);
|
||||||
|
}
|
||||||
|
|
||||||
|
BloomFilter* BloomFilter::Unserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
return reinterpret_cast<BloomFilter*>(SerialObj::Unserialize(info, SER_BLOOMFILTER));
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BloomFilter::DoSerialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
DO_SERIALIZE(SER_BLOOMFILTER, SerialObj);
|
||||||
|
|
||||||
|
return hasher->Serialize(info);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BloomFilter::DoUnserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
DO_UNSERIALIZE(SerialObj);
|
||||||
|
|
||||||
|
hasher = Hasher::Unserialize(info);
|
||||||
|
return hasher != 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t BasicBloomFilter::M(double fp, size_t capacity)
|
||||||
|
{
|
||||||
|
double ln2 = std::log(2);
|
||||||
|
return std::ceil(-(capacity * std::log(fp) / ln2 / ln2));
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t BasicBloomFilter::K(size_t cells, size_t capacity)
|
||||||
|
{
|
||||||
|
double frac = static_cast<double>(cells) / static_cast<double>(capacity);
|
||||||
|
return std::ceil(frac * std::log(2));
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BasicBloomFilter::Empty() const
|
||||||
|
{
|
||||||
|
return bits->AllZero();
|
||||||
|
}
|
||||||
|
|
||||||
|
void BasicBloomFilter::Clear()
|
||||||
|
{
|
||||||
|
bits->Clear();
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BasicBloomFilter::Merge(const BloomFilter* other)
|
||||||
|
{
|
||||||
|
if ( typeid(*this) != typeid(*other) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
const BasicBloomFilter* o = static_cast<const BasicBloomFilter*>(other);
|
||||||
|
|
||||||
|
if ( ! hasher->Equals(o->hasher) )
|
||||||
|
{
|
||||||
|
reporter->Error("incompatible hashers in BasicBloomFilter merge");
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
else if ( bits->Size() != o->bits->Size() )
|
||||||
|
{
|
||||||
|
reporter->Error("different bitvector size in BasicBloomFilter merge");
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
(*bits) |= *o->bits;
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
BasicBloomFilter* BasicBloomFilter::Clone() const
|
||||||
|
{
|
||||||
|
BasicBloomFilter* copy = new BasicBloomFilter();
|
||||||
|
|
||||||
|
copy->hasher = hasher->Clone();
|
||||||
|
copy->bits = new BitVector(*bits);
|
||||||
|
|
||||||
|
return copy;
|
||||||
|
}
|
||||||
|
|
||||||
|
BasicBloomFilter::BasicBloomFilter()
|
||||||
|
{
|
||||||
|
bits = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
BasicBloomFilter::BasicBloomFilter(const Hasher* hasher, size_t cells)
|
||||||
|
: BloomFilter(hasher)
|
||||||
|
{
|
||||||
|
bits = new BitVector(cells);
|
||||||
|
}
|
||||||
|
|
||||||
|
IMPLEMENT_SERIAL(BasicBloomFilter, SER_BASICBLOOMFILTER)
|
||||||
|
|
||||||
|
bool BasicBloomFilter::DoSerialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
DO_SERIALIZE(SER_BASICBLOOMFILTER, BloomFilter);
|
||||||
|
return bits->Serialize(info);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool BasicBloomFilter::DoUnserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
DO_UNSERIALIZE(BloomFilter);
|
||||||
|
bits = BitVector::Unserialize(info);
|
||||||
|
return (bits != 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
void BasicBloomFilter::AddImpl(const Hasher::digest_vector& h)
|
||||||
|
{
|
||||||
|
for ( size_t i = 0; i < h.size(); ++i )
|
||||||
|
bits->Set(h[i] % bits->Size());
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t BasicBloomFilter::CountImpl(const Hasher::digest_vector& h) const
|
||||||
|
{
|
||||||
|
for ( size_t i = 0; i < h.size(); ++i )
|
||||||
|
{
|
||||||
|
if ( ! (*bits)[h[i] % bits->Size()] )
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
CountingBloomFilter::CountingBloomFilter()
|
||||||
|
{
|
||||||
|
cells = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
CountingBloomFilter::CountingBloomFilter(const Hasher* hasher,
|
||||||
|
size_t arg_cells, size_t width)
|
||||||
|
: BloomFilter(hasher)
|
||||||
|
{
|
||||||
|
cells = new CounterVector(width, arg_cells);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool CountingBloomFilter::Empty() const
|
||||||
|
{
|
||||||
|
return cells->AllZero();
|
||||||
|
}
|
||||||
|
|
||||||
|
void CountingBloomFilter::Clear()
|
||||||
|
{
|
||||||
|
cells->Clear();
|
||||||
|
}
|
||||||
|
|
||||||
|
bool CountingBloomFilter::Merge(const BloomFilter* other)
|
||||||
|
{
|
||||||
|
if ( typeid(*this) != typeid(*other) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
const CountingBloomFilter* o = static_cast<const CountingBloomFilter*>(other);
|
||||||
|
|
||||||
|
if ( ! hasher->Equals(o->hasher) )
|
||||||
|
{
|
||||||
|
reporter->Error("incompatible hashers in CountingBloomFilter merge");
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
else if ( cells->Size() != o->cells->Size() )
|
||||||
|
{
|
||||||
|
reporter->Error("different bitvector size in CountingBloomFilter merge");
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
(*cells) |= *o->cells;
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
CountingBloomFilter* CountingBloomFilter::Clone() const
|
||||||
|
{
|
||||||
|
CountingBloomFilter* copy = new CountingBloomFilter();
|
||||||
|
|
||||||
|
copy->hasher = hasher->Clone();
|
||||||
|
copy->cells = new CounterVector(*cells);
|
||||||
|
|
||||||
|
return copy;
|
||||||
|
}
|
||||||
|
|
||||||
|
IMPLEMENT_SERIAL(CountingBloomFilter, SER_COUNTINGBLOOMFILTER)
|
||||||
|
|
||||||
|
bool CountingBloomFilter::DoSerialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
DO_SERIALIZE(SER_COUNTINGBLOOMFILTER, BloomFilter);
|
||||||
|
return cells->Serialize(info);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool CountingBloomFilter::DoUnserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
DO_UNSERIALIZE(BloomFilter);
|
||||||
|
cells = CounterVector::Unserialize(info);
|
||||||
|
return (cells != 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
// TODO: Use partitioning in add/count to allow for reusing CMS bounds.
|
||||||
|
void CountingBloomFilter::AddImpl(const Hasher::digest_vector& h)
|
||||||
|
{
|
||||||
|
for ( size_t i = 0; i < h.size(); ++i )
|
||||||
|
cells->Increment(h[i] % cells->Size());
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t CountingBloomFilter::CountImpl(const Hasher::digest_vector& h) const
|
||||||
|
{
|
||||||
|
CounterVector::size_type min =
|
||||||
|
std::numeric_limits<CounterVector::size_type>::max();
|
||||||
|
|
||||||
|
for ( size_t i = 0; i < h.size(); ++i )
|
||||||
|
{
|
||||||
|
CounterVector::size_type cnt = cells->Count(h[i] % cells->Size());
|
||||||
|
if ( cnt < min )
|
||||||
|
min = cnt;
|
||||||
|
}
|
||||||
|
|
||||||
|
return min;
|
||||||
|
}
|
238
src/probabilistic/BloomFilter.h
Normal file
238
src/probabilistic/BloomFilter.h
Normal file
|
@ -0,0 +1,238 @@
|
||||||
|
// See the file "COPYING" in the main distribution directory for copyright.
|
||||||
|
|
||||||
|
#ifndef PROBABILISTIC_BLOOMFILTER_H
|
||||||
|
#define PROBABILISTIC_BLOOMFILTER_H
|
||||||
|
|
||||||
|
#include <vector>
|
||||||
|
#include "BitVector.h"
|
||||||
|
#include "Hasher.h"
|
||||||
|
|
||||||
|
namespace probabilistic {
|
||||||
|
|
||||||
|
class CounterVector;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* The abstract base class for Bloom filters.
|
||||||
|
*/
|
||||||
|
class BloomFilter : public SerialObj {
|
||||||
|
public:
|
||||||
|
/**
|
||||||
|
* Destructor.
|
||||||
|
*/
|
||||||
|
virtual ~BloomFilter();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Adds an element of type T to the Bloom filter.
|
||||||
|
* @param x The element to add
|
||||||
|
*/
|
||||||
|
template <typename T>
|
||||||
|
void Add(const T& x)
|
||||||
|
{
|
||||||
|
AddImpl((*hasher)(x));
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Retrieves the associated count of a given value.
|
||||||
|
*
|
||||||
|
* @param x The value of type `T` to check.
|
||||||
|
*
|
||||||
|
* @return The counter associated with *x*.
|
||||||
|
*/
|
||||||
|
template <typename T>
|
||||||
|
size_t Count(const T& x) const
|
||||||
|
{
|
||||||
|
return CountImpl((*hasher)(x));
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Checks whether the Bloom filter is empty.
|
||||||
|
*
|
||||||
|
* @return `true` if the Bloom filter contains no elements.
|
||||||
|
*/
|
||||||
|
virtual bool Empty() const = 0;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Removes all elements, i.e., resets all bits in the underlying bit vector.
|
||||||
|
*/
|
||||||
|
virtual void Clear() = 0;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Merges another Bloom filter into a copy of this one.
|
||||||
|
*
|
||||||
|
* @param other The other Bloom filter.
|
||||||
|
*
|
||||||
|
* @return `true` on success.
|
||||||
|
*/
|
||||||
|
virtual bool Merge(const BloomFilter* other) = 0;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Constructs a copy of this Bloom filter.
|
||||||
|
*
|
||||||
|
* @return A copy of `*this`.
|
||||||
|
*/
|
||||||
|
virtual BloomFilter* Clone() const = 0;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Serializes the Bloom filter.
|
||||||
|
*
|
||||||
|
* @param info The serializaton information to use.
|
||||||
|
*
|
||||||
|
* @return True if successful.
|
||||||
|
*/
|
||||||
|
bool Serialize(SerialInfo* info) const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Unserializes a Bloom filter.
|
||||||
|
*
|
||||||
|
* @param info The serializaton information to use.
|
||||||
|
*
|
||||||
|
* @return The unserialized Bloom filter, or null if an error
|
||||||
|
* occured.
|
||||||
|
*/
|
||||||
|
static BloomFilter* Unserialize(UnserialInfo* info);
|
||||||
|
|
||||||
|
protected:
|
||||||
|
DECLARE_ABSTRACT_SERIAL(BloomFilter);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Default constructor.
|
||||||
|
*/
|
||||||
|
BloomFilter();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Constructs a Bloom filter.
|
||||||
|
*
|
||||||
|
* @param hasher The hasher to use for this Bloom filter.
|
||||||
|
*/
|
||||||
|
BloomFilter(const Hasher* hasher);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Abstract method for implementinng the *Add* operation.
|
||||||
|
*
|
||||||
|
* @param hashes A set of *k* hashes for the item to add, computed by
|
||||||
|
* the internal hasher object.
|
||||||
|
*
|
||||||
|
*/
|
||||||
|
virtual void AddImpl(const Hasher::digest_vector& hashes) = 0;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Abstract method for implementing the *Count* operation.
|
||||||
|
*
|
||||||
|
* @param hashes A set of *k* hashes for the item to add, computed by
|
||||||
|
* the internal hasher object.
|
||||||
|
*
|
||||||
|
* @return Returns the counter associated with the hashed element.
|
||||||
|
*/
|
||||||
|
virtual size_t CountImpl(const Hasher::digest_vector& hashes) const = 0;
|
||||||
|
|
||||||
|
const Hasher* hasher;
|
||||||
|
};
|
||||||
|
|
||||||
|
/**
|
||||||
|
* A basic Bloom filter.
|
||||||
|
*/
|
||||||
|
class BasicBloomFilter : public BloomFilter {
|
||||||
|
public:
|
||||||
|
/**
|
||||||
|
* Constructs a basic Bloom filter with a given number of cells. The
|
||||||
|
* ideal number of cells can be computed with *M*.
|
||||||
|
*
|
||||||
|
* @param hasher The hasher to use. The ideal number of hash
|
||||||
|
* functions can be computed with *K*.
|
||||||
|
*
|
||||||
|
* @param cells The number of cells.
|
||||||
|
*/
|
||||||
|
BasicBloomFilter(const Hasher* hasher, size_t cells);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the number of cells based on a given false positive rate
|
||||||
|
* and capacity. In the literature, this parameter often has the name
|
||||||
|
* *M*.
|
||||||
|
*
|
||||||
|
* @param fp The false positive rate.
|
||||||
|
*
|
||||||
|
* @param capacity The expected number of elements that will be
|
||||||
|
* stored.
|
||||||
|
*
|
||||||
|
* Returns: The number cells needed to support a false positive rate
|
||||||
|
* of *fp* with at most *capacity* elements.
|
||||||
|
*/
|
||||||
|
static size_t M(double fp, size_t capacity);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the optimal number of hash functions based on the number cells
|
||||||
|
* and expected number of elements.
|
||||||
|
*
|
||||||
|
* @param cells The number of cells (*m*).
|
||||||
|
*
|
||||||
|
* @param capacity The maximum number of elements.
|
||||||
|
*
|
||||||
|
* Returns: the optimal number of hash functions for a false-positive
|
||||||
|
* rate of *fp* for at most *capacity* elements.
|
||||||
|
*/
|
||||||
|
static size_t K(size_t cells, size_t capacity);
|
||||||
|
|
||||||
|
// Overridden from BloomFilter.
|
||||||
|
virtual bool Empty() const;
|
||||||
|
virtual void Clear();
|
||||||
|
virtual bool Merge(const BloomFilter* other);
|
||||||
|
virtual BasicBloomFilter* Clone() const;
|
||||||
|
|
||||||
|
protected:
|
||||||
|
DECLARE_SERIAL(BasicBloomFilter);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Default constructor.
|
||||||
|
*/
|
||||||
|
BasicBloomFilter();
|
||||||
|
|
||||||
|
// Overridden from BloomFilter.
|
||||||
|
virtual void AddImpl(const Hasher::digest_vector& h);
|
||||||
|
virtual size_t CountImpl(const Hasher::digest_vector& h) const;
|
||||||
|
|
||||||
|
private:
|
||||||
|
BitVector* bits;
|
||||||
|
};
|
||||||
|
|
||||||
|
/**
|
||||||
|
* A counting Bloom filter.
|
||||||
|
*/
|
||||||
|
class CountingBloomFilter : public BloomFilter {
|
||||||
|
public:
|
||||||
|
/**
|
||||||
|
* Constructs a counting Bloom filter.
|
||||||
|
*
|
||||||
|
* @param hasher The hasher to use. The ideal number of hash
|
||||||
|
* functions can be computed with *K*.
|
||||||
|
*
|
||||||
|
* @param cells The number of cells to use.
|
||||||
|
*
|
||||||
|
* @param width The maximal bit-width of counter values.
|
||||||
|
*/
|
||||||
|
CountingBloomFilter(const Hasher* hasher, size_t cells, size_t width);
|
||||||
|
|
||||||
|
// Overridden from BloomFilter.
|
||||||
|
virtual bool Empty() const;
|
||||||
|
virtual void Clear();
|
||||||
|
virtual bool Merge(const BloomFilter* other);
|
||||||
|
virtual CountingBloomFilter* Clone() const;
|
||||||
|
|
||||||
|
protected:
|
||||||
|
DECLARE_SERIAL(CountingBloomFilter);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Default constructor.
|
||||||
|
*/
|
||||||
|
CountingBloomFilter();
|
||||||
|
|
||||||
|
// Overridden from BloomFilter.
|
||||||
|
virtual void AddImpl(const Hasher::digest_vector& h);
|
||||||
|
virtual size_t CountImpl(const Hasher::digest_vector& h) const;
|
||||||
|
|
||||||
|
private:
|
||||||
|
CounterVector* cells;
|
||||||
|
};
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
#endif
|
18
src/probabilistic/CMakeLists.txt
Normal file
18
src/probabilistic/CMakeLists.txt
Normal file
|
@ -0,0 +1,18 @@
|
||||||
|
|
||||||
|
include(BroSubdir)
|
||||||
|
|
||||||
|
include_directories(BEFORE
|
||||||
|
${CMAKE_CURRENT_SOURCE_DIR}
|
||||||
|
${CMAKE_CURRENT_BINARY_DIR}
|
||||||
|
)
|
||||||
|
|
||||||
|
set(probabilistic_SRCS
|
||||||
|
BitVector.cc
|
||||||
|
BloomFilter.cc
|
||||||
|
CounterVector.cc
|
||||||
|
Hasher.cc)
|
||||||
|
|
||||||
|
bif_target(bloom-filter.bif)
|
||||||
|
bro_add_subdir_library(probabilistic ${probabilistic_SRCS})
|
||||||
|
|
||||||
|
add_dependencies(bro_probabilistic generate_outputs)
|
193
src/probabilistic/CounterVector.cc
Normal file
193
src/probabilistic/CounterVector.cc
Normal file
|
@ -0,0 +1,193 @@
|
||||||
|
// See the file "COPYING" in the main distribution directory for copyright.
|
||||||
|
|
||||||
|
#include "CounterVector.h"
|
||||||
|
|
||||||
|
#include <limits>
|
||||||
|
#include "BitVector.h"
|
||||||
|
#include "Serializer.h"
|
||||||
|
|
||||||
|
using namespace probabilistic;
|
||||||
|
|
||||||
|
CounterVector::CounterVector(size_t arg_width, size_t cells)
|
||||||
|
{
|
||||||
|
bits = new BitVector(arg_width * cells);
|
||||||
|
width = arg_width;
|
||||||
|
}
|
||||||
|
|
||||||
|
CounterVector::CounterVector(const CounterVector& other)
|
||||||
|
{
|
||||||
|
bits = new BitVector(*other.bits);
|
||||||
|
width = other.width;
|
||||||
|
}
|
||||||
|
|
||||||
|
CounterVector::~CounterVector()
|
||||||
|
{
|
||||||
|
delete bits;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool CounterVector::Increment(size_type cell, count_type value)
|
||||||
|
{
|
||||||
|
assert(cell < Size());
|
||||||
|
assert(value != 0);
|
||||||
|
|
||||||
|
size_t lsb = cell * width;
|
||||||
|
bool carry = false;
|
||||||
|
|
||||||
|
for ( size_t i = 0; i < width; ++i )
|
||||||
|
{
|
||||||
|
bool b1 = (*bits)[lsb + i];
|
||||||
|
bool b2 = value & (1 << i);
|
||||||
|
(*bits)[lsb + i] = b1 ^ b2 ^ carry;
|
||||||
|
carry = ( b1 && b2 ) || ( carry && ( b1 != b2 ) );
|
||||||
|
}
|
||||||
|
|
||||||
|
if ( carry )
|
||||||
|
{
|
||||||
|
for ( size_t i = 0; i < width; ++i )
|
||||||
|
bits->Set(lsb + i);
|
||||||
|
}
|
||||||
|
|
||||||
|
return ! carry;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool CounterVector::Decrement(size_type cell, count_type value)
|
||||||
|
{
|
||||||
|
assert(cell < Size());
|
||||||
|
assert(value != 0);
|
||||||
|
|
||||||
|
value = ~value + 1; // A - B := A + ~B + 1
|
||||||
|
bool carry = false;
|
||||||
|
size_t lsb = cell * width;
|
||||||
|
|
||||||
|
for ( size_t i = 0; i < width; ++i )
|
||||||
|
{
|
||||||
|
bool b1 = (*bits)[lsb + i];
|
||||||
|
bool b2 = value & (1 << i);
|
||||||
|
(*bits)[lsb + i] = b1 ^ b2 ^ carry;
|
||||||
|
carry = ( b1 && b2 ) || ( carry && ( b1 != b2 ) );
|
||||||
|
}
|
||||||
|
|
||||||
|
return carry;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool CounterVector::AllZero() const
|
||||||
|
{
|
||||||
|
return bits->AllZero();
|
||||||
|
}
|
||||||
|
|
||||||
|
void CounterVector::Clear()
|
||||||
|
{
|
||||||
|
bits->Clear();
|
||||||
|
}
|
||||||
|
|
||||||
|
CounterVector::count_type CounterVector::Count(size_type cell) const
|
||||||
|
{
|
||||||
|
assert(cell < Size());
|
||||||
|
|
||||||
|
size_t cnt = 0, order = 1;
|
||||||
|
size_t lsb = cell * width;
|
||||||
|
|
||||||
|
for ( size_t i = lsb; i < lsb + width; ++i, order <<= 1 )
|
||||||
|
if ( (*bits)[i] )
|
||||||
|
cnt |= order;
|
||||||
|
|
||||||
|
return cnt;
|
||||||
|
}
|
||||||
|
|
||||||
|
CounterVector::size_type CounterVector::Size() const
|
||||||
|
{
|
||||||
|
return bits->Size() / width;
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t CounterVector::Width() const
|
||||||
|
{
|
||||||
|
return width;
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t CounterVector::Max() const
|
||||||
|
{
|
||||||
|
return std::numeric_limits<size_t>::max()
|
||||||
|
>> (std::numeric_limits<size_t>::digits - width);
|
||||||
|
}
|
||||||
|
|
||||||
|
CounterVector& CounterVector::Merge(const CounterVector& other)
|
||||||
|
{
|
||||||
|
assert(Size() == other.Size());
|
||||||
|
assert(Width() == other.Width());
|
||||||
|
|
||||||
|
for ( size_t cell = 0; cell < Size(); ++cell )
|
||||||
|
{
|
||||||
|
size_t lsb = cell * width;
|
||||||
|
bool carry = false;
|
||||||
|
|
||||||
|
for ( size_t i = 0; i < width; ++i )
|
||||||
|
{
|
||||||
|
bool b1 = (*bits)[lsb + i];
|
||||||
|
bool b2 = (*other.bits)[lsb + i];
|
||||||
|
(*bits)[lsb + i] = b1 ^ b2 ^ carry;
|
||||||
|
carry = ( b1 && b2 ) || ( carry && ( b1 != b2 ) );
|
||||||
|
}
|
||||||
|
|
||||||
|
if ( carry )
|
||||||
|
{
|
||||||
|
for ( size_t i = 0; i < width; ++i )
|
||||||
|
bits->Set(lsb + i);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return *this;
|
||||||
|
}
|
||||||
|
|
||||||
|
namespace probabilistic {
|
||||||
|
|
||||||
|
CounterVector& CounterVector::operator|=(const CounterVector& other)
|
||||||
|
{
|
||||||
|
return Merge(other);
|
||||||
|
}
|
||||||
|
|
||||||
|
CounterVector operator|(const CounterVector& x, const CounterVector& y)
|
||||||
|
{
|
||||||
|
CounterVector cv(x);
|
||||||
|
return cv |= y;
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
bool CounterVector::Serialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
return SerialObj::Serialize(info);
|
||||||
|
}
|
||||||
|
|
||||||
|
CounterVector* CounterVector::Unserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
return reinterpret_cast<CounterVector*>(SerialObj::Unserialize(info, SER_COUNTERVECTOR));
|
||||||
|
}
|
||||||
|
|
||||||
|
IMPLEMENT_SERIAL(CounterVector, SER_COUNTERVECTOR)
|
||||||
|
|
||||||
|
bool CounterVector::DoSerialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
DO_SERIALIZE(SER_COUNTERVECTOR, SerialObj);
|
||||||
|
|
||||||
|
if ( ! bits->Serialize(info) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
return SERIALIZE(static_cast<uint64>(width));
|
||||||
|
}
|
||||||
|
|
||||||
|
bool CounterVector::DoUnserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
DO_UNSERIALIZE(SerialObj);
|
||||||
|
|
||||||
|
bits = BitVector::Unserialize(info);
|
||||||
|
if ( ! bits )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
uint64 w;
|
||||||
|
if ( ! UNSERIALIZE(&w) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
width = static_cast<size_t>(w);
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
165
src/probabilistic/CounterVector.h
Normal file
165
src/probabilistic/CounterVector.h
Normal file
|
@ -0,0 +1,165 @@
|
||||||
|
// See the file "COPYING" in the main distribution directory for copyright.
|
||||||
|
|
||||||
|
#ifndef PROBABILISTIC_COUNTERVECTOR_H
|
||||||
|
#define PROBABILISTIC_COUNTERVECTOR_H
|
||||||
|
|
||||||
|
#include "SerialObj.h"
|
||||||
|
|
||||||
|
namespace probabilistic {
|
||||||
|
|
||||||
|
class BitVector;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* A vector of counters, each of which has a fixed number of bits.
|
||||||
|
*/
|
||||||
|
class CounterVector : public SerialObj {
|
||||||
|
public:
|
||||||
|
typedef size_t size_type;
|
||||||
|
typedef uint64 count_type;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Constructs a counter vector having cells of a given width.
|
||||||
|
*
|
||||||
|
* @param width The number of bits that each cell occupies.
|
||||||
|
*
|
||||||
|
* @param cells The number of cells in the bitvector.
|
||||||
|
*
|
||||||
|
* @pre `cells > 0 && width > 0`
|
||||||
|
*/
|
||||||
|
CounterVector(size_t width, size_t cells = 1024);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Copy-constructs a counter vector.
|
||||||
|
*
|
||||||
|
* @param other The counter vector to copy.
|
||||||
|
*/
|
||||||
|
CounterVector(const CounterVector& other);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Destructor.
|
||||||
|
*/
|
||||||
|
~CounterVector();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Increments a given cell.
|
||||||
|
*
|
||||||
|
* @param cell The cell to increment.
|
||||||
|
*
|
||||||
|
* @param value The value to add to the current counter in *cell*.
|
||||||
|
*
|
||||||
|
* @return `true` if adding *value* to the counter in *cell* succeeded.
|
||||||
|
*
|
||||||
|
* @pre `cell < Size()`
|
||||||
|
*/
|
||||||
|
bool Increment(size_type cell, count_type value = 1);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Decrements a given cell.
|
||||||
|
*
|
||||||
|
* @param cell The cell to decrement.
|
||||||
|
*
|
||||||
|
* @param value The value to subtract from the current counter in *cell*.
|
||||||
|
*
|
||||||
|
* @return `true` if subtracting *value* from the counter in *cell* succeeded.
|
||||||
|
*
|
||||||
|
* @pre `cell < Size()`
|
||||||
|
*/
|
||||||
|
bool Decrement(size_type cell, count_type value = 1);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Retrieves the counter of a given cell.
|
||||||
|
*
|
||||||
|
* @param cell The cell index to retrieve the count for.
|
||||||
|
*
|
||||||
|
* @return The counter associated with *cell*.
|
||||||
|
*
|
||||||
|
* @pre `cell < Size()`
|
||||||
|
*/
|
||||||
|
count_type Count(size_type cell) const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Checks whether all counters are 0.
|
||||||
|
* @return `true` iff all counters have the value 0.
|
||||||
|
*/
|
||||||
|
bool AllZero() const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Sets all counters to 0.
|
||||||
|
*/
|
||||||
|
void Clear();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Retrieves the number of cells in the storage.
|
||||||
|
*
|
||||||
|
* @return The number of cells.
|
||||||
|
*/
|
||||||
|
size_type Size() const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Retrieves the counter width.
|
||||||
|
*
|
||||||
|
* @return The number of bits per counter.
|
||||||
|
*/
|
||||||
|
size_t Width() const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the maximum counter value.
|
||||||
|
*
|
||||||
|
* @return The maximum counter value based on the width.
|
||||||
|
*/
|
||||||
|
size_t Max() const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Merges another counter vector into this instance by *adding* the
|
||||||
|
* counters of each cells.
|
||||||
|
*
|
||||||
|
* @param other The counter vector to merge into this instance.
|
||||||
|
*
|
||||||
|
* @return A reference to `*this`.
|
||||||
|
*
|
||||||
|
* @pre `Size() == other.Size() && Width() == other.Width()`
|
||||||
|
*/
|
||||||
|
CounterVector& Merge(const CounterVector& other);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* An alias for ::Merge.
|
||||||
|
*/
|
||||||
|
CounterVector& operator|=(const CounterVector& other);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Serializes the bit vector.
|
||||||
|
*
|
||||||
|
* @param info The serializaton information to use.
|
||||||
|
*
|
||||||
|
* @return True if successful.
|
||||||
|
*/
|
||||||
|
bool Serialize(SerialInfo* info) const;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Unserialize the counter vector.
|
||||||
|
*
|
||||||
|
* @param info The serializaton information to use.
|
||||||
|
*
|
||||||
|
* @return The unserialized counter vector, or null if an error
|
||||||
|
* occured.
|
||||||
|
*/
|
||||||
|
static CounterVector* Unserialize(UnserialInfo* info);
|
||||||
|
|
||||||
|
protected:
|
||||||
|
friend CounterVector operator|(const CounterVector& x,
|
||||||
|
const CounterVector& y);
|
||||||
|
|
||||||
|
CounterVector() { }
|
||||||
|
|
||||||
|
DECLARE_SERIAL(CounterVector);
|
||||||
|
|
||||||
|
private:
|
||||||
|
CounterVector& operator=(const CounterVector&); // Disable.
|
||||||
|
|
||||||
|
BitVector* bits;
|
||||||
|
size_t width;
|
||||||
|
};
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
#endif
|
194
src/probabilistic/Hasher.cc
Normal file
194
src/probabilistic/Hasher.cc
Normal file
|
@ -0,0 +1,194 @@
|
||||||
|
// See the file "COPYING" in the main distribution directory for copyright.
|
||||||
|
|
||||||
|
#include <typeinfo>
|
||||||
|
|
||||||
|
#include "Hasher.h"
|
||||||
|
#include "digest.h"
|
||||||
|
#include "Serializer.h"
|
||||||
|
|
||||||
|
using namespace probabilistic;
|
||||||
|
|
||||||
|
bool Hasher::Serialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
return SerialObj::Serialize(info);
|
||||||
|
}
|
||||||
|
|
||||||
|
Hasher* Hasher::Unserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
return reinterpret_cast<Hasher*>(SerialObj::Unserialize(info, SER_HASHER));
|
||||||
|
}
|
||||||
|
|
||||||
|
bool Hasher::DoSerialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
DO_SERIALIZE(SER_HASHER, SerialObj);
|
||||||
|
|
||||||
|
if ( ! SERIALIZE(static_cast<uint16>(k)) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
return SERIALIZE_STR(name.c_str(), name.size());
|
||||||
|
}
|
||||||
|
|
||||||
|
bool Hasher::DoUnserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
DO_UNSERIALIZE(SerialObj);
|
||||||
|
|
||||||
|
uint16 serial_k;
|
||||||
|
if ( ! UNSERIALIZE(&serial_k) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
k = serial_k;
|
||||||
|
assert(k > 0);
|
||||||
|
|
||||||
|
const char* serial_name;
|
||||||
|
if ( ! UNSERIALIZE_STR(&serial_name, 0) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
name = serial_name;
|
||||||
|
delete [] serial_name;
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
Hasher::Hasher(size_t k, const std::string& arg_name)
|
||||||
|
: k(k)
|
||||||
|
{
|
||||||
|
k = k;
|
||||||
|
name = arg_name;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
UHF::UHF(size_t seed, const std::string& extra)
|
||||||
|
: h(compute_seed(seed, extra))
|
||||||
|
{
|
||||||
|
}
|
||||||
|
|
||||||
|
Hasher::digest UHF::hash(const void* x, size_t n) const
|
||||||
|
{
|
||||||
|
assert(n <= UHASH_KEY_SIZE);
|
||||||
|
return n == 0 ? 0 : h(x, n);
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t UHF::compute_seed(size_t seed, const std::string& extra)
|
||||||
|
{
|
||||||
|
u_char buf[SHA256_DIGEST_LENGTH];
|
||||||
|
SHA256_CTX ctx;
|
||||||
|
sha256_init(&ctx);
|
||||||
|
|
||||||
|
if ( extra.empty() )
|
||||||
|
{
|
||||||
|
unsigned int first_seed = initial_seed();
|
||||||
|
sha256_update(&ctx, &first_seed, sizeof(first_seed));
|
||||||
|
}
|
||||||
|
|
||||||
|
else
|
||||||
|
sha256_update(&ctx, extra.c_str(), extra.size());
|
||||||
|
|
||||||
|
sha256_update(&ctx, &seed, sizeof(seed));
|
||||||
|
sha256_final(&ctx, buf);
|
||||||
|
|
||||||
|
// Take the first sizeof(size_t) bytes as seed.
|
||||||
|
return *reinterpret_cast<size_t*>(buf);
|
||||||
|
}
|
||||||
|
|
||||||
|
DefaultHasher::DefaultHasher(size_t k, const std::string& name)
|
||||||
|
: Hasher(k, name)
|
||||||
|
{
|
||||||
|
for ( size_t i = 0; i < k; ++i )
|
||||||
|
hash_functions.push_back(UHF(i, name));
|
||||||
|
}
|
||||||
|
|
||||||
|
Hasher::digest_vector DefaultHasher::Hash(const void* x, size_t n) const
|
||||||
|
{
|
||||||
|
digest_vector h(K(), 0);
|
||||||
|
|
||||||
|
for ( size_t i = 0; i < h.size(); ++i )
|
||||||
|
h[i] = hash_functions[i](x, n);
|
||||||
|
|
||||||
|
return h;
|
||||||
|
}
|
||||||
|
|
||||||
|
DefaultHasher* DefaultHasher::Clone() const
|
||||||
|
{
|
||||||
|
return new DefaultHasher(*this);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool DefaultHasher::Equals(const Hasher* other) const
|
||||||
|
{
|
||||||
|
if ( typeid(*this) != typeid(*other) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
const DefaultHasher* o = static_cast<const DefaultHasher*>(other);
|
||||||
|
return hash_functions == o->hash_functions;
|
||||||
|
}
|
||||||
|
|
||||||
|
IMPLEMENT_SERIAL(DefaultHasher, SER_DEFAULTHASHER)
|
||||||
|
|
||||||
|
bool DefaultHasher::DoSerialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
DO_SERIALIZE(SER_DEFAULTHASHER, Hasher);
|
||||||
|
|
||||||
|
// Nothing to do here, the base class has all we need serialized already.
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool DefaultHasher::DoUnserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
DO_UNSERIALIZE(Hasher);
|
||||||
|
|
||||||
|
hash_functions.clear();
|
||||||
|
for ( size_t i = 0; i < K(); ++i )
|
||||||
|
hash_functions.push_back(UHF(i, Name()));
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
DoubleHasher::DoubleHasher(size_t k, const std::string& name)
|
||||||
|
: Hasher(k, name), h1(1, name), h2(2, name)
|
||||||
|
{
|
||||||
|
}
|
||||||
|
|
||||||
|
Hasher::digest_vector DoubleHasher::Hash(const void* x, size_t n) const
|
||||||
|
{
|
||||||
|
digest d1 = h1(x, n);
|
||||||
|
digest d2 = h2(x, n);
|
||||||
|
digest_vector h(K(), 0);
|
||||||
|
|
||||||
|
for ( size_t i = 0; i < h.size(); ++i )
|
||||||
|
h[i] = d1 + i * d2;
|
||||||
|
|
||||||
|
return h;
|
||||||
|
}
|
||||||
|
|
||||||
|
DoubleHasher* DoubleHasher::Clone() const
|
||||||
|
{
|
||||||
|
return new DoubleHasher(*this);
|
||||||
|
}
|
||||||
|
|
||||||
|
bool DoubleHasher::Equals(const Hasher* other) const
|
||||||
|
{
|
||||||
|
if ( typeid(*this) != typeid(*other) )
|
||||||
|
return false;
|
||||||
|
|
||||||
|
const DoubleHasher* o = static_cast<const DoubleHasher*>(other);
|
||||||
|
return h1 == o->h1 && h2 == o->h2;
|
||||||
|
}
|
||||||
|
|
||||||
|
IMPLEMENT_SERIAL(DoubleHasher, SER_DOUBLEHASHER)
|
||||||
|
|
||||||
|
bool DoubleHasher::DoSerialize(SerialInfo* info) const
|
||||||
|
{
|
||||||
|
DO_SERIALIZE(SER_DOUBLEHASHER, Hasher);
|
||||||
|
|
||||||
|
// Nothing to do here, the base class has all we need serialized already.
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool DoubleHasher::DoUnserialize(UnserialInfo* info)
|
||||||
|
{
|
||||||
|
DO_UNSERIALIZE(Hasher);
|
||||||
|
|
||||||
|
h1 = UHF(1, Name());
|
||||||
|
h2 = UHF(2, Name());
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
220
src/probabilistic/Hasher.h
Normal file
220
src/probabilistic/Hasher.h
Normal file
|
@ -0,0 +1,220 @@
|
||||||
|
// See the file "COPYING" in the main distribution directory for copyright.
|
||||||
|
|
||||||
|
#ifndef PROBABILISTIC_HASHER_H
|
||||||
|
#define PROBABILISTIC_HASHER_H
|
||||||
|
|
||||||
|
#include "Hash.h"
|
||||||
|
#include "H3.h"
|
||||||
|
#include "SerialObj.h"
|
||||||
|
|
||||||
|
namespace probabilistic {
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Abstract base class for hashers. A hasher creates a family of hash
|
||||||
|
* functions to hash an element *k* times.
|
||||||
|
*/
|
||||||
|
class Hasher : public SerialObj {
|
||||||
|
public:
|
||||||
|
typedef hash_t digest;
|
||||||
|
typedef std::vector<digest> digest_vector;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Destructor.
|
||||||
|
*/
|
||||||
|
virtual ~Hasher() { }
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes hash values for an element.
|
||||||
|
*
|
||||||
|
* @param x The element to hash.
|
||||||
|
*
|
||||||
|
* @return Vector of *k* hash values.
|
||||||
|
*/
|
||||||
|
template <typename T>
|
||||||
|
digest_vector operator()(const T& x) const
|
||||||
|
{
|
||||||
|
return Hash(&x, sizeof(T));
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the hashes for a set of bytes.
|
||||||
|
*
|
||||||
|
* @param x Pointer to first byte to hash.
|
||||||
|
*
|
||||||
|
* @param n Number of bytes to hash.
|
||||||
|
*
|
||||||
|
* @return Vector of *k* hash values.
|
||||||
|
*
|
||||||
|
*/
|
||||||
|
virtual digest_vector Hash(const void* x, size_t n) const = 0;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns a deep copy of the hasher.
|
||||||
|
*/
|
||||||
|
virtual Hasher* Clone() const = 0;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns true if two hashers are identical.
|
||||||
|
*/
|
||||||
|
virtual bool Equals(const Hasher* other) const = 0;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns the number *k* of hash functions the hashers applies.
|
||||||
|
*/
|
||||||
|
size_t K() const { return k; }
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns the hasher's name. If not empty, the hasher uses this descriptor
|
||||||
|
* to seed its *k* hash functions. Otherwise the hasher mixes in the initial
|
||||||
|
* seed derived from the environment variable `$BRO_SEED`.
|
||||||
|
*/
|
||||||
|
const std::string& Name() const { return name; }
|
||||||
|
|
||||||
|
bool Serialize(SerialInfo* info) const;
|
||||||
|
static Hasher* Unserialize(UnserialInfo* info);
|
||||||
|
|
||||||
|
protected:
|
||||||
|
DECLARE_ABSTRACT_SERIAL(Hasher);
|
||||||
|
|
||||||
|
Hasher() { }
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Constructor.
|
||||||
|
*
|
||||||
|
* @param k the number of hash functions.
|
||||||
|
*
|
||||||
|
* @param name A name for the hasher. Hashers with the same name
|
||||||
|
* should provide consistent results.
|
||||||
|
*/
|
||||||
|
Hasher(size_t k, const std::string& name);
|
||||||
|
|
||||||
|
private:
|
||||||
|
size_t k;
|
||||||
|
std::string name;
|
||||||
|
};
|
||||||
|
|
||||||
|
/**
|
||||||
|
* A universal hash function family. This is a helper class that Hasher
|
||||||
|
* implementations can use in their implementation.
|
||||||
|
*/
|
||||||
|
class UHF {
|
||||||
|
public:
|
||||||
|
/**
|
||||||
|
* Constructs an H3 hash function seeded with a given seed and an
|
||||||
|
* optional extra seed to replace the initial Bro seed.
|
||||||
|
*
|
||||||
|
* @param seed The seed to use for this instance.
|
||||||
|
*
|
||||||
|
* @param extra If not empty, this parameter replaces the initial
|
||||||
|
* seed to compute the seed for t to compute the seed NUL-terminated
|
||||||
|
* string as additional seed.
|
||||||
|
*/
|
||||||
|
UHF(size_t seed = 0, const std::string& extra = "");
|
||||||
|
|
||||||
|
template <typename T>
|
||||||
|
Hasher::digest operator()(const T& x) const
|
||||||
|
{
|
||||||
|
return hash(&x, sizeof(T));
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes hash values for an element.
|
||||||
|
*
|
||||||
|
* @param x The element to hash.
|
||||||
|
*
|
||||||
|
* @return Vector of *k* hash values.
|
||||||
|
*/
|
||||||
|
Hasher::digest operator()(const void* x, size_t n) const
|
||||||
|
{
|
||||||
|
return hash(x, n);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Computes the hashes for a set of bytes.
|
||||||
|
*
|
||||||
|
* @param x Pointer to first byte to hash.
|
||||||
|
*
|
||||||
|
* @param n Number of bytes to hash.
|
||||||
|
*
|
||||||
|
* @return Vector of *k* hash values.
|
||||||
|
*
|
||||||
|
*/
|
||||||
|
Hasher::digest hash(const void* x, size_t n) const;
|
||||||
|
|
||||||
|
friend bool operator==(const UHF& x, const UHF& y)
|
||||||
|
{
|
||||||
|
return x.h == y.h;
|
||||||
|
}
|
||||||
|
|
||||||
|
friend bool operator!=(const UHF& x, const UHF& y)
|
||||||
|
{
|
||||||
|
return ! (x == y);
|
||||||
|
}
|
||||||
|
|
||||||
|
private:
|
||||||
|
static size_t compute_seed(size_t seed, const std::string& extra);
|
||||||
|
|
||||||
|
H3<Hasher::digest, UHASH_KEY_SIZE> h;
|
||||||
|
};
|
||||||
|
|
||||||
|
|
||||||
|
/**
|
||||||
|
* A hasher implementing the default hashing policy. Uses *k* separate hash
|
||||||
|
* functions internally.
|
||||||
|
*/
|
||||||
|
class DefaultHasher : public Hasher {
|
||||||
|
public:
|
||||||
|
/**
|
||||||
|
* Constructor for a hasher with *k* hash functions.
|
||||||
|
*
|
||||||
|
* @param k The number of hash functions to use.
|
||||||
|
*
|
||||||
|
* @param name The name of the hasher.
|
||||||
|
*/
|
||||||
|
DefaultHasher(size_t k, const std::string& name = "");
|
||||||
|
|
||||||
|
// Overridden from Hasher.
|
||||||
|
virtual digest_vector Hash(const void* x, size_t n) const /* final */;
|
||||||
|
virtual DefaultHasher* Clone() const /* final */;
|
||||||
|
virtual bool Equals(const Hasher* other) const /* final */;
|
||||||
|
|
||||||
|
DECLARE_SERIAL(DefaultHasher);
|
||||||
|
|
||||||
|
private:
|
||||||
|
DefaultHasher() { }
|
||||||
|
|
||||||
|
std::vector<UHF> hash_functions;
|
||||||
|
};
|
||||||
|
|
||||||
|
/**
|
||||||
|
* The *double-hashing* policy. Uses a linear combination of two hash
|
||||||
|
* functions.
|
||||||
|
*/
|
||||||
|
class DoubleHasher : public Hasher {
|
||||||
|
public:
|
||||||
|
/**
|
||||||
|
* Constructor for a double hasher with *k* hash functions.
|
||||||
|
*
|
||||||
|
* @param k The number of hash functions to use.
|
||||||
|
*
|
||||||
|
* @param name The name of the hasher.
|
||||||
|
*/
|
||||||
|
DoubleHasher(size_t k, const std::string& name = "");
|
||||||
|
|
||||||
|
// Overridden from Hasher.
|
||||||
|
virtual digest_vector Hash(const void* x, size_t n) const /* final */;
|
||||||
|
virtual DoubleHasher* Clone() const /* final */;
|
||||||
|
virtual bool Equals(const Hasher* other) const /* final */;
|
||||||
|
|
||||||
|
DECLARE_SERIAL(DoubleHasher);
|
||||||
|
|
||||||
|
private:
|
||||||
|
DoubleHasher() { }
|
||||||
|
|
||||||
|
UHF h1;
|
||||||
|
UHF h2;
|
||||||
|
};
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
#endif
|
196
src/probabilistic/bloom-filter.bif
Normal file
196
src/probabilistic/bloom-filter.bif
Normal file
|
@ -0,0 +1,196 @@
|
||||||
|
# ===========================================================================
|
||||||
|
#
|
||||||
|
# Bloom Filter Functions
|
||||||
|
#
|
||||||
|
# ===========================================================================
|
||||||
|
|
||||||
|
%%{
|
||||||
|
|
||||||
|
// TODO: This is currently included from the top-level src directory, hence
|
||||||
|
// paths are relative to there. We need a better mechanisms to pull in
|
||||||
|
// BiFs defined in sub directories.
|
||||||
|
#include "probabilistic/BloomFilter.h"
|
||||||
|
#include "OpaqueVal.h"
|
||||||
|
|
||||||
|
using namespace probabilistic;
|
||||||
|
|
||||||
|
%%}
|
||||||
|
|
||||||
|
module GLOBAL;
|
||||||
|
|
||||||
|
## Creates a basic Bloom filter.
|
||||||
|
##
|
||||||
|
## .. note:: A Bloom filter can have a name associated with it. In the future,
|
||||||
|
## Bloom filters with the same name will be compatible across indepedent Bro
|
||||||
|
## instances, i.e., it will be possible to merge them. Currently, however, that is
|
||||||
|
## not yet supported.
|
||||||
|
##
|
||||||
|
## fp: The desired false-positive rate.
|
||||||
|
##
|
||||||
|
## capacity: the maximum number of elements that guarantees a false-positive
|
||||||
|
## rate of *fp*.
|
||||||
|
##
|
||||||
|
## name: A name that uniquely identifies and seeds the Bloom filter. If empty,
|
||||||
|
## the filter will remain tied to the current Bro process.
|
||||||
|
##
|
||||||
|
## Returns: A Bloom filter handle.
|
||||||
|
##
|
||||||
|
## .. bro:see:: bloomfilter_counting_init bloomfilter_add bloomfilter_lookup
|
||||||
|
## bloomfilter_clear bloomfilter_merge
|
||||||
|
function bloomfilter_basic_init%(fp: double, capacity: count,
|
||||||
|
name: string &default=""%): opaque of bloomfilter
|
||||||
|
%{
|
||||||
|
if ( fp < 0.0 || fp > 1.0 )
|
||||||
|
{
|
||||||
|
reporter->Error("false-positive rate must take value between 0 and 1");
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t cells = BasicBloomFilter::M(fp, capacity);
|
||||||
|
size_t optimal_k = BasicBloomFilter::K(cells, capacity);
|
||||||
|
const Hasher* h = new DefaultHasher(optimal_k, name->CheckString());
|
||||||
|
|
||||||
|
return new BloomFilterVal(new BasicBloomFilter(h, cells));
|
||||||
|
%}
|
||||||
|
|
||||||
|
## Creates a counting Bloom filter.
|
||||||
|
##
|
||||||
|
## .. note:: A Bloom filter can have a name associated with it. In the future,
|
||||||
|
## Bloom filters with the same name will be compatible across indepedent Bro
|
||||||
|
## instances, i.e., it will be possible to merge them. Currently, however, that is
|
||||||
|
## not yet supported.
|
||||||
|
##
|
||||||
|
## k: The number of hash functions to use.
|
||||||
|
##
|
||||||
|
## cells: The number of cells of the underlying counter vector. As there's no
|
||||||
|
## single answer to what's the best parameterization for a counting Bloom filter,
|
||||||
|
## we refer to the Bloom filter literature here for choosing an appropiate value.
|
||||||
|
##
|
||||||
|
## max: The maximum counter value associated with each each element described
|
||||||
|
## by *w = ceil(log_2(max))* bits. Each bit in the underlying counter vector
|
||||||
|
## becomes a cell of size *w* bits.
|
||||||
|
##
|
||||||
|
## name: A name that uniquely identifies and seeds the Bloom filter. If empty,
|
||||||
|
## the filter will remain tied to the current Bro process.
|
||||||
|
##
|
||||||
|
## Returns: A Bloom filter handle.
|
||||||
|
##
|
||||||
|
## .. bro:see:: bloomfilter_basic_init bloomfilter_add bloomfilter_lookup
|
||||||
|
## bloomfilter_clear bloomfilter_merge
|
||||||
|
function bloomfilter_counting_init%(k: count, cells: count, max: count,
|
||||||
|
name: string &default=""%): opaque of bloomfilter
|
||||||
|
%{
|
||||||
|
if ( max == 0 )
|
||||||
|
{
|
||||||
|
reporter->Error("max counter value must be greater than 0");
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
const Hasher* h = new DefaultHasher(k, name->CheckString());
|
||||||
|
|
||||||
|
uint16 width = 1;
|
||||||
|
while ( max >>= 1 )
|
||||||
|
++width;
|
||||||
|
|
||||||
|
return new BloomFilterVal(new CountingBloomFilter(h, cells, width));
|
||||||
|
%}
|
||||||
|
|
||||||
|
## Adds an element to a Bloom filter.
|
||||||
|
##
|
||||||
|
## bf: The Bloom filter handle.
|
||||||
|
##
|
||||||
|
## x: The element to add.
|
||||||
|
##
|
||||||
|
## .. bro:see:: bloomfilter_counting_init bloomfilter_basic_init loomfilter_lookup
|
||||||
|
## bloomfilter_clear bloomfilter_merge
|
||||||
|
function bloomfilter_add%(bf: opaque of bloomfilter, x: any%): any
|
||||||
|
%{
|
||||||
|
BloomFilterVal* bfv = static_cast<BloomFilterVal*>(bf);
|
||||||
|
|
||||||
|
if ( ! bfv->Type() && ! bfv->Typify(x->Type()) )
|
||||||
|
reporter->Error("failed to set Bloom filter type");
|
||||||
|
|
||||||
|
else if ( ! same_type(bfv->Type(), x->Type()) )
|
||||||
|
reporter->Error("incompatible Bloom filter types");
|
||||||
|
|
||||||
|
else
|
||||||
|
bfv->Add(x);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
%}
|
||||||
|
|
||||||
|
## Retrieves the counter for a given element in a Bloom filter.
|
||||||
|
##
|
||||||
|
## bf: The Bloom filter handle.
|
||||||
|
##
|
||||||
|
## x: The element to count.
|
||||||
|
##
|
||||||
|
## Returns: the counter associated with *x* in *bf*.
|
||||||
|
##
|
||||||
|
## .. bro:see:: bloomfilter_counting_init bloomfilter_basic_init
|
||||||
|
## bloomfilter_add bloomfilter_clear bloomfilter_merge
|
||||||
|
function bloomfilter_lookup%(bf: opaque of bloomfilter, x: any%): count
|
||||||
|
%{
|
||||||
|
const BloomFilterVal* bfv = static_cast<const BloomFilterVal*>(bf);
|
||||||
|
|
||||||
|
if ( bfv->Empty() )
|
||||||
|
return new Val(0, TYPE_COUNT);
|
||||||
|
|
||||||
|
if ( ! bfv->Type() )
|
||||||
|
reporter->Error("cannot perform lookup on untyped Bloom filter");
|
||||||
|
|
||||||
|
else if ( ! same_type(bfv->Type(), x->Type()) )
|
||||||
|
reporter->Error("incompatible Bloom filter types");
|
||||||
|
|
||||||
|
else
|
||||||
|
return new Val(static_cast<uint64>(bfv->Count(x)), TYPE_COUNT);
|
||||||
|
|
||||||
|
return new Val(0, TYPE_COUNT);
|
||||||
|
%}
|
||||||
|
|
||||||
|
## Removes all elements from a Bloom filter. This function resets all bits in the
|
||||||
|
## underlying bitvector back to 0 but does not change the parameterization of the
|
||||||
|
## Bloom filter, such as the element type and the hasher seed.
|
||||||
|
##
|
||||||
|
## bf: The Bloom filter handle.
|
||||||
|
##
|
||||||
|
## .. bro:see:: bloomfilter_counting_init bloomfilter_basic_init
|
||||||
|
## bloomfilter_add bloomfilter_lookup bloomfilter_merge
|
||||||
|
function bloomfilter_clear%(bf: opaque of bloomfilter%): any
|
||||||
|
%{
|
||||||
|
BloomFilterVal* bfv = static_cast<BloomFilterVal*>(bf);
|
||||||
|
|
||||||
|
if ( bfv->Type() ) // Untyped Bloom filters are already empty.
|
||||||
|
bfv->Clear();
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
%}
|
||||||
|
|
||||||
|
## Merges two Bloom filters.
|
||||||
|
##
|
||||||
|
## .. note:: Currently Bloom filters created by different Bro instances cannot
|
||||||
|
## be merged. In the future, this will be supported as long as both filters
|
||||||
|
## are created with the same name.
|
||||||
|
##
|
||||||
|
## bf1: The first Bloom filter handle.
|
||||||
|
##
|
||||||
|
## bf2: The second Bloom filter handle.
|
||||||
|
##
|
||||||
|
## Returns: The union of *bf1* and *bf2*.
|
||||||
|
##
|
||||||
|
## .. bro:see:: bloomfilter_counting_init bloomfilter_basic_init
|
||||||
|
## bloomfilter_add bloomfilter_lookup bloomfilter_clear
|
||||||
|
function bloomfilter_merge%(bf1: opaque of bloomfilter,
|
||||||
|
bf2: opaque of bloomfilter%): opaque of bloomfilter
|
||||||
|
%{
|
||||||
|
const BloomFilterVal* bfv1 = static_cast<const BloomFilterVal*>(bf1);
|
||||||
|
const BloomFilterVal* bfv2 = static_cast<const BloomFilterVal*>(bf2);
|
||||||
|
|
||||||
|
if ( ! same_type(bfv1->Type(), bfv2->Type()) )
|
||||||
|
{
|
||||||
|
reporter->Error("incompatible Bloom filter types");
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
return BloomFilterVal::Merge(bfv1, bfv2);
|
||||||
|
%}
|
42
src/util.cc
42
src/util.cc
|
@ -716,6 +716,8 @@ static bool write_random_seeds(const char* write_file, uint32 seed,
|
||||||
|
|
||||||
static bool bro_rand_determistic = false;
|
static bool bro_rand_determistic = false;
|
||||||
static unsigned int bro_rand_state = 0;
|
static unsigned int bro_rand_state = 0;
|
||||||
|
static bool first_seed_saved = false;
|
||||||
|
static unsigned int first_seed = 0;
|
||||||
|
|
||||||
static void bro_srandom(unsigned int seed, bool deterministic)
|
static void bro_srandom(unsigned int seed, bool deterministic)
|
||||||
{
|
{
|
||||||
|
@ -800,6 +802,12 @@ void init_random_seed(uint32 seed, const char* read_file, const char* write_file
|
||||||
|
|
||||||
bro_srandom(seed, seeds_done);
|
bro_srandom(seed, seeds_done);
|
||||||
|
|
||||||
|
if ( ! first_seed_saved )
|
||||||
|
{
|
||||||
|
first_seed = seed;
|
||||||
|
first_seed_saved = true;
|
||||||
|
}
|
||||||
|
|
||||||
if ( ! hmac_key_set )
|
if ( ! hmac_key_set )
|
||||||
{
|
{
|
||||||
MD5((const u_char*) buf, sizeof(buf), shared_hmac_md5_key);
|
MD5((const u_char*) buf, sizeof(buf), shared_hmac_md5_key);
|
||||||
|
@ -811,27 +819,39 @@ void init_random_seed(uint32 seed, const char* read_file, const char* write_file
|
||||||
write_file);
|
write_file);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
unsigned int initial_seed()
|
||||||
|
{
|
||||||
|
return first_seed;
|
||||||
|
}
|
||||||
|
|
||||||
bool have_random_seed()
|
bool have_random_seed()
|
||||||
{
|
{
|
||||||
return bro_rand_determistic;
|
return bro_rand_determistic;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
unsigned int bro_prng(unsigned int state)
|
||||||
|
{
|
||||||
|
// Use our own simple linear congruence PRNG to make sure we are
|
||||||
|
// predictable across platforms.
|
||||||
|
static const long int m = 2147483647;
|
||||||
|
static const long int a = 16807;
|
||||||
|
const long int q = m / a;
|
||||||
|
const long int r = m % a;
|
||||||
|
|
||||||
|
state = a * ( state % q ) - r * ( state / q );
|
||||||
|
|
||||||
|
if ( state <= 0 )
|
||||||
|
state += m;
|
||||||
|
|
||||||
|
return state;
|
||||||
|
}
|
||||||
|
|
||||||
long int bro_random()
|
long int bro_random()
|
||||||
{
|
{
|
||||||
if ( ! bro_rand_determistic )
|
if ( ! bro_rand_determistic )
|
||||||
return random(); // Use system PRNG.
|
return random(); // Use system PRNG.
|
||||||
|
|
||||||
// Use our own simple linear congruence PRNG to make sure we are
|
bro_rand_state = bro_prng(bro_rand_state);
|
||||||
// predictable across platforms.
|
|
||||||
const long int m = 2147483647;
|
|
||||||
const long int a = 16807;
|
|
||||||
const long int q = m / a;
|
|
||||||
const long int r = m % a;
|
|
||||||
|
|
||||||
bro_rand_state = a * ( bro_rand_state % q ) - r * ( bro_rand_state / q );
|
|
||||||
|
|
||||||
if ( bro_rand_state <= 0 )
|
|
||||||
bro_rand_state += m;
|
|
||||||
|
|
||||||
return bro_rand_state;
|
return bro_rand_state;
|
||||||
}
|
}
|
||||||
|
|
12
src/util.h
12
src/util.h
|
@ -165,12 +165,20 @@ extern void hmac_md5(size_t size, const unsigned char* bytes,
|
||||||
extern void init_random_seed(uint32 seed, const char* load_file,
|
extern void init_random_seed(uint32 seed, const char* load_file,
|
||||||
const char* write_file);
|
const char* write_file);
|
||||||
|
|
||||||
|
// Retrieves the initial seed computed after the very first call to
|
||||||
|
// init_random_seed(). Repeated calls to init_random_seed() will not affect
|
||||||
|
// the return value of this function.
|
||||||
|
unsigned int initial_seed();
|
||||||
|
|
||||||
// Returns true if the user explicitly set a seed via init_random_seed();
|
// Returns true if the user explicitly set a seed via init_random_seed();
|
||||||
extern bool have_random_seed();
|
extern bool have_random_seed();
|
||||||
|
|
||||||
|
// A simple linear congruence PRNG. It takes its state as argument and
|
||||||
|
// returns a new random value, which can serve as state for subsequent calls.
|
||||||
|
unsigned int bro_prng(unsigned int state);
|
||||||
|
|
||||||
// Replacement for the system random(), to which is normally falls back
|
// Replacement for the system random(), to which is normally falls back
|
||||||
// except when a seed has been given. In that case, we use our own
|
// except when a seed has been given. In that case, the function bro_prng.
|
||||||
// predictable PRNG.
|
|
||||||
long int bro_random();
|
long int bro_random();
|
||||||
|
|
||||||
// Calls the system srandom() function with the given seed if not running
|
// Calls the system srandom() function with the given seed if not running
|
||||||
|
|
27
testing/btest/Baseline/bifs.bloomfilter/output
Normal file
27
testing/btest/Baseline/bifs.bloomfilter/output
Normal file
|
@ -0,0 +1,27 @@
|
||||||
|
error: incompatible Bloom filter types
|
||||||
|
error: incompatible Bloom filter types
|
||||||
|
error: incompatible Bloom filter types
|
||||||
|
error: incompatible Bloom filter types
|
||||||
|
error: false-positive rate must take value between 0 and 1
|
||||||
|
error: false-positive rate must take value between 0 and 1
|
||||||
|
0
|
||||||
|
1
|
||||||
|
1
|
||||||
|
0
|
||||||
|
1
|
||||||
|
1
|
||||||
|
1
|
||||||
|
1
|
||||||
|
1
|
||||||
|
1
|
||||||
|
1
|
||||||
|
1
|
||||||
|
1
|
||||||
|
2
|
||||||
|
3
|
||||||
|
3
|
||||||
|
2
|
||||||
|
3
|
||||||
|
3
|
||||||
|
3
|
||||||
|
2
|
|
@ -3,7 +3,7 @@
|
||||||
#empty_field (empty)
|
#empty_field (empty)
|
||||||
#unset_field -
|
#unset_field -
|
||||||
#path loaded_scripts
|
#path loaded_scripts
|
||||||
#open 2013-07-25-19-59-47
|
#open 2013-07-29-22-37-52
|
||||||
#fields name
|
#fields name
|
||||||
#types string
|
#types string
|
||||||
scripts/base/init-bare.bro
|
scripts/base/init-bare.bro
|
||||||
|
@ -12,6 +12,7 @@ scripts/base/init-bare.bro
|
||||||
build/scripts/base/bif/strings.bif.bro
|
build/scripts/base/bif/strings.bif.bro
|
||||||
build/scripts/base/bif/bro.bif.bro
|
build/scripts/base/bif/bro.bif.bro
|
||||||
build/scripts/base/bif/reporter.bif.bro
|
build/scripts/base/bif/reporter.bif.bro
|
||||||
|
build/scripts/base/bif/bloom-filter.bif.bro
|
||||||
build/scripts/base/bif/event.bif.bro
|
build/scripts/base/bif/event.bif.bro
|
||||||
build/scripts/base/bif/plugins/__load__.bro
|
build/scripts/base/bif/plugins/__load__.bro
|
||||||
build/scripts/base/bif/plugins/Bro_ARP.events.bif.bro
|
build/scripts/base/bif/plugins/Bro_ARP.events.bif.bro
|
||||||
|
@ -89,6 +90,7 @@ scripts/base/init-bare.bro
|
||||||
build/scripts/base/bif/file_analysis.bif.bro
|
build/scripts/base/bif/file_analysis.bif.bro
|
||||||
scripts/base/utils/site.bro
|
scripts/base/utils/site.bro
|
||||||
scripts/base/utils/patterns.bro
|
scripts/base/utils/patterns.bro
|
||||||
|
build/scripts/base/bif/__load__.bro
|
||||||
scripts/policy/misc/loaded-scripts.bro
|
scripts/policy/misc/loaded-scripts.bro
|
||||||
scripts/base/utils/paths.bro
|
scripts/base/utils/paths.bro
|
||||||
#close 2013-07-25-19-59-47
|
#close 2013-07-29-22-37-52
|
||||||
|
|
|
@ -3,7 +3,7 @@
|
||||||
#empty_field (empty)
|
#empty_field (empty)
|
||||||
#unset_field -
|
#unset_field -
|
||||||
#path loaded_scripts
|
#path loaded_scripts
|
||||||
#open 2013-07-29-20-08-38
|
#open 2013-07-29-22-37-53
|
||||||
#fields name
|
#fields name
|
||||||
#types string
|
#types string
|
||||||
scripts/base/init-bare.bro
|
scripts/base/init-bare.bro
|
||||||
|
@ -12,6 +12,7 @@ scripts/base/init-bare.bro
|
||||||
build/scripts/base/bif/strings.bif.bro
|
build/scripts/base/bif/strings.bif.bro
|
||||||
build/scripts/base/bif/bro.bif.bro
|
build/scripts/base/bif/bro.bif.bro
|
||||||
build/scripts/base/bif/reporter.bif.bro
|
build/scripts/base/bif/reporter.bif.bro
|
||||||
|
build/scripts/base/bif/bloom-filter.bif.bro
|
||||||
build/scripts/base/bif/event.bif.bro
|
build/scripts/base/bif/event.bif.bro
|
||||||
build/scripts/base/bif/plugins/__load__.bro
|
build/scripts/base/bif/plugins/__load__.bro
|
||||||
build/scripts/base/bif/plugins/Bro_ARP.events.bif.bro
|
build/scripts/base/bif/plugins/Bro_ARP.events.bif.bro
|
||||||
|
@ -89,13 +90,19 @@ scripts/base/init-bare.bro
|
||||||
build/scripts/base/bif/file_analysis.bif.bro
|
build/scripts/base/bif/file_analysis.bif.bro
|
||||||
scripts/base/utils/site.bro
|
scripts/base/utils/site.bro
|
||||||
scripts/base/utils/patterns.bro
|
scripts/base/utils/patterns.bro
|
||||||
|
build/scripts/base/bif/__load__.bro
|
||||||
scripts/base/init-default.bro
|
scripts/base/init-default.bro
|
||||||
|
scripts/base/utils/active-http.bro
|
||||||
|
scripts/base/utils/exec.bro
|
||||||
scripts/base/utils/addrs.bro
|
scripts/base/utils/addrs.bro
|
||||||
scripts/base/utils/conn-ids.bro
|
scripts/base/utils/conn-ids.bro
|
||||||
|
scripts/base/utils/dir.bro
|
||||||
|
scripts/base/frameworks/reporter/__load__.bro
|
||||||
|
scripts/base/frameworks/reporter/main.bro
|
||||||
|
scripts/base/utils/paths.bro
|
||||||
scripts/base/utils/directions-and-hosts.bro
|
scripts/base/utils/directions-and-hosts.bro
|
||||||
scripts/base/utils/files.bro
|
scripts/base/utils/files.bro
|
||||||
scripts/base/utils/numbers.bro
|
scripts/base/utils/numbers.bro
|
||||||
scripts/base/utils/paths.bro
|
|
||||||
scripts/base/utils/queue.bro
|
scripts/base/utils/queue.bro
|
||||||
scripts/base/utils/strings.bro
|
scripts/base/utils/strings.bro
|
||||||
scripts/base/utils/thresholds.bro
|
scripts/base/utils/thresholds.bro
|
||||||
|
@ -129,8 +136,6 @@ scripts/base/init-default.bro
|
||||||
scripts/base/frameworks/intel/__load__.bro
|
scripts/base/frameworks/intel/__load__.bro
|
||||||
scripts/base/frameworks/intel/main.bro
|
scripts/base/frameworks/intel/main.bro
|
||||||
scripts/base/frameworks/intel/input.bro
|
scripts/base/frameworks/intel/input.bro
|
||||||
scripts/base/frameworks/reporter/__load__.bro
|
|
||||||
scripts/base/frameworks/reporter/main.bro
|
|
||||||
scripts/base/frameworks/sumstats/__load__.bro
|
scripts/base/frameworks/sumstats/__load__.bro
|
||||||
scripts/base/frameworks/sumstats/main.bro
|
scripts/base/frameworks/sumstats/main.bro
|
||||||
scripts/base/frameworks/sumstats/plugins/__load__.bro
|
scripts/base/frameworks/sumstats/plugins/__load__.bro
|
||||||
|
@ -197,4 +202,4 @@ scripts/base/init-default.bro
|
||||||
scripts/base/files/extract/main.bro
|
scripts/base/files/extract/main.bro
|
||||||
scripts/base/misc/find-checksum-offloading.bro
|
scripts/base/misc/find-checksum-offloading.bro
|
||||||
scripts/policy/misc/loaded-scripts.bro
|
scripts/policy/misc/loaded-scripts.bro
|
||||||
#close 2013-07-29-20-08-38
|
#close 2013-07-29-22-37-53
|
||||||
|
|
|
@ -3,8 +3,8 @@
|
||||||
#empty_field (empty)
|
#empty_field (empty)
|
||||||
#unset_field -
|
#unset_field -
|
||||||
#path intel
|
#path intel
|
||||||
#open 2012-10-03-20-20-39
|
#open 2013-07-19-17-05-48
|
||||||
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p seen.host seen.str seen.str_type seen.where sources
|
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p seen.indicator seen.indicator_type seen.where sources
|
||||||
#types time string addr port addr port addr string enum enum table[string]
|
#types time string addr port addr port string enum enum table[string]
|
||||||
1349295639.424940 - - - - - 123.123.123.123 - - Intel::IN_ANYWHERE worker-1
|
1374253548.038580 - - - - - 123.123.123.123 Intel::ADDR Intel::IN_ANYWHERE worker-1
|
||||||
#close 2012-10-03-20-20-49
|
#close 2013-07-19-17-05-57
|
||||||
|
|
|
@ -3,9 +3,9 @@
|
||||||
#empty_field (empty)
|
#empty_field (empty)
|
||||||
#unset_field -
|
#unset_field -
|
||||||
#path intel
|
#path intel
|
||||||
#open 2012-10-03-20-18-05
|
#open 2013-07-19-17-04-26
|
||||||
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p seen.host seen.str seen.str_type seen.where sources
|
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p seen.indicator seen.indicator_type seen.where sources
|
||||||
#types time string addr port addr port addr string enum enum table[string]
|
#types time string addr port addr port string enum enum table[string]
|
||||||
1349295485.114156 - - - - - - e@mail.com Intel::EMAIL SOMEWHERE source1
|
1374253466.857185 - - - - - e@mail.com Intel::EMAIL SOMEWHERE source1
|
||||||
1349295485.114156 - - - - - 1.2.3.4 - - SOMEWHERE source1
|
1374253466.857185 - - - - - 1.2.3.4 Intel::ADDR SOMEWHERE source1
|
||||||
#close 2012-10-03-20-18-05
|
#close 2013-07-19-17-04-26
|
||||||
|
|
|
@ -3,11 +3,11 @@
|
||||||
#empty_field (empty)
|
#empty_field (empty)
|
||||||
#unset_field -
|
#unset_field -
|
||||||
#path intel
|
#path intel
|
||||||
#open 2012-10-10-15-05-23
|
#open 2013-07-19-17-06-57
|
||||||
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p seen.host seen.str seen.str_type seen.where sources
|
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p seen.indicator seen.indicator_type seen.where sources
|
||||||
#types time string addr port addr port addr string enum enum table[string]
|
#types time string addr port addr port string enum enum table[string]
|
||||||
1349881523.548946 - - - - - 1.2.3.4 - - Intel::IN_A_TEST source1
|
1374253617.312158 - - - - - 1.2.3.4 Intel::ADDR Intel::IN_A_TEST source1
|
||||||
1349881523.548946 - - - - - - e@mail.com Intel::EMAIL Intel::IN_A_TEST source1
|
1374253617.312158 - - - - - e@mail.com Intel::EMAIL Intel::IN_A_TEST source1
|
||||||
1349881524.567896 - - - - - 1.2.3.4 - - Intel::IN_A_TEST source1
|
1374253618.332565 - - - - - 1.2.3.4 Intel::ADDR Intel::IN_A_TEST source1
|
||||||
1349881524.567896 - - - - - - e@mail.com Intel::EMAIL Intel::IN_A_TEST source1
|
1374253618.332565 - - - - - e@mail.com Intel::EMAIL Intel::IN_A_TEST source1
|
||||||
#close 2012-10-10-15-05-24
|
#close 2013-07-19-17-07-06
|
||||||
|
|
|
@ -32,10 +32,10 @@
|
||||||
<field type="variable32" name="username" pack_unique="yes"/>
|
<field type="variable32" name="username" pack_unique="yes"/>
|
||||||
<field type="variable32" name="password" pack_unique="yes"/>
|
<field type="variable32" name="password" pack_unique="yes"/>
|
||||||
<field type="variable32" name="proxied" pack_unique="yes"/>
|
<field type="variable32" name="proxied" pack_unique="yes"/>
|
||||||
<field type="variable32" name="mime_type" pack_unique="yes"/>
|
<field type="variable32" name="orig_fuids" pack_unique="yes"/>
|
||||||
<field type="variable32" name="md5" pack_unique="yes"/>
|
<field type="variable32" name="orig_mime_types" pack_unique="yes"/>
|
||||||
<field type="variable32" name="extracted_request_files" pack_unique="yes"/>
|
<field type="variable32" name="resp_fuids" pack_unique="yes"/>
|
||||||
<field type="variable32" name="extracted_response_files" pack_unique="yes"/>
|
<field type="variable32" name="resp_mime_types" pack_unique="yes"/>
|
||||||
</ExtentType>
|
</ExtentType>
|
||||||
<!-- ts : time -->
|
<!-- ts : time -->
|
||||||
<!-- uid : string -->
|
<!-- uid : string -->
|
||||||
|
@ -60,13 +60,13 @@
|
||||||
<!-- username : string -->
|
<!-- username : string -->
|
||||||
<!-- password : string -->
|
<!-- password : string -->
|
||||||
<!-- proxied : table[string] -->
|
<!-- proxied : table[string] -->
|
||||||
<!-- mime_type : string -->
|
<!-- orig_fuids : vector[string] -->
|
||||||
<!-- md5 : string -->
|
<!-- orig_mime_types : vector[string] -->
|
||||||
<!-- extracted_request_files : vector[string] -->
|
<!-- resp_fuids : vector[string] -->
|
||||||
<!-- extracted_response_files : vector[string] -->
|
<!-- resp_mime_types : vector[string] -->
|
||||||
|
|
||||||
# Extent, type='http'
|
# Extent, type='http'
|
||||||
ts uid id.orig_h id.orig_p id.resp_h id.resp_p trans_depth method host uri referrer user_agent request_body_len response_body_len status_code status_msg info_code info_msg filename tags username password proxied mime_type md5 extracted_request_files extracted_response_files
|
ts uid id.orig_h id.orig_p id.resp_h id.resp_p trans_depth method host uri referrer user_agent request_body_len response_body_len status_code status_msg info_code info_msg filename tags username password proxied orig_fuids orig_mime_types resp_fuids resp_mime_types
|
||||||
1300475168.784020 j4u32Pc5bif 141.142.220.118 48649 208.80.152.118 80 1 GET bits.wikimedia.org /skins-1.5/monobook/main.css http://www.wikipedia.org/ Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110303 Ubuntu/10.04 (lucid) Firefox/3.6.15 0 0 304 Not Modified 0
|
1300475168.784020 j4u32Pc5bif 141.142.220.118 48649 208.80.152.118 80 1 GET bits.wikimedia.org /skins-1.5/monobook/main.css http://www.wikipedia.org/ Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110303 Ubuntu/10.04 (lucid) Firefox/3.6.15 0 0 304 Not Modified 0
|
||||||
1300475168.916018 VW0XPVINV8a 141.142.220.118 49997 208.80.152.3 80 1 GET upload.wikimedia.org /wikipedia/commons/6/63/Wikipedia-logo.png http://www.wikipedia.org/ Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110303 Ubuntu/10.04 (lucid) Firefox/3.6.15 0 0 304 Not Modified 0
|
1300475168.916018 VW0XPVINV8a 141.142.220.118 49997 208.80.152.3 80 1 GET upload.wikimedia.org /wikipedia/commons/6/63/Wikipedia-logo.png http://www.wikipedia.org/ Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110303 Ubuntu/10.04 (lucid) Firefox/3.6.15 0 0 304 Not Modified 0
|
||||||
1300475168.916183 3PKsZ2Uye21 141.142.220.118 49996 208.80.152.3 80 1 GET upload.wikimedia.org /wikipedia/commons/thumb/b/bb/Wikipedia_wordmark.svg/174px-Wikipedia_wordmark.svg.png http://www.wikipedia.org/ Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110303 Ubuntu/10.04 (lucid) Firefox/3.6.15 0 0 304 Not Modified 0
|
1300475168.916183 3PKsZ2Uye21 141.142.220.118 49996 208.80.152.3 80 1 GET upload.wikimedia.org /wikipedia/commons/thumb/b/bb/Wikipedia_wordmark.svg/174px-Wikipedia_wordmark.svg.png http://www.wikipedia.org/ Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110303 Ubuntu/10.04 (lucid) Firefox/3.6.15 0 0 304 Not Modified 0
|
||||||
|
|
|
@ -0,0 +1,10 @@
|
||||||
|
#separator \x09
|
||||||
|
#set_separator ,
|
||||||
|
#empty_field (empty)
|
||||||
|
#unset_field -
|
||||||
|
#path dns
|
||||||
|
#open 2013-07-25-20-29-44
|
||||||
|
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto trans_id query qclass qclass_name qtype qtype_name rcode rcode_name AA TC RD RA Z answers TTLs rejected
|
||||||
|
#types time string addr port addr port enum count string count string count string count string bool bool bool bool count vector[string] vector[interval] bool
|
||||||
|
1359565680.761790 UWkUyAuUGXf 192.168.6.10 53209 192.168.129.36 53 udp 41477 paypal.com 1 C_INTERNET 48 DNSKEY 0 NOERROR F F T F 1 - - F
|
||||||
|
#close 2013-07-25-20-29-44
|
|
@ -0,0 +1,5 @@
|
||||||
|
[code=200, msg=OK^M, body=It works!, headers={
|
||||||
|
[Server] = 1.0,
|
||||||
|
[Content-type] = text/plain,
|
||||||
|
[Date] = July 22, 2013
|
||||||
|
}]
|
10
testing/btest/Baseline/scripts.base.utils.dir/bro..stdout
Normal file
10
testing/btest/Baseline/scripts.base.utils.dir/bro..stdout
Normal file
|
@ -0,0 +1,10 @@
|
||||||
|
new_file1, ../testdir/bye
|
||||||
|
new_file1, ../testdir/hi
|
||||||
|
new_file1, ../testdir/howsitgoing
|
||||||
|
new_file2, ../testdir/bye
|
||||||
|
new_file2, ../testdir/hi
|
||||||
|
new_file2, ../testdir/howsitgoing
|
||||||
|
new_file1, ../testdir/bye
|
||||||
|
new_file1, ../testdir/newone
|
||||||
|
new_file2, ../testdir/bye
|
||||||
|
new_file2, ../testdir/newone
|
|
@ -0,0 +1,7 @@
|
||||||
|
test1, [exit_code=0, signal_exit=F, stdout=[done, exit, stop], stderr=<uninitialized>, files={
|
||||||
|
[out1] = [insert text here, and here],
|
||||||
|
[out2] = [insert more text here, and there]
|
||||||
|
}]
|
||||||
|
test2, [exit_code=1, signal_exit=F, stdout=[here's something on stdout, some more stdout, last stdout], stderr=[and some stderr, more stderr, last stderr], files=<uninitialized>]
|
||||||
|
test3, [exit_code=9, signal_exit=F, stdout=[FML], stderr=<uninitialized>, files=<uninitialized>]
|
||||||
|
test4, [exit_code=0, signal_exit=F, stdout=[hibye], stderr=<uninitialized>, files=<uninitialized>]
|
|
@ -24,4 +24,11 @@ cleanup:
|
||||||
update-doc-sources:
|
update-doc-sources:
|
||||||
../../doc/scripts/genDocSourcesList.sh ../../doc/scripts/DocSourcesList.cmake
|
../../doc/scripts/genDocSourcesList.sh ../../doc/scripts/DocSourcesList.cmake
|
||||||
|
|
||||||
|
# Updates the three coverage tests that usually need tweaking when
|
||||||
|
# scripts get added/removed.
|
||||||
|
update-coverage-tests: update-doc-sources
|
||||||
|
btest -qU coverage.bare-load-baseline
|
||||||
|
btest -qU coverage.default-load-baseline
|
||||||
|
@echo "Use 'git diff' to check updates look right."
|
||||||
|
|
||||||
.PHONY: all btest-verbose brief btest-brief coverage cleanup
|
.PHONY: all btest-verbose brief btest-brief coverage cleanup
|
||||||
|
|
BIN
testing/btest/Traces/dns-dnskey.trace
Normal file
BIN
testing/btest/Traces/dns-dnskey.trace
Normal file
Binary file not shown.
83
testing/btest/bifs/bloomfilter.bro
Normal file
83
testing/btest/bifs/bloomfilter.bro
Normal file
|
@ -0,0 +1,83 @@
|
||||||
|
# @TEST-EXEC: bro -b %INPUT >output 2>&1
|
||||||
|
# @TEST-EXEC: btest-diff output
|
||||||
|
|
||||||
|
function test_basic_bloom_filter()
|
||||||
|
{
|
||||||
|
# Basic usage with counts.
|
||||||
|
local bf_cnt = bloomfilter_basic_init(0.1, 1000);
|
||||||
|
bloomfilter_add(bf_cnt, 42);
|
||||||
|
bloomfilter_add(bf_cnt, 84);
|
||||||
|
bloomfilter_add(bf_cnt, 168);
|
||||||
|
print bloomfilter_lookup(bf_cnt, 0);
|
||||||
|
print bloomfilter_lookup(bf_cnt, 42);
|
||||||
|
print bloomfilter_lookup(bf_cnt, 168);
|
||||||
|
print bloomfilter_lookup(bf_cnt, 336);
|
||||||
|
bloomfilter_add(bf_cnt, 0.5); # Type mismatch
|
||||||
|
bloomfilter_add(bf_cnt, "foo"); # Type mismatch
|
||||||
|
|
||||||
|
# Basic usage with strings.
|
||||||
|
local bf_str = bloomfilter_basic_init(0.9, 10);
|
||||||
|
bloomfilter_add(bf_str, "foo");
|
||||||
|
bloomfilter_add(bf_str, "bar");
|
||||||
|
print bloomfilter_lookup(bf_str, "foo");
|
||||||
|
print bloomfilter_lookup(bf_str, "bar");
|
||||||
|
print bloomfilter_lookup(bf_str, "b4z"); # FP
|
||||||
|
print bloomfilter_lookup(bf_str, "quux"); # FP
|
||||||
|
bloomfilter_add(bf_str, 0.5); # Type mismatch
|
||||||
|
bloomfilter_add(bf_str, 100); # Type mismatch
|
||||||
|
|
||||||
|
# Edge cases.
|
||||||
|
local bf_edge0 = bloomfilter_basic_init(0.000000000001, 1);
|
||||||
|
local bf_edge1 = bloomfilter_basic_init(0.00000001, 100000000);
|
||||||
|
local bf_edge2 = bloomfilter_basic_init(0.9999999, 1);
|
||||||
|
local bf_edge3 = bloomfilter_basic_init(0.9999999, 100000000000);
|
||||||
|
|
||||||
|
# Invalid parameters.
|
||||||
|
local bf_bug0 = bloomfilter_basic_init(-0.5, 42);
|
||||||
|
local bf_bug1 = bloomfilter_basic_init(1.1, 42);
|
||||||
|
|
||||||
|
# Merging
|
||||||
|
local bf_cnt2 = bloomfilter_basic_init(0.1, 1000);
|
||||||
|
bloomfilter_add(bf_cnt2, 42);
|
||||||
|
bloomfilter_add(bf_cnt, 100);
|
||||||
|
local bf_merged = bloomfilter_merge(bf_cnt, bf_cnt2);
|
||||||
|
print bloomfilter_lookup(bf_merged, 42);
|
||||||
|
print bloomfilter_lookup(bf_merged, 84);
|
||||||
|
print bloomfilter_lookup(bf_merged, 100);
|
||||||
|
print bloomfilter_lookup(bf_merged, 168);
|
||||||
|
}
|
||||||
|
|
||||||
|
function test_counting_bloom_filter()
|
||||||
|
{
|
||||||
|
local bf = bloomfilter_counting_init(3, 32, 3);
|
||||||
|
bloomfilter_add(bf, "foo");
|
||||||
|
print bloomfilter_lookup(bf, "foo"); # 1
|
||||||
|
bloomfilter_add(bf, "foo");
|
||||||
|
print bloomfilter_lookup(bf, "foo"); # 2
|
||||||
|
bloomfilter_add(bf, "foo");
|
||||||
|
print bloomfilter_lookup(bf, "foo"); # 3
|
||||||
|
bloomfilter_add(bf, "foo");
|
||||||
|
print bloomfilter_lookup(bf, "foo"); # still 3
|
||||||
|
|
||||||
|
|
||||||
|
bloomfilter_add(bf, "bar");
|
||||||
|
bloomfilter_add(bf, "bar");
|
||||||
|
print bloomfilter_lookup(bf, "bar"); # 2
|
||||||
|
print bloomfilter_lookup(bf, "foo"); # still 3
|
||||||
|
|
||||||
|
# Merging
|
||||||
|
local bf2 = bloomfilter_counting_init(3, 32, 3);
|
||||||
|
bloomfilter_add(bf2, "baz");
|
||||||
|
bloomfilter_add(bf2, "baz");
|
||||||
|
bloomfilter_add(bf2, "bar");
|
||||||
|
local bf_merged = bloomfilter_merge(bf, bf2);
|
||||||
|
print bloomfilter_lookup(bf_merged, "foo");
|
||||||
|
print bloomfilter_lookup(bf_merged, "bar");
|
||||||
|
print bloomfilter_lookup(bf_merged, "baz");
|
||||||
|
}
|
||||||
|
|
||||||
|
event bro_init()
|
||||||
|
{
|
||||||
|
test_basic_bloom_filter();
|
||||||
|
test_counting_bloom_filter();
|
||||||
|
}
|
|
@ -10,5 +10,8 @@
|
||||||
#
|
#
|
||||||
# @TEST-EXEC: test -d $DIST/scripts
|
# @TEST-EXEC: test -d $DIST/scripts
|
||||||
# @TEST-EXEC: for script in `find $DIST/scripts/ -name \*\.bro -not -path '*/site/*'`; do echo "=== $script" >>allerrors; if echo "$script" | egrep -q 'communication/listen|controllee'; then rm -rf load_attempt .bgprocs; btest-bg-run load_attempt bro -b $script; btest-bg-wait -k 2; cat load_attempt/.stderr >>allerrors; else bro -b $script 2>>allerrors; fi done || exit 0
|
# @TEST-EXEC: for script in `find $DIST/scripts/ -name \*\.bro -not -path '*/site/*'`; do echo "=== $script" >>allerrors; if echo "$script" | egrep -q 'communication/listen|controllee'; then rm -rf load_attempt .bgprocs; btest-bg-run load_attempt bro -b $script; btest-bg-wait -k 2; cat load_attempt/.stderr >>allerrors; else bro -b $script 2>>allerrors; fi done || exit 0
|
||||||
# @TEST-EXEC: cat allerrors | grep -v "received termination signal" | grep -v '===' | sort | uniq > unique_errors
|
# @TEST-EXEC: cat allerrors | grep -v "received termination signal" | fgrep -v -f %INPUT | grep -v '===' | sort | uniq > unique_errors
|
||||||
# @TEST-EXEC: btest-diff unique_errors
|
# @TEST-EXEC: btest-diff unique_errors
|
||||||
|
|
||||||
|
# White-list of tests to exclude because of cyclic load dependencies.
|
||||||
|
scripts/base/protocols/ftp/utils.bro
|
||||||
|
|
|
@ -12,6 +12,9 @@ global sha1_handle: opaque of sha1 &persistent &synchronized;
|
||||||
global sha256_handle: opaque of sha256 &persistent &synchronized;
|
global sha256_handle: opaque of sha256 &persistent &synchronized;
|
||||||
global entropy_handle: opaque of entropy &persistent &synchronized;
|
global entropy_handle: opaque of entropy &persistent &synchronized;
|
||||||
|
|
||||||
|
global bloomfilter_elements: set[string] &persistent &synchronized;
|
||||||
|
global bloomfilter_handle: opaque of bloomfilter &persistent &synchronized;
|
||||||
|
|
||||||
event bro_done()
|
event bro_done()
|
||||||
{
|
{
|
||||||
local out = open("output.log");
|
local out = open("output.log");
|
||||||
|
@ -36,6 +39,9 @@ event bro_done()
|
||||||
print out, entropy_test_finish(entropy_handle);
|
print out, entropy_test_finish(entropy_handle);
|
||||||
else
|
else
|
||||||
print out, "entropy_test_add() failed";
|
print out, "entropy_test_add() failed";
|
||||||
|
|
||||||
|
for ( e in bloomfilter_elements )
|
||||||
|
print bloomfilter_lookup(bloomfilter_handle, e);
|
||||||
}
|
}
|
||||||
|
|
||||||
@TEST-END-FILE
|
@TEST-END-FILE
|
||||||
|
@ -47,6 +53,9 @@ global sha1_handle: opaque of sha1 &persistent &synchronized;
|
||||||
global sha256_handle: opaque of sha256 &persistent &synchronized;
|
global sha256_handle: opaque of sha256 &persistent &synchronized;
|
||||||
global entropy_handle: opaque of entropy &persistent &synchronized;
|
global entropy_handle: opaque of entropy &persistent &synchronized;
|
||||||
|
|
||||||
|
global bloomfilter_elements = { "foo", "bar", "baz" } &persistent &synchronized;
|
||||||
|
global bloomfilter_handle: opaque of bloomfilter &persistent &synchronized;
|
||||||
|
|
||||||
event bro_init()
|
event bro_init()
|
||||||
{
|
{
|
||||||
local out = open("expected.log");
|
local out = open("expected.log");
|
||||||
|
@ -72,6 +81,10 @@ event bro_init()
|
||||||
entropy_handle = entropy_test_init();
|
entropy_handle = entropy_test_init();
|
||||||
if ( ! entropy_test_add(entropy_handle, "f") )
|
if ( ! entropy_test_add(entropy_handle, "f") )
|
||||||
print out, "entropy_test_add() failed";
|
print out, "entropy_test_add() failed";
|
||||||
|
|
||||||
|
bloomfilter_handle = bloomfilter_basic_init(0.1, 100);
|
||||||
|
for ( e in bloomfilter_elements )
|
||||||
|
bloomfilter_add(bloomfilter_handle, e);
|
||||||
}
|
}
|
||||||
|
|
||||||
@TEST-END-FILE
|
@TEST-END-FILE
|
||||||
|
|
|
@ -28,7 +28,7 @@ event remote_connection_handshake_done(p: event_peer)
|
||||||
# Insert the data once both workers are connected.
|
# Insert the data once both workers are connected.
|
||||||
if ( Cluster::local_node_type() == Cluster::MANAGER && Cluster::worker_count == 2 )
|
if ( Cluster::local_node_type() == Cluster::MANAGER && Cluster::worker_count == 2 )
|
||||||
{
|
{
|
||||||
Intel::insert([$host=1.2.3.4,$meta=[$source="manager"]]);
|
Intel::insert([$indicator="1.2.3.4", $indicator_type=Intel::ADDR, $meta=[$source="manager"]]);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -39,7 +39,7 @@ event Intel::cluster_new_item(item: Intel::Item)
|
||||||
if ( ! is_remote_event() )
|
if ( ! is_remote_event() )
|
||||||
return;
|
return;
|
||||||
|
|
||||||
print fmt("cluster_new_item: %s inserted by %s (from peer: %s)", item$host, item$meta$source, get_event_peer()$descr);
|
print fmt("cluster_new_item: %s inserted by %s (from peer: %s)", item$indicator, item$meta$source, get_event_peer()$descr);
|
||||||
|
|
||||||
if ( ! sent_data )
|
if ( ! sent_data )
|
||||||
{
|
{
|
||||||
|
@ -47,9 +47,9 @@ event Intel::cluster_new_item(item: Intel::Item)
|
||||||
# full cluster is constructed.
|
# full cluster is constructed.
|
||||||
sent_data = T;
|
sent_data = T;
|
||||||
if ( Cluster::node == "worker-1" )
|
if ( Cluster::node == "worker-1" )
|
||||||
Intel::insert([$host=123.123.123.123,$meta=[$source="worker-1"]]);
|
Intel::insert([$indicator="123.123.123.123", $indicator_type=Intel::ADDR, $meta=[$source="worker-1"]]);
|
||||||
if ( Cluster::node == "worker-2" )
|
if ( Cluster::node == "worker-2" )
|
||||||
Intel::insert([$host=4.3.2.1,$meta=[$source="worker-2"]]);
|
Intel::insert([$indicator="4.3.2.1", $indicator_type=Intel::ADDR, $meta=[$source="worker-2"]]);
|
||||||
}
|
}
|
||||||
|
|
||||||
# We're forcing worker-2 to do a lookup when it has three intelligence items
|
# We're forcing worker-2 to do a lookup when it has three intelligence items
|
||||||
|
|
|
@ -5,10 +5,10 @@
|
||||||
# @TEST-EXEC: btest-diff broproc/intel.log
|
# @TEST-EXEC: btest-diff broproc/intel.log
|
||||||
|
|
||||||
@TEST-START-FILE intel.dat
|
@TEST-START-FILE intel.dat
|
||||||
#fields host net str str_type meta.source meta.desc meta.url
|
#fields indicator indicator_type meta.source meta.desc meta.url
|
||||||
1.2.3.4 - - - source1 this host is just plain baaad http://some-data-distributor.com/1234
|
1.2.3.4 Intel::ADDR source1 this host is just plain baaad http://some-data-distributor.com/1234
|
||||||
1.2.3.4 - - - source1 this host is just plain baaad http://some-data-distributor.com/1234
|
1.2.3.4 Intel::ADDR source1 this host is just plain baaad http://some-data-distributor.com/1234
|
||||||
- - e@mail.com Intel::EMAIL source1 Phishing email source http://some-data-distributor.com/100000
|
e@mail.com Intel::EMAIL source1 Phishing email source http://some-data-distributor.com/100000
|
||||||
@TEST-END-FILE
|
@TEST-END-FILE
|
||||||
|
|
||||||
@load frameworks/communication/listen
|
@load frameworks/communication/listen
|
||||||
|
@ -18,8 +18,8 @@ redef enum Intel::Where += { SOMEWHERE };
|
||||||
|
|
||||||
event do_it()
|
event do_it()
|
||||||
{
|
{
|
||||||
Intel::seen([$str="e@mail.com",
|
Intel::seen([$indicator="e@mail.com",
|
||||||
$str_type=Intel::EMAIL,
|
$indicator_type=Intel::EMAIL,
|
||||||
$where=SOMEWHERE]);
|
$where=SOMEWHERE]);
|
||||||
|
|
||||||
Intel::seen([$host=1.2.3.4,
|
Intel::seen([$host=1.2.3.4,
|
||||||
|
|
|
@ -19,10 +19,10 @@ redef Cluster::nodes = {
|
||||||
@TEST-END-FILE
|
@TEST-END-FILE
|
||||||
|
|
||||||
@TEST-START-FILE intel.dat
|
@TEST-START-FILE intel.dat
|
||||||
#fields host net str str_type meta.source meta.desc meta.url
|
#fields indicator indicator_type meta.source meta.desc meta.url
|
||||||
1.2.3.4 - - - source1 this host is just plain baaad http://some-data-distributor.com/1234
|
1.2.3.4 Intel::ADDR source1 this host is just plain baaad http://some-data-distributor.com/1234
|
||||||
1.2.3.4 - - - source1 this host is just plain baaad http://some-data-distributor.com/1234
|
1.2.3.4 Intel::ADDR source1 this host is just plain baaad http://some-data-distributor.com/1234
|
||||||
- - e@mail.com Intel::EMAIL source1 Phishing email source http://some-data-distributor.com/100000
|
e@mail.com Intel::EMAIL source1 Phishing email source http://some-data-distributor.com/100000
|
||||||
@TEST-END-FILE
|
@TEST-END-FILE
|
||||||
|
|
||||||
@load base/frameworks/control
|
@load base/frameworks/control
|
||||||
|
@ -41,7 +41,7 @@ redef enum Intel::Where += {
|
||||||
event do_it()
|
event do_it()
|
||||||
{
|
{
|
||||||
Intel::seen([$host=1.2.3.4, $where=Intel::IN_A_TEST]);
|
Intel::seen([$host=1.2.3.4, $where=Intel::IN_A_TEST]);
|
||||||
Intel::seen([$str="e@mail.com", $str_type=Intel::EMAIL, $where=Intel::IN_A_TEST]);
|
Intel::seen([$indicator="e@mail.com", $indicator_type=Intel::EMAIL, $where=Intel::IN_A_TEST]);
|
||||||
}
|
}
|
||||||
|
|
||||||
event bro_init()
|
event bro_init()
|
||||||
|
|
4
testing/btest/scripts/base/protocols/dns/dns-key.bro
Normal file
4
testing/btest/scripts/base/protocols/dns/dns-key.bro
Normal file
|
@ -0,0 +1,4 @@
|
||||||
|
# Making sure DNSKEY gets logged as such.
|
||||||
|
#
|
||||||
|
# @TEST-EXEC: bro -r $TRACES/dns-dnskey.trace
|
||||||
|
# @TEST-EXEC: btest-diff dns.log
|
28
testing/btest/scripts/base/utils/active-http.test
Normal file
28
testing/btest/scripts/base/utils/active-http.test
Normal file
|
@ -0,0 +1,28 @@
|
||||||
|
# @TEST-REQUIRES: which httpd
|
||||||
|
# @TEST-REQUIRES: which python
|
||||||
|
#
|
||||||
|
# @TEST-EXEC: btest-bg-run httpd python $SCRIPTS/httpd.py --max 1
|
||||||
|
# @TEST-EXEC: sleep 3
|
||||||
|
# @TEST-EXEC: btest-bg-run bro bro -b %INPUT
|
||||||
|
# @TEST-EXEC: btest-bg-wait 15
|
||||||
|
# @TEST-EXEC: btest-diff bro/.stdout
|
||||||
|
|
||||||
|
@load base/utils/active-http
|
||||||
|
|
||||||
|
redef exit_only_after_terminate = T;
|
||||||
|
|
||||||
|
event bro_init()
|
||||||
|
{
|
||||||
|
local req = ActiveHTTP::Request($url="localhost:32123");
|
||||||
|
|
||||||
|
when ( local resp = ActiveHTTP::request(req) )
|
||||||
|
{
|
||||||
|
print resp;
|
||||||
|
terminate();
|
||||||
|
}
|
||||||
|
timeout 1min
|
||||||
|
{
|
||||||
|
print "HTTP request timeout";
|
||||||
|
terminate();
|
||||||
|
}
|
||||||
|
}
|
58
testing/btest/scripts/base/utils/dir.test
Normal file
58
testing/btest/scripts/base/utils/dir.test
Normal file
|
@ -0,0 +1,58 @@
|
||||||
|
# @TEST-EXEC: btest-bg-run bro bro -b ../dirtest.bro
|
||||||
|
# @TEST-EXEC: btest-bg-wait 10
|
||||||
|
# @TEST-EXEC: TEST_DIFF_CANONIFIER=$SCRIPTS/diff-sort btest-diff bro/.stdout
|
||||||
|
|
||||||
|
@TEST-START-FILE dirtest.bro
|
||||||
|
|
||||||
|
@load base/utils/dir
|
||||||
|
|
||||||
|
redef exit_only_after_terminate = T;
|
||||||
|
|
||||||
|
global c: count = 0;
|
||||||
|
|
||||||
|
function check_terminate_condition()
|
||||||
|
{
|
||||||
|
c += 1;
|
||||||
|
|
||||||
|
if ( c == 10 )
|
||||||
|
terminate();
|
||||||
|
}
|
||||||
|
|
||||||
|
function new_file1(fname: string)
|
||||||
|
{
|
||||||
|
print "new_file1", fname;
|
||||||
|
check_terminate_condition();
|
||||||
|
}
|
||||||
|
|
||||||
|
function new_file2(fname: string)
|
||||||
|
{
|
||||||
|
print "new_file2", fname;
|
||||||
|
check_terminate_condition();
|
||||||
|
}
|
||||||
|
|
||||||
|
event change_things()
|
||||||
|
{
|
||||||
|
system("touch ../testdir/newone");
|
||||||
|
system("rm ../testdir/bye && touch ../testdir/bye");
|
||||||
|
}
|
||||||
|
|
||||||
|
event bro_init()
|
||||||
|
{
|
||||||
|
Dir::monitor("../testdir", new_file1, .5sec);
|
||||||
|
Dir::monitor("../testdir", new_file2, 1sec);
|
||||||
|
schedule 1sec { change_things() };
|
||||||
|
}
|
||||||
|
|
||||||
|
@TEST-END-FILE
|
||||||
|
|
||||||
|
@TEST-START-FILE testdir/hi
|
||||||
|
123
|
||||||
|
@TEST-END-FILE
|
||||||
|
|
||||||
|
@TEST-START-FILE testdir/howsitgoing
|
||||||
|
abc
|
||||||
|
@TEST-END-FILE
|
||||||
|
|
||||||
|
@TEST-START-FILE testdir/bye
|
||||||
|
!@#
|
||||||
|
@TEST-END-FILE
|
74
testing/btest/scripts/base/utils/exec.test
Normal file
74
testing/btest/scripts/base/utils/exec.test
Normal file
|
@ -0,0 +1,74 @@
|
||||||
|
# @TEST-EXEC: btest-bg-run bro bro -b ../exectest.bro
|
||||||
|
# @TEST-EXEC: btest-bg-wait 10
|
||||||
|
# @TEST-EXEC: TEST_DIFF_CANONIFIER=$SCRIPTS/diff-sort btest-diff bro/.stdout
|
||||||
|
|
||||||
|
@TEST-START-FILE exectest.bro
|
||||||
|
|
||||||
|
@load base/utils/exec
|
||||||
|
|
||||||
|
redef exit_only_after_terminate = T;
|
||||||
|
|
||||||
|
global c: count = 0;
|
||||||
|
|
||||||
|
function check_exit_condition()
|
||||||
|
{
|
||||||
|
c += 1;
|
||||||
|
|
||||||
|
if ( c == 4 )
|
||||||
|
terminate();
|
||||||
|
}
|
||||||
|
|
||||||
|
function test_cmd(label: string, cmd: Exec::Command)
|
||||||
|
{
|
||||||
|
when ( local result = Exec::run(cmd) )
|
||||||
|
{
|
||||||
|
print label, result;
|
||||||
|
check_exit_condition();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
event bro_init()
|
||||||
|
{
|
||||||
|
test_cmd("test1", [$cmd="bash ../somescript.sh",
|
||||||
|
$read_files=set("out1", "out2")]);
|
||||||
|
test_cmd("test2", [$cmd="bash ../nofiles.sh"]);
|
||||||
|
test_cmd("test3", [$cmd="bash ../suicide.sh"]);
|
||||||
|
test_cmd("test4", [$cmd="bash ../stdin.sh", $stdin="hibye"]);
|
||||||
|
}
|
||||||
|
|
||||||
|
@TEST-END-FILE
|
||||||
|
|
||||||
|
@TEST-START-FILE somescript.sh
|
||||||
|
#! /usr/bin/env bash
|
||||||
|
echo "insert text here" > out1
|
||||||
|
echo "and here" >> out1
|
||||||
|
echo "insert more text here" > out2
|
||||||
|
echo "and there" >> out2
|
||||||
|
echo "done"
|
||||||
|
echo "exit"
|
||||||
|
echo "stop"
|
||||||
|
@TEST-END-FILE
|
||||||
|
|
||||||
|
@TEST-START-FILE nofiles.sh
|
||||||
|
#! /usr/bin/env bash
|
||||||
|
echo "here's something on stdout"
|
||||||
|
echo "some more stdout"
|
||||||
|
echo "last stdout"
|
||||||
|
echo "and some stderr" 1>&2
|
||||||
|
echo "more stderr" 1>&2
|
||||||
|
echo "last stderr" 1>&2
|
||||||
|
exit 1
|
||||||
|
@TEST-END-FILE
|
||||||
|
|
||||||
|
@TEST-START-FILE suicide.sh
|
||||||
|
#! /usr/bin/env bash
|
||||||
|
echo "FML"
|
||||||
|
kill -9 $$
|
||||||
|
echo "nope"
|
||||||
|
@TEST-END-FILE
|
||||||
|
|
||||||
|
@TEST-START-FILE stdin.sh
|
||||||
|
#! /usr/bin/env bash
|
||||||
|
read -r line
|
||||||
|
echo "$line"
|
||||||
|
@TEST-END-FILE
|
40
testing/scripts/httpd.py
Executable file
40
testing/scripts/httpd.py
Executable file
|
@ -0,0 +1,40 @@
|
||||||
|
#! /usr/bin/env python
|
||||||
|
|
||||||
|
import BaseHTTPServer
|
||||||
|
|
||||||
|
class MyRequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
|
||||||
|
|
||||||
|
def do_GET(self):
|
||||||
|
self.send_response(200)
|
||||||
|
self.send_header("Content-type", "text/plain")
|
||||||
|
self.end_headers()
|
||||||
|
self.wfile.write("It works!")
|
||||||
|
|
||||||
|
def version_string(self):
|
||||||
|
return "1.0"
|
||||||
|
|
||||||
|
def date_time_string(self):
|
||||||
|
return "July 22, 2013"
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
from optparse import OptionParser
|
||||||
|
p = OptionParser()
|
||||||
|
p.add_option("-a", "--addr", type="string", default="localhost",
|
||||||
|
help=("listen on given address (numeric IP or host name), "
|
||||||
|
"an empty string (the default) means INADDR_ANY"))
|
||||||
|
p.add_option("-p", "--port", type="int", default=32123,
|
||||||
|
help="listen on given TCP port number")
|
||||||
|
p.add_option("-m", "--max", type="int", default=-1,
|
||||||
|
help="max number of requests to respond to, -1 means no max")
|
||||||
|
options, args = p.parse_args()
|
||||||
|
|
||||||
|
httpd = BaseHTTPServer.HTTPServer((options.addr, options.port),
|
||||||
|
MyRequestHandler)
|
||||||
|
if options.max == -1:
|
||||||
|
httpd.serve_forever()
|
||||||
|
else:
|
||||||
|
served_count = 0
|
||||||
|
while served_count != options.max:
|
||||||
|
httpd.handle_request()
|
||||||
|
served_count += 1
|
Loading…
Add table
Add a link
Reference in a new issue