Closes github's #77 and closes BIT-1606
* topic/seth/intel-update-merge:
Tiny scoping updates and test baseline updates for Intel framework.
Minor documentation cleanups.
Fixed insertion of nested subnets.
Refactored FAF integration of intel framework.
Added expiration for intelligence items.
Improved intel notices.
Added hook to allow extending the intel log.
Added remove function to intel-framework.
Added support for subnets to intel-framework.
Refactoring of meta data handling for intel.
Added testcase for intel updates.
* origin/topic/robin/bit-1641:
Fixing duplicate SSH authentication failure events.
I changed the test slightly; the output of uniq is not stable between
operating systems (on OS-X, it emits a space, on Linux it apparently
emits a tab). I removed the call to uniq - sort by itself is enough to
create a difference if there are duplicate entries.
Addresses BIT-1641
ninja said:
ninja: warning: multiple rules generate
scripts/base/bif/const.bif.bro. builds involving this target will
not be correct; continuing anyway [-w dupbuild=warn]
Looks like there's a larger problem here involving *.bif of the same
name at different locations of the source tree. For now, I'ved fixed
this one by merging src/iosource/pcap/{const,functions}.bif into
pcap.bif.
This adds an event that is raised once Catch & Release ceases the
block management for an IP address because the IP has not been seen in
traffic during the watch interval.
This allows users who use their own logic on the top of catch and
release know when they will have to start re-blocking the IP if it
occurs in traffic again.
Calling Error() in an input reader now automatically will disable the
reader and return a failure in the Update/Heartbeat calls.
Also adds more tests.
Addresses BIT-1181
This change introduces error events for Table and Event readers. Users
can now specify an event that is called when an info, warning, or error
is emitted by their input reader. This can, e.g., be used to raise
notices in case errors occur when reading an important input stream.
Example:
event error_event(desc: Input::TableDescription, msg: string, level: Reporter::Level)
{
...
}
event bro_init()
{
Input::add_table([$source="a", $error_ev=error_event, ...]);
}
For the moment, this converts all errors in the Asciiformatter into
warnings (to show that they are non-fatal) - the Reader itself also has
to throw an Error to show that a fatal error occurred and processing
will be abort.
It might be nicer to change this and require readers to mark fatal
errors as such when throwing them.
Addresses BIT-1181
This allows all threads accessing the same database to share sqlite
objects. This, for example, fixes the issue with several threads
simultaneously writing to the same database file.
See https://www.sqlite.org/sharedcache.html
Addresses BIT-1325
Addig a new random seed for external tests.
I added a wrapper around the siphash() function to make calling it a
little bit safer at least.
BIT-1612 #merged
* origin/topic/johanna/bit-1612:
HLL: Fix missing typecast in test case.
Remove the -K/-J options for setting keys.
Add test checking the quality of HLL by adding a lot of elements.
Fix serializing probabilistic hashers.
Baseline updates after hash function change.
Also switch BloomFilters from H3 to siphash.
Change Hashing from H3 to Siphash.
HLL: Remove unnecessary comparison.
Hyperloglog: change calculation of Rho
The options were never really used and do not seem especially useful;
initialization with a seed file still works.
This also fixes a bug with the initialization of the siphash key.
The test adds 170,000 IP addresses. After the recent hashing changes,
HLL estimates 171,250 entries (completely stable). Before, HLL estimated,
depending on the initial seeds, ~700 to 300,000 entries.
This commit mostly changes the hash function that is used for Internal
hashing of data < 36 bytes from H3 to Siphash. This change is motivated
by the fact that it turns out that H3 apparently does not deliver a very
good source of data uniqueness; running HLL with H3 as a hashing
function results in quite poor results (up to of 75% off in my tests).
In difference, running HLL with Siphash (or HMAC-MD5) changes this
factor to ~2%.
This also fixes a long-standing bug in Hash.h which truncated our hash
values to 32 bit on most machines.
Furthermore, it once again fixes a problem with the Rank function in
HLL.
non-partial connections.
Before, if we saw a responder-side SYN/ACK, but had not seen the
initial orginator-side SYN, Bro would treat the connection as partial,
meaning that most application-layer analyzers would refuse to inspect
the payload. That was unfortunate because all payload data was
actually there (and even passed to the analyzers). This change make
Bro consider these connections as complete, so that analyzers will
just normally process them.
The leads to couple more connections in the test-suite to now being
analyzed.
Addresses #1492. (I used an HTTP trace for debugging instead of the
HTTPS trace from the ticket, as the clear-text makes it easier to
track the data flow).
When inserting, existance of the given subnet is checked using exact
matching instead of longest prefix matching. Before, inserting a subnet
would have updated the subnet item, which is the longest prefix of the
inserted subnet, if present.
One tweak: I made ts optional and set it to network_time() if not given.
BIT-1578 #merged
* origin/topic/johanna/bit-1578:
Weird: fix potential small issue when ignoring duplicates
Rewrite weird logging.
- SMTP protocol headers now do some minimal parsing to clean up
email addresses.
- New function named split_mime_email_addresses to take MIME headers
and get addresses split apart but including the display name.
- Update tests.
In all versions so far, the identifier string that was used for
comparisons might have been different from the identifier string that
was added (when certain notices are used).
This commit rewrites the way that weirds are logged and fixes a number
of issues on the way. Most prominently, flow weirds now actually log
information about the flow that they occur in (before this change, they
only logged the name of the weird, which is only marginally helpful).
Besides restructuring how weird logging works internally, weirds can now
also be generated by calling Weird::weird with the info record directly,
allowing more fine-granular passing of information. This is e.g. used
for DNS weirds, which do not have the connection record available any
more when they are generated (before data like the connection ID was
just not logged in these instances).
Addresses BIT-1578
File Analysis Framework related code has been moved into a separate
script. Using redefinitions of the corresponding records causes the
file-related columns to appear last.