Commit graph

89 commits

Author SHA1 Message Date
Johanna Amann
6b9abe85a7 Add error events to input framework.
This change introduces error events for Table and Event readers. Users
can now specify an event that is called when an info, warning, or error
is emitted by their input reader. This can, e.g., be used to raise
notices in case errors occur when reading an important input stream.

Example:

event error_event(desc: Input::TableDescription, msg: string, level: Reporter::Level)
	{
	...
	}

event bro_init()
	{
	Input::add_table([$source="a", $error_ev=error_event, ...]);
	}

For the moment, this converts all errors in the Asciiformatter into
warnings (to show that they are non-fatal) - the Reader itself also has
to throw an Error to show that a fatal error occurred and processing
will be abort.

It might be nicer to change this and require readers to mark fatal
errors as such when throwing them.

Addresses BIT-1181
2016-07-22 19:45:28 -07:00
Robin Sommer
4d84ee82da Merge remote-tracking branch 'origin/topic/johanna/bit-1612'
Addig a new random seed for external tests.

I added a wrapper around the siphash() function to make calling it a
little bit safer at least.

BIT-1612 #merged

* origin/topic/johanna/bit-1612:
  HLL: Fix missing typecast in test case.
  Remove the -K/-J options for setting keys.
  Add test checking the quality of HLL by adding a lot of elements.
  Fix serializing probabilistic hashers.
  Baseline updates after hash function change.
  Also switch BloomFilters from H3 to siphash.
  Change Hashing from H3 to Siphash.
  HLL: Remove unnecessary comparison.
  Hyperloglog: change calculation of Rho
2016-07-14 16:26:17 -07:00
Johanna Amann
499ed5b566 Remove the -K/-J options for setting keys.
The options were never really used and do not seem especially useful;
initialization with a seed file still works.

This also fixes a bug with the initialization of the siphash key.
2016-07-13 16:57:53 -07:00
Johanna Amann
e1218cc7fa Change Hashing from H3 to Siphash.
This commit mostly changes the hash function that is used for Internal
hashing of data < 36 bytes from H3 to Siphash. This change is motivated
by the fact that it turns out that H3 apparently does not deliver a very
good source of data uniqueness; running HLL with H3 as a hashing
function results in quite poor results (up to of 75% off in my tests).
In difference, running HLL with Siphash (or HMAC-MD5) changes this
factor to ~2%.

This also fixes a long-standing bug in Hash.h which truncated our hash
values to 32 bit on most machines.

Furthermore, it once again fixes a problem with the Rank function in
HLL.
2016-07-13 06:44:51 -07:00
Seth Hall
d9d579c52c Merge remote-tracking branch 'origin/master' into topic/seth/stats-improvement 2016-05-02 14:34:29 -04:00
Johanna Amann
446a44787a Remove old string functions.
More specifically, this removes the functions:
strcasecmp_n
strchr_n
strrchr_n

and replaces the calls with the respective C-library calls that should
be part of just about all operating systems by now.
2016-03-04 12:02:19 -08:00
Seth Hall
6d836b7956 More stats improvements
Broke out the stats collection into a bunch of new Bifs
in stats.bif.  Scripts that use stats collection functions
have also been updated.  More work to do.
2016-01-07 16:20:24 -05:00
Robin Sommer
3957091e1b Renaming config.h to bro-config.h.
A couple times now I had this conflicting with files of the same name
in other projects.
2015-07-28 11:57:04 -07:00
Robin Sommer
0f96d06252 Making plugin names case-insensitive for some internal comparisions.
Makes the plugin system a bit more tolerant against spelling
inconsistencies that would be hard to catch otherwise.
2015-02-16 20:26:23 -08:00
Jon Siwek
88af106b6b Fix use of deprecated gperftools headers.
As of gperftools 2.0 (Feb. 2012), they've been renamed in to
gperftools/ instead of google/, and as of gperftools 2.2, including
the later emits deprecation warnings.
2015-02-11 13:56:34 -06:00
Jon Siwek
69b1ba653d Minor adjustments to plugin code/docs.
Mostly whitespace/typos.
Moved some Plugin methods out from public access.
2014-07-30 16:48:23 -05:00
Robin Sommer
551950c438 Adding environment variable BRO_PLUGIN_ACTIVATE that unconditionally
activates plugins.

Plugins are specified with a comma-separated list of names.
2014-05-29 18:15:18 -07:00
Robin Sommer
bbd409d274 Merge remote-tracking branch 'origin/master' into topic/robin/dynamic-plugins-2.3
(Never good to name a branch after version anticipated to include it ...)
2014-05-14 16:23:04 -07:00
Jon Siwek
e8a5ea8844 Refactor various hex escaping code. 2014-04-18 13:19:50 -05:00
Jon Siwek
b22ca5d0a3 Replace libmagic w/ Bro signatures for file MIME type identification.
Notable changes:

- libmagic is no longer used at all.  All MIME type detection is
  done through new Bro signatures, and there's no longer a means to get
  verbose file type descriptions (e.g. "PNG image data, 1435 x 170").
  The majority of the default file magic signatures are derived
  from the default magic database of libmagic ~5.17.

- File magic signatures consist of two new constructs in the
  signature rule parsing grammar: "file-magic" gives a regular
  expression to match against, and "file-mime" gives the MIME type
  string of content that matches the magic and an optional strength
  value for the match.

- Modified signature/rule syntax for identifiers: they can no longer
  start with a '-', which made for ambiguous syntax when doing negative
  strength values in "file-mime".  Also brought syntax for Bro script
  identifiers in line with reality (they can't start with numbers or
  include '-' at all).

- A new Built-In Function, "file_magic", can be used to get all
  file magic matches and their corresponding strength against a given
  chunk of data

- The second parameter of the "identify_data" Built-In Function
  can no longer be used to get verbose file type descriptions, though it
  can still be used to get the strongest matching file magic signature.

- The "file_transferred" event's "descr" parameter no longer
  contains verbose file type descriptions.

- The BROMAGIC environment variable no longer changes any behavior
  in Bro as magic databases are no longer used/installed.

- Reverted back to minimum requirement of CMake 2.6.3 from 2.8.0
  (it's back to being the same requirement as the Bro v2.2 release).
  The bump was to accomodate building libmagic as an external project,
  which is no longer needed.

Addresses BIT-1143.
2014-03-04 11:12:06 -06:00
Robin Sommer
3f47c5bc87 Merge remote-tracking branch 'origin/master' into topic/robin/dynamic-plugins-2.3 2014-01-24 20:26:00 -08:00
Robin Sommer
a80dd10215 Updates of the dynamic plugin code.
Includes:

    - Cleanup of the plugin API, in particular generally changing
      const char* to std::string

    - Renaming environment variable BRO_PLUGINS to BRO_PLUGIN_PATH,
      defaulting to <prefix>/lib/bro/plugins

    - Reworking how dynamic plugins are searched and activated. See
      doc/devel/plugins.rst for details.

    - New @load-plugin directive to explicitly activate a plugin

    - Support for Darwin. (Linux untested right now)

    - The init-plugin updates come with support for "make test", "make
      sdist", and "make bdist" (see how-to).

    - Test updates.

Notes: The new hook mechanism, which allows plugins to hook into Bro's
core a well-defined points, is still essentially untested.
2013-12-16 11:57:56 -08:00
Robin Sommer
987452beff Cleanup of plugin component API.
- Move more functionality into base class.
- Remove cctors and assignment operators (weren't actually needed anymore)
- Switch from const char* to std::string.
2013-12-16 10:07:20 -08:00
Jon Siwek
5a67135486 Fix uninitialized field in basename/dirname util wrapper.
Shouldn't cause a problem as it's always set in subclass ctors,
just silences a coverity warning.
2013-12-10 14:08:09 -06:00
Jon Siwek
21df25d429 Fix build on FreeBSD.
basename(3)/dirname(3) const-ness may vary w/ platform.
2013-12-05 11:01:44 -06:00
Robin Sommer
3abf626908 Merge remote-tracking branch 'origin/topic/jsiwek/broxygen'
BIT-1098

* origin/topic/jsiwek/broxygen:
  Fix Broxygen-related compile errors.
  Add a Broxygen coverage test.
  Internal Broxygen organization/documentation/polish.
  Add unit tests for Broxygen config file targets.
  Change Broxygen config file format.
  Broxygen doc-related test updates.  Fix two regressions.
  A couple documentation fixes.
  Integrate new Broxygen functionality into Sphinx.
  Implement majority of Broxygen features delegated to Bro.
  Broxygen can now read a config file specifying particular targets.
  Remove unneeded Broxygen comments in scan.bro.
  Replace safe_basename/safe_dirname w/ SafeBasename/SafeDirname.
  Add BIF interface for retrieving comments/docs.
  Quick optimization to Broxygen doc gathering.
  Flesh out Broxygen doc-gathering skeleton.
  Refactor search_for_file() util function.
  Initial skeleton of new Broxygen infrastructure.
2013-12-04 11:14:19 -08:00
Robin Sommer
bda0c29f66 Restructuring the plugin API to accomodate hooks.
I got rid of the earlier separate InterpreterPlugin class. Instead
Plugin now has a set of virtual methods HookSomething()... that
plugins can override. For efficiency purposes, they however need to
register first that they are interested in a hook, otherwise the
virtual method will never be called. The idea is to extend the set of
hooks over time as we figure out what's useful.

This is a checkpoint commit that's essentially untested and probably
broken. It compiles, though.
2013-11-26 14:04:29 -08:00
Robin Sommer
555df1e7ea Checkpointing the dynamic plugin code.
This is essentially the code from the dynamic-plugin branch except for
some pieces that I have split out into separate, earlier commits.

I'm going to updatre things in this branch going forward.
2013-11-26 14:04:29 -08:00
Jon Siwek
4f6d01000a Implement majority of Broxygen features delegated to Bro.
Still have to update the Sphinx integration.
2013-11-14 14:00:51 -06:00
Jon Siwek
bdd359d58c Broxygen can now read a config file specifying particular targets.
Though nothing currently gets built as most dependency/outdated
checks and doc-generation methods are still skeleton code.
2013-11-05 16:40:24 -06:00
Jon Siwek
3046013d69 Replace safe_basename/safe_dirname w/ SafeBasename/SafeDirname.
So errors can be better handled.
2013-11-04 11:42:39 -06:00
Jon Siwek
f18436640e Flesh out Broxygen doc-gathering skeleton. 2013-10-22 14:45:47 -05:00
Jon Siwek
90477df973 Refactor search_for_file() util function.
It was getting too bloated and allocated memory in ways that were
difficult to understand how to manage.  Separated out primarily in to
new find_file() and open_file()/open_package() functions.

Also renamed other util functions for path-related things.
2013-10-07 15:01:03 -05:00
Jon Siwek
5a857a6dfc Initial skeleton of new Broxygen infrastructure.
Doesn't generate any docs, but it's hooked in to all places needed to
gather the necessary stuff w/ significantly less coupling than before.

The gathering now always occurs unconditionally to make documentation
available at runtime and a command line switch (-X) only toggles whether
to output docs to disk (reST format).

Should also improve the treatment of type name aliasing which wasn't a
big problem in practice before, but I think it's more correct now:
there's now a distinct BroType for each alias, but extensible types
(record/enum) will automatically update the types for aliases on redef.

Other misc refactoring of note:

    - Removed a redundant/unused way of declaring event types.

    - Changed type serialization format/process to preserve type name
      information and remove compatibility code (since broccoli will
      have be updated anyway).
2013-10-03 10:42:04 -05:00
Jon Siwek
0b97343ff7 Fix various potential memory leaks.
Though I expect most not to be exercised in practice.
2013-09-12 15:23:52 -05:00
Robin Sommer
d8226169b8 Fixing random number generation so that it returns same numbers as
before.

That broke a lot of tests.
2013-07-24 16:34:52 -07:00
Robin Sommer
474107fe40 Broifying the code.
Also extending API documentation a bit more and fixing a memory leak.
2013-07-23 20:10:32 -07:00
Matthias Vallentin
69a7dd03bc Merge remote-tracking branch 'origin/master' into topic/matthias/bloom-filter 2013-07-22 22:26:15 +02:00
Matthias Vallentin
9f74064289 Expose Bro's linear congruence PRNG as utility function.
It was previously not possible to crank the wheel on the PRNG in a
deterministic way without affecting the globally unique seed. The new extra
utility function bro_prng takes a state in the form of a long int and returns
the new PRNG state, now allowing arbitrary code parts to use the random number
functionality.

This commit also fixes a problem in the H3 constructor, which requires use
of multiple seeds. The single seed passed in now serves as seed to crank out as
many value needed using bro_prng.
2013-06-17 14:02:14 -07:00
Matthias Vallentin
d2d8aff814 Add utility function to access first random seed. 2013-06-14 09:22:48 -07:00
Jon Siwek
7c7b6214a6 Move file analyzers to new plugin infrastructure. 2013-06-10 15:50:18 -05:00
Robin Sommer
eb637f9f3e Merge remote-tracking branch 'origin/master' into topic/robin/plugins
Thanks to git this merge was less troublesome that I was afraid it
would be. Not all tests pass yet though (and file hashes have changed
unfortunately).

Conflicts:
	cmake
	doc/scripts/DocSourcesList.cmake
	scripts/base/init-bare.bro
	scripts/base/protocols/ftp/main.bro
	scripts/base/protocols/irc/dcc-send.bro
	scripts/test-all-policy.bro
	src/AnalyzerTags.h
	src/CMakeLists.txt
	src/analyzer/Analyzer.cc
	src/analyzer/protocol/file/File.cc
	src/analyzer/protocol/file/File.h
	src/analyzer/protocol/http/HTTP.cc
	src/analyzer/protocol/http/HTTP.h
	src/analyzer/protocol/mime/MIME.cc
	src/event.bif
	src/main.cc
	src/util-config.h.in
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/istate.events-ssl/receiver.http.log
	testing/btest/Baseline/istate.events-ssl/sender.http.log
	testing/btest/Baseline/istate.events/receiver.http.log
	testing/btest/Baseline/istate.events/sender.http.log
2013-05-16 17:58:48 -07:00
Jon Siwek
0141f51801 FileAnalysis: load custom mime magic database just once.
This works around a bug in libmagic since version 5.12 (current at
time of writing is 5.14) -- second call to magic_load() w/ non-default
database segfaults.
2013-04-29 12:49:22 -05:00
Jon Siwek
037d582b0e FileAnalysis: add custom libmagic database.
- It's derived from the magic database of libmagic 5.14, but with most
  everything not related to mime types removed.

- The custom database is always used by default for mime detection, but
  the more verbose file type detection will fall back on the default
  libmagic installation's database.  The result is: mime type strings
  are now guaranteed to be consistent across platforms, but the verbose
  file type descriptions are not.

- The custom database gets installed in $prefix/share/bro/magic, and
  should even be extensible if files with new patterns are added inside
  the directory.

- The search path for the mime magic database can be controlled via
  BROMAGIC environment variable.

- Remove mime_desc field from ftp.log.

- Stop using the mime/file type canonifier with unit tests.

- libmagic >= 5.04 is now a requirement.
2013-04-12 11:58:19 -05:00
Robin Sommer
2002787c6e A set of interface changes in preparation for merging into BinPAC++
branch.
2013-04-09 17:16:27 -07:00
Robin Sommer
af1809aaa3 First prototype of new analyzer framework.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.

There are three major parts going into this:

    - A new plugin infrastructure in src/plugin. This is independent
      of analyzers and will eventually support plugins for other parts
      of Bro as well (think: readers and writers). The goal is that
      plugins can be alternatively compiled in statically or loadead
      dynamically at runtime from a shared library. While the latter
      isn't there yet, there'll be almost no code change for a plugin
      to make it dynamic later (hopefully :)

    - New analyzer infrastructure in src/analyzer. I've moved a number
      of analyzer-related classes here, including Analyzer and DPM;
      the latter now renamed to Analyzer::Manager. More will move here
      later. Currently, there's only one plugin here, which provides
      *all* existing analyzers. We can modularize this further in the
      future (or not).

    - A new script interface in base/framework/analyzer. I think that
      this will eventually replace the dpm framework, but for now
      that's still there as well, though some parts have moved over.

I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:

    const ports = { 22/tcp } &redef;

    event bro_init() &priority=5
        {
        ...
        Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
        }

As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.

This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.

The debug stream "dpm" shows more about the loaded/enabled analyzers.

A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).

This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
2013-03-26 11:05:38 -07:00
Jon Siwek
1f6cac9b6d Merge branch 'master' into topic/jsiwek/file-analysis 2013-03-11 13:20:45 -05:00
Robin Sommer
f830ed3edf s/bro-ids.org/bro.org/g 2013-03-07 19:33:04 -08:00
Jon Siwek
589952f4d9 Merge branch 'master' into topic/jsiwek/file-analysis
Conflicts:
	src/FileAnalyzer.cc
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
2013-03-07 11:06:00 -06:00
Jon Siwek
2481f9f837 Fix possible null pointer dereference in identify_data BIF.
There was no check/handling for if magic_buffer() returns null.
Also centralized libmagic calls for consistent error handling/output.
2013-02-27 16:04:36 -06:00
Jon Siwek
f04d189d3f More work on the interface to add/remove file analysis actions.
Added the file extraction action and did other misc. cleanup.  Most of
the minimal core features/support for file analysis should be working at
this point, just have to start fleshing things out.
2013-02-14 12:53:20 -06:00
Robin Sommer
05e6289719 Catching out-of-memory in patricia tree code.
Based on patch by Bill Parker.
2012-12-03 15:42:43 -08:00
Jon Siwek
46d225cc5b Add parsing rules for IPv4/IPv6 subnet literal constants, addresses #888
This fixes specifying IPv4 subnets in IPv4-mapped-IPv6 format with a
mask length relative to the 128 bits of the mapped IPv6 address.
2012-10-22 15:57:21 -05:00
Jon Siwek
734e5f68d3 Add more error handling for close() calls. 2012-07-26 12:40:12 -05:00
Robin Sommer
a33e9a6941 Fixing FreeBSD compiler error. 2012-07-25 13:58:23 -07:00