Commit graph

339 commits

Author SHA1 Message Date
Jon Siwek
b22ca5d0a3 Replace libmagic w/ Bro signatures for file MIME type identification.
Notable changes:

- libmagic is no longer used at all.  All MIME type detection is
  done through new Bro signatures, and there's no longer a means to get
  verbose file type descriptions (e.g. "PNG image data, 1435 x 170").
  The majority of the default file magic signatures are derived
  from the default magic database of libmagic ~5.17.

- File magic signatures consist of two new constructs in the
  signature rule parsing grammar: "file-magic" gives a regular
  expression to match against, and "file-mime" gives the MIME type
  string of content that matches the magic and an optional strength
  value for the match.

- Modified signature/rule syntax for identifiers: they can no longer
  start with a '-', which made for ambiguous syntax when doing negative
  strength values in "file-mime".  Also brought syntax for Bro script
  identifiers in line with reality (they can't start with numbers or
  include '-' at all).

- A new Built-In Function, "file_magic", can be used to get all
  file magic matches and their corresponding strength against a given
  chunk of data

- The second parameter of the "identify_data" Built-In Function
  can no longer be used to get verbose file type descriptions, though it
  can still be used to get the strongest matching file magic signature.

- The "file_transferred" event's "descr" parameter no longer
  contains verbose file type descriptions.

- The BROMAGIC environment variable no longer changes any behavior
  in Bro as magic databases are no longer used/installed.

- Reverted back to minimum requirement of CMake 2.6.3 from 2.8.0
  (it's back to being the same requirement as the Bro v2.2 release).
  The bump was to accomodate building libmagic as an external project,
  which is no longer needed.

Addresses BIT-1143.
2014-03-04 11:12:06 -06:00
Bernhard Amann
f821a13cce Merge remote-tracking branch 'origin/master' into topic/bernhard/file-analysis-x509
Conflicts:
	src/analyzer/protocol/ssl/events.bif

Still broken.
2014-01-28 06:43:08 -08:00
Bernhard Amann
2c7e7f962e Make x509 certificates an opaque type 2014-01-28 06:39:50 -08:00
Robin Sommer
191b63e334 Merge branch 'topic/robin/dynamic-plugins-2.3' into topic/robin/pktsrc 2014-01-27 09:31:15 -08:00
Robin Sommer
3f47c5bc87 Merge remote-tracking branch 'origin/master' into topic/robin/dynamic-plugins-2.3 2014-01-24 20:26:00 -08:00
Robin Sommer
a80dd10215 Updates of the dynamic plugin code.
Includes:

    - Cleanup of the plugin API, in particular generally changing
      const char* to std::string

    - Renaming environment variable BRO_PLUGINS to BRO_PLUGIN_PATH,
      defaulting to <prefix>/lib/bro/plugins

    - Reworking how dynamic plugins are searched and activated. See
      doc/devel/plugins.rst for details.

    - New @load-plugin directive to explicitly activate a plugin

    - Support for Darwin. (Linux untested right now)

    - The init-plugin updates come with support for "make test", "make
      sdist", and "make bdist" (see how-to).

    - Test updates.

Notes: The new hook mechanism, which allows plugins to hook into Bro's
core a well-defined points, is still essentially untested.
2013-12-16 11:57:56 -08:00
Robin Sommer
93d9dde969 IOSource reorg.
A bunch of infrastructure work to move IOSource, IOSourceRegistry (now
iosource::Manager) and PktSrc/PktDumper code into iosource/, and over
to a plugin structure.

Other IOSources aren't touched yet, they are still in src/*.

It compiles and does something with a small trace, but that's all I've
tested so far. There are quite certainly a number of problems left, as
well as various TODOs and cleanup; and nothing's cast in stone yet.

Will continue to work on this.
2013-12-11 18:00:34 -08:00
Jon Siwek
dedc39d784 Minor Broxygen improvements, addresses BIT-1098.
- Internals: move type alias table to private static BroType member.

- Sphinx extension: now uses absolute path to bro binary.

- reST ouput formatting: remove "param" from function desriptions
  and change package overview docs so script link+summaries render
  consistently.
2013-12-06 09:35:35 -06:00
Robin Sommer
bda0c29f66 Restructuring the plugin API to accomodate hooks.
I got rid of the earlier separate InterpreterPlugin class. Instead
Plugin now has a set of virtual methods HookSomething()... that
plugins can override. For efficiency purposes, they however need to
register first that they are interested in a hook, otherwise the
virtual method will never be called. The idea is to extend the set of
hooks over time as we figure out what's useful.

This is a checkpoint commit that's essentially untested and probably
broken. It compiles, though.
2013-11-26 14:04:29 -08:00
Robin Sommer
555df1e7ea Checkpointing the dynamic plugin code.
This is essentially the code from the dynamic-plugin branch except for
some pieces that I have split out into separate, earlier commits.

I'm going to updatre things in this branch going forward.
2013-11-26 14:04:29 -08:00
Jon Siwek
e58865af22 Internal Broxygen organization/documentation/polish. 2013-11-25 14:36:05 -06:00
Jon Siwek
4f6d01000a Implement majority of Broxygen features delegated to Bro.
Still have to update the Sphinx integration.
2013-11-14 14:00:51 -06:00
Jon Siwek
52733b0501 Quick optimization to Broxygen doc gathering.
So script parsing is only ~2x slower rather than 20x.  Turns out cloning
Vals is particularly slow.  Changed to just get a string description of
the Val for initial value and redef tracking.
2013-10-22 16:29:58 -05:00
Jon Siwek
f18436640e Flesh out Broxygen doc-gathering skeleton. 2013-10-22 14:45:47 -05:00
Jon Siwek
68227f112d Merge branch 'master' into topic/jsiwek/broxygen 2013-10-03 13:06:23 -05:00
Jon Siwek
5a857a6dfc Initial skeleton of new Broxygen infrastructure.
Doesn't generate any docs, but it's hooked in to all places needed to
gather the necessary stuff w/ significantly less coupling than before.

The gathering now always occurs unconditionally to make documentation
available at runtime and a command line switch (-X) only toggles whether
to output docs to disk (reST format).

Should also improve the treatment of type name aliasing which wasn't a
big problem in practice before, but I think it's more correct now:
there's now a distinct BroType for each alias, but extensible types
(record/enum) will automatically update the types for aliases on redef.

Other misc refactoring of note:

    - Removed a redundant/unused way of declaring event types.

    - Changed type serialization format/process to preserve type name
      information and remove compatibility code (since broccoli will
      have be updated anyway).
2013-10-03 10:42:04 -05:00
Jon Siwek
daf5d0d098 Improve return value checking and error handling. 2013-09-24 17:38:22 -05:00
Jon Siwek
0b97343ff7 Fix various potential memory leaks.
Though I expect most not to be exercised in practice.
2013-09-12 15:23:52 -05:00
Bernhard Amann
2dd0d057e6 Merge remote-tracking branch 'origin/master' into topic/bernhard/hyperloglog
Conflicts:
	src/NetVar.cc
	src/NetVar.h
2013-08-30 08:43:47 -07:00
Jon Siwek
dc2e3d6e04 Fix global opaque val segfault, addresses BIT-1071
The opaque types need to be created before scripts are parsed.
2013-08-29 17:17:40 -05:00
Jon Siwek
dc370fdd8d Fix a deadlock w/ SQLite.
sqlite3_shutdown() was called a bit too early, when SQLite-using
threads may still have yet to fully shutdown.
2013-08-19 14:18:18 -05:00
Jon Siwek
d3dad31bdc Raw input reader command execution "fixes".
- Primarily working around an issue that occurs when threads
  concurrently create pipes and fork a child process.  See comment in
  code...

- Other minor cleanup of the code:  making sure the child process calls
  _exit() versus exit(), limits itself to few select system calls before
  the exec(), and closes more unused file descriptors.
2013-08-14 11:37:30 -05:00
Jon Siwek
d84f6e012c Fix various documentation, mostly related to file analysis.
- Fix examples/references in the file analysis how-to/usage doc.

- Add Broxygen-generated docs for file analyzer plugins.

- Break FTP::Info type declaration out in to its own file to get
  rid of some circular dependencies (between s/b/p/ftp/main and
  s/b/p/ftp/utils).
2013-07-29 16:15:37 -05:00
Seth Hall
5f8ee93ef0 Merge remote-tracking branch 'origin/master' into topic/seth/analyzer-framework
Conflicts:
	scripts/base/init-default.bro
	scripts/base/protocols/dns/main.bro
	scripts/base/protocols/ftp/main.bro
	scripts/base/protocols/http/main.bro
	scripts/base/protocols/irc/main.bro
	scripts/base/protocols/smtp/main.bro
	scripts/base/protocols/ssh/main.bro
	scripts/base/protocols/ssl/main.bro
	scripts/base/protocols/syslog/main.bro
	src/main.cc
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
2013-07-04 23:07:52 -04:00
Robin Sommer
96fe05633a Merge remote-tracking branch 'origin/topic/bernhard/input-update'
Closes #1021.

* origin/topic/bernhard/input-update:
  this event handler fails the unused-event-handlers test because it is a bit of a special case.
  ...and fix the event ordering issue. Dispatch != QueueEvent
  add Terminate to input framework to prevent potential shutdown race-conditions.
  fix warning.
  fix stderr test. ls behaves differently on errors on linux...
  small fixes.
  linux does not have strnstr
  and close only fds that are currently open (the logging framework really did not like that :) )
  A bunch of more changes for the raw reader
  make reading from stdout and stderr simultaneously work.
  allow sending data to stdin of child process
  Streaming reads from external commands work without blocking anything.
  replace popen with fork and exec.
  change raw reader to use basic c io instead of fdstream encapsulation class.
2013-07-03 16:52:28 -07:00
Robin Sommer
a329c3e7c3 Merge remote-tracking branch 'origin/topic/jsiwek/plugin-docs'
Closes #1019.

* origin/topic/jsiwek/plugin-docs:
  Teach broxygen to generate protocol analyzer plugin reference.
  const adjustments
2013-07-03 16:32:00 -07:00
Jon Siwek
7c7b6214a6 Move file analyzers to new plugin infrastructure. 2013-06-10 15:50:18 -05:00
Bernhard Amann
3517c0ba99 add Terminate to input framework to prevent potential shutdown race-conditions. 2013-06-09 08:27:08 -04:00
Jon Siwek
e56a17102e Teach broxygen to generate protocol analyzer plugin reference. 2013-06-07 13:21:18 -05:00
Robin Sommer
23463d064c Little fixes. 2013-05-30 19:13:08 -07:00
Robin Sommer
e3a7e0301b Cleanup and more API docs. 2013-05-30 16:45:14 -07:00
Robin Sommer
eb637f9f3e Merge remote-tracking branch 'origin/master' into topic/robin/plugins
Thanks to git this merge was less troublesome that I was afraid it
would be. Not all tests pass yet though (and file hashes have changed
unfortunately).

Conflicts:
	cmake
	doc/scripts/DocSourcesList.cmake
	scripts/base/init-bare.bro
	scripts/base/protocols/ftp/main.bro
	scripts/base/protocols/irc/dcc-send.bro
	scripts/test-all-policy.bro
	src/AnalyzerTags.h
	src/CMakeLists.txt
	src/analyzer/Analyzer.cc
	src/analyzer/protocol/file/File.cc
	src/analyzer/protocol/file/File.h
	src/analyzer/protocol/http/HTTP.cc
	src/analyzer/protocol/http/HTTP.h
	src/analyzer/protocol/mime/MIME.cc
	src/event.bif
	src/main.cc
	src/util-config.h.in
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/istate.events-ssl/receiver.http.log
	testing/btest/Baseline/istate.events-ssl/sender.http.log
	testing/btest/Baseline/istate.events/receiver.http.log
	testing/btest/Baseline/istate.events/sender.http.log
2013-05-16 17:58:48 -07:00
Robin Sommer
4b86730ef2 Reintroducing the logging::Manager's Terminate() method.
It doesn't do anything else than simply forwarding to FlushBuffers().

This is just for consistency in terminate_bro() where components get
their Terminate() called so that the main code doesn't need to know
anything more specific about what particular action to take at
shutdown.
2013-05-15 17:19:52 -07:00
Robin Sommer
639a6410c6 Merge remote-tracking branch 'origin/topic/bernhard/thread-cleanup'
* origin/topic/bernhard/thread-cleanup:
  and just to be really sure - always make threads go through OnWaitForStop
  hopefully finally fix last interesting race-condition
  it is apparently getting a bit late for changes at important code...
  spoke to soon (forgot to comment in line again).
  Change thread shutdown again to also work with input framework.
  Changing semantics of thread stop methods.
  Support for cleaning up threads that have terminated.
2013-05-15 17:16:41 -07:00
Robin Sommer
358528732c Merge branch 'topic/robin/sqlite-merge'
Closes #997.

* topic/robin/sqlite-merge: (25 commits)
  Fix to make sqlite test consistent, and updating coverage baselines
  Avoid a CMake warning about 3rdparty looking like a number.
  Fixing linker error.
  and there is no has-reader.
  make sqlite3 executable required and add test-cases for errors
  Renaming src/external -> src/3rdparty
  fix a few small rough edges (mostly comments that do no longer apply)
  fix bug in input-manager regarding enums that a writer reads without 0-terminating the string
  actually make sqlite work again (tests passed because the writer was not actually defined because of the define.)
  add sqlite distribution.
  fix warnings, update baselines, handle rotation
  add sqlite tests and fix small vector/set escaping bugs
  fix small bug with vectors and sets.
  make work with newer AsciiFormatter.
  start adding a different text for empty records for the sqlite writer.
  no, you will never guess from where I copied this file...
  make sqlite support more or less work for logging and input
  make sqlite-writer more stable.
  make it compile with new version of AsciiInputOutput
  and adapt to AsciiInputOutput - seems to work...
  ...

Conflicts:
	scripts/base/frameworks/input/__load__.bro
	src/CMakeLists.txt
	src/input.bif
	src/input/Manager.cc
	src/main.cc
	src/types.bif
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
2013-05-15 16:03:19 -07:00
Bernhard Amann
f389cafc3b Merge remote-tracking branch 'origin/master' into topic/bernhard/thread-cleanup
Conflicts:
	src/main.cc
2013-05-15 16:00:49 -07:00
Bernhard Amann
39f1b9e01f Change thread shutdown again to also work with input framework.
Seems to work, tests pass, but not really verified.

Major change 1:
finished flag in MsgThread was replaced by 2 flags:
child_finished and main_finished.

child_finished is set by child_thread and means that the processing
loop is stopped immediately (no longer needed, no new input messages
will be processed, if loop continues running there is an ugly delay
on shutdown). (This took me a while to realize...)

main_finished is set by a message that is sent back by the child
to the main thread when Finished() is called (and child_finished
is set). when main_finished is set, processing of output messages
stops. But all messages that the child thread pushed in the queue
before calling Finish() are still processed.

Change 2:
Logging terminate call was replaced by a smaller call that just
flushes out the cache held by the main thread. This call
has to be done before thread shutdown is called - otherwhise
the threads will be shut down before all messages are pushed
on them. (This also took me a while to realize...).

Change 3:
Input framework actually calls it stop methods correctly (everything
was prepared, function call was missing)
2013-05-14 23:45:55 -07:00
Robin Sommer
abb350b535 Renaming src/external -> src/3rdparty
We should eventually move more 3rdparty code in here.
2013-05-14 17:14:08 -07:00
Robin Sommer
de88645d05 Merge remote-tracking branch 'origin/topic/bernhard/sqlite'
* origin/topic/bernhard/sqlite:
  fix a few small rough edges (mostly comments that do no longer apply)
  fix bug in input-manager regarding enums that a writer reads without 0-terminating the string
  actually make sqlite work again (tests passed because the writer was not actually defined because of the define.)
  add sqlite distribution.
  fix warnings, update baselines, handle rotation
  add sqlite tests and fix small vector/set escaping bugs
  fix small bug with vectors and sets.
  make work with newer AsciiFormatter.
  start adding a different text for empty records for the sqlite writer.
  no, you will never guess from where I copied this file...
  make sqlite support more or less work for logging and input
  make sqlite-writer more stable.
  make it compile with new version of AsciiInputOutput
  and adapt to AsciiInputOutput - seems to work...
  make it compile
  add SQLite reader.
  ...adapt to new api...
  now the writer supports tables and vectors.
  basic sqlite writer seems to work.
2013-05-14 17:11:09 -07:00
Jon Siwek
0141f51801 FileAnalysis: load custom mime magic database just once.
This works around a bug in libmagic since version 5.12 (current at
time of writing is 5.14) -- second call to magic_load() w/ non-default
database segfaults.
2013-04-29 12:49:22 -05:00
Jon Siwek
037d582b0e FileAnalysis: add custom libmagic database.
- It's derived from the magic database of libmagic 5.14, but with most
  everything not related to mime types removed.

- The custom database is always used by default for mime detection, but
  the more verbose file type detection will fall back on the default
  libmagic installation's database.  The result is: mime type strings
  are now guaranteed to be consistent across platforms, but the verbose
  file type descriptions are not.

- The custom database gets installed in $prefix/share/bro/magic, and
  should even be extensible if files with new patterns are added inside
  the directory.

- The search path for the mime magic database can be controlled via
  BROMAGIC environment variable.

- Remove mime_desc field from ftp.log.

- Stop using the mime/file type canonifier with unit tests.

- libmagic >= 5.04 is now a requirement.
2013-04-12 11:58:19 -05:00
Robin Sommer
2002787c6e A set of interface changes in preparation for merging into BinPAC++
branch.
2013-04-09 17:16:27 -07:00
Robin Sommer
897be0e147 Giving analyzer/ its own CMakeLists.txt.
Also moving src/analyzer.bif to src/analyzer/analyzer.bif, along with
the infrastructure to build/incude bif code at other locations.

We should generally move to having per-directory CMakeLists.txt. I'll
convert the others over later.
2013-04-04 16:53:21 -07:00
Robin Sommer
40ca718e90 Removing the --use-binpac switch. 2013-04-03 13:40:49 -07:00
Robin Sommer
19c1816ebb Infrastructure for modularizing protocol analyzers.
There's now a new directory "src/protocols/", and the plan is for each
protocol analyzer to eventually have its own subdirectory in there
that contains everything it defines (C++/pac/bif). The infrastructure
to make that happen is in place, and two analyzers have been
converted to the new model, HTTP and SSL; there's no further
HTTP/SSL-specific code anywhere else in the core anymore (I believe :-)

Further changes:

    - -N lists available plugins, -NN lists more details on what these
      plugins provide (analyzers, bif elements). (The latter does not
      work for analyzers that haven't been converted yet).

    - *.bif.bro files now go into scripts/base/bif/; and
      scripts/base/bif/plugins/ for bif files provided by plugins.

    - I've factored out the bifcl/binpac CMake magic from
      src/CMakeLists.txt to cmake/{BifCl,Binpac}

    - There's a new cmake/BroPlugin that contains magic to allow
      plugins to have a simple CMakeLists.txt. The hope is that
      eventually the same CMakeLists.txt can be used for compiling a
      plugin either statically or dynamically.

    - bifcl has a new option -c that changes the code it generates so
      that it can be used with a plugin.

TODOs:
    - "make install" is probably broken.
    - Broxygen is probably broken for plugin-defined events.
    - event groups are broken (do we want to keep them?)
2013-03-29 19:59:31 -07:00
Robin Sommer
eef4858692 Fixes for non-OSX. 2013-03-26 13:08:03 -07:00
Robin Sommer
af1809aaa3 First prototype of new analyzer framework.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.

There are three major parts going into this:

    - A new plugin infrastructure in src/plugin. This is independent
      of analyzers and will eventually support plugins for other parts
      of Bro as well (think: readers and writers). The goal is that
      plugins can be alternatively compiled in statically or loadead
      dynamically at runtime from a shared library. While the latter
      isn't there yet, there'll be almost no code change for a plugin
      to make it dynamic later (hopefully :)

    - New analyzer infrastructure in src/analyzer. I've moved a number
      of analyzer-related classes here, including Analyzer and DPM;
      the latter now renamed to Analyzer::Manager. More will move here
      later. Currently, there's only one plugin here, which provides
      *all* existing analyzers. We can modularize this further in the
      future (or not).

    - A new script interface in base/framework/analyzer. I think that
      this will eventually replace the dpm framework, but for now
      that's still there as well, though some parts have moved over.

I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:

    const ports = { 22/tcp } &redef;

    event bro_init() &priority=5
        {
        ...
        Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
        }

As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.

This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.

The debug stream "dpm" shows more about the loaded/enabled analyzers.

A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).

This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
2013-03-26 11:05:38 -07:00
Bernhard Amann
8cb91de93a Merge remote-tracking branch 'origin/master' into topic/bernhard/sqlite
Conflicts:
	src/threading/AsciiFormatter.cc
2013-03-11 11:47:10 -07:00
Jon Siwek
f8af42cf9a Reorganizing file analysis source code. 2013-02-14 16:07:42 -06:00
Jon Siwek
b9d204005d Merge branch 'master' into topic/jsiwek/file-analysis 2013-02-08 09:53:27 -06:00