Notable changes:
- libmagic is no longer used at all. All MIME type detection is
done through new Bro signatures, and there's no longer a means to get
verbose file type descriptions (e.g. "PNG image data, 1435 x 170").
The majority of the default file magic signatures are derived
from the default magic database of libmagic ~5.17.
- File magic signatures consist of two new constructs in the
signature rule parsing grammar: "file-magic" gives a regular
expression to match against, and "file-mime" gives the MIME type
string of content that matches the magic and an optional strength
value for the match.
- Modified signature/rule syntax for identifiers: they can no longer
start with a '-', which made for ambiguous syntax when doing negative
strength values in "file-mime". Also brought syntax for Bro script
identifiers in line with reality (they can't start with numbers or
include '-' at all).
- A new Built-In Function, "file_magic", can be used to get all
file magic matches and their corresponding strength against a given
chunk of data
- The second parameter of the "identify_data" Built-In Function
can no longer be used to get verbose file type descriptions, though it
can still be used to get the strongest matching file magic signature.
- The "file_transferred" event's "descr" parameter no longer
contains verbose file type descriptions.
- The BROMAGIC environment variable no longer changes any behavior
in Bro as magic databases are no longer used/installed.
- Reverted back to minimum requirement of CMake 2.6.3 from 2.8.0
(it's back to being the same requirement as the Bro v2.2 release).
The bump was to accomodate building libmagic as an external project,
which is no longer needed.
Addresses BIT-1143.
Includes:
- Cleanup of the plugin API, in particular generally changing
const char* to std::string
- Renaming environment variable BRO_PLUGINS to BRO_PLUGIN_PATH,
defaulting to <prefix>/lib/bro/plugins
- Reworking how dynamic plugins are searched and activated. See
doc/devel/plugins.rst for details.
- New @load-plugin directive to explicitly activate a plugin
- Support for Darwin. (Linux untested right now)
- The init-plugin updates come with support for "make test", "make
sdist", and "make bdist" (see how-to).
- Test updates.
Notes: The new hook mechanism, which allows plugins to hook into Bro's
core a well-defined points, is still essentially untested.
- Move more functionality into base class.
- Remove cctors and assignment operators (weren't actually needed anymore)
- Switch from const char* to std::string.
BIT-1098
* origin/topic/jsiwek/broxygen:
Fix Broxygen-related compile errors.
Add a Broxygen coverage test.
Internal Broxygen organization/documentation/polish.
Add unit tests for Broxygen config file targets.
Change Broxygen config file format.
Broxygen doc-related test updates. Fix two regressions.
A couple documentation fixes.
Integrate new Broxygen functionality into Sphinx.
Implement majority of Broxygen features delegated to Bro.
Broxygen can now read a config file specifying particular targets.
Remove unneeded Broxygen comments in scan.bro.
Replace safe_basename/safe_dirname w/ SafeBasename/SafeDirname.
Add BIF interface for retrieving comments/docs.
Quick optimization to Broxygen doc gathering.
Flesh out Broxygen doc-gathering skeleton.
Refactor search_for_file() util function.
Initial skeleton of new Broxygen infrastructure.
I got rid of the earlier separate InterpreterPlugin class. Instead
Plugin now has a set of virtual methods HookSomething()... that
plugins can override. For efficiency purposes, they however need to
register first that they are interested in a hook, otherwise the
virtual method will never be called. The idea is to extend the set of
hooks over time as we figure out what's useful.
This is a checkpoint commit that's essentially untested and probably
broken. It compiles, though.
This is essentially the code from the dynamic-plugin branch except for
some pieces that I have split out into separate, earlier commits.
I'm going to updatre things in this branch going forward.
It was getting too bloated and allocated memory in ways that were
difficult to understand how to manage. Separated out primarily in to
new find_file() and open_file()/open_package() functions.
Also renamed other util functions for path-related things.
Doesn't generate any docs, but it's hooked in to all places needed to
gather the necessary stuff w/ significantly less coupling than before.
The gathering now always occurs unconditionally to make documentation
available at runtime and a command line switch (-X) only toggles whether
to output docs to disk (reST format).
Should also improve the treatment of type name aliasing which wasn't a
big problem in practice before, but I think it's more correct now:
there's now a distinct BroType for each alias, but extensible types
(record/enum) will automatically update the types for aliases on redef.
Other misc refactoring of note:
- Removed a redundant/unused way of declaring event types.
- Changed type serialization format/process to preserve type name
information and remove compatibility code (since broccoli will
have be updated anyway).
It was previously not possible to crank the wheel on the PRNG in a
deterministic way without affecting the globally unique seed. The new extra
utility function bro_prng takes a state in the form of a long int and returns
the new PRNG state, now allowing arbitrary code parts to use the random number
functionality.
This commit also fixes a problem in the H3 constructor, which requires use
of multiple seeds. The single seed passed in now serves as seed to crank out as
many value needed using bro_prng.
Thanks to git this merge was less troublesome that I was afraid it
would be. Not all tests pass yet though (and file hashes have changed
unfortunately).
Conflicts:
cmake
doc/scripts/DocSourcesList.cmake
scripts/base/init-bare.bro
scripts/base/protocols/ftp/main.bro
scripts/base/protocols/irc/dcc-send.bro
scripts/test-all-policy.bro
src/AnalyzerTags.h
src/CMakeLists.txt
src/analyzer/Analyzer.cc
src/analyzer/protocol/file/File.cc
src/analyzer/protocol/file/File.h
src/analyzer/protocol/http/HTTP.cc
src/analyzer/protocol/http/HTTP.h
src/analyzer/protocol/mime/MIME.cc
src/event.bif
src/main.cc
src/util-config.h.in
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/istate.events-ssl/receiver.http.log
testing/btest/Baseline/istate.events-ssl/sender.http.log
testing/btest/Baseline/istate.events/receiver.http.log
testing/btest/Baseline/istate.events/sender.http.log
This works around a bug in libmagic since version 5.12 (current at
time of writing is 5.14) -- second call to magic_load() w/ non-default
database segfaults.
- It's derived from the magic database of libmagic 5.14, but with most
everything not related to mime types removed.
- The custom database is always used by default for mime detection, but
the more verbose file type detection will fall back on the default
libmagic installation's database. The result is: mime type strings
are now guaranteed to be consistent across platforms, but the verbose
file type descriptions are not.
- The custom database gets installed in $prefix/share/bro/magic, and
should even be extensible if files with new patterns are added inside
the directory.
- The search path for the mime magic database can be controlled via
BROMAGIC environment variable.
- Remove mime_desc field from ftp.log.
- Stop using the mime/file type canonifier with unit tests.
- libmagic >= 5.04 is now a requirement.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.
There are three major parts going into this:
- A new plugin infrastructure in src/plugin. This is independent
of analyzers and will eventually support plugins for other parts
of Bro as well (think: readers and writers). The goal is that
plugins can be alternatively compiled in statically or loadead
dynamically at runtime from a shared library. While the latter
isn't there yet, there'll be almost no code change for a plugin
to make it dynamic later (hopefully :)
- New analyzer infrastructure in src/analyzer. I've moved a number
of analyzer-related classes here, including Analyzer and DPM;
the latter now renamed to Analyzer::Manager. More will move here
later. Currently, there's only one plugin here, which provides
*all* existing analyzers. We can modularize this further in the
future (or not).
- A new script interface in base/framework/analyzer. I think that
this will eventually replace the dpm framework, but for now
that's still there as well, though some parts have moved over.
I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:
const ports = { 22/tcp } &redef;
event bro_init() &priority=5
{
...
Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
}
As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.
This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.
The debug stream "dpm" shows more about the loaded/enabled analyzers.
A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).
This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
Added the file extraction action and did other misc. cleanup. Most of
the minimal core features/support for file analysis should be working at
this point, just have to start fleshing things out.
* origin/fastpath:
Fix memory leak when processing a thread's input message fails.
add comparator functor to the info maps of readerbackend and readerwriteend.
Fix initialization of WriterFrontend names.
This is required, because after the recent changes the info map containst a
char* as key. Without the comparator the map will compare the char addresses
for all operations - which is not really what we want.
I managed to completely forget to add unescaping to the input framework -
this should fix it. It now works with the exact same escaping that is
used by the writers (\x##).
Includes one testcase that seems to work - everything else still passes.
I've only tested that it compiles, not whether it still works. The
fact that we don't have any tests for this makes me uneasy ...
* remotes/origin/topic/seth/elasticsearch: (35 commits)
Some documentation updates for elasticsearch plugin.
Temporarily removing the ES timeout because it works with signals and is incompatible with Bro threads.
Changed ES index names to localtime and added a meta index.
New script for easily duplicating logs to ElasticSearch.
Some better elasticsearch reliability.
Fixed small elasticsearch problem in configure output.
Re-adding the needed call to FinishedRotation in the ES writer plugin.
Tiny updates.
Bringing elasticsearch branch up to date with master.
Adding a define to make the stdint C macros available.
Adding an extra header.
Fixed a bug with messed up time value passing to elasticsearch.
Small updates and a little standardization for config.h.in naming.
Bug fixes.
Bug fix and feature.
Forgot to call the parent method for DoHeartBeat.
Changed the escaping method.
Flush logs to ES daemon as Bro is shutting down.
Reduce the batch size to 1000 and add a maximum time interval for batches.
Reworked bulk operation string construction to use ODesc and added json escaping.
...
* robin/topic/writer-info:
Extending the log writer DoInit() API.
Reworking log writer API to make it easier to pass additional information to a writer's initialization method.
Conflicts:
src/logging/WriterBackend.cc
src/logging/WriterBackend.h
src/logging/WriterFrontend.cc
* origin/fastpath:
Fix inconsistencies in random number generation.
Updating input framework unit tests.
Add front-end name to InitMessage from WriterFrontend to Backend.
Small tweak to make test complete quicker.
Drain events before terminating log/thread managers.
Fix strict-aliasing warning in RemoteSerializer.cc (fixes#834).
Fix typos in event documentation
Fix typos in NEWS for Bro 2.1 beta
The srand()/rand() interface was being intermixed with the
srandom()/random() one. The later is now used throughout.
Changed the srand() and rand() BIFs to work deterministically if Bro
was given a seed file (addresses #825). They also now wrap the
system's srandom() and random() instead of srand() and rand() as per
the above.
We now pass in a Info struct that contains:
- the path name (as before)
- the rotation interval
- the log_rotate_base_time in seconds
- a table of key/value pairs with further configuration options.
To fill the table, log filters have a new field "config: table[string]
of strings". This gives a way to pass arbitrary values from
script-land to writers. Interpretation is left up to the writer.
Also splits calc_next_rotate() into two functions, one of which is
thread-safe and can be used with the log_rotate_base_time value from
DoInit().
Includes also updates to the None writer:
- It gets its own script writers/none.bro.
- New bool option LogNone::debug to enable debug output. It then
prints out all the values passed to DoInit(). That's used by a
btest test to ensure the new DoInit() values are right.
- Fixed a bug that prevented Bro from terminating..
(scripts.base.frameworks.logging.rotate-custom currently fails.
Haven't yet investigated why.)