It was previously not possible to crank the wheel on the PRNG in a
deterministic way without affecting the globally unique seed. The new extra
utility function bro_prng takes a state in the form of a long int and returns
the new PRNG state, now allowing arbitrary code parts to use the random number
functionality.
This commit also fixes a problem in the H3 constructor, which requires use
of multiple seeds. The single seed passed in now serves as seed to crank out as
many value needed using bro_prng.
Thanks to git this merge was less troublesome that I was afraid it
would be. Not all tests pass yet though (and file hashes have changed
unfortunately).
Conflicts:
cmake
doc/scripts/DocSourcesList.cmake
scripts/base/init-bare.bro
scripts/base/protocols/ftp/main.bro
scripts/base/protocols/irc/dcc-send.bro
scripts/test-all-policy.bro
src/AnalyzerTags.h
src/CMakeLists.txt
src/analyzer/Analyzer.cc
src/analyzer/protocol/file/File.cc
src/analyzer/protocol/file/File.h
src/analyzer/protocol/http/HTTP.cc
src/analyzer/protocol/http/HTTP.h
src/analyzer/protocol/mime/MIME.cc
src/event.bif
src/main.cc
src/util-config.h.in
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/istate.events-ssl/receiver.http.log
testing/btest/Baseline/istate.events-ssl/sender.http.log
testing/btest/Baseline/istate.events/receiver.http.log
testing/btest/Baseline/istate.events/sender.http.log
This works around a bug in libmagic since version 5.12 (current at
time of writing is 5.14) -- second call to magic_load() w/ non-default
database segfaults.
- It's derived from the magic database of libmagic 5.14, but with most
everything not related to mime types removed.
- The custom database is always used by default for mime detection, but
the more verbose file type detection will fall back on the default
libmagic installation's database. The result is: mime type strings
are now guaranteed to be consistent across platforms, but the verbose
file type descriptions are not.
- The custom database gets installed in $prefix/share/bro/magic, and
should even be extensible if files with new patterns are added inside
the directory.
- The search path for the mime magic database can be controlled via
BROMAGIC environment variable.
- Remove mime_desc field from ftp.log.
- Stop using the mime/file type canonifier with unit tests.
- libmagic >= 5.04 is now a requirement.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.
There are three major parts going into this:
- A new plugin infrastructure in src/plugin. This is independent
of analyzers and will eventually support plugins for other parts
of Bro as well (think: readers and writers). The goal is that
plugins can be alternatively compiled in statically or loadead
dynamically at runtime from a shared library. While the latter
isn't there yet, there'll be almost no code change for a plugin
to make it dynamic later (hopefully :)
- New analyzer infrastructure in src/analyzer. I've moved a number
of analyzer-related classes here, including Analyzer and DPM;
the latter now renamed to Analyzer::Manager. More will move here
later. Currently, there's only one plugin here, which provides
*all* existing analyzers. We can modularize this further in the
future (or not).
- A new script interface in base/framework/analyzer. I think that
this will eventually replace the dpm framework, but for now
that's still there as well, though some parts have moved over.
I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:
const ports = { 22/tcp } &redef;
event bro_init() &priority=5
{
...
Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
}
As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.
This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.
The debug stream "dpm" shows more about the loaded/enabled analyzers.
A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).
This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
Added the file extraction action and did other misc. cleanup. Most of
the minimal core features/support for file analysis should be working at
this point, just have to start fleshing things out.
* origin/fastpath:
Fix memory leak when processing a thread's input message fails.
add comparator functor to the info maps of readerbackend and readerwriteend.
Fix initialization of WriterFrontend names.
This is required, because after the recent changes the info map containst a
char* as key. Without the comparator the map will compare the char addresses
for all operations - which is not really what we want.
I managed to completely forget to add unescaping to the input framework -
this should fix it. It now works with the exact same escaping that is
used by the writers (\x##).
Includes one testcase that seems to work - everything else still passes.
I've only tested that it compiles, not whether it still works. The
fact that we don't have any tests for this makes me uneasy ...
* remotes/origin/topic/seth/elasticsearch: (35 commits)
Some documentation updates for elasticsearch plugin.
Temporarily removing the ES timeout because it works with signals and is incompatible with Bro threads.
Changed ES index names to localtime and added a meta index.
New script for easily duplicating logs to ElasticSearch.
Some better elasticsearch reliability.
Fixed small elasticsearch problem in configure output.
Re-adding the needed call to FinishedRotation in the ES writer plugin.
Tiny updates.
Bringing elasticsearch branch up to date with master.
Adding a define to make the stdint C macros available.
Adding an extra header.
Fixed a bug with messed up time value passing to elasticsearch.
Small updates and a little standardization for config.h.in naming.
Bug fixes.
Bug fix and feature.
Forgot to call the parent method for DoHeartBeat.
Changed the escaping method.
Flush logs to ES daemon as Bro is shutting down.
Reduce the batch size to 1000 and add a maximum time interval for batches.
Reworked bulk operation string construction to use ODesc and added json escaping.
...
* robin/topic/writer-info:
Extending the log writer DoInit() API.
Reworking log writer API to make it easier to pass additional information to a writer's initialization method.
Conflicts:
src/logging/WriterBackend.cc
src/logging/WriterBackend.h
src/logging/WriterFrontend.cc
* origin/fastpath:
Fix inconsistencies in random number generation.
Updating input framework unit tests.
Add front-end name to InitMessage from WriterFrontend to Backend.
Small tweak to make test complete quicker.
Drain events before terminating log/thread managers.
Fix strict-aliasing warning in RemoteSerializer.cc (fixes#834).
Fix typos in event documentation
Fix typos in NEWS for Bro 2.1 beta
The srand()/rand() interface was being intermixed with the
srandom()/random() one. The later is now used throughout.
Changed the srand() and rand() BIFs to work deterministically if Bro
was given a seed file (addresses #825). They also now wrap the
system's srandom() and random() instead of srand() and rand() as per
the above.
We now pass in a Info struct that contains:
- the path name (as before)
- the rotation interval
- the log_rotate_base_time in seconds
- a table of key/value pairs with further configuration options.
To fill the table, log filters have a new field "config: table[string]
of strings". This gives a way to pass arbitrary values from
script-land to writers. Interpretation is left up to the writer.
Also splits calc_next_rotate() into two functions, one of which is
thread-safe and can be used with the log_rotate_base_time value from
DoInit().
Includes also updates to the None writer:
- It gets its own script writers/none.bro.
- New bool option LogNone::debug to enable debug output. It then
prints out all the values passed to DoInit(). That's used by a
btest test to ensure the new DoInit() values are right.
- Fixed a bug that prevented Bro from terminating..
(scripts.base.frameworks.logging.rotate-custom currently fails.
Haven't yet investigated why.)
Also renaming --enable-perftools to --enable-perftool-debug to
indicate that the switch is only relevant for debugging the heap. It's
not needed to pick up tcmalloc for better performance.
--with-perftools can still (and always) be used to give a hint where
to find the libraries.
With the threading, using tcmalloc improves memory usage on FreeBSD
significantly when running on a trace. If it fixes the live problems,
remains to be seen ...
* origin/topic/gilbert/rand-pool:
Updating tests.
Updated uid pools to use integer values instead of strings.
Updating tests.
Test no longer relevant. Need a way to generate and test collisions.
A few minor tweaks to make code less braindead. Fixed-length piece of pool name now only used to hash when determinism is not required; otherwise, whole pool name is used. Note that collisions between pool name hashes will lead to sensitivity to initialization order within the UID generator.
Testing long (>32 character) pool names.
Simple test to verify various pools are not affecting each other.
Some working code. Adds UID pools classified by string. Just compiles and runs; need to go back through and make sure this code is actually doing what I want it to do.
Note, I've removed the collision detection. Seems unlikely to occur
and even if, it's not really that bad.
and runs; need to go back through and make sure this code is actually
doing what I want it to do.
Note: Added new function unique_id_from(pool: string, prefix: string)
that allows the user to explicitly specify a randomness pool to use when
generating unique IDs.
- Fixing the parts of the `make restdoc` and `make doc` process that were
broken by the last Bro script re-organization
- Generated documentation for Bro scripts derived from BiFs now use the
original BiF source file as the "original source file" link
- Renaming of the internal POLICYDEST definition and other misc places that
refer to "policy" scripts; that terminology doesn't make total sense now
- Added a documentation blacklist reminder test that will fail if there's
scripts that are blacklisted from being documentated because they're still
in progress
- Some minor Bro script changes to fix small @load dependency errors
Addresses #543
Any added prefixes are now used *after* all input files have been
parsed to look for a prefixed, flattened version of the input file
somewhere in BROPATH and, if found, load it.
For example, if "lcl" is in @prefixes, and site.bro is loaded, then
a file named "lcl.site.bro" that's in BROPATH would end up being
automatically loaded as well. Packages work similarly, e.g. loading
"protocols/http" means a file named "lcl.protocols.http.bro" in BROPATH
gets loaded automatically.
* origin/topic/robin/reporting:
Syslog BiF now goes through the reporter as well.
Avoiding infinite loops when an error message handlers triggers errors itself.
Renaming the Logger to Reporter.
Overhauling the internal reporting of messages to the user.
Updating a bunch of tests/baselines as well.
Conflicts:
aux/broccoli
policy.old/alarm.bro
policy/all.bro
policy/bro.init
policy/frameworks/notice/weird.bro
policy/notice.bro
src/SSL-binpac.cc
src/bro.bif
src/main.cc
Added an arg to the search_for_files() util function that can return
the subpath of BROPATH's policy/ dir in which the loaded file is found.
This subpath is then used in both the the reST file's document title
(so that script's named e.g. "base.bro" actually have some context) and
in figuring out how to interlink with other generated docs of other
scripts that are found in @load directives.
I still need to overhaul things so the loading of "packages" is
documented in a meaningful way and that the CMake targets are able
to generate indexes for packages.
The Logger class is now in charge of reporting all errors, warnings,
informational messages, weirds, and syslogs. All other components
route their messages through the global bro_logger singleton.
The Logger class comes with these reporting methods:
void Message(const char* fmt, ...);
void Warning(const char* fmt, ...);
void Error(const char* fmt, ...);
void FatalError(const char* fmt, ...); // Terminate Bro.
void Weird(const char* name);
[ .. some more Weird() variants ... ]
void Syslog(const char* fmt, ...);
void InternalWarning(const char* fmt, ...);
void InternalError(const char* fmt, ...); // Terminates Bro.
See Logger.h for more information on these.
Generally, the reporting now works as follows:
- All non-fatal message are reported in one of two ways:
(1) At startup (i.e., before we start processing packets),
they are logged to stderr.
(2) During processing, they turn into events:
event log_message%(msg: string, location: string%);
event log_warning%(msg: string, location: string%);
event log_error%(msg: string, location: string%);
The script level can then handle them as desired.
If we don't have an event handler, we fall back to
reporting on stderr.
- All fatal errors are logged to stderr and Bro terminates
immediately.
- Syslog(msg) directly syslogs, but doesn't do anything else.
The three main types of messages can also be generated on the
scripting layer via new Log::* bifs:
Log::error(msg: string);
Log::warning(msg: string);
Log::message(msg: string);
These pass through the bro_logger as well and thus are handled in the
same way. Their output includes location information.
More changes:
- Removed the alarm statement and the alarm_hook event.
- Adapted lots of locations to use the bro_logger, including some
of the messages that were previously either just written to
stdout, or even funneled through the alarm mechanism.
- No distinction anymore between Error() and RunTime(). There's
now only one class of errors; the line was quite blurred already
anyway.
- util.h: all the error()/warn()/message()/run_time()/pinpoint()
functions are gone. Use the bro_logger instead now.
- Script errors are formatted a bit differently due to the
changes. What I've seen so far looks ok to me, but let me know
if there's something odd.
Notes:
- The default handlers for the new log_* events are just dummy
implementations for now since we need to integrate all this into
the new scripts anyway.
- I'm not too happy with the names of the Logger class and its
instance bro_logger. We now have a LogMgr as well, which makes
this all a bit confusing. But I didn't have a good idea for
better names so I stuck with them for now.
Perhaps we should merge Logger and LogMgr?
With a directory "foo" somewhere in BROPATH, "@load foo" now checks if
there's a file "foo/__load__.bro". If so, it reads that file in. (If
not, Bro reports the same error as before, complaining that it can't
read a directory).
* origin/topic/robin/conn-ids:
Moving uid from conn_id to connection, and making output determistic if a hash seed is given.
Extending conn_id with a globally unique identifiers.
- When Bro is given a PRNG seed, it now uses its own internal random
number generator that produces consistent results across sytems.
Note that this internal generator isn't very good, so it should only
be used for testing purposes.
- The BTest configuration now sets the environemnt variables TZ=UTC
and LANG=C to ensure consistent results.
- Fixing doc markup in logging.bro.
- Updating baselines.