This commit marks (hopefully) ever one-parameter constructor as explicit.
It also uses override in (hopefully) all circumstances where a virtual
method is overridden.
There are a very few other minor changes - most of them were necessary
to get everything to compile (like one additional constructor). In one
case I changed an implicit operation to an explicit string conversion -
I think the automatically chosen conversion was much more convoluted.
This took longer than I want to admit but not as long as I feared :)
This small change allows the empty field separator to be empty. This
means that we can represent an empty list by a empty input string, which
was not possible before.
Before, an empty empty field separator meant that there is no empty
field - to get back to this behavior one now has to set the empty field
separator to a string that is guaranteed to not be part of the input
data. Note that we did not use "empty" empty field separators anywhere
and I am not aware of this being used by anyone - the new behavior seems
like it is much more useful in practice.
This also changes the config framework to interpret empty lists as...
empty, instead of interpreting them as lists that have one zero-length
element; this seems like the saner default.
Currently the destructor would try to free unallocated memory. This
could e.g. be triggered by the input framework reading a set with an
invalid element.
The configuration framework consists of three mostly distinct parts:
* option variables
* the config reader
* the script level framework
I will describe the three elements in the following.
Internally, this commit also performs a range of changes to the Input
manager; it marks a lot of functions as const and introduces a new
ValueToVal method (which could in theory replace the already existing
one - it is a bit more powerful).
This also changes SerialTypes to have a subtype for Values, just as
Fields already have it; I think it was mostly an oversight that this was
not introduced from the beginning. This should not necessitate any code
changes for people already using SerialTypes.
option variable
===============
The option keyword allows variables to be specified as run-tine options.
Such variables cannot be changed using normal assignments. Instead, they
can be changed using Option::set. It is possible to "subscribe" to
options and be notified when an option value changes.
Change handlers can also change values before they are applied; this
gives them the opportunity to reject changes. Priorities can be
specified if there are several handlers for one option.
Example script:
option testbool: bool = T;
function option_changed(ID: string, new_value: bool): bool
{
print fmt("Value of %s changed from %s to %s", ID, testbool, new_value);
return new_value;
}
event bro_init()
{
print "Old value", testbool;
Option::set_change_handler("testbool", option_changed);
Option::set("testbool", F);
print "New value", testbool;
}
config reader
=============
The config reader provides a way to read configuration files back into
Bro. Most importantly it automatically converts values to the correct
types. This is important because it is at least inconvenient (and
sometimes near impossible) to perform the necessary type conversions in
Bro scripts themselves. This is especially true for sets/vectors.
Configuration generally look like this:
[option name][tab/spaces][new variable value]
so, for example:
testaddr 2607:f8b0:4005:801::200e
testinterval 60
testtime 1507321987
test_set a b c d erdbeerschnitzel
The reader uses the option name to look up the type that variable has in
the Bro core and automatically converts the value to the correct type.
Example script use:
type Idx: record {
option_name: string;
};
type Val: record {
option_val: string;
};
global currconfig: table[string] of string = table();
event InputConfig::new_value(name: string, source: string, id: string, value: any)
{
print id, value;
}
event bro_init()
{
Input::add_table([$reader=Input::READER_CONFIG, $source="../configfile", $name="configuration", $idx=Idx, $val=Val, $destination=currconfig, $want_record=F]);
}
Script-level config framework
=============================
The script-level framework ties these two features together and makes
them a bit more convenient to use. Configuration files can simply be
specified by placing them into Config::config_files. The framework also
creates a config.log that shows all value changes that took place.
Usage example:
redef Config::config_files += {configfile};
export {
option testbool : bool = F;
}
The file is now monitored for changes; when a change occurs the
respective option values are automatically updated and the value change
is written to config.log.
The two hooks being added are:
void HookLogInit(const std::string& writer, const std::string& instantiating_filter, bool local, bool remote, const logging::WriterBackend::WriterInfo& info, int num_fields, const threading::Field* const* fields);
which is called when a writer is being instantiated and contains
information about the fields being logged, as well as
bool HookLogWrite(const std::string& writer, const std::string& filter, const logging::WriterBackend::WriterInfo& info, int num_fields, const threading::Field* const* fields, threading::Value** vals);
which is called for each log line being written by each writer. It
contains all the data being written. The data can be changed in the
function call and lines can be prevented from being written.
This commit also fixes a few small problems with plugin hooks itself,
and extends the tests that were already there, besides introducing tests
for the added functionality.
This moves all threading code in Bro from pthreads to the c++11
primitives, which make for shorter, easier to use, and less error-prone
code.
pthreads is still used in 2 places in Bro currently. BasicThread uses
two bits of functionality that are not available using the c++ API
(setting thread names & setting signal masks). Since all c++
implementations that I am aware of still use an underlying pthreads
implementation, we just use native_handle to access the underlying
pthreads implementation for these cases. I do not expect this to lead to
problems in the forseable future. If we ever encounter a platform where
a different thread architecture is used, we might have to change that
around.
This code is guarded by static_asserts, so we will notice if a platform
uses a different implementation.
sqlite also uses pthreads directly.
This change introduces error events for Table and Event readers. Users
can now specify an event that is called when an info, warning, or error
is emitted by their input reader. This can, e.g., be used to raise
notices in case errors occur when reading an important input stream.
Example:
event error_event(desc: Input::TableDescription, msg: string, level: Reporter::Level)
{
...
}
event bro_init()
{
Input::add_table([$source="a", $error_ev=error_event, ...]);
}
For the moment, this converts all errors in the Asciiformatter into
warnings (to show that they are non-fatal) - the Reader itself also has
to throw an Error to show that a fatal error occurred and processing
will be abort.
It might be nicer to change this and require readers to mark fatal
errors as such when throwing them.
Addresses BIT-1181
This was causing occasional problems with the time on processes
running lots of threads. The use of gmtime in the json
formatter is the likely culprit due to the fact that the
json formatter runs in threads. More evidence for this is that
the problem only appears to exhibit when logs are being written
as JSON.
Cleaned up the surrounding code a bit and also added '[' as another
case (not sure that can happen, but doesn't hurt eihter).
* 'master' of https://github.com/aeppert/bro:
Whitespace
Remove
Remove.
Fix for JSON formatter
A fatal error, especially in DEBUG, should result in a core.
Seems to fix a case where an entry in the table may be null on insert.
In the event that the first entry in a record is optional AND not present, the serializer will incorrectly add a leading comma. This leading common is invalid JSON and will, more often than not, cause parser failures downstream.
Setting the thread name on every heartbeat uses a mild amount of
cycles and there's not much benefit to doing it there to get the
additional info regarding the number of processed messages since thread
names usually get truncated to 16 characters and omit that part anyway.
* origin/topic/seth/json-formatter:
Updating a couple of tests.
Expanded support for modifying the timestamp format in the JSON formatter.
Ascii input reader now supports all config options per-input stream.
Added an option to the JSON formatter to use ISO 8601 for timestamps.
Refactored formatters and updated the the writers a bit.
Includes some minor bugfixes and cleanup at various places, including
in old code.
- It's not *exactly* ISO 8601 which doesn't seem to support
subseconds, but subseconds are very important to us and
most things that support ISO8601 seem to also support subseconds
in the way I'm implemented it.
- Formatters have been abstracted similarly to readers and writers now.
- The Ascii writer has a new option for writing out logs as JSON.
- The Ascii writer now has all options availble as per-filter
options as well as global.
A bunch of infrastructure work to move IOSource, IOSourceRegistry (now
iosource::Manager) and PktSrc/PktDumper code into iosource/, and over
to a plugin structure.
Other IOSources aren't touched yet, they are still in src/*.
It compiles and does something with a small trace, but that's all I've
tested so far. There are quite certainly a number of problems left, as
well as various TODOs and cleanup; and nothing's cast in stone yet.
Will continue to work on this.
A thread that is done/killed should signify that the thread manager has
some processing to do -- it needs to process any messages in its out
queue, join the thread, and delete it. Otherwise the thread manager
may reach a state where it makes no progress in processing the last
remaining done/killed thread.
This fix also fixes the deadlock issue without putting any
new strain into the main packet processing path.
Instead of occasionally returning true in MaybeReady sometime,
we occasionally process threads if time_mgr time is not running.
If time_mgr time is running, we have heartbeat messages that will
trigger processing in any case -- processing always checks the
exact state of the Queues.
This fix probably also means that we can remove the communication
loads from all input framework tests and run them all simultaneously.
Bernhard and I tracked it down we believe: the thread queue could
deadlock in certain cases. As a fix we tuned the heuristic for telling
if a queue might have input to occasionaly err on the safe side by
flagging "yes", so that processing will proceed.
It's a bit unfortunate to apply this fix last minute before the
release as it could potentially impact performance if the heuristic
fails to often. We believe the chosen parmaterization should be fine ...
Replaced some with InternalWarning or InternalAnalyzerError, the later
being a new method which signals the analyzer to not process further
input. Some usages I just removed if they didn't make sense or clearly
couldn't happen. Also did some minor refactors of related code while
reviewing/exploring ways to get rid of InternalError usages.
Also, for TCP content file write failures there's a new event:
"contents_file_write_failure".
statistics, finalize prepared statement before exitting logger.
This might fix the deadlock issue, at least it did not happen for
me on my tried on the test system where it happened quite regularly
before.