Occasionally a few lines in the first part of the output file were
not in the expected order (this seems to be caused by each line in the
output being created by a process that is run in the background but
bro doesn't wait for it to finish). Fixed by sorting the output.
This test has failed numerous times on Travis CI. Fixes to make this
test more reliable: create the does-not-exist.dat file atomically, and
increase wait time after starting bro in order to give all input
streams a chance to try to read the input file.
Also added the input stream name to the test output, in order to make
output easier to understand if the test fails again.
This small change allows the empty field separator to be empty. This
means that we can represent an empty list by a empty input string, which
was not possible before.
Before, an empty empty field separator meant that there is no empty
field - to get back to this behavior one now has to set the empty field
separator to a string that is guaranteed to not be part of the input
data. Note that we did not use "empty" empty field separators anywhere
and I am not aware of this being used by anyone - the new behavior seems
like it is much more useful in practice.
This also changes the config framework to interpret empty lists as...
empty, instead of interpreting them as lists that have one zero-length
element; this seems like the saner default.
The configure reader had a small bug that caused the tracking of changed
variables to be incorrect after the second update. This resulted in
change-events for unchanged variables.
Currently the destructor would try to free unallocated memory. This
could e.g. be triggered by the input framework reading a set with an
invalid element.
get_filter_names(id: ID) : set[string] returns the names of the current
list of filters for a specified log stream.
Furthermore this commit makes a number of logging functions more robust
by checking existence of values before trying to modify them. This
commit also really implements (and tests) the enable_stream function.
There are two new script level functions to query and lookup files
from the core by their IDs. These are adding feature parity for
similarly named functions for files. The function prototypes are
as follows:
Files::file_exists(fuid: string): bool
Files::lookup_File(fuid: string): fa_file
The configuration framework consists of three mostly distinct parts:
* option variables
* the config reader
* the script level framework
I will describe the three elements in the following.
Internally, this commit also performs a range of changes to the Input
manager; it marks a lot of functions as const and introduces a new
ValueToVal method (which could in theory replace the already existing
one - it is a bit more powerful).
This also changes SerialTypes to have a subtype for Values, just as
Fields already have it; I think it was mostly an oversight that this was
not introduced from the beginning. This should not necessitate any code
changes for people already using SerialTypes.
option variable
===============
The option keyword allows variables to be specified as run-tine options.
Such variables cannot be changed using normal assignments. Instead, they
can be changed using Option::set. It is possible to "subscribe" to
options and be notified when an option value changes.
Change handlers can also change values before they are applied; this
gives them the opportunity to reject changes. Priorities can be
specified if there are several handlers for one option.
Example script:
option testbool: bool = T;
function option_changed(ID: string, new_value: bool): bool
{
print fmt("Value of %s changed from %s to %s", ID, testbool, new_value);
return new_value;
}
event bro_init()
{
print "Old value", testbool;
Option::set_change_handler("testbool", option_changed);
Option::set("testbool", F);
print "New value", testbool;
}
config reader
=============
The config reader provides a way to read configuration files back into
Bro. Most importantly it automatically converts values to the correct
types. This is important because it is at least inconvenient (and
sometimes near impossible) to perform the necessary type conversions in
Bro scripts themselves. This is especially true for sets/vectors.
Configuration generally look like this:
[option name][tab/spaces][new variable value]
so, for example:
testaddr 2607:f8b0:4005:801::200e
testinterval 60
testtime 1507321987
test_set a b c d erdbeerschnitzel
The reader uses the option name to look up the type that variable has in
the Bro core and automatically converts the value to the correct type.
Example script use:
type Idx: record {
option_name: string;
};
type Val: record {
option_val: string;
};
global currconfig: table[string] of string = table();
event InputConfig::new_value(name: string, source: string, id: string, value: any)
{
print id, value;
}
event bro_init()
{
Input::add_table([$reader=Input::READER_CONFIG, $source="../configfile", $name="configuration", $idx=Idx, $val=Val, $destination=currconfig, $want_record=F]);
}
Script-level config framework
=============================
The script-level framework ties these two features together and makes
them a bit more convenient to use. Configuration files can simply be
specified by placing them into Config::config_files. The framework also
creates a config.log that shows all value changes that took place.
Usage example:
redef Config::config_files += {configfile};
export {
option testbool : bool = F;
}
The file is now monitored for changes; when a change occurs the
respective option values are automatically updated and the value change
is written to config.log.
The catch-and-release.bro test was failing whenever three conditions
were all true: sorting the netcontrol.log before comparing to
the baseline, the presence of LC_ALL=C in btest.cfg changes the sort
order, and sometimes the timestamp increases slightly beginning
with one of the rule_id == 5 lines.
As a result of these three conditions, the sorted order of the lines
with rule_id of 5 were different than the baseline.
Fixed by not sorting netcontrol.log, as this doesn't seem necessary.
- Addresses Philip Romero's question from the Bro mailing list.
- Adds Microsoft Edge as a detected browser.
- We are now unescaping encoded characters in software names.
This feature can be enabled globally for all logs by setting
LogAscii::gzip_level to a value greater than 0.
This feature can be enabled on a per-log basis by setting gzip-level in
$confic to a value greater than 0.
The changes are now a bit more succinct with less code changes required.
Behavior is tested a little bit more thoroughly and a memory problem
when reading incomplete lines was fixed. ReadHeader also always directly
returns if header reading failed.
Error messages now are back to what they were before the change, if the
new behavior is not used.
I also tweaked the documentation text a bit.
By default, the ASCII reader does not fail on errors anymore.
If there is a problem parsing a line, a reporter warning is
written and parsing continues. If the file is missing or can't
be read, the input thread just tries again on the next heartbeat.
Options have been added to recreate the previous behavior...
const InputAscii::fail_on_invalid_lines: bool;
and
const InputAscii::fail_on_file_problem: bool;
They are both set to `F` by default which makes the input readers
resilient to failure.
number of fields required.
Addresses BIT-1683
I do not think this quite fixes the underlying issue of BIT-1683 - it
should not be possible to get to this state in normal operations.
Also fixes a small memory leak for disabled writers.
* origin/topic/seth/log-framework-ext:
Log extensions: series of small fixes and new tests.
Change the function for log extension to take a path only and update tests.
Final changes to log framework ext code.
Add logging framework metadata mechanism.
Add unrolling separator & field name map to logging framework.
The extensions now work with optional types, as well with complex types
(like subrecords). Not returning a record in the ext_func no longer
crashes bro.
The default_ext_func was switched to return void in
cases where no extension revord is defined (was bool).
I also got rid of the offsets in the indices - with the rest of the
implementation, that was not really necessary and made the code more
complex.
The "metadata" functionality has been renamed to "ext" to
represent that the logs are being extended. The function that
returns the record which is used to extend the log now receives
a log filter as it's single argument.
The field name "unrolling" is now renamed to "scope" so the variables
names now look like this: "Log::default_scope_sep"
This adds the capability for the user to attach a reason when removing
or destroying a rule. The message will both be logged in netcontrol.log
and forwarded to the responsible plugins.
Addresses BIT-1655
* origin/topic/dnthayer/ticket1627:
Add a test for starting a cluster with a logger node
Update broctl submodule
Update broctl submodule to branch topic/dnthayer/ticket1627
Change how logger node is detected in cluster framework
Update test baselines for the new logger node type
Update docs for the new logger node type
Add a new node type for logging
This adds an event that is raised once Catch & Release ceases the
block management for an IP address because the IP has not been seen in
traffic during the watch interval.
This allows users who use their own logic on the top of catch and
release know when they will have to start re-blocking the IP if it
occurs in traffic again.
Calling Error() in an input reader now automatically will disable the
reader and return a failure in the Update/Heartbeat calls.
Also adds more tests.
Addresses BIT-1181
This change introduces error events for Table and Event readers. Users
can now specify an event that is called when an info, warning, or error
is emitted by their input reader. This can, e.g., be used to raise
notices in case errors occur when reading an important input stream.
Example:
event error_event(desc: Input::TableDescription, msg: string, level: Reporter::Level)
{
...
}
event bro_init()
{
Input::add_table([$source="a", $error_ev=error_event, ...]);
}
For the moment, this converts all errors in the Asciiformatter into
warnings (to show that they are non-fatal) - the Reader itself also has
to throw an Error to show that a fatal error occurred and processing
will be abort.
It might be nicer to change this and require readers to mark fatal
errors as such when throwing them.
Addresses BIT-1181
This allows all threads accessing the same database to share sqlite
objects. This, for example, fixes the issue with several threads
simultaneously writing to the same database file.
See https://www.sqlite.org/sharedcache.html
Addresses BIT-1325