On FreeBSD, this test showed two problems: (1) reordering problems
based on writing the predicate, event, and end-of-data updates into a
single file, (2) a race condition based on printing the entirety of
the table description argument in update events. The description
contains the destination table, and its content at the time an update
event gets processed isn't deterministic: depending on the number
of updates the reader thread has sent, the table will contain a
varying number of entries.
The invalidset and invalidtext tests loaded an input file via table
and event reads, in parallel. On FreeBSD this triggers an occasional
reordering of messages coming out of the reader thread vs the input
managers. This commit makes the table and event reads sequential,
avoiding the race.
The framework so far populated data structures with missing fields
even when those fields are defined without the &optional
attribute. When using the attribute, such entries continue to get
populated.
Update tests to reflect focus on unset fields.
When a text with an (escaped) zero byte was passed to ParseValue, only
the part of the string up to the zero byte was copied, but the length of
the full string was passed to the input framework.
This leads to the input manager reading over the end of the buffer.
Fixeszeek/zeek#1398
Script-layer counts, when provided as negative integers in an input
file, got cast to unsigned values because strtoull() does not complain
about negative values. For example, input string "-1" would lead to
value 18446744073709551615 (an all-ones 64-bit int) on x86_64. This is
more likely to be an error than an intent to get very large,
platform-dependent values, so these input lines are now skipped with
according messaging in the reporter.log/stderr.
This also affected ports: -1/tcp got cast to unsigned and only thrown
out because PortVal rejects values > 65535, mapping them to 0. We now
skip such inputs as well.
Updates existing input framework tests to capture the new behavior.
By default all baslines are run through diff-remove-timestamp. On a BSD
sed implementation, this means that a newline is added to the end of the
file, if no newline was there originally. This behavior differs from GNU
sed, which does not add a newline.
In this commit we unify this behavior by always adding a newline, even
when using GNU sed. This commit also disables the canonifier for a bunch
of binary baselines, so we do not have to change them.
- Use `-b` most everywhere, it will save time.
- Start some intel tests upon the input file being fully read instead of
at an arbitrary time.
- Improve termination condition for some sumstats/cluster tests.
- Filter uninteresting output from some supervisor tests.
- Test for `notice_policy.log` is no longer needed.
It was not dealing with multiple spaces between the key and the value
with MUSL correctly. This change ensures that if a value exists, that it
begins and ends with a non-blank character.
A race condition could cause unstable output: if the thread reading the
file is fast, often you see both "pred" functions execute and then both
"line" events execute with both entries already in the table, but if the
thread reading the file is slow, you see pred, event, pred, event, with
only one entry available in the first event.
* Generally increase timeouts for tests that have recent transient
failures
* Change any test that relied on `btest-bg-wait -k` since that's never
going to play with with CI systems. Instead, we always need to have
a well-defined termination condition in the test itself (and most
already did, so didn't really need the `-k` flag anyway).
This change standardizes threading formatter error handling and moves
the remaining error calls to be warnings instead.
This is in line with already existing code - in most cases warnings were
raised, only a few cases raised errors. These cases do not differ
significantly from other cases in which warnings are raised.
This also fixes GH-692, in which misformatted lines prevent future file
parsing.
This commit also moves the FailWarn method that is used by both the
config and the ascii reader up to the ReaderBackend. Furthermore it
makes the Warning method of ReaderBackend respect the warning
suppression that is introduced by the FailWarn method.
* origin/topic/seth/624:
Support whitespace at end of line for config reader.
This merge fixes a failing test; it also sprinkles a few more spaces
into another test file.
The main change is that this now also works with configuration lines
that don't have a value.
* 'master' of https://github.com/ZekeMedley/zeek:
Use the right delete and improve the leak test. Increases the size of the table being loaded in the pattern leak test and uses the right delete method.
Fix formatting.
Fix memory leak and add test.
Add pattern support to input framework.
For backward compatibility when reading values, we first check
the ZEEK-prefixed value, and if not set, then check the corresponding
BRO-prefixed value.
This also installs symlinks from "zeek" and "bro-config" to a wrapper
script that prints a deprecation warning.
The btests pass, but this is still WIP. broctl renaming is still
missing.
#239
This introduces the following redefinable string constants, empty by
default:
- InputAscii::path_prefix
- InputBinary::path_prefix
- Intel::path_prefix
When using ASCII or binary reades in the Input/Intel Framework with an
input stream source that does not have an absolute path, these
constants cause Zeek to prefix the resulting paths accordingly. For
example, in the following the location on disk from which Zeek loads
the input becomes "/path/to/input/whitelist.data":
redef InputAscii::path_prefix = "/path/to/input";
event bro_init()
{
Input::add_table([$source="whitelist.data", ...]);
}
These path prefixes can be absolute or relative. When an input stream
source already uses an absolute path, this path is preserved and the
new variables have no effect (i.e., we do not affect configurations
already using absolute paths).
Since the Intel framework builds upon the Input framework, the first
two paths also affect Intel file locations. If this is undesirable,
the Intel::path_prefix variable allows specifying a separate path:
when its value is absolute, the resulting source seen by the Input
framework is absolute, therefore no further changes to the paths
happen.
This change ignores leading/trailing whitespaces for a couple of
data-types (bool, port, subnet, addr) and just parses them as if the
whitespace was not present.
The ascii formatter already was happy to read ports in the form
"42/tcp"; however it emitted a warning message for each line.
This patch fixes this and adds a bit more testing for the existing
behavior.
Mostly trying to standardize the way tests sleep for arbitrary amounts
of time to make it easier to tell at which particular point the
unit test actually may need the timeout interval increased (or else
debugged further).
This test has failed numerous times on Travis CI. Fixes to make this
test more reliable: create the does-not-exist.dat file atomically, and
increase wait time after starting bro in order to give all input
streams a chance to try to read the input file.
Also added the input stream name to the test output, in order to make
output easier to understand if the test fails again.
Currently the destructor would try to free unallocated memory. This
could e.g. be triggered by the input framework reading a set with an
invalid element.
The configuration framework consists of three mostly distinct parts:
* option variables
* the config reader
* the script level framework
I will describe the three elements in the following.
Internally, this commit also performs a range of changes to the Input
manager; it marks a lot of functions as const and introduces a new
ValueToVal method (which could in theory replace the already existing
one - it is a bit more powerful).
This also changes SerialTypes to have a subtype for Values, just as
Fields already have it; I think it was mostly an oversight that this was
not introduced from the beginning. This should not necessitate any code
changes for people already using SerialTypes.
option variable
===============
The option keyword allows variables to be specified as run-tine options.
Such variables cannot be changed using normal assignments. Instead, they
can be changed using Option::set. It is possible to "subscribe" to
options and be notified when an option value changes.
Change handlers can also change values before they are applied; this
gives them the opportunity to reject changes. Priorities can be
specified if there are several handlers for one option.
Example script:
option testbool: bool = T;
function option_changed(ID: string, new_value: bool): bool
{
print fmt("Value of %s changed from %s to %s", ID, testbool, new_value);
return new_value;
}
event bro_init()
{
print "Old value", testbool;
Option::set_change_handler("testbool", option_changed);
Option::set("testbool", F);
print "New value", testbool;
}
config reader
=============
The config reader provides a way to read configuration files back into
Bro. Most importantly it automatically converts values to the correct
types. This is important because it is at least inconvenient (and
sometimes near impossible) to perform the necessary type conversions in
Bro scripts themselves. This is especially true for sets/vectors.
Configuration generally look like this:
[option name][tab/spaces][new variable value]
so, for example:
testaddr 2607:f8b0:4005:801::200e
testinterval 60
testtime 1507321987
test_set a b c d erdbeerschnitzel
The reader uses the option name to look up the type that variable has in
the Bro core and automatically converts the value to the correct type.
Example script use:
type Idx: record {
option_name: string;
};
type Val: record {
option_val: string;
};
global currconfig: table[string] of string = table();
event InputConfig::new_value(name: string, source: string, id: string, value: any)
{
print id, value;
}
event bro_init()
{
Input::add_table([$reader=Input::READER_CONFIG, $source="../configfile", $name="configuration", $idx=Idx, $val=Val, $destination=currconfig, $want_record=F]);
}
Script-level config framework
=============================
The script-level framework ties these two features together and makes
them a bit more convenient to use. Configuration files can simply be
specified by placing them into Config::config_files. The framework also
creates a config.log that shows all value changes that took place.
Usage example:
redef Config::config_files += {configfile};
export {
option testbool : bool = F;
}
The file is now monitored for changes; when a change occurs the
respective option values are automatically updated and the value change
is written to config.log.
The changes are now a bit more succinct with less code changes required.
Behavior is tested a little bit more thoroughly and a memory problem
when reading incomplete lines was fixed. ReadHeader also always directly
returns if header reading failed.
Error messages now are back to what they were before the change, if the
new behavior is not used.
I also tweaked the documentation text a bit.
By default, the ASCII reader does not fail on errors anymore.
If there is a problem parsing a line, a reporter warning is
written and parsing continues. If the file is missing or can't
be read, the input thread just tries again on the next heartbeat.
Options have been added to recreate the previous behavior...
const InputAscii::fail_on_invalid_lines: bool;
and
const InputAscii::fail_on_file_problem: bool;
They are both set to `F` by default which makes the input readers
resilient to failure.
Calling Error() in an input reader now automatically will disable the
reader and return a failure in the Update/Heartbeat calls.
Also adds more tests.
Addresses BIT-1181
This change introduces error events for Table and Event readers. Users
can now specify an event that is called when an info, warning, or error
is emitted by their input reader. This can, e.g., be used to raise
notices in case errors occur when reading an important input stream.
Example:
event error_event(desc: Input::TableDescription, msg: string, level: Reporter::Level)
{
...
}
event bro_init()
{
Input::add_table([$source="a", $error_ev=error_event, ...]);
}
For the moment, this converts all errors in the Asciiformatter into
warnings (to show that they are non-fatal) - the Reader itself also has
to throw an Error to show that a fatal error occurred and processing
will be abort.
It might be nicer to change this and require readers to mark fatal
errors as such when throwing them.
Addresses BIT-1181
The ascii reader now accepts \r\n newlines without complaining.
Furthermore, the reader was slightly rewritten in a more c++11-y way,
removing all raw pointers from the class.
Addresses BIT-1198