This change standardizes threading formatter error handling and moves
the remaining error calls to be warnings instead.
This is in line with already existing code - in most cases warnings were
raised, only a few cases raised errors. These cases do not differ
significantly from other cases in which warnings are raised.
This also fixes GH-692, in which misformatted lines prevent future file
parsing.
This commit also moves the FailWarn method that is used by both the
config and the ascii reader up to the ReaderBackend. Furthermore it
makes the Warning method of ReaderBackend respect the warning
suppression that is introduced by the FailWarn method.
Warnings in the ASCII reader so far remained suppressed even when an
input file changed. It's helpful to learn about problems in the data
when putting in place new data files, so this isn't great. This change
maintains the existing warning suppression while processing a file,
but re-enables warnings after updates to a file.
Also includes minor comment clarifications, and maintains the
not-so-great code duplication between the ASCII and Config readers
until we refactor this properly.
* 'topic/christian/inputframework-reporter-filenames' of https://github.com/ckreibich/zeek:
Add input file name to additional ASCII reader warning messages
The ASCII reader had a few messages that did not indicate in which
file it notices a problem. With the input framework it simplifies
troubleshooting when that file is spelled out, because you may have
multiple such files on your system.
Includes test baseline updates.
This introduces the following redefinable string constants, empty by
default:
- InputAscii::path_prefix
- InputBinary::path_prefix
- Intel::path_prefix
When using ASCII or binary reades in the Input/Intel Framework with an
input stream source that does not have an absolute path, these
constants cause Zeek to prefix the resulting paths accordingly. For
example, in the following the location on disk from which Zeek loads
the input becomes "/path/to/input/whitelist.data":
redef InputAscii::path_prefix = "/path/to/input";
event bro_init()
{
Input::add_table([$source="whitelist.data", ...]);
}
These path prefixes can be absolute or relative. When an input stream
source already uses an absolute path, this path is preserved and the
new variables have no effect (i.e., we do not affect configurations
already using absolute paths).
Since the Intel framework builds upon the Input framework, the first
two paths also affect Intel file locations. If this is undesirable,
the Intel::path_prefix variable allows specifying a separate path:
when its value is absolute, the resulting source seen by the Input
framework is absolute, therefore no further changes to the paths
happen.
The changes are now a bit more succinct with less code changes required.
Behavior is tested a little bit more thoroughly and a memory problem
when reading incomplete lines was fixed. ReadHeader also always directly
returns if header reading failed.
Error messages now are back to what they were before the change, if the
new behavior is not used.
I also tweaked the documentation text a bit.
By default, the ASCII reader does not fail on errors anymore.
If there is a problem parsing a line, a reporter warning is
written and parsing continues. If the file is missing or can't
be read, the input thread just tries again on the next heartbeat.
Options have been added to recreate the previous behavior...
const InputAscii::fail_on_invalid_lines: bool;
and
const InputAscii::fail_on_file_problem: bool;
They are both set to `F` by default which makes the input readers
resilient to failure.
This works correctly now (as a prototype at least). If a file
disappears, the thread complains once and once the file reappears
the thread will once again begin watching it.
This change introduces error events for Table and Event readers. Users
can now specify an event that is called when an info, warning, or error
is emitted by their input reader. This can, e.g., be used to raise
notices in case errors occur when reading an important input stream.
Example:
event error_event(desc: Input::TableDescription, msg: string, level: Reporter::Level)
{
...
}
event bro_init()
{
Input::add_table([$source="a", $error_ev=error_event, ...]);
}
For the moment, this converts all errors in the Asciiformatter into
warnings (to show that they are non-fatal) - the Reader itself also has
to throw an Error to show that a fatal error occurred and processing
will be abort.
It might be nicer to change this and require readers to mark fatal
errors as such when throwing them.
Addresses BIT-1181
The ascii reader now accepts \r\n newlines without complaining.
Furthermore, the reader was slightly rewritten in a more c++11-y way,
removing all raw pointers from the class.
Addresses BIT-1198