Somewhere record types with zero fields get the optional attribute
apparently. The input/sqlite/basic test failed due to complaining
that ctx is optional. It isn't optional and when it has zero fields
we can just ignore it, too.
Also adds a input framework test with an explicit empty record type
This isn't a straightforward fix, unfortunately. The existing GetLine()
implementation didn't deal well with input that's incrementally produced
where individually read chunks wouldn't end with the separator.
The prior implementation increased the buffer each time it failed to find
a separator in the current buffer, but then also ended up not searching the
full new buffer size for the terminator, doing that endlessly.
This change reworks the Raw reader to rely only on bufpos for reading
and searching purposes and skip reallocation if the buffer size if it
wasn't actually exhausted.
Closes#3957
It seems like other similar tests get by because they have more "stuff"
before they call `terminate()` most likely. But, to be safe, just
removing the "received termination signal" line seems like the best
approach.
Invalid lines in a file was the one case that would not suppress future
warnings. Just make it suppress warnings too, but clear that suppression
if there is a field in between that doesn't error.
Fixes#3692
The input framework currently gives a rather opaque error message when
encountering a line in which a required value is not provided. This
change updates this behavior; the error message now provides the record
element (or the name or the index element) which was not set in the
input data, even though it is required to be set by the underlying Zeek
type.
This seems to have relied on the reading file twice behavior simply
testing that 16 lines are observed. Switch to using two separate
files and doing a system("mv ...") to trigger the REREAD logic, there's
not force_update() needed and it wouldn't do anything if the file
hadn't changed anyway.
Found while writing documentation and being confused why
all lines and end_of_data() arrive twice during startup.
The test is a bit fuzzy, but does fail reliably without
the changes to Raw.cc
Also fix not checking dev in the MODE_REREAD path.
Closes#3053
* amazing-pp/topic/fupeng/from_json_bif:
Implement from_json bif
Minor updates during merge: Moved ValFromJSON into zeek::detail for the
time being, removed gotos, normalized some error messages to lower case,
minimal test extension and added a raw reader input framework test reading
"json lines" as a demo, adding notes about the implicit type
conversions.
Now that it's loaded in bare mode, no need to load it explicitly.
The main thing that tests were relying on seems to be tracking of
c$service for conn.log baselines. Very few were actually checking
for dpd.log
The previous way of splitting strings would break if the last string in
the line was an empty string, and it would return one fewer fields than
it should have. This was breaking the last line in the
scripts.base.framework.input.ascii.setspecialcases once the bug fixed in
GH #1628 was fixed.
While writing a test for the new "tail -F semantics" I found that
the $want_record=F case was broken (errno 25). So instead of opening
/dev/null when the input file is missing change READER_RAW to avoid
I/O until it can be opened.
Add two tests, one for when the event handler is called with a
record and one for when it's called with a string.
On FreeBSD, this test showed two problems: (1) reordering problems
based on writing the predicate, event, and end-of-data updates into a
single file, (2) a race condition based on printing the entirety of
the table description argument in update events. The description
contains the destination table, and its content at the time an update
event gets processed isn't deterministic: depending on the number
of updates the reader thread has sent, the table will contain a
varying number of entries.
The invalidset and invalidtext tests loaded an input file via table
and event reads, in parallel. On FreeBSD this triggers an occasional
reordering of messages coming out of the reader thread vs the input
managers. This commit makes the table and event reads sequential,
avoiding the race.
The framework so far populated data structures with missing fields
even when those fields are defined without the &optional
attribute. When using the attribute, such entries continue to get
populated.
Update tests to reflect focus on unset fields.
When a text with an (escaped) zero byte was passed to ParseValue, only
the part of the string up to the zero byte was copied, but the length of
the full string was passed to the input framework.
This leads to the input manager reading over the end of the buffer.
Fixeszeek/zeek#1398
Script-layer counts, when provided as negative integers in an input
file, got cast to unsigned values because strtoull() does not complain
about negative values. For example, input string "-1" would lead to
value 18446744073709551615 (an all-ones 64-bit int) on x86_64. This is
more likely to be an error than an intent to get very large,
platform-dependent values, so these input lines are now skipped with
according messaging in the reporter.log/stderr.
This also affected ports: -1/tcp got cast to unsigned and only thrown
out because PortVal rejects values > 65535, mapping them to 0. We now
skip such inputs as well.
Updates existing input framework tests to capture the new behavior.
By default all baslines are run through diff-remove-timestamp. On a BSD
sed implementation, this means that a newline is added to the end of the
file, if no newline was there originally. This behavior differs from GNU
sed, which does not add a newline.
In this commit we unify this behavior by always adding a newline, even
when using GNU sed. This commit also disables the canonifier for a bunch
of binary baselines, so we do not have to change them.
- Use `-b` most everywhere, it will save time.
- Start some intel tests upon the input file being fully read instead of
at an arbitrary time.
- Improve termination condition for some sumstats/cluster tests.
- Filter uninteresting output from some supervisor tests.
- Test for `notice_policy.log` is no longer needed.
It was not dealing with multiple spaces between the key and the value
with MUSL correctly. This change ensures that if a value exists, that it
begins and ends with a non-blank character.
A race condition could cause unstable output: if the thread reading the
file is fast, often you see both "pred" functions execute and then both
"line" events execute with both entries already in the table, but if the
thread reading the file is slow, you see pred, event, pred, event, with
only one entry available in the first event.
* Generally increase timeouts for tests that have recent transient
failures
* Change any test that relied on `btest-bg-wait -k` since that's never
going to play with with CI systems. Instead, we always need to have
a well-defined termination condition in the test itself (and most
already did, so didn't really need the `-k` flag anyway).
This change standardizes threading formatter error handling and moves
the remaining error calls to be warnings instead.
This is in line with already existing code - in most cases warnings were
raised, only a few cases raised errors. These cases do not differ
significantly from other cases in which warnings are raised.
This also fixes GH-692, in which misformatted lines prevent future file
parsing.
This commit also moves the FailWarn method that is used by both the
config and the ascii reader up to the ReaderBackend. Furthermore it
makes the Warning method of ReaderBackend respect the warning
suppression that is introduced by the FailWarn method.
* origin/topic/seth/624:
Support whitespace at end of line for config reader.
This merge fixes a failing test; it also sprinkles a few more spaces
into another test file.
The main change is that this now also works with configuration lines
that don't have a value.
* 'master' of https://github.com/ZekeMedley/zeek:
Use the right delete and improve the leak test. Increases the size of the table being loaded in the pattern leak test and uses the right delete method.
Fix formatting.
Fix memory leak and add test.
Add pattern support to input framework.
For backward compatibility when reading values, we first check
the ZEEK-prefixed value, and if not set, then check the corresponding
BRO-prefixed value.
This also installs symlinks from "zeek" and "bro-config" to a wrapper
script that prints a deprecation warning.
The btests pass, but this is still WIP. broctl renaming is still
missing.
#239
This introduces the following redefinable string constants, empty by
default:
- InputAscii::path_prefix
- InputBinary::path_prefix
- Intel::path_prefix
When using ASCII or binary reades in the Input/Intel Framework with an
input stream source that does not have an absolute path, these
constants cause Zeek to prefix the resulting paths accordingly. For
example, in the following the location on disk from which Zeek loads
the input becomes "/path/to/input/whitelist.data":
redef InputAscii::path_prefix = "/path/to/input";
event bro_init()
{
Input::add_table([$source="whitelist.data", ...]);
}
These path prefixes can be absolute or relative. When an input stream
source already uses an absolute path, this path is preserved and the
new variables have no effect (i.e., we do not affect configurations
already using absolute paths).
Since the Intel framework builds upon the Input framework, the first
two paths also affect Intel file locations. If this is undesirable,
the Intel::path_prefix variable allows specifying a separate path:
when its value is absolute, the resulting source seen by the Input
framework is absolute, therefore no further changes to the paths
happen.