These may be redefined to customize log rotation path prefixes,
including use of a directory. File extensions are still up to
individual log writers to add themselves during the actual rotation.
These new also allow for some simplication to the default
ASCII postprocessor function: it eliminates the need for it doing an
extra/awkward rename() operation that only changes the timestamp format.
This also teaches the supervisor framework to use these new options
to rotate ascii logs into a log-queue/ directory with a specific
file name format (intended for an external archiver process to
monitor separately).
This commit switches UID hashing from md5 to a highway hash. It also
moves the salt value out of the file plugin - and makes it
installation-specific instead - it is moved to the global namespace.
There now are digest hash functions to make "static"
installation-specific hashes that are stable over workers available to
everyone; hashes can be 64, 128 or 256 bits in size.
Due to the fact that we switch the file hashing algorithm, all file
hashes change.
The underlyigng algorithm that is used for hashing is highwayhash-128,
which is significantly faster than md5.
Node-specific topic prefix subscriptions/publications now add a trailing
slash like "zeek/cluster/node/<name>/". Without the trailing slash,
messages attempting to target "proxy-10" may also be sent to "proxy-1"
since subscription matching is prefix-based.
These were previously reporting leaks due to various allocations not
getting cleaned up during the stack unwind, but at the current state of
the transition toward IntrusivePtr usage, theses tests no longer leak.
A race condition could cause unstable output: if the thread reading the
file is fast, often you see both "pred" functions execute and then both
"line" events execute with both entries already in the table, but if the
thread reading the file is slow, you see pred, event, pred, event, with
only one entry available in the first event.
* Generally increase timeouts for tests that have recent transient
failures
* Change any test that relied on `btest-bg-wait -k` since that's never
going to play with with CI systems. Instead, we always need to have
a well-defined termination condition in the test itself (and most
already did, so didn't really need the `-k` flag anyway).
This change standardizes threading formatter error handling and moves
the remaining error calls to be warnings instead.
This is in line with already existing code - in most cases warnings were
raised, only a few cases raised errors. These cases do not differ
significantly from other cases in which warnings are raised.
This also fixes GH-692, in which misformatted lines prevent future file
parsing.
This commit also moves the FailWarn method that is used by both the
config and the ascii reader up to the ReaderBackend. Furthermore it
makes the Warning method of ReaderBackend respect the warning
suppression that is introduced by the FailWarn method.
E.g. ones that throw interpreter exceptions, as those are currently
known to potentially cause leaks. Fixing the underlying leaks involves
the larger task of more IntrusivePtr usage.
Reference cycles may also cause leaks.
Adjustments during merge:
- kept the UNKNOWN Log::ID as placeholder value
- changed the coverage.find-bro-logs test to check for arbitrary $path
field values instead of just string literals
- don't force EnumVal to unsigned integer since the relevant union member
is the signed integer and added the relevant enum values/types to
.bif files for easier access
- compare FILE* versus file name to check for stdout equality (don't
think it matters much, just a bit more efficient)
- minor whitespace/style tweaks
* origin/topic/dev/print-to-log:
Added a non boolean configuration and other changes as suggested by Jon
Allow Print Statements to be redirected to a Log# This is a combination of 3 commits.
* origin/topic/seth/624:
Support whitespace at end of line for config reader.
This merge fixes a failing test; it also sprinkles a few more spaces
into another test file.
The main change is that this now also works with configuration lines
that don't have a value.
* origin/topic/dev/non-ascii-logging:
Removed Policy Script for UTF-8 Logs
Commented out UTF-8 Script in Test All Policy
Minor Style Tweak
Use getNumBytesForUTF8 method to determine number of bytes
Added Jon's test cases as unit tests
Prioritizes escaping predefined Escape Sequences over Unescaping UTF-8 Sequences
Added additional check to confirm anything unescaping is a multibyte UTF-8 sequence, addressing the test case Jon brought up
Added optional script and redef bool to enable utf-8 in ASCII logs
Initial Commit, removed std::isprint check to escape
Made minor code format and logic adjustments during merge.
* origin/topic/timw/cleaner-utf8:
GHI-486: Switch over to using LLVM utf8-checking code to better validate characters
I addressed a buffer over-read during the merge and added test-cases for
it.
This allows one to tune the number of protocol violations to tolerate
from any given analyzer type before just disabling a given instance
of it.
Also removes the "disabled_aids" field from the DPD::Info record
since it serves no purpose: in this case, calling disable_analyzer
multiple times for the same analyzer is a no-op.
* origin/topic/timw/150-to-json:
Update submodules for JSON work
Update unit tests for JSON logger to match new output
Modify JSON log writer to use the external JSON library
Update unit test output to match json.zeek being deprecated and slight format changes to JSON output
Add proper JSON serialization via C++, deprecate json.zeek
Add new method for escaping UTF8 strings for JSON output
Move do_sub method from zeek.bif to StringVal class method
Move record_fields method from zeek.bif to Val class method
Add ToStdString method for StringVal
In the past they were processed on the manager - which requires big
records to be sent around.
This has a potential of incompatibilities if someone relied on global
state for notice processing.
GH-214
By using a consistent timestamp. That avoids rare chances of sqlite
output from rounding the current time into such a form that happens
to bypass the timestamp canonifier script (whenever it happened to
land on a whole or tenth second).
These are no longer loaded by default due to the performance impact they
cause simply by being loaded (they have event handlers for commonly
generated events) and they aren't generally useful enough to justify it.
* 'master' of https://github.com/ZekeMedley/zeek:
Use the right delete and improve the leak test. Increases the size of the table being loaded in the pattern leak test and uses the right delete method.
Fix formatting.
Fix memory leak and add test.
Add pattern support to input framework.
All types (besides EntropyVal) now support a native copy operation,
which uses primitives of the underlying datatypes to perform a quick
copy, without serialization.
EntropyVal is the one exception - since that type is rather complex
(many members) and will probably not be copied a lot, if at all, it
makes sense to just use the serialization function.
This will have to be slightly re-written in the near-term-future to use
the new serialization function for that opaque type.
This change also introduces a new x509_from_der bif, which allows to
parse a der into an opaque of x509.
This change removes the d2i_X509_ wrapper function; this was a remnant
when d2i_X509 took non-const arguments. We directly use d2i_X509 at
several places assuming const-ness, so there does not seem to ba a
reason to keep the wrapper.
This change also exposed a problem in the File cache - cases in which an
object was brought back into the cache, and writing occurred in the
file_open event were never correctly handeled as far as I can tell.
For backward compatibility when reading values, we first check
the ZEEK-prefixed value, and if not set, then check the corresponding
BRO-prefixed value.