* origin/topic/timw/open-dict: (40 commits)
Move Dict constants to detail namespace
Add a few missing deprecation fixes
Adjust Dict whitespace/style
Adjust more btest timings
Improve termination reliability/speed for brokerstore btests
General btest cleanup
Update NEWS about change in Dictionary implementation
Improve Intel expire-item btest to be less time-sensitive
Improve btests with unstable table/set output ordering
Update doc submodule
Adjust a few btests that were unstable due to time-sensitivity
Fix DNS script deleting a table element while iterating
Improve a brokerstore btest to filter out Broker connection messages
Sort output of a few SumStats cluster tests
Fix extract_first_email_addr() to really return the first email
Add find_all_ordered() BIF
Extend external test suite canonifier with set-sorting logic
Update btests/baselines for OpenDict compat
Fix new/malloc/delete/free mismatches in Dictionary code
Add explanation for a Dict TODO item
...
- Use `-b` most everywhere, it will save time.
- Start some intel tests upon the input file being fully read instead of
at an arbitrary time.
- Improve termination condition for some sumstats/cluster tests.
- Filter uninteresting output from some supervisor tests.
- Test for `notice_policy.log` is no longer needed.
Particularly, the final output order of a table/set is sensitive to
order of input/insertions and some tests were converting
std::unordered_{set,map} to Zeek table/set and iteration over those
standard containers may not always loop through elements in the same
order across all platforms.
The use of find_all() in extract_email_addrs_vec() extracted occurrences
to an intermediate set and thus lost any sense of ordering.
This changes extract_email_addrs_vec() to use find_all_ordered() and
return all occurrences of email addresses found in the argument,
included duplicates, with their order of occurrence preserved.
Haven't checked different build configurations yet, but all except
a few SumStats tests are stable for me now. The external tests
are also completely failing, but haven't looked at those yet.
The body-lengths of sub-entities, like multipart messages, got counted
twice by mistake: once upon the end of the sub-entity and then again
upon the end of the top-level entity that contains all sub-entities.
The size of just the top-level entity is the correct one to use.
It was not dealing with multiple spaces between the key and the value
with MUSL correctly. This change ensures that if a value exists, that it
begins and ends with a non-blank character.
* origin/topic/vladg/gh-1084:
Add btest for GH-1084
Update baselines
MySQL: Fix parsing logic bug. We were correctly NOT expecting an EOF, but because we were parsing the header and then not parsing the rest, we would get out of sync
These may be redefined to customize log rotation path prefixes,
including use of a directory. File extensions are still up to
individual log writers to add themselves during the actual rotation.
These new also allow for some simplication to the default
ASCII postprocessor function: it eliminates the need for it doing an
extra/awkward rename() operation that only changes the timestamp format.
This also teaches the supervisor framework to use these new options
to rotate ascii logs into a log-queue/ directory with a specific
file name format (intended for an external archiver process to
monitor separately).
For `DHCP::ClientID$hwtype` fields equal to 0, the `hwaddr` field is
no longer misformatted as a MAC and instead just contains the raw bytes
seen in the DHCP Client ID Option.
This also updates all usages of the deprecated Val ctor to use
either IntervalVal, TimeVal, or DoubleVal ctors. The reason for
doing away with the old constructor is that using it with TYPE_INTERVAL
isn't strictly correct since there exists a more specific subclass,
IntervalVal, with overriden ValDescribe() method that ought to be used
to print such values in a more descriptive way.
* origin/topic/timw/906-find-all-urls-regex:
Restore previous url scheme capture group
GH-906: Fix the regex in url.zeek to better match for find_all_urls
This commit switches UID hashing from md5 to a highway hash. It also
moves the salt value out of the file plugin - and makes it
installation-specific instead - it is moved to the global namespace.
There now are digest hash functions to make "static"
installation-specific hashes that are stable over workers available to
everyone; hashes can be 64, 128 or 256 bits in size.
Due to the fact that we switch the file hashing algorithm, all file
hashes change.
The underlyigng algorithm that is used for hashing is highwayhash-128,
which is significantly faster than md5.
- Added test case and adjusted whitespace in merge
* 'stats-logging-fix' of https://github.com/brittanydonowho/zeek:
Fixed stats.zeek to log all data before zeek terminates rather than return too soon
Node-specific topic prefix subscriptions/publications now add a trailing
slash like "zeek/cluster/node/<name>/". Without the trailing slash,
messages attempting to target "proxy-10" may also be sent to "proxy-1"
since subscription matching is prefix-based.
- Squashed the original commit set
- Cleaned up formatting
- Fixed register_for_ports() for right RDPEUDP analyzer
* topic/ak/rdpeudp:
Add RDP over UDP analyzer
These were previously reporting leaks due to various allocations not
getting cleaned up during the stack unwind, but at the current state of
the transition toward IntrusivePtr usage, theses tests no longer leak.
* The compression capability was incorrectly set to 0x0004 instead of 0x0003
* The padding was 4-byte instead of 8-byte aligned and also the spec.
does not strictly require the padding for the last item in the list.
* Add a default case to handle parsing of unknown context types.
Changed some configuration defaults to potentially more same values.
The callback function is now a hook to allow costomization of the events
that are raised.
Tests now exist. Test baselines are updated.