This addresses PR feedback. The main component in this commit is to
disable &on_change notifications when &backend loads a table from sqlite
on startup.
* origin/topic/jsiwek/gh-1024-broker-store-handle-type-checks:
Improve Broker store API's handling of invalid arguments
Add builtin_exception() functions
GH-1024: fix crash on passing wrong types to Broker store API
This was a bit of a bigger merge since Zeek changed inbetween the time
of the PR and me actually merging it.
I put the new functions into the zeek::detail namespace -- since it
seems unlikely that those will be used by something external.
I also renamed them to fit better with the naming scheme of the new
error functions.
Fixes GH-1024
This commit adds script/c++ documentation and fixes a few loose ends.
It also adds tests for corner cases and massively improves error
messages.
This also actually introduces type-compatibility checking and introduces
a new attribute that lets a user override this if they really know what
they are doing. I am not quite sure if we should really let that stay in
- but it can be very convenient to have this functionality.
One test is continuing to fail - the expiry test is very flaky. This is,
I think, caused by delays of the broker store forwarding. I am unsure if
we can actually do anything about that.
Previously, a single `icmp_conn` record was built per ICMP "connection"
and re-used for all events generated from it. This may have been a
historical attempt at performance optimization, but:
* By default, Zeek does not load any scripts that handle ICMP events.
* The one script Zeek ships with that does handle ICMP events,
"detect-traceroute", is already noted as being disabled due to
potential performance problems of doing that kind of analysis.
* Re-use of the original `icmp_conn` record tends to misreport
TTL and length values since they come from original packet instead
of the current one.
* Even if we chose to still re-use `icmp_conn` records and just fill
in a new TTL and length value each packet, a user script could have
stored a reference to the record and not be expecting those values
to be changed out from underneath them.
Now, a new `icmp_info` record is created/populated in all ICMP events
and should be used instead of `icmp_conn`. It also removes the
orig_h/resp_h fields as those are redundant with what's already
available in the connection record.
Local frame offsets were being assigned based on number of the alternate
prototype's parameters, which may end up having less total parameters
than the canonical prototype, causing the local value to incorrectly
overwrite an event/hook argument value.
The location information now points out the place of the deprecated
prototype instead of the location where the ID was initially declared
(which may not itself be a deprecated prototype).
Particularly, this is meant for using &deprecated on canonical
event/hook prototype parameters to encourage users to create handlers
to another, non-deprecated prototype. i.e. for canonical prototypes,
we may not always want to put &deprecated directly on the prototype
itself since that signals deprecation of the ID entirely.
* origin/master: (47 commits)
scan.l: Remove "constant" did_module_restore logic
Fix FreeBSD CI script to install right SWIG package
Update submodule(s)
GH-928: use realpath() instead of inode to de-duplicate scripts
Update submodule(s)
GH-1040: Add zero-indexed version of str_split
Fix WhileStmt to call Stmt(Tag) ctor
GH-1041: Move compress_path to a bif that uses normalize_path
Update submodule(s)
Update submodule(s)
Update submodule(s)
Fix --enable-mobile-ipv6 build
Fix namespace of GetCurrentLocation() to zeek::detail
Add backtrace() and print_backtrace()
Rename BroString files to ZeekString
Update NEWS entry with note about class renames
Rename BroObj to Obj
Rename BroString to zeek::String
Move Func up to zeek namespace, rename BroFunc to ScriptFunc
Mark global val_mgr as deprecated and fix uses of it to use namespaced version
...
These may be redefined to customize log rotation path prefixes,
including use of a directory. File extensions are still up to
individual log writers to add themselves during the actual rotation.
These new also allow for some simplication to the default
ASCII postprocessor function: it eliminates the need for it doing an
extra/awkward rename() operation that only changes the timestamp format.
This also teaches the supervisor framework to use these new options
to rotate ascii logs into a log-queue/ directory with a specific
file name format (intended for an external archiver process to
monitor separately).
This helps prevent a node from being killed/crashing in the middle
of writing a log, restarting, and eventually clobbering that log
file that never underwent the rotation/archival process.
The old `archive-log` and `post-terminate` scripts as used by
ZeekControl previously implemented this behavior, but the new logic is
entirely in the ASCII writer. It uses ".shadow" log files stored
alongside the real log to help detect such scenarios and rotate them
correctly upon the next startup of the Zeek process.
The stdout/stderr of child processes is now redirected over a pipe back
to the supervisor process so that it can prefix the output with
the name of the emitting node.
Duplicate script `@load` directives are now detected by comparing
against canonical paths formed by realpath(). This fixes the previous,
unexpected behavior of treating scripts that hardlink to same
inode as duplicates: such links will now be loaded as distinct scripts
since their canonical path differs.
With this, the basic functionality of &backend seems to be working.
It is not yet integrated with zeekctl, one has to manually specify the
storage location for the sqlite files somewhere when using sqlite.
Usage for memory stores:
global table_to_share: table[string] of count &backend=Broker::MEMORY;
Usage for sqlite stores:
redef Broker::auto_store_db_directory = "[path]";
global table_to_share: table[string] of count &backend=Broker::SQLITE;
In both cases, the cluster should automatically sync to changes done by
any node. When using sqlite, data should also be saved to disk and
re-loaded on startup.
Currently this requires using this with a normal cluster - or sending
messages by yourself.
It, in principle, should also work with SQLITE - but that is a bit
nonsensical without being able to change the storage location.
The &backend attribute allows for a much more convenient way of
interacting with brokerstores. One does not need to create a broker
store anymore - instead all of this is done internally.
The current state of this partially works. This should work fine for
persistence - but clones are currently not yet correctly attached.
When a clone attaches to a master, it just gets the diffs sent as
events. Which is neat because it means that we pretty much don't need
any extra code to handle this.
This currently only handles the most basic case, and is not thoroughly
tested.
When initializing a master store, we now check if there already is data
in it. If yes, we load it directly into the zeek table when the store is
created. We assume that this is happening at Zeek startup - and are
supremely evil and just load it synchronously. Which could block
execution for a bit for larger stores.
That being said - this might sidestep other issues that would arise when
doing this async (like scripts already inserting data).
Next step: check if this approach also works for clones.
* origin/master:
Fix shadowed variable that breaks lookup_hostname()
GH-1025: allow copying/cloning of `opaque of Broker::Store`
Fix "possibly-truncated" compiler warning in BuildJSON snprintf()
Update submodule(s)
Fixed some places where tabs became spaces
Convert to using permissions to check for access to cirrus variables in benchmark script
Integrate review feedback: improve command-line option redef parsing
Fix several issues with command-line option redefs
Remove last_access_time from TableEntryVal.
Minimize data published for expected FTP data channel analysis
Stricter checking if we have a dns field on the connection being processed
Modified the DNS protocol analyzer to add a new parameter to the dns_request event which includes the DNS query in its original case. Added a policy script that will add the original_case to the dns.log file as well. Created new btests to test both.
Place build file in explicit location for benchmarking to work correctly
cmake: Make musl support more distro agnostic
Update highwayhash submodule to upstream.
GH-998: Fix Reporter::conn_weird() to handle expired connections
Changes during merge
- Changed the policy script to use an event handler that behaves
for like the base script: &priority=5, msg$opcode != early-out,
no record field existence checks
- Also extended dns_query_reply event with original_query param
- Removed ExtractName overload, and just use default param
* 'dns-original-query-case' of https://github.com/rvictory/zeek:
Fixed some places where tabs became spaces
Stricter checking if we have a dns field on the connection being processed
Modified the DNS protocol analyzer to add a new parameter to the dns_request event which includes the DNS query in its original case. Added a policy script that will add the original_case to the dns.log file as well. Created new btests to test both.
Makes some attributes conflict with each other. This also needed the
test to change.
The test is a bit flaky - but I can, for the heck of it, not figure out
why. I am punting that for the future after spending a few hours on it.
* Some methods mistakenly returned a bool instead of QueryResult
when passed an invalid `opaque of Broker::Store` handle.
* Now generates a runtime exception for store_name() and is_closed()
calls that pass an invalid `opaque of Broker::Store` handle as any
returned value can't be reasonably used in any subsequent logic.
* Descriptions of any invalid arguments are now given in the error
message.
* Variables of `string` type can now be set to an empty string
* Trying to set a variable with non-`string` type to an empty value
now emits an error instead of silently doing nothing
* Providing an invalid identifier now emits an "unknown identifier"
error instead of silently doing nothing