When a JSON document contains key names containing colons or other
special characters that are not valid in Zeek identifiers, from_json()
cannot be used to parse such input.
This change allows a customizable normalization function.
Closes#3142.
* topic/awelzel/3112-log-suffix-left-over-log-rotation:
cluster/logger: Fix leftover-log-rotation in multi-logger setups
cluster/logger: Fix global var reference
Using break in either of the hooks allows to suppress the default reporter
error message rather than suppressing solely based on the existence of an
assertion_failure() handler.
Populating log_metadata during zeek_init() is too late for the
leftover-log-rotation functionality, so do it at script parse time.
Also, prepend archiver_ to the log_metadata table and encoding function
due to being in the global namespace and to align with the
archiver_rotation_format_func. This hasn't been in a released
version yet, so fine to rename still.
Closes#3112
* origin/topic/awelzel/cluster-at-if-removal:
test-all-policy: Do not load nodes-experimental/manager.zeek
cluster/main: Remove extra @if ( Cluster::is_enabled() )
Turns out loading this script in non-cluster mode uses Cluster::log()
and creates cluster.log in the external baselines saying "cluster
started". Do not load it into test-all-policy.zeek and instead rely
on the new test-all-policy-cluster.test to load it transitively
when running as manager for basic checking.
These have been discussed in the context of "@if &analyze" [1] and
am much in favor for not disabling/removing ~100 lines (more than
fits on a single terminal) out from the middle of a file. There's no
performance impact for having these handlers enabled unconditionally.
Also, any future work on "@if &analyze" will look at them again which
we could also skip.
This also reverts back to the behavior where the Cluster::LOG stream
is created even in non cluster setups like in previous Zeek versions.
As long as no one writes to it there's essentially no difference. If
someone does write to Cluster::LOG, I'd argue not black holing these
messages is better. Schema generators using Log::active_streams will
continue to discover Cluster::LOG even if they run in non-cluster
mode.
https://github.com/zeek/zeek/pull/3062#discussion_r1200498905
We previously would reprent port ranges from EVT files element-wise.
This can potentially generate a lot of code (all on a single line
though) which some versions of GCC seem to have trouble with, and which
also causes JIT overhead.
With this patch we switch to directly representing ranges. Single ports
are represented as ranges `[start, start]`.
Closes#3094.
* origin/topic/vern/at-if-analyze:
updates reflecting review comments
change base scripts to use run-time if's or @if ... &analyze
a number of BTests updated with @if ... &analyze
update for scripting coverage BTest demonstrating utility of @if ... &analyze
BTests for new @if ... &analyze functionality
"if ( ... ) &analyze" language feature
classes for tracking "@if (...) &analyze" notion of code being/not being "activated"
RemoveGlobal() method for Scope class + simplifying interfaces
While writing documentation about troubleshooting and looking a bit
at the older stats.log, realized we don't have the packet lag metric
exposed as metric/telemetry. Add it.
This is a Zeek instance lagging behind in network time ~6second because
it's very overloaded:
zeek_net_packet_lag_seconds{endpoint=""} 6.169406 1684848998092
Repeating the message for every new call to get_file_handle() is not
very useful. It's pretty much an analyzer configuration issue so logging
it once should be enough.
OSS-Fuzz generated traffic containing a CWD command with a single very large
path argument (427kb) starting with ".___/` \x00\x00...", This is followed
by a large number of ftp replies with code 250. The directory logic in
ftp_reply() would match every incoming reply with the one pending CWD command,
triggering path buildup ending with something 120MB in size.
Protect from re-using a directory command by setting a flag in the
CmdArg record when it was consumed for the path traversal logic.
This doesn't prevent unbounded path build-up generally, but does prevent the
amplification of a single large command with very many small ftp_replies.
Re-using a pending path command seems like a bug as well.
* jgras/topic/jgras/cluster-active-node-count-fix:
Fix get_active_node_count for node types not present.
Changed over to explicit existence check instead to avoid the set()
creation upon missed lookups.
This reflects the `spicy-plugin` code as of `d8c296b81cc2a11`.
In addition to moving the code into Zeek's source tree, this comes
with a couple small functional changes:
- `spicyz` no longer tries to infer if it's running from the build
directory. Instead `ZEEK_SPICY_LIBRARY` can be set to a custom
location. `zeek-set-path.sh` does that now.
- ZEEK_CONFIG can be set to change what `spicyz -z` print out. This is
primarily for backwards compatibility.
Some further notes on specifics:
- We raise the minimum Spicy version to 1.8 (i.e., current `main`
branch).
- Renamed the `compiler/` subdirectory to `spicyz` to avoid
include-path conflicts with the Spicy headers.
- In `cmake/`, the corresponding PR brings a new/extended version of
`FindZeek`, which Spicy analyzer packages need. We also now install
some of the files that the Spicy plugin used to bring for testing,
so that existing packages keep working.
- For now, this all remains backwards compatible with the current
`zkg` analyzer templates so that they work with both external and
integrated Spicy support. Later, once we don't need to support any
external Spicy plugin versions anymore, we can clean up the
templates as well.
- All the plugin's tests have moved into the standard test suite. They
are skipped if configure with `--disable-spicy`.
This holds off on adapting the new code further to Zeek's coding
conventions, so that it remains easier to maintain it in parallel to
the (now legacy) external plugin. We'll make a pass over the
formatting for (presumable) Zeek 6.1.
Issue #3028 tracks how a flipped connections reset a connection's value
including any state set during new_connection(). For the time being,
update community-id functionality back to the original connection_state_remove()
approach to avoid missing community_ids on flipped connections.
* amazing-pp/topic/fupeng/from_json_bif:
Implement from_json bif
Minor updates during merge: Moved ValFromJSON into zeek::detail for the
time being, removed gotos, normalized some error messages to lower case,
minimal test extension and added a raw reader input framework test reading
"json lines" as a demo, adding notes about the implicit type
conversions.
When multiple loggers are configured in a Supervisor controlled cluster
configuration, encode extra information into the rotated filename to
identify which logger produced the log.
This is similar to the approach taken for ZeekControl, re-using the
log_suffix terminology, but as there's only a single zeek-archiver
process and no postprocessors and no other side-channel for additional
information, we encode extra metadata into the filename. zeek-archiver
is extended to recognize the special metadata part of the filename.
This also solves the issue that multiple loggers in a supervisor setup
overwrite each others log files within a single log-queue directory.
* origin/topic/awelzel/smb2-state-handling:
NEWS: Add entry about SMB::max_pending_messages and state discarding
scripts/smb2-main: Reset script-level state upon smb2_discarded_messages_state()
smb2: Limit per-connection read/ioctl/tree state
DTLSv1.3 changes the DTLS record format, introducing a completely new
header - which is a first for DTLS.
We don't currently completely parse this header, as this requires a bit
more statekeeping. This will be added in a future revision. This also
also has little practical implications.
* topic/johanna/no-error-message-durning-tls-or-dtls-protocol-violations:
SSL: failing analyzer handling - address review feedback
SSL: do not try to disable failed analyzer
Also folds in minor feedback from GH-3012