* origin/topic/awelzel/move-hilti-jit-parallelism-to-btest-cfg:
testing/btest: Default to HILTI_JIT_PARALLELISM=1
Revert "CI: Use ccache and a single CPU when building spicy analyzers for btests"
This is a rework of b59bed9d06 moving
HILTI_JIT_PARALLELISM=1 into btest.cfg to make it default applicable to
btest -j users (and CI).
The background for this change is that spicyz may spawn up to nproc compiler
instances by default. Combined with btest -j, this may be nproc x nproc
instances worst case. Particularly with gcc, this easily overloads CI or
local systems, putting them into hard-to-recover-from thrashing/OOM states.
Exporting HILTI_JIT_PARALLELISM in the shell allows overriding.
* cknill/topic/cknill/display_cmake_fix:
Fix for --display-cmake in configure Moved build directory creation further down in the script so that --display-cmake has a chance to happen before build tree setup.
* topic/christian/management-telemetry-additions:
Management framework: bump cluster testsuite to pull in telemetry tests
Management framework: bump zeek-client
Management framework: augment deployed configs with instance IP addresses
Management framework: add auto-enumeration of metrics ports
Management framework: propagate metrics port from agent
Management framework: add metrics port in management & Supervisor node records
Harden the telemetry manager against unset Telemetry::metrics_address
Comment-only tweaks for telemetry-related settings.
The controller learns IP addresses from agents that peer with it, but that
information has so far gotten lost when resulting configs get pushed out to the
agents. This makes these updates include that information.
This is quite redundant with the enumeration for Broker ports,
unfortunately. But the logic is subtly different: all nodes obtain a telemetry
port, while not all nodes require a Broker port, for example, and in the metrics
port assignment we also cross-check selected Broker ports. I found more unified
code actually harder to read in the end.
The logic for the two sets remains the same: from a start point, ports get
enumerated sequentially that aren't otherwise taken. These ports are assumed
available; there's nothing that checks their availability -- for now.
The default start port is 9000. I considered 9090, to align with the Prometheus
default, but counting upward from there is likely to hit trouble with the Broker
default ports (9999/9997), used by the Supervisor. Counting downward is a bit
unnatural, and shifting the Broker default ports brings subtle ordering issues.
This also changes the node ordering logic slightly since it seems more intuitive
to keep sequential ports on a given instance, instead of striping across them.
We populate that address from the ZEEK_DEFAULT_LISTEN_ADDRESS environment
variable, but weren't prepared for that not being set. We now fall back to
0.0.0.0. This may have the same IPv6 issues that we've encountered elsewhere
when doing so before (v6 interfaces need "::") -- but this is still more likely
to work than not having any string at all.
When Zeek flips roles of a HTTP connection subsequent to the HTTP analyzer
being attached, that analyzer would not update its own ContentLine analyzer
state, resulting in the wrong ContentLine analyzer being switched into
plain delivery mode.
In debug builds, this would result in assertion failures, in production
builds, the HTTP analyzer would receive HTTP bodies as individual header
lines, or conversely, individual header lines would be delivered as a
large chunk from the ContentLine analyzer.
PCAPs were generated locally using tcprewrite to select well-known-http ports
for both endpoints, then editcap to drop the first SYN packet.
Kudos to @JordanBarnartt for keeping at it.
Closes#3789
* topic/christian/supervisor-node-simplification:
Remove the Supervisor's internal ClusterEndpoint struct.
Provide a script-layer equivalent to Supervisor::__init_cluster().
This eliminates one place in which we currently need to mirror changes to the
script-land Cluster::Node record. Instead of keeping an exact in-core equivalent, the
Supervisor now treats the data structure as opaque, and stores the whole cluster
table as a JSON string.
We may replace the script-layer Supervisor::ClusterEndpoint in the future, using
Cluster::Node directly. But that's a more invasive change that will affect how
people invoke Supervisor::create() and similars.
Relying on JSON for serialization has the side-effect of removing the
Supervisor's earlier quirk of using 0/tcp, not 0/unknown, to indicate unused
ports in the Supervisor::ClusterEndpoint record.
If the script layer is able to access the current node's config via
Supervisor::node(), it can handle populating Cluster::nodes. That code
is much more straightforward than an equivalent in-core implementation
(especially with the upcoming change to the cluster table's implementation).
This introduces base/frameworks/cluster/supervisor.zeek and
Cluster::Supervisor::__init_cluster_nodes() for that purpose.
The @load of the Supervisor API in cluster/main.zeek isn't technically
necessary since we already load it explicitly even in init-bare.zeek,
but being explicit seems better.
* topic/christian/json-improvements:
Update NEWS file to cover JSON enhancements
Support JSON roundtripping via to_json()/from_json() for patterns
Support table deserialization in from_json()
Support map-based definition of ports in from_json()
Document the field_escape_pattern in the to_json() BiF
This needed a small tweak in the deserialization, since each roundtrip
would otherwise pad the prior pattern with an extra /^?(...)$?/.
This expands the language.set test to also verify serializing/unserializing for
sets, similarly to tables in the previous commit.
This allows additional data roundtripping through JSON since to_json() already
supports tables. There are some subtleties around the formatting of strings in
JSON object keys, for which this adds a bit of helper infrastructure.
This also expands the language.table test to verify the roundtrips, and adapts
bif.from_json to include a table in the test record.
The from_json() BiF and its underlying code in Val.cc currently expect ports
expressed as a string ('80/tcp' etc). Zeek's own serialization via ToJSON()
renders them as an object ('{"port":80, "proto":"tcp"}'). This adds support
for the latter format to from_json(), so serialized values can be read back.
* origin/topic/awelzel/3682-bad-pipe-op-3:
threading/Manager: Warn if threads are added after termination
iosource/Manager: Reap dry sources while computing timeout
threading/MsgThread: Decouple IO source and thread lifetimes
iosource/Manager: Do not manage lifetime of pkt_src
iosource/Manager: Honor manage_lifetime and dont_count for short-lived IO sources
The core.file-analyzer-violation test showed that it's possible to
create new threads (log writers) when Zeek is in the process of
terminating. This can result in the IO manager's deconstructor
deleting IO sources for threads that are still running.
This is sort of a scripting issue, so for now log a reporter warning
when it happens to have a bit of a bread-crumb what might be
going on. In the future it might make sense to plug APIs with
zeek_is_terminating().
MsgThread acting as an IO source can result in the situation where the
threading manager's heartbeat timer deletes a finished MsgThread instance,
but at the same time this thread is in the list of ready IO sources the
main loop is currently processing.
Fix this by decoupling the lifetime of the IO source part and properly
registering as lifetime managed IO sources with the IO manager.
Fixes#3682
Now that dry sources are properly reaped and freed, an offline packet
source would be deleted once dry, resulting in GetPktSrc() returning
a wild pointer. Don't manage the packet source lifetime and instead
free it during Manager destruction.