* origin/topic/johanna/new-style-analyzer-log:
NEWS entries for analyzer log changes
Move detect-protocol from frameworks/dpd to frameworks/analyzer
Introduce new c$failed_analyzers field
Settle on analyzer.log for the dpd.log replacement
dpd->analyzer.log change - rename files
Analyzer failure logging: tweaks and test fixes
Introduce analyzer-failed.log, as a replacement for dpd.log
Rename analyzer.log to analyzer.debug log; move to policy
Move dpd.log to policy script
detect-protocol.zeek was the last non-deprecated script left in
policy/frameworks/dpd. It was moved to policy/frameworks/analyzer. A
script that loads the script from the new location with a deprecation
warning was added.
This field is used internally to trace which analyzers already had a
violation. This is mostly used to prevent duplicate logging.
In the past, c$service_violation was used for a similar purpose -
however it has slightly different semantics. Where c$failed_analyzers
tracks analyzers that were removed due to a violation,
c$service_violation tracks violations - and doesn't care if an analyzer
was actually removed due to it.
To address review feedback in GH-4362: rename analyzer-failed-log.zeek
to loggig.zeek, analyzer-debug-log.zeek to debug-logging.zeek and
dpd-log.zeek to deprecated-dpd-log.zeek.
Includes respective test, NEWS, etc updates.
The main part of this commit are changes in tests. A lot of the tests
that previously relied on analyzer.log or dpd.log now use the new
analyzer-failed.log.
I verified all the changes and, as far as I can tell, everything
behaves as it should. This includes the external test baselines.
This change also enables logging of file and packet analyzer to
analyzer_failed.log and fixes some small behavior issues.
The analyzer_failed event is no longer raised when the removal of an
analyzer is vetoed.
If an analyzer is no longer active when an analyzer violation is raised,
currently the analyzer_failed event is raised. This can, e.g., happen
when an analyzer error happens at the very end of the connection. This
makes the behavior more similar to what happened in the past, and also
intuitively seems to make sense.
A bug introduced in the failed service logging was fixed.
Analyzer-failed.log is, essentially, the replacement for dpd.log. The
name should make more sense, as it does now log analyzer failures. For
protocol analyzers specifically, these are failures that lead to the
analyzer being disabled.
The current analyzer.log is more useful for debugging than for
operational purposes. Hence this is disabled by default, moved to a
policy script, and the log is renamed to analyzer-debug.log.
Furthermore, logging of analyzer confirmations and disabling analyzers
are now enabled by default.
This is the first phase of moving from the current dpd log to a more
modern logfile, without some of the weirdnesses that the current dpd log
contains.
Tests will not pass in the current state; this is just splitting out
functionality.
In GH-4422 it was pointed out that the protocols/conn/failed-service-logging.zeek
policy script only works when
`DPD::track_removed_services_in_connection=T` is set.
This was caused by a logic error in the script. This commit fixes this
logic error and introduces an additional test that checks that
failed-service-logging works even when the option is not set to true.
The ZeroMQ heuristic for "ready to publish" is to create an unique and
ephemeral subscription using the XSUB socket and observe it arrive on the
XPUB socket. At this point, visibility into other node's subscriptions
is provided.
Due to prefix matching, worker-1's node_topic() also matched worker-10,
worker-11, etc. Suffix the node topic with a `.`. The original implementation
came from NATS, where subjects are separated by `.`.
Adapt nodeid_topic() for consistency.
* origin/topic/johanna/dpd-changes:
DPD: failed services logging alignment
DPD: update test baselines; change options for external tests.
DPD: change policy script for service violation logging; add NEWS
DPD changes - small script fixes and renames.
Update public and private test suite for DPD changes.
Allow to track service violations in conn.log.
Make conn.log service field ordered
DPD: change handling of pre-confirmation violations, remove max_violations
DPD: log analyzers that have confirmed
IRC analyzer - make protocol confirmation more robust.
There were some special cases in which the failed-service-logging policy
script might log a service being removed that was not removed due to an
analyzer violation. This change should fix these cases.
This commit renames the `service_violation` column that can be added via
a policy script to `failed_service`. This expresses the intent of it
better - the column contains services that failed and were removed after
confirmation.
Furthermore, the script is fixed so it actually does this - before it
would sometimes add services to the list that were not actually removed.
In the course of this, the type of the column was changed from a vector
to an ordered set.
Due to the column rename, the policy script itself is also renamed.
Also adds a NEWS entry for the DPD changes.
It is not safe to use the same socket from different threads, but the
current code used the xsub socket directly from the main thread (to setup
subscriptions) and from the internal thread for polling and reading.
Leverage the PAIR socket already in use for forwarding publish operations
to the internal thread also for subscribe and unsubscribe.
The failure mode is/was a bit annoying. Essentially, closing of the
context would hang indefinitely in zmq_ctx_term().
This also includes some test baseline updates, due to recent QUIC
changes.
* origin/master: (39 commits)
Update doc submodule [nomail] [skip ci]
Bump cluster testsuite to pull in resilience to agent connection timing [skip ci]
IPv6 support for detect-external-names and testcase
Add `skip_resp_host_port_pairs` option.
util/init_random_seed: write_file implies deterministic
external/subdir-btest.cfg: Set OPENSSL_ENABLE_SHA1_SIGNATURES=1
btest/x509_verify: Drop OpenSSL 1.0 hack
testing/btest: Use OPENSSL_ENABLE_SHA1_SIGNATURES
Add ZAM baseline for new scripts.base.protocols.quic.analyzer-confirmations btest
QUIC/decrypt_crypto: Rename all_data to data
QUIC: Confirm before forwarding data to SSL
QUIC: Parse all QUIC packets in a UDP datagram
QUIC: Only slurp till packet end, not till &eod
Remove unused SupervisedNode::InitCluster declaration
Update doc submodule [nomail] [skip ci]
Bump cluster testsuite to pull in updated Prometheus tests
Make enc_part value from kerberos response available to scripts
Management framework: move up addition of agent IPs into deployable cluster configs
Support multiple instances per host addr in auto metrics generation
When auto-generating metrics ports for worker nodes, get them more uniform across instances.
...
This commit builds on top of GH-4183 and adds IPv6 support for
policy/protocols/dns/detect-external-names.
Additionally it adds a test-case for this file testing it with mDNS
queries.
Since the changes to port autoassignment in the preceding commits leverage agent
IP address information, we need to ensure that this information is available at
the time of autoassignment. The controller learns IP addresses from connecting
agents, and previously used that information at deploy time. This moves the
augmentation of the cluster config up to port autoassignment time.