This commit renames the `service_violation` column that can be added via
a policy script to `failed_service`. This expresses the intent of it
better - the column contains services that failed and were removed after
confirmation.
Furthermore, the script is fixed so it actually does this - before it
would sometimes add services to the list that were not actually removed.
In the course of this, the type of the column was changed from a vector
to an ordered set.
Due to the column rename, the policy script itself is also renamed.
Also adds a NEWS entry for the DPD changes.
This also includes some test baseline updates, due to recent QUIC
changes.
* origin/master: (39 commits)
Update doc submodule [nomail] [skip ci]
Bump cluster testsuite to pull in resilience to agent connection timing [skip ci]
IPv6 support for detect-external-names and testcase
Add `skip_resp_host_port_pairs` option.
util/init_random_seed: write_file implies deterministic
external/subdir-btest.cfg: Set OPENSSL_ENABLE_SHA1_SIGNATURES=1
btest/x509_verify: Drop OpenSSL 1.0 hack
testing/btest: Use OPENSSL_ENABLE_SHA1_SIGNATURES
Add ZAM baseline for new scripts.base.protocols.quic.analyzer-confirmations btest
QUIC/decrypt_crypto: Rename all_data to data
QUIC: Confirm before forwarding data to SSL
QUIC: Parse all QUIC packets in a UDP datagram
QUIC: Only slurp till packet end, not till &eod
Remove unused SupervisedNode::InitCluster declaration
Update doc submodule [nomail] [skip ci]
Bump cluster testsuite to pull in updated Prometheus tests
Make enc_part value from kerberos response available to scripts
Management framework: move up addition of agent IPs into deployable cluster configs
Support multiple instances per host addr in auto metrics generation
When auto-generating metrics ports for worker nodes, get them more uniform across instances.
...
This commit builds on top of GH-4183 and adds IPv6 support for
policy/protocols/dns/detect-external-names.
Additionally it adds a test-case for this file testing it with mDNS
queries.
Since the changes to port autoassignment in the preceding commits leverage agent
IP address information, we need to ensure that this information is available at
the time of autoassignment. The controller learns IP addresses from connecting
agents, and previously used that information at deploy time. This moves the
augmentation of the cluster config up to port autoassignment time.
This introduces ian options, DPD::track_removed_services_in_connection.
It adds failed services to the services column, prefixed with a
"-".
Alternatively, this commit also adds
policy/protocols/conn/failed-services.zeek, which provides the same
information in a new column in conn.log.
This changes service set in the connection record, and thus also the
conn.log service field to being ordered. Speficically, the order of the
entries in the service field will be the same order in which protocols
will be confirmed. This means that it now is possible to see which
protocols were layered over each other in which order by looking at the
respective conn.log entry.
This commit revamps the handling of analyzer violations that happen
before an analyzer confirms the protocol.
The current state is that an analyzer is disabled after 5 violations, if
it has not been confirmed. If it has been confirmed, it is disabled
after a single violation.
The reason for this is a historic mistake. In Zeek up to versions 1.5,
analyzers were unconditianally removed when they raised the first
protocol violation.
When this script was ported to the new layout for Zeek 2.0 in
b4b990cfb5, a logic error was introduced
that caused analyzers to no longer be disabled if they were not
confirmed.
This was the state for ~8 years, till the DPD::max_violations options
was added, which instates the current approach of disabling unconfirmed
analyzers after 5 violations. Sadly, there is not much discussion about
this change - from my hazy memory, I think this was discovered during
performance tests and the new behavior was added without checking into
the history of previous changes.
This commit reinstates the originally intended behavior of DPD. When an
analyzer that has not been confirmed raises a protocol violation, it is
immediately removed from the connection. This also makes a lot of sense
- this allows the analyzer to be in a "tasting" phase at the beginning
of the connection, and to error out quickly once it realizes that it was
attached to a connection not containing the desired protocol.
This change also removes the DPD::max_violations option, as it no longer
serves any purpose after this change. (In practice, the option remains
with an &deprecated warning, but it is no longer used for anything).
There are relatively minimal test-baseline changes due to this; they are
mostly triggered by the removal of the data structure and by less
analyzer errors being thrown, as unconfirmed analyzers are disabled
after the first error.
This switches the DPD logic to always log analyzers that raised a
protocol confirmation.
The logic is that, once a protocol has been confirmed - and thus there
probably is log output - it does not make sense to later remove it from
the log. It does make sense to somehow flag it as failed - but that
seems like a secondary step.
* origin/topic/johanna/gh-4061:
Update BiF-tracking, add is_event_handled
Address review comments and small updates for DNS warnings
Raise warnings when for DNS events that are not raised due to dns_skip_all_addl
224.0.0.0/24 (and 6to4 conversion 2002:e000::/40) from RFC5771 "Multicast Local Network Control Block" defined as non-routable.
239.0.0.0/8 (and 6to4 conversion 2002:ef00::/24) from RFC2365 "Administratively Scoped IP Multicast"
fec0::/10 from RFC3879 "Deprecated Site Local Addresses"
(cherry picked from commit 821ab2dbed)
By default, dns_skip_all_addl is set to false. This causes several
events to not be raised. This change emits warnings when a user defines
event handlers for events that will not be raised.
Furthermore, it adds notes about this behavior to the documentation. We
also introduce a new BIF, `is_event_handled`, which checks if an event
is handled.
Fixes GH-4061
This fixes instances where `zeek:see` was used incorrectly so it was not
rendered correctly. All these instances have been found by looking for
`zeek:see` in the generated HTML where it should not be visible anymore.
I also removed a doc reference to `paraglob_add` which never existed.
This caused confusion and I don't think it's very intuitive. If called
with a name that does not exist, this returns without a value, not even
an error value. Changing that seems like it could be more deprecation
work.
The nodes-experimental/manager.zeek file ends up calling Broker::publish()
unconditionally, resulting in a warning. Skip running that code when
generating documentation.
* origin/topic/awelzel/move-broker-to-cluster-publish:
netcontrol: Move to Cluster::publish()
openflow: Move to Cluster::publish()
netcontrol/catch-and-release: Move to Cluster::publish()
config: Move to Cluster::publish()
ssl/validate-certs: Move to Cluster::publish()
irc: Move to Cluster::publish()
ftp: Move to Cluster::publish()
dhcp: Move to cluster publish
notice: Move to Cluster::publish()
intel: Move to Cluster::publish()
sumstats: Move to Cluster::publish()
* origin/topic/awelzel/fix-cluster-publish-any:
cluster/Backend: Handle unspecified table/set
cluster: Fix Cluster::publish() of Broker::Data
cluster: Be noisy when attempting to connect to an unknown node
Mostly due to spending too much time wondering why nodes didn't connect
when there was a mismatch between "manager" and "manager-1" in the
cluster layout. Remove manager from test-all-policy-cluster test to
avoid connection attempts in this test.
...at the same time, add some `TEST-REQUIRES: have-zeromq` which
unfortunately means that developers will usually want libzmq
installed on their system.
This is a cluster backend implementation using a central XPUB/XSUB proxy
that by default runs on the manager node. Logging is implemented leveraging
PUSH/PULL sockets between logger and other nodes, rather than going
through XPUB/XSUB.
The test-all-policy-cluster baseline changed: Previously, Broker::peer()
would be called from setup-connections.zeek, causing the IO loop to be
alive. With the ZeroMQ backend, the IO loop is only alive when
Cluster::init() is called, but that doesn't happen anymore.