remove instance of plus sign to account for real plus in sql
account for spaces encoding to plus signs in sqli regex detection
add test cases for sqli space to plus
account for spaces encoding to plus signs in sqli regex detection
forgot semicolon
account for spaces encoding to plus signs in sqli regex detection
* topic/christian/telemetry-make-bifs-primary:
Telemetry framework: move BIFs to the primary-bif stage
Minor comment tweaks for init-frameworks-and-bifs.zeek
Adding a metric for the network time value itself should make it
possible to observe it stopping or growing slowly as compared to
realtime when Zeek isn't able to keep up.
Also, modify the telemetry/log.zeek test to include misc/stats and
log at a higher frequency with a more interesting pcap.
This stops invoking Telemetry::sync() via a scheduled event and instead
only invokes it on-demand. This makes metric collection network time
independent and lazier, too.
With Prometheus scrape requests being processed on Zeek's main thread
now, we can safely invoke the script layer Telemetry::sync() hook.
Closes#3947
This moves the Telemetry framework's BIF-defined functionalit from the
secondary-BIFs stage to the primary one. That is, this functionality is now
available from the end of init-bare.zeek, not only after the end of
init-frameworks-and-bifs.zeek.
This allows us to use script-layer telemetry in our Zeek's own code that get
pulled in during init-frameworks-and-bifs.
This change splits up the BIF features into functions, constants, and types,
because that's the granularity most workable in Func.cc and NetVar. It also now
defines the Telemetry::MetricsType enum once, not redundantly in BIFs and script
layer.
Due to subtle load ordering issues between the telemetry and cluster frameworks
this pushes the redef stage of Telemetry::metrics_port and address into
base/frameworks/telemetry/options.zeek, which is loaded sufficiently late in
init-frameworks-and-bifs.zeek to sidestep those issues. (When not doing this,
the effect is that the redef in telemetry/main.zeek doesn't yet find the
cluster-provided values, and Zeek does not end up listening on these ports.)
The need to add basic Zeek headers in script_opt/ZAM/ZBody.cc as a side-effect
of this is curious, but looks harmless.
Also includes baseline updates for the usual btests and adds a few doc strings.
Log flushing is currently triggered based on the threading heartbeat timer
of WriterBackends and the hard-coded WRITE_BUFFER_SIZE 1000.
This change introduces a separate timer that is managed by the logger
manager instead of piggy-backing on the heartbeat timer, as well as a
const &redef for the buffer size.
This allows to modify the log flush frequency and batch size independently
of the threading heartbeat interval. Later, this will allow to re-use the
buffering and flushing logic of writer frontends for non-Broker cluster
backends, too.
One change here is that even frontends that do not have a backend will
be flushed regularly. This is wanted for non-Broker backends and should be
very cheap. Possibly, Broker can piggy back on this timer down the road, too,
rather than using its own script-level timer (see Broker::log_flush()).
The cmds list may grow unbounded due to the POP3 analyzer being in
multiLine mode after seeing `AUTH` in a Redis connection, but never
a `.` terminator. This can easily be provoked by the Redis ping
command.
This adds two heuristics: 1) Forcefully process the oldest commands in
the cmds list and cap it at max_pending_commands. 2) Start raising
analyzer violations if the client has been using more than
max_unknown_client_commands commands (default 10).
Closes#3936
Remove overhead of unconditionally calling remove_teredo_connection()
for *every* connection by installing a connection removal hook for only
when state was allocated.
This adds a protocol parser for the PostgreSQL protocol and a new
postgresql.log similar to the existing mysql.log.
This should be considered preliminary and hopefully during 7.1 and 7.2
with feedback from the community, we can improve on the events and logs.
Even if most PostgreSQL communication is encrypted in the real-world, this
will minimally allow monitoring of the SSLRequest and hand off further
analysis to the SSL analyzer.
This originates from github.com/awelzel/spicy-postgresql, with lots of
polishing happening in the past two days.
The current implementation would only log, if the password contains a
colon, the part before the first colon (e.g., the password
`password:password` would be logged as `password`).
A test has been added to confirm the expected behaviour.
It turns out that, for probably a long time, we have reported an
incorrect version when parsing an SSLv2 client hello. We always reported
this as SSLv2, no matter which version the client hello actually
contained.
This bug probably went unnoticed for a long time, as SSLv2 is
essentially unused nowadays, and as this field does not show up in the
default logs.
This was found due to a baseline difference when writing the Spicy SSL
analyzer.
This avoids the earlier problem of not tracking ports correctly in
scriptland, while still supporting `port` in EVT files and `%port` in
Spicy files.
As it turns out we are already following the same approach for file
analyzers' MIME types, so I'm applying the same pattern: it's one
event per port, without further customization points. That leaves the
patch pretty small after all while fixing the original issue.
* origin/topic/johanna/ssl-history-also-for-sslv2-not-only-for-things-that-use-the-more-modern-handshake:
Make ssl_history work for SSLv2 handshakes/connections
* origin/topic/vern/zam-regularization: (33 commits)
simpler and more robust identification of function parameters for AST profiling
fixes to limit AST traversal in the face of recursive types
address some script optimization compiler warnings under Linux
fix for -O C++ construction of variable names that use multiple module namespaces
fix for script optimization of "opaque" values that are run-time constants
fix for script optimization of nested switch statements
script optimization fix for complex "in" expressions in conditionals
updates to typos allow-list reflecting ZAM regularization changes
BTest updates for ZAM regularization changes
convert new ZAM operations to use typed operands
complete migration of ZAM to use only public ZVal methods
"-O validate-ZAM" option to validate generated ZAM instructions
internal option to suppress control-flow optimization
exposing some functionality for greater flexibility in structuring run-time execution
rework ZAM compilation of type switches to leverage value switches
add tracking of control flow information
factoring of ZAM operation specifications into separate files
updates to ZAM operations / gen-zam regularization, other than the operations themselves
type-checking fix for vector-of-string operations
ZVal constructor for booleans
...
This reworks the parser such that COM_CHANGE_USER switches the
connection back into the CONNECTION_PHASE so that we can remove the
EXPECT_AUTH_SWITCH special case in the COMMAND_PHASE. Adds two pcaps
produced with Python that actually do COM_CHANGE_USER as it seems
not possible from the MySQL CLI.
It turns out that the ssl_history field never was populated with C/S for
SSLv2 connections, or connections using the SSLv2 handshake. In our
testcases, the latter is especially common - with connections up to TLS1
using the old SSLv2 client hello for backwards compatibility.
This change resolves this issue. As the history is not by default
enabled in a lot of locations, baseline impact is minor.
Initial fuzzing caused a bind response to arrive before a bind request,
resulting in an unset field expression error:
expression error in base/protocols/ldap/main.zeek, line 270: field value missing (LDAP::m$opcode)
Prevent this by ensuring m$opcode is set and raising instead.
This avoids the callbacks from being processed on the worker thread
spawned by Civetweb. It fixes data race issues with lookups involving
global variables, amongst other threading issues.
PCAP was produced with a local OpenLDAP server configured to support StartTLS.
This puts the Zeek calls into a separate ldap_zeek.spicy file/module
to separate it from LDAP.
With Cluster::Node$metrics_port being optional, there's not really
a need for the extra script. New rule, if a metrics_port is set, the
node will attempt to listen on it.
Users can still redef Telemetry::metrics_port *after*
base/frameworks/telemetry was loaded to change the port defined
in cluster-layout.zeek.
The controller learns IP addresses from agents that peer with it, but that
information has so far gotten lost when resulting configs get pushed out to the
agents. This makes these updates include that information.
This is quite redundant with the enumeration for Broker ports,
unfortunately. But the logic is subtly different: all nodes obtain a telemetry
port, while not all nodes require a Broker port, for example, and in the metrics
port assignment we also cross-check selected Broker ports. I found more unified
code actually harder to read in the end.
The logic for the two sets remains the same: from a start point, ports get
enumerated sequentially that aren't otherwise taken. These ports are assumed
available; there's nothing that checks their availability -- for now.
The default start port is 9000. I considered 9090, to align with the Prometheus
default, but counting upward from there is likely to hit trouble with the Broker
default ports (9999/9997), used by the Supervisor. Counting downward is a bit
unnatural, and shifting the Broker default ports brings subtle ordering issues.
This also changes the node ordering logic slightly since it seems more intuitive
to keep sequential ports on a given instance, instead of striping across them.
This eliminates one place in which we currently need to mirror changes to the
script-land Cluster::Node record. Instead of keeping an exact in-core equivalent, the
Supervisor now treats the data structure as opaque, and stores the whole cluster
table as a JSON string.
We may replace the script-layer Supervisor::ClusterEndpoint in the future, using
Cluster::Node directly. But that's a more invasive change that will affect how
people invoke Supervisor::create() and similars.
Relying on JSON for serialization has the side-effect of removing the
Supervisor's earlier quirk of using 0/tcp, not 0/unknown, to indicate unused
ports in the Supervisor::ClusterEndpoint record.
If the script layer is able to access the current node's config via
Supervisor::node(), it can handle populating Cluster::nodes. That code
is much more straightforward than an equivalent in-core implementation
(especially with the upcoming change to the cluster table's implementation).
This introduces base/frameworks/cluster/supervisor.zeek and
Cluster::Supervisor::__init_cluster_nodes() for that purpose.
The @load of the Supervisor API in cluster/main.zeek isn't technically
necessary since we already load it explicitly even in init-bare.zeek,
but being explicit seems better.