When checking exported Spicy types for collisions with existing Zeek
types we previously would also check whether they collide with names in
global scope, i.e., we didn't provide a `no_global` arg to
`detail::lookup_ID` which defaulted to false (since we also provided a
module name I'd argue that the behavior of that function is confusing
and probably error-prone -- like seen here).
This meant that e.g., a Spicy enum `foo::Direction` (automatically in
implicit Spicy module scope) would be detected to collide with the
existing Zeek `Direction` enum.
With this patch we use the `lookup_ID` API correctly and do not check
against potential collisions with globals anymore since it is not
needed.
Closes#3279.
Setting this option to false does not count missing bytes in files towards the
extraction limits, and allows to extract data up to the desired limit,
even when partial files are written.
When missing bytes are encountered, files are now written as sparse
files.
Using this option requires the underlying storage and utilities to support
sparse files.
(cherry picked from commit afa6f3a0d3b8db1ec5b5e82d26225504c2891089)
OSS Fuzz generated a CWD request and reply followed by very many EPRT
requests. This caused Zeek to re-log the CWD request and invoke `build_url_ftp()`
over and over again resulting in long processing times.
Avoid this scenario by not logging commands that aren't pending anymore.
(cherry picked from commit b05dd31667ff634ec7d017f09d122f05878fdf65)
* origin/topic/vern/script-opt-maint.Sep23:
fix for ZAM statement-level profiling (broken by GH-3199)
ZAM fixes for compatibility with GH-3249 changes
-O gen-C++ fixes for compatibility with GH-3249 changes minor -O gen-C++ BTest updates
minor BTest reordering to diminish differences with script optimization
Currently, loop vars are added to a function scope's inits and
initialized upon entering a function with default values. This
applies to vector, record and table types.
This is unnecessary for variables used in for loops as they are
guaranteed to be initialized while iterating.
Initializing fields of recovered records caused running &default expression
of fields just so that they are re-assigned in the next step with the
recovered fields. The second test case still shows that the loop var
is initialized as well even though that's not needed.
Add tests for iterating over records with &default attributes for both,
tables and vectors.
Fixes#3267
* origin/topic/jazoff/gh-3268:
Fix check for emailed notices
Changes: Added a test-case printing email_delay_tokens to compare email vs
non-email notice types. Previously, both notice types would have email
delay tokens at that point in the flow.
The diffs produced by telemetry.log when introducing a weird or
removing/adding protocol specific logs is overwhelming and distracting
without providing value. Exclude telemetry.log similar to how we already
exclude stats.log.
Some more targeted telemetry.log tests exists in the normal testing/btest
suite and that appears more sensible.
...except for clang-format, because versions after v13.0.0 have
borked the Whitesmith formatting. Also moves yapf from
pre-commit/mirrors-yapf to google/yapf.
When http_reply events are received before http_request events, either
through faking traffic or possible re-ordering, it is possible to trigger
unbounded state growth due to later http_requests never being matched
again with responses.
Prevent this by synchronizing request/response counters when late
requests come in.
Also forcefully flush pending requests when http_replies are never
observed either due to the analyzer having been disabled or because
half-duplex traffic.
Fixes#1705
* topic/awelzel/3235-dont-flip-broadcasts:
testing: Bump external test suite
dhcp: Handle is_orig=T for connections from server to 255.255.255.255
IPBasedAnalyzer: Don't flip connections when destination is broadcast
This works around the new semantics of is_orig=T for "connections"
from DHCP servers to broadcast addresses. IMO, having the server address
as originator in the conn.log is still more intuitive.
* origin/topic/vern/script-opt-maint.Aug23:
updated notes regarding "-O gen-C++" maintenance
"-O gen-C++" support for "assert" statements
addressed some nits re "-O gen-C++" script optimization
fixes for compiling lambdas to C++
fixes to avoid ambiguities in analyzing captures for script optimization
disambiguate lambdas by adding scoping and consideration of captures
addressed performance and correctness issues flagged by Coverity
Using pcaps from https://interop.seemann.io/ as samples for QUIC protocol
data didn't produce a conn.log for the contained data. `tcpdump -r`
and Wireshark do show the contained IP/UDP packets. Teach Zeek how
to handle link type DLT_PPP 0x09 using a new PPP analyzer based on the
PPPSerial analyzer code.
Usual update to files/x509 baseline after adding new analyzer due
to enum values changing.
This change makes the community-id script that adds the community id to
notice.log automatically load the main script if this was not already
loaded.
In the past, the script just did not perform any action if the main
script was not loaded.
This change also makes the notice script respect the seed/base64
settings that were set in the main script.
Fixes GH-3242
* origin/topic/robin/spicy-export-extensions:
[Spicy] Clean up representation of EVT record fields.
[Spicy] Extend functionality of `export` in EVT files.
[Spicy] Refactor parsing of `export` in EVT files.
We now support selecting which fields of a unit type get exported into
the automatically created Zeek record; as well as selecting which
fields get a `&log` attribute added automatically to either all fields
or to selected fields.
Syntax:
- To export only selected fields:
export Foo::X with { field1, field3 };
- To export all but selected fields:
export Foo::X without { field2, field3 };
- To `&log` all fields:
export Foo::X &log;
- To `&log` only selected fields:
export Foo::X with { field1 &log, field3 }; # exports (only) field1 and field3, and marks field1 for logging
Syntax is still subject to change.
Closes#3218.
Closes#3219.
* origin/topic/timw/3059-set-vector-conversion:
Fix conversion with record types
Add conversion between set and vector using 'as' keyword
Add std::move for a couple of variables passed by value
This is based on the discussion in zeek/zeek#2668. Using &default with tables
can be confusing as the default value is not inserted. The following example
prints an empty table at the end even new Service records was instantiated.
type Service: record {
occurrences: count &default=0;
last_seen: time &default=network_time();
};
global services: table[string] of Service &default=Service();
event zeek_init()
{
services["http"]$occurrences += 1;
services["http"]$last_seen = network_time();
print services;
}
Changing above &default to &default_insert will insert the newly created
default value upon a missed lookup and act less surprising.
Other examples that caused confusion previously revolved around table of sets
or table of vectors and `add` or `+=` not working as expected.
tbl_of_vector["http"] += 1
add tbl_of_set["http"][1];
This is similar to GH-3206. There do not seem to be practical
consequences - but we should still fix it.
This also includes the udp-testcase that was forgotten in GH-3206.
This marks every identifier used within an attribute as seeds. The scenario
this avoids is functions referenced through attributes on unused tables or
record types (&default, &expire_func, ...) being dinged as unused as
that's rather confusing.
Also adds test for the above and a light smoke test into language/ as it
doesn't appear we had coverage here.
Closes#3122
* origin/topic/vern/zam-memory-reduction:
Baseline "-a zam" update
increase BTest wait time to abide ZAM compilation times
avoid script coverage overhead (especially memory) when using ZAM
fixes for correctly tracking which functions have been fully inlined
support for discarding ASTs once compiled via ZAM script optimization
some code simplifications and streamlining
The input framework currently gives a rather opaque error message when
encountering a line in which a required value is not provided. This
change updates this behavior; the error message now provides the record
element (or the name or the index element) which was not set in the
input data, even though it is required to be set by the underlying Zeek
type.