This needed a small tweak in the deserialization, since each roundtrip
would otherwise pad the prior pattern with an extra /^?(...)$?/.
This expands the language.set test to also verify serializing/unserializing for
sets, similarly to tables in the previous commit.
This allows additional data roundtripping through JSON since to_json() already
supports tables. There are some subtleties around the formatting of strings in
JSON object keys, for which this adds a bit of helper infrastructure.
This also expands the language.table test to verify the roundtrips, and adapts
bif.from_json to include a table in the test record.
The from_json() BiF and its underlying code in Val.cc currently expect ports
expressed as a string ('80/tcp' etc). Zeek's own serialization via ToJSON()
renders them as an object ('{"port":80, "proto":"tcp"}'). This adds support
for the latter format to from_json(), so serialized values can be read back.
By avoiding to use `broker::data` directly, we gain a degree of freedom
that allows us to swap out `broker::data` for something else (e.g.,
`broker::variant`) in the future. Furthermore, it also helps us to keep
Broker types "local" to the Broker manager and gives us a nicer
interface.
Also replaces uses of `broker::expected` with `std::optional`. While an
`expected `can carry additional information as to why a value is not
present, nothing in Zeek ever cared about that. Hence, using
`std::optional` removes an unnecessary dependency on a Broker detail
while also being more efficient (no extra heap allocation when no value
is present).
Provide a script accessible way to introspect the DFA stats that can be
leveraged to gather runtime statistics of the underlying DFA. This
re-uses the existing MatcherStats used by ``get_matcher_stats()``.
Seems we can just open code the CompileSet() usage in the TablePatternMatcher
helper without indirecting through another class. Further, add the collection
of indices into MatchAll() rather than duplicating its code in
MatchDisjunction(). Doesn't seem like MatchAll() is used widely.
Anchors within pattern passed to sub() or gsub() were previously ignored,
replacing any occurrence of '<text>' even when '^<text>' was used as a
pattern.
This is a pretty user-visible change (and we even have anchored patterns
within the base scripts), but seems "the right thing to do".
Relates to #3455
Add a new overload to `copy_string` that takes the input characters plus
size. The new overload avoids inefficient scanning of the input for the
null terminator in cases where we know the size beforehand. Furthermore,
this overload *must* be used when dealing with input character sequences
that may have no null terminator, e.g., when the input is from a
`std::string_view` object.
This largely copies over Spicy's `.clang-format` configuration file. The
one place where we deviate is header include order since Zeek depends on
headers being included in a certain order.
Similar motivation as for RecordVal, save an extra malloc/free
and pointer indirection.
This breaks the `auto& RawVec()` API which previously returned
a reference to the std::vector*. It now returns a reference
to the vector instead. It's commented as intended for internal
and compiled code, so even though it's public API,
The previous `std::vector<std::optional<ZVal>>*&` return type was also very
likely not intended (all consumers just dereference it anyhow). I'm certain
this API was never meant to modify the actual pointer value.
I've switched to explicit typing, too.
* origin/topic/timw/3059-set-vector-conversion:
Fix conversion with record types
Add conversion between set and vector using 'as' keyword
Add std::move for a couple of variables passed by value
This is based on the discussion in zeek/zeek#2668. Using &default with tables
can be confusing as the default value is not inserted. The following example
prints an empty table at the end even new Service records was instantiated.
type Service: record {
occurrences: count &default=0;
last_seen: time &default=network_time();
};
global services: table[string] of Service &default=Service();
event zeek_init()
{
services["http"]$occurrences += 1;
services["http"]$last_seen = network_time();
print services;
}
Changing above &default to &default_insert will insert the newly created
default value upon a missed lookup and act less surprising.
Other examples that caused confusion previously revolved around table of sets
or table of vectors and `add` or `+=` not working as expected.
tbl_of_vector["http"] += 1
add tbl_of_set["http"][1];
When a JSON document contains key names containing colons or other
special characters that are not valid in Zeek identifiers, from_json()
cannot be used to parse such input.
This change allows a customizable normalization function.
Closes#3142.
threading/formatters/JSON.h currently includes rapidjson headers for declaring
the NullDoubleWriter. This appears mostly an internal detail, but
results in the situation that 1) we need to ship rapidjson headers with
the Zeek install tree and 2) taking care that external plugins are able
to find these headers should they include formatters/JSON.h.
There are currently no other Zeek headers that include rapidjson, so this
seems very unfortunate and self-inflicted given it's not actually required.
Attempt to hide this implementation detail with the goal to remove the
rapidjson includes with v7.1 and then also stop bundling and exposing
the include path to external plugins.
The NullDoubleWriter implementation moves into a new formatters/detail/json.h
header which is not installed.
Closes#3128
* amazing-pp/topic/fupeng/from_json_bif:
Implement from_json bif
Minor updates during merge: Moved ValFromJSON into zeek::detail for the
time being, removed gotos, normalized some error messages to lower case,
minimal test extension and added a raw reader input framework test reading
"json lines" as a demo, adding notes about the implicit type
conversions.
* origin/topic/vern/record-optimizations.Apr23B:
different fix for MSVC compiler issues
more general approach for addressing MSVC compiler issues with IntrusivePtr
restored RecordType::Create, now marked as deprecated tidying of namespaces and private class members simplification of flagging record field initializations that should be skipped address peculiar MSVC compilation complaint for IntrusivePtr's
clarifications and tidying for record field initializations
optimize record construction by deferring initializations of aggregates
compile-scripts-to-C++ speedups by switching to raw record access
logging speedup by switching to raw record access
remove redundant record coercions
Removed the `#if 0` hunk during merging: Probably could have gone with a
doctest instead.
tidying of namespaces and private class members
simplification of flagging record field initializations that should be skipped
address peculiar MSVC compilation complaint for IntrusivePtr's