Commit graph

583 commits

Author SHA1 Message Date
Vern Paxson
f693f22192 porting of GH-4022 2024-11-12 15:41:20 -08:00
Christian Kreibich
0179a5e75c Support JSON roundtripping via to_json()/from_json() for patterns
This needed a small tweak in the deserialization, since each roundtrip
would otherwise pad the prior pattern with an extra /^?(...)$?/.

This expands the language.set test to also verify serializing/unserializing for
sets, similarly to tables in the previous commit.
2024-07-02 14:46:16 -07:00
Christian Kreibich
92c1098e97 Support table deserialization in from_json()
This allows additional data roundtripping through JSON since to_json() already
supports tables. There are some subtleties around the formatting of strings in
JSON object keys, for which this adds a bit of helper infrastructure.

This also expands the language.table test to verify the roundtrips, and adapts
bif.from_json to include a table in the test record.
2024-07-02 14:46:16 -07:00
Christian Kreibich
df645e9bb2 Support map-based definition of ports in from_json()
The from_json() BiF and its underlying code in Val.cc currently expect ports
expressed as a string ('80/tcp' etc). Zeek's own serialization via ToJSON()
renders them as an object ('{"port":80, "proto":"tcp"}'). This adds support
for the latter format to from_json(), so serialized values can be read back.
2024-07-02 14:46:16 -07:00
Christian Kreibich
a29f862f95 Document the field_escape_pattern in the to_json() BiF
This argument, and its corresponding use in Val.cc's BuildJSON(),
were never explained.
2024-07-02 14:46:16 -07:00
Vern Paxson
1f9fa4304d refine Val "footprint" to equate long strings with multiple objects 2024-04-29 12:39:36 -07:00
Tim Wojtulewicz
d745fbbca2 Avoid calling typecasts in Val when we have direct access to the underlying value object 2024-04-25 10:33:41 -07:00
Vern Paxson
5311904bb1 removing vestigial same_val() function 2024-04-25 09:15:12 -07:00
Vern Paxson
4cafacf90b fix ZAM "cat" of doubles/times to include trailing ".0" per normal BiF behavior 2024-03-28 16:43:06 -07:00
Vern Paxson
91cab9931d ZAM optimizations for record creation
includes reworking of managing "auxiliary" information for ZAM instructions
2024-01-25 20:49:12 +01:00
Vern Paxson
bae87fb606 script optimization fixes for "concretizing" vector-of-any's 2024-01-15 15:03:56 +01:00
Vern Paxson
a11ee9038b streamlining of constructing script-level tables 2023-12-16 17:33:46 +01:00
Dominik Charousset
647fdf7737 Add facade types to avoid using raw Broker types
By avoiding to use `broker::data` directly, we gain a degree of freedom
that allows us to swap out `broker::data` for something else (e.g.,
`broker::variant`) in the future. Furthermore, it also helps us to keep
Broker types "local" to the Broker manager and gives us a nicer
interface.

Also replaces uses of `broker::expected` with `std::optional`. While an
`expected `can carry additional information as to why a value is not
present, nothing in Zeek ever cared about that. Hence, using
`std::optional` removes an unnecessary dependency on a Broker detail
while also being more efficient (no extra heap allocation when no value
is present).
2023-12-04 15:23:28 +01:00
Arne Welzel
cf9afd7b77 TableVal: Replace raw subnets/pattern_matcher with unique_ptr 2023-11-21 11:16:17 +01:00
Arne Welzel
36c43d2aa3 TablePatternMatcher: Drop Insert()/Remove(), use Clear()
Also move Clear() when assigning into more generic Assign() function.
2023-11-21 11:16:16 +01:00
Arne Welzel
c113b9b297 Expr/Val: Add support for in set[pattern] 2023-11-21 10:34:17 +01:00
Arne Welzel
e39f280e3d zeek.bif: Implement table_pattern_matcher_stats() bif for introspection
Provide a script accessible way to introspect the DFA stats that can be
leveraged to gather runtime statistics of the underlying DFA. This
re-uses the existing MatcherStats used by ``get_matcher_stats()``.
2023-11-21 10:34:17 +01:00
Arne Welzel
9ae99cdc44 RE: Remove RE_DisjunctiveMatcher and re-use MatchAll()
Seems we can just open code the CompileSet() usage in the TablePatternMatcher
helper without indirecting through another class. Further, add the collection
of indices into MatchAll() rather than duplicating its code in
MatchDisjunction(). Doesn't seem like MatchAll() is used widely.
2023-11-21 10:34:16 +01:00
Arne Welzel
501b582bc7 TablePatternMatcher: Use const StringValPtr& instead of const StringVal* 2023-11-21 10:34:16 +01:00
Arne Welzel
c426304c27 Val: Move TablePatternMatcher into detail namespace
There's anyway only prototype in the headers, so detail seems better
than the public zeek namespace.
2023-11-21 10:34:16 +01:00
Arne Welzel
43a5473919 TablePatternMatcher: Use unique_ptr 2023-11-21 10:34:16 +01:00
Arne Welzel
c8bab6a0ec IndexType: Add IsPatternIndex(), like IsSubNetIndex() 2023-11-21 10:34:16 +01:00
Vern Paxson
699549eb45 support for indexing "table[pattern] of T" with strings to get multi-matches 2023-11-21 10:34:15 +01:00
Arne Welzel
e339e93e69 strings.bif/sub,gsub: Respect anchors in pattern
Anchors within pattern passed to sub() or gsub() were previously ignored,
replacing any occurrence of '<text>' even when '^<text>' was used as a
pattern.

This is a pretty user-visible change (and we even have anchored patterns
within the base scripts), but seems "the right thing to do".

Relates to #3455
2023-11-17 14:37:25 +01:00
Dominik Charousset
cebb85b1e8 Fix unsafe and inefficient uses of copy_string
Add a new overload to `copy_string` that takes the input characters plus
size. The new overload avoids inefficient scanning of the input for the
null terminator in cases where we know the size beforehand. Furthermore,
this overload *must* be used when dealing with input character sequences
that may have no null terminator, e.g., when the input is from a
`std::string_view` object.
2023-11-03 15:25:38 +01:00
Benjamin Bannier
f5a76c1aed Reformat Zeek in Spicy style
This largely copies over Spicy's `.clang-format` configuration file. The
one place where we deviate is header include order since Zeek depends on
headers being included in a certain order.
2023-10-30 09:40:55 +01:00
Vern Paxson
b53a025b1e fix bug in failing to concretize empty vectors 2023-09-26 14:39:05 -07:00
Arne Welzel
cbaf43e8ea VectorVal: Embed vector_val
Similar motivation as for RecordVal, save an extra malloc/free
and pointer indirection.

This breaks the `auto& RawVec()` API which previously returned
a reference to the std::vector*. It now returns a reference
to the vector instead. It's commented as intended for internal
and compiled code, so even though it's public API,

The previous `std::vector<std::optional<ZVal>>*&` return type was also very
likely not intended (all consumers just dereference it anyhow). I'm certain
this API was never meant to modify the actual pointer value.

I've switched to explicit typing, too.
2023-09-22 21:52:52 +02:00
Arne Welzel
f362935a66 RecordVal: Embed record_val
This should remove one malloc/free per created and destroyed record instance
and avoid one extra pointer indirection to access fields.
2023-09-22 19:43:07 +02:00
Vern Paxson
d70a0fae85 fixes for vector assignments involving "any"/"vector of any" types 2023-08-24 15:48:00 -07:00
Tim Wojtulewicz
e8ef169b27 Merge remote-tracking branch 'origin/topic/timw/3059-set-vector-conversion'
* origin/topic/timw/3059-set-vector-conversion:
  Fix conversion with record types
  Add conversion between set and vector using 'as' keyword
  Add std::move for a couple of variables passed by value
2023-08-11 10:35:06 -07:00
Tim Wojtulewicz
fe9926e538 Fix conversion with record types 2023-08-10 13:42:23 -07:00
Tim Wojtulewicz
af9e852c28 Add conversion between set and vector using 'as' keyword 2023-08-09 14:41:54 -07:00
Arne Welzel
73a7fdad95 TableVal: Unify &default and &default_insert lookups
Introduce DefaultAttr() helper to avoid a bit of duplicated code.
2023-08-04 12:31:27 +02:00
Arne Welzel
431767d04b Add &default_insert attribute for tables
This is based on the discussion in zeek/zeek#2668. Using &default with tables
can be confusing as the default value is not inserted. The following example
prints an empty table at the end even new Service records was instantiated.

    type Service: record {
        occurrences: count &default=0;
        last_seen: time &default=network_time();
    };

    global services: table[string] of Service &default=Service();

    event zeek_init()
        {
        services["http"]$occurrences += 1;
        services["http"]$last_seen = network_time();

        print services;
        }

Changing above &default to &default_insert will insert the newly created
default value upon a missed lookup and act less surprising.

Other examples that caused confusion previously revolved around table of sets
 or table of vectors and `add` or `+=` not working as expected.

    tbl_of_vector["http"] += 1
    add tbl_of_set["http"][1];
2023-08-04 12:30:36 +02:00
Vern Paxson
6b0d595dae avoid constructing TypeList's on-the-fly for ListVal's with fixed types 2023-07-17 16:31:30 -07:00
Tim Wojtulewicz
64b78f6fb9 Use emplace_back over push_back where appropriate 2023-07-07 09:17:05 -07:00
Tim Wojtulewicz
de13bb6361 Avoid unnecessary type names in return statements 2023-07-07 09:17:05 -07:00
Tim Wojtulewicz
90d0bc64fa Replace empty destructor bodies with =default definitions 2023-07-07 09:17:05 -07:00
Vern Paxson
1505fd4aa1 added some class accessors/set-ers 2023-06-30 09:36:14 +02:00
Arne Welzel
480d52ca1f from_json: Support function to normalize key names
When a JSON document contains key names containing colons or other
special characters that are not valid in Zeek identifiers, from_json()
cannot be used to parse such input.

This change allows a customizable normalization function.

Closes #3142.
2023-06-29 15:57:49 +02:00
Arne Welzel
6efc696179 formatters/JSON: Prepare to remove rapidjson from installed Zeek headers
threading/formatters/JSON.h currently includes rapidjson headers for declaring
the NullDoubleWriter. This appears mostly an internal detail, but
results in the situation that 1) we need to ship rapidjson headers with
the Zeek install tree and 2) taking care that external plugins are able
to find these headers should they include formatters/JSON.h.

There are currently no other Zeek headers that include rapidjson, so this
seems very unfortunate and self-inflicted given it's not actually required.

Attempt to hide this implementation detail with the goal to remove the
rapidjson includes with v7.1 and then also stop bundling and exposing
the include path to external plugins.

The NullDoubleWriter implementation moves into a new formatters/detail/json.h
header which is not installed.

Closes #3128
2023-06-17 13:48:25 +02:00
Arne Welzel
7a043e5e8f all: Fix typos identified by typos pre-commit hook 2023-06-13 17:57:32 +02:00
Arne Welzel
1facc34e09 Fixup Val.h/Val.cc: Actually move ValFromJSON into zeek::detail
Lost during merge..
2023-05-09 11:23:32 +02:00
Arne Welzel
264284150b Merge remote-tracking branch 'amazing-pp/topic/fupeng/from_json_bif'
* amazing-pp/topic/fupeng/from_json_bif:
  Implement from_json bif

Minor updates during merge: Moved ValFromJSON into zeek::detail for the
time being, removed gotos, normalized some error messages to lower case,
minimal test extension and added a raw reader input framework test reading
"json lines" as a demo, adding notes about the implicit type
conversions.
2023-05-09 10:36:58 +02:00
Fupeng Zhao
584e68434d Implement from_json bif 2023-05-06 00:42:46 +00:00
Arne Welzel
89c828ac14 Merge remote-tracking branch 'origin/topic/vern/record-optimizations.Apr23B'
* origin/topic/vern/record-optimizations.Apr23B:
  different fix for MSVC compiler issues
  more general approach for addressing MSVC compiler issues with IntrusivePtr
  restored RecordType::Create, now marked as deprecated tidying of namespaces and private class members simplification of flagging record field initializations that should be skipped address peculiar MSVC compilation complaint for IntrusivePtr's
  clarifications and tidying for record field initializations
  optimize record construction by deferring initializations of aggregates
  compile-scripts-to-C++ speedups by switching to raw record access
  logging speedup by switching to raw record access
  remove redundant record coercions

Removed the `#if 0` hunk during merging: Probably could have gone with a
doctest instead.
2023-04-19 11:59:56 +02:00
Vern Paxson
c19eba62d6 restored RecordType::Create, now marked as deprecated
tidying of namespaces and private class members
simplification of flagging record field initializations that should be skipped
address peculiar MSVC compilation complaint for IntrusivePtr's
2023-04-18 09:41:45 -07:00
Vern Paxson
ee358affda clarifications and tidying for record field initializations 2023-04-15 20:12:49 -07:00
Arne Welzel
a0540f96a1 Revert "Type: Add TypeManager->TypeList() and use for ListVal()"
This reverts commit 24c606b4df.

This commit introduced a memory leak ListVal::Append() modifying
the cached TYPE_ANY type list.
2023-04-14 09:49:05 +02:00