Commit graph

16736 commits

Author SHA1 Message Date
Christian Kreibich
737b1a2013 Remove the Supervisor's internal ClusterEndpoint struct.
This eliminates one place in which we currently need to mirror changes to the
script-land Cluster::Node record. Instead of keeping an exact in-core equivalent, the
Supervisor now treats the data structure as opaque, and stores the whole cluster
table as a JSON string.

We may replace the script-layer Supervisor::ClusterEndpoint in the future, using
Cluster::Node directly. But that's a more invasive change that will affect how
people invoke Supervisor::create() and similars.

Relying on JSON for serialization has the side-effect of removing the
Supervisor's earlier quirk of using 0/tcp, not 0/unknown, to indicate unused
ports in the Supervisor::ClusterEndpoint record.
2024-07-02 14:52:17 -07:00
Christian Kreibich
a98ec6b08b Provide a script-layer equivalent to Supervisor::__init_cluster().
If the script layer is able to access the current node's config via
Supervisor::node(), it can handle populating Cluster::nodes. That code
is much more straightforward than an equivalent in-core implementation
(especially with the upcoming change to the cluster table's implementation).
This introduces base/frameworks/cluster/supervisor.zeek and
Cluster::Supervisor::__init_cluster_nodes() for that purpose.

The @load of the Supervisor API in cluster/main.zeek isn't technically
necessary since we already load it explicitly even in init-bare.zeek,
but being explicit seems better.
2024-07-02 14:52:13 -07:00
Christian Kreibich
3d6954dfd4 Merge branch 'topic/christian/json-improvements'
* topic/christian/json-improvements:
  Update NEWS file to cover JSON enhancements
  Support JSON roundtripping via to_json()/from_json() for patterns
  Support table deserialization in from_json()
  Support map-based definition of ports in from_json()
  Document the field_escape_pattern in the to_json() BiF
2024-07-02 14:47:24 -07:00
Christian Kreibich
5f8b6986a2 Update NEWS file to cover JSON enhancements 2024-07-02 14:46:16 -07:00
Christian Kreibich
0179a5e75c Support JSON roundtripping via to_json()/from_json() for patterns
This needed a small tweak in the deserialization, since each roundtrip
would otherwise pad the prior pattern with an extra /^?(...)$?/.

This expands the language.set test to also verify serializing/unserializing for
sets, similarly to tables in the previous commit.
2024-07-02 14:46:16 -07:00
Christian Kreibich
92c1098e97 Support table deserialization in from_json()
This allows additional data roundtripping through JSON since to_json() already
supports tables. There are some subtleties around the formatting of strings in
JSON object keys, for which this adds a bit of helper infrastructure.

This also expands the language.table test to verify the roundtrips, and adapts
bif.from_json to include a table in the test record.
2024-07-02 14:46:16 -07:00
Christian Kreibich
df645e9bb2 Support map-based definition of ports in from_json()
The from_json() BiF and its underlying code in Val.cc currently expect ports
expressed as a string ('80/tcp' etc). Zeek's own serialization via ToJSON()
renders them as an object ('{"port":80, "proto":"tcp"}'). This adds support
for the latter format to from_json(), so serialized values can be read back.
2024-07-02 14:46:16 -07:00
Christian Kreibich
a29f862f95 Document the field_escape_pattern in the to_json() BiF
This argument, and its corresponding use in Val.cc's BuildJSON(),
were never explained.
2024-07-02 14:46:16 -07:00
Arne Welzel
c2dd3dfad0 Bump cmake submodule [nomail] 2024-07-02 19:42:29 +02:00
Arne Welzel
e57aa5932e Merge remote-tracking branch 'origin/topic/awelzel/3682-bad-pipe-op-3'
* origin/topic/awelzel/3682-bad-pipe-op-3:
  threading/Manager: Warn if threads are added after termination
  iosource/Manager: Reap dry sources while computing timeout
  threading/MsgThread: Decouple IO source and thread lifetimes
  iosource/Manager: Do not manage lifetime of pkt_src
  iosource/Manager: Honor manage_lifetime and dont_count for short-lived IO sources
2024-07-02 14:41:54 +02:00
Arne Welzel
f050d96503 threading/Manager: Warn if threads are added after termination
The core.file-analyzer-violation test showed that it's possible to
create new threads (log writers) when Zeek is in the process of
terminating. This can result in the IO manager's deconstructor
deleting IO sources for threads that are still running.

This is sort of a scripting issue, so for now log a reporter warning
when it happens to have a bit of a bread-crumb what might be
going on. In the future it might make sense to plug APIs with
zeek_is_terminating().
2024-07-02 12:34:28 +02:00
Arne Welzel
739a8ac509 iosource/Manager: Reap dry sources while computing timeout
Avoids looping over the sources vector twice and should result
in the same behavior.
2024-07-02 11:32:05 +02:00
Arne Welzel
b3118d2a48 threading/MsgThread: Decouple IO source and thread lifetimes
MsgThread acting as an IO source can result in the situation where the
threading manager's heartbeat timer deletes a finished MsgThread instance,
but at the same time this thread is in the list of ready IO sources the
main loop is currently processing.

Fix this by decoupling the lifetime of the IO source part and properly
registering as lifetime managed IO sources with the IO manager.

Fixes #3682
2024-07-02 11:00:37 +02:00
Arne Welzel
0451a4038c iosource/Manager: Do not manage lifetime of pkt_src
Now that dry sources are properly reaped and freed, an offline packet
source would be deleted once dry, resulting in GetPktSrc() returning
a wild pointer. Don't manage the packet source lifetime and instead
free it during Manager destruction.
2024-07-02 10:47:08 +02:00
Arne Welzel
fcca8670d3 iosource/Manager: Honor manage_lifetime and dont_count for short-lived IO sources
If an IO source is registered and becomes dry at runtime, the IO
manager would not honor its manage_lifetime or dont_count attribute
during collection, resulting in memory leaks.

This probably hasn't mattered so far as there's no IO sources registered
in-tree at runtime using manage_lifetime=true.
2024-07-02 10:46:59 +02:00
Arne Welzel
43804fa3b5 Merge remote-tracking branch 'origin/topic/awelzel/fix-coveralls-no-token'
* origin/topic/awelzel/fix-coveralls-no-token:
  coverage/lcov_html: Allow missing coveralls token
2024-06-26 13:16:38 +02:00
Arne Welzel
5248f60806 coverage/lcov_html: Allow missing coveralls token
This is a fixup for 0cd023b839 which
currently causes ASAN coverage builds to fail for non-master branches
when due to a missing COVERALLS_REPO_TOKEN.

Instead of bailing out for non-master branches, pass `--dry-run` to the
coveralls-lcov invocation to test more of the script.
2024-06-25 17:23:45 +02:00
Benjamin Bannier
0987d9cd37 Merge remote-tracking branch 'origin/topic/bbannier/bump-spicy' 2024-06-25 13:39:09 +02:00
Benjamin Bannier
f0dad976e6 Bump auxil/spicy to latest development snapshot 2024-06-25 12:47:35 +02:00
Arne Welzel
2ebb8824b2 Merge remote-tracking branch 'origin/topic/awelzel/bump-zeekctl-file-extract-dir'
* origin/topic/awelzel/bump-zeekctl-file-extract-dir:
  NEWS: Add entry about FileExtractDir
  Update zeekctl submodule
2024-06-25 11:32:27 +02:00
Arne Welzel
4b26dfa715 zeek-testing-private: Update baseline, after merge 2024-06-24 11:25:21 +02:00
Arne Welzel
3097a79539 Merge remote-tracking branch 'origin/topic/vern/record-script-opt'
* origin/topic/vern/record-script-opt:
  script optimization for record operations sourced (in part) from other records
2024-06-24 11:19:31 +02:00
Vern Paxson
4b719ef45a script optimization for record operations sourced (in part) from other records 2024-06-24 09:38:37 +02:00
Christian Kreibich
eb5ea66012 Merge branch 'topic/awelzel/topic/awelzel/ssh-invalid-version-2'
* topic/awelzel/topic/awelzel/ssh-invalid-version-2:
  zeek-testing-private: Update baseline
  ssh: Revert half-duplex robustness
2024-06-20 18:17:57 -07:00
Christian Kreibich
398b41af5a Merge branch 'topic/dopheide/runtime-includes' of github.com:/dopheide-esnet/zeek
* 'topic/dopheide/runtime-includes' of github.com:/dopheide-esnet/zeek:
  Fixes build error of OpenVPN spicy plugin
2024-06-20 17:34:21 -07:00
Michael Dopheide
ad543a4803 Fixes build error of OpenVPN spicy plugin 2024-06-20 17:05:03 -05:00
Robin Sommer
b5206f818a
Merge remote-tracking branch 'origin/topic/robin/gh-3521-zeek-val'
* origin/topic/robin/gh-3521-zeek-val:
  Bump Spicy and documentation submodules.
  Spicy: Provide runtime API to access Zeek-side globals.
  Spicy: Reformat `zeek.spicy` with `spicy-format`.
  Spicy: Extend exception hierarchy.
2024-06-20 15:54:17 +02:00
Robin Sommer
98760a0683
Bump Spicy and documentation submodules. 2024-06-20 14:41:49 +02:00
Robin Sommer
4fc57294f1
Spicy: Provide runtime API to access Zeek-side globals.
This allows to read Zeek global variables from inside Spicy code. The
main challenge here is supporting all of Zeek's data type in a
type-safe manner.

The most straight-forward API is a set of functions
`get_<type>(<id>)`, where `<type>` is the Zeek-side type
name (e.g., `count`, `string`, `bool`) and `<id>` is the fully scoped
name of the Zeek-side global (e.g., `MyModule::Boolean`). These
functions then return the corresponding Zeek value, converted in an
appropriate Spicy type. Example:

    Zeek:
        module Foo;

        const x: count = 42;
        const y: string = "xxx";

    Spicy:
        import zeek;

        assert zeek::get_count("Foo::x") == 42;
        assert zeek::get_string("Foo::y") == b"xxx"; # returns bytes(!)

For container types, the `get_*` function returns an opaque types that
can be used to access the containers' values. An additional set of
functions `as_<type>` allows converting opaque values of atomic
types to Spicy equivalents. Example:

    Zeek:
        module Foo;

        const s: set[count] = { 1, 2 };
        const t: table[count] of string = { [1] = "One", [2] = "Two" }

    Spicy:

        # Check set membership.
        local set_ = zeek::get_set("Foo::s");
        assert zeek::set_contains(set_, 1) == True

        # Look up table element.
        local table_ = zeek::get_table("Foo::t");
        local value = zeek::table_lookup(t, 1);
        assert zeek::as_string(value) == b"One"

There are also functions for accessing elements of Zeek-side vectors
and records.

If any of these `zeek::*` conversion functions fails (e.g., due to a
global of that name not existing), it will throw an exception.

Design considerations:

    - We support only reading Zeek variables, not writing. This is
      both to simplify the API, and also conceptually to avoid
      offering backdoors into Zeek state that could end up with a very
      tight coupling of Spicy and Zeek code.

    - We accept that a single access might be relatively slow due to
      name lookup and data conversion. This is primarily meant for
      configuration-style data, not for transferring lots of dynamic
      state over.

    - In that spirit, we don't support deep-copying complex data types
      from Zeek over to Spicy. This is (1) to avoid performance
      problems when accidentally copying large containers over,
      potentially even at every access; and (2) to avoid the two sides
      getting out of sync if one ends up modifying a container without
      the other being able to see it.
2024-06-20 12:02:54 +02:00
Arne Welzel
5c56969ca4 zeek-testing-private: Update baseline 2024-06-19 19:47:54 +02:00
Arne Welzel
5dfff4492c ssh: Revert half-duplex robustness
This reverts part of commit a0888b7e36 due
to inhibiting analyzer violations when parsing non SSH traffic when
the &restofdata path is entered.

@J-Gras reported the analyzer not being disabled when sending HTTP
traffic on port 22.

This adds the verbose analyzer.log baselines such that future improvements
of these scenarios become visible.
2024-06-19 16:04:51 +02:00
Robin Sommer
93dd9d6797
Spicy: Reformat zeek.spicy with spicy-format. 2024-06-19 10:22:36 +02:00
Robin Sommer
751c35b476
Spicy: Extend exception hierarchy.
We move the current `TypeMismatch` into a  new `ParameterMismatch`
exception that's derived from a more general `TypeMismatch` now that
can also be used for other, non-parameter mismatches.
2024-06-18 12:46:47 +02:00
Arne Welzel
a7f10df4f7 Merge remote-tracking branch 'origin/topic/christian/ci-updates'
* origin/topic/christian/ci-updates:
  CMakeLists: Disable -Werror for 3rdparty/sqlite3.c
  Bump zeek-3rdparty to pull in sqlite move to 3.46
  CI: drop Fedora 38, add 40
2024-06-18 10:53:09 +02:00
Arne Welzel
003d2d1468 CMakeLists: Disable -Werror for 3rdparty/sqlite3.c
We package vanilla sqlite from upstream and on Fedora 40 with sqlite 3.46
there's the following compiler warning:

    In function 'sqlite3Strlen30',
        inlined from 'sqlite3ColumnSetColl' at
        ../../src/3rdparty/sqlite3.c:122105:10:
        ../../src/3rdparty/sqlite3.c:35003:28: error: 'strlen' reading 1 or more bytes from a region of size 0 [-Werror=stringop-overread]
    35003 |   return 0x3fffffff & (int)strlen(z);
          |                            ^~~~~~~~~
    In function 'sqlite3ColumnSetColl':

Disabling -Werror on sqlite3.c seems sensible given we have little
control over that code.
2024-06-18 10:03:32 +02:00
Christian Kreibich
5af23757fa Bump zeek-3rdparty to pull in sqlite move to 3.46
This avoids a compiler warning/error on Fedora 40.
2024-06-17 18:45:43 -07:00
Christian Kreibich
59d0f311a5 CI: drop Fedora 38, add 40 2024-06-17 18:45:39 -07:00
Robin Sommer
8c755af8b2
Merge remote-tracking branch 'origin/topic/robin/gh-3783-replaces-two'
* origin/topic/robin/gh-3783-replaces-two:
  Spicy: Disallow repeating replacements of the same analyzer.
  Bump Spicy.
2024-06-14 13:51:06 +02:00
Robin Sommer
4318d5ab9e
Spicy: Disallow repeating replacements of the same analyzer.
We now reject EVT files that attempt to replace the same built-in
analyzer multiple times as doing so would be ill-defined and not very
intuitive in what exactly it means.

Closes #3783.
2024-06-14 13:10:47 +02:00
Robin Sommer
956e147f70
Bump Spicy. 2024-06-14 13:10:47 +02:00
Arne Welzel
9e95ef7f0f NEWS: Add entry about FileExtractDir 2024-06-11 15:36:06 +02:00
Arne Welzel
9ad77ea9da Update zeekctl submodule 2024-06-11 15:36:06 +02:00
Benjamin Bannier
345fc31dcc Merge remote-tracking branch 'origin/topic/bbannier/ci-centos8-stream-eol' 2024-06-11 15:11:52 +02:00
Benjamin Bannier
20eeb6dbf6 Drop EOL centos8-stream in CI 2024-06-11 14:48:35 +02:00
Arne Welzel
1e3b5ee68b Merge remote-tracking branch 'origin/topic/timw/civetweb-shutdown-data-race'
* origin/topic/timw/civetweb-shutdown-data-race:
  Suppress a known data race during civetweb shutdown
2024-06-11 12:01:10 +02:00
Arne Welzel
3081a40a2a Merge remote-tracking branch 'origin/topic/awelzel/asan-coverage-fixes'
* origin/topic/awelzel/asan-coverage-fixes:
  Bump cmake for -fprofile-update=atomic usage
  cirrus: Unset CCACHE_BASEDIR for asan/coverage build
2024-06-11 11:03:14 +02:00
Arne Welzel
8bf3d3c7fc Bump cmake for -fprofile-update=atomic usage 2024-06-11 08:58:21 +02:00
Arne Welzel
f228cf878a cirrus: Unset CCACHE_BASEDIR for asan/coverage build
When CCACHE_BASEDIR is set, ccache will rewrite absolute paths to
relative paths in order to allow compilation in different source
directories. We do not need this feature on Cirrus (the checkout
is always in /zeek) and using absolute paths avoids
confusion/normalization needs for the gcov -p results.

We could consider removing the global CCACHE_BASEDIR, but it'd
bust the ccache of every other task, too.
2024-06-11 08:56:46 +02:00
zeek-bot
d603653495 Update doc submodule [nomail] [skip ci] 2024-06-08 00:11:59 +00:00
Tim Wojtulewicz
753127be6d Suppress a known data race during civetweb shutdown 2024-06-07 11:31:34 -07:00