Commit graph

261 commits

Author SHA1 Message Date
Arne Welzel
3dd1f8d38a logging/WriterFrontend: Add LogWriteHeader as member
The header captures the enum values as well as the fields
2024-12-04 12:37:22 +01:00
Arne Welzel
f5d4526eac logging: Add filter_name to WriterInfo
...with this change, it'll be possible to identify WriterFrontend's
based on (stream, filter_name, path) pairs in addition to (stream,
writer, path) pairs.
2024-12-04 12:37:22 +01:00
Arne Welzel
65037fa822 logging/Manager: Fix using filename from input.h in debug log
...and remove network_time, it's always included.
2024-11-15 15:46:24 +01:00
Arne Welzel
78999d147d logging/Manager: Extract another CreateWriter() helper
For other cluster backends, CreateWriter() will use a logger's filter
configuration rather than receiving all configuration through CreateLog.
Extract a helper out from WriteToFilters() for reuse.
2024-09-27 15:32:09 +02:00
Arne Welzel
16cca62292 logging/Manager: Extract path_func invocation into helper 2024-09-27 15:32:09 +02:00
Arne Welzel
0d925e935e logging: Dedicated log flush timer
Log flushing is currently triggered based on the threading heartbeat timer
of WriterBackends and the hard-coded WRITE_BUFFER_SIZE 1000.

This change introduces a separate timer that is managed by the logger
manager instead of piggy-backing on the heartbeat timer, as well as a
const &redef for the buffer size.

This allows to modify the log flush frequency and batch size independently
of the threading heartbeat interval. Later, this will allow to re-use the
buffering and flushing logic of writer frontends for non-Broker cluster
backends, too.

One change here is that even frontends that do not have a backend will
be flushed regularly. This is wanted for non-Broker backends and should be
very cheap. Possibly, Broker can piggy back on this timer down the road, too,
rather than using its own script-level timer (see Broker::log_flush()).
2024-09-27 15:30:35 +02:00
Arne Welzel
77b9510c8a all: Change to use Func::GetName() 2024-09-27 15:11:17 +02:00
Arne Welzel
a9290cc031 logging: Switch index-assignment of raw pointers to emplace_back() 2024-08-30 10:59:55 +02:00
Arne Welzel
245fd0c94f broker/logging: Change threading::Value** usage std::vector instead
This allows to leverage automatic memory management, less allocations
and using move semantics for expressing ownership.

This breaks the existing logging and broker API, but keeps the plugin
DoWrite() and HookLogWrite() methods functioning.

It further changes ValToLogVal to return a threading::Value rather than
a threading::Value*. The vector_val and set_val fields unfortunately
use the same pointer-to-array-of-pointers approach. this can'tbe changed
as it'd break backwards compatibility for plugin provided input readers
and log writers.
2024-08-30 10:58:57 +02:00
Tim Wojtulewicz
93717ca8f8 Remove is_sum arguments from counters and gauges 2024-05-31 13:36:37 -07:00
Tim Wojtulewicz
46ff48c29a Change all instruments to only handle doubles 2024-05-31 13:36:37 -07:00
Tim Wojtulewicz
84aa308527 Rework everything to access the prometheus-cpp objects more directly 2024-05-31 13:30:31 -07:00
Tim Wojtulewicz
a0ae06b3cd Convert telemetry code to use prometheus-cpp 2024-05-31 13:30:31 -07:00
Dominik Charousset
bd3e5bedbb Integrate review feedback 2024-01-06 13:48:14 +01:00
Dominik Charousset
1bc5fda591 Backward compatibility for OpaqueVal serialization
External plugins depend on the API for `OpaqueVal`. This set of changes
brings back the previous signature for the `Serialize` and `Unserialize`
member functions. The new set of functions that operate on the recently
added `BrokerData` API were renamed accordingly and use a `Data` suffix to
distinguish between the old and new interface.

For the transition period, `OpaqueVal` now has two "sets" of
serialization functions: old and new (using the suffix). By default, the
new functions call the old API and then convert to the new types. Hence,
plugins that override the old set of member functions will continue to
work. New code should only override the new set of functions.

Since the macro `DECLARE_OPAQUE_VALUE` (a convenience macro for adding a
default set of member functions to a subtype of `OpaqueVal`) might be
used by 3rd parties, the macro has been "restored" to its previous
behavior, i.e., it will override the old set of member functions. The
new macro `DECLARE_OPAQUE_VALUE_V2` is similar but overrides the new set
of functions instead.

The class `BloomFilter` uses the same member function signatures as
`OpaqueVal` for serialization. Hence, the same old/new split was
implemented to keep the APIs consistent.
2024-01-06 10:52:06 +01:00
Vern Paxson
ead4b681aa bug fix for delayed logging 2023-12-12 09:45:19 +01:00
Christian Kreibich
0aef842f05 Merge branch 'topic/neverlord/broker-data'
* topic/neverlord/broker-data:
  Integrate review feedback
  Add facade types to avoid using raw Broker types
2023-12-04 12:32:35 -08:00
Arne Welzel
30314dd940 logging: Fix coverity std::move suggestions 2023-12-04 18:27:57 +01:00
Arne Welzel
52fba4aacf logging/Manager: Fix coverity null-deref
Prior code assumed non-null stream given the active_write_ctx matches,
but please coverity.
2023-12-04 18:27:57 +01:00
Dominik Charousset
647fdf7737 Add facade types to avoid using raw Broker types
By avoiding to use `broker::data` directly, we gain a degree of freedom
that allows us to swap out `broker::data` for something else (e.g.,
`broker::variant`) in the future. Furthermore, it also helps us to keep
Broker types "local" to the Broker manager and gives us a nicer
interface.

Also replaces uses of `broker::expected` with `std::optional`. While an
`expected `can carry additional information as to why a value is not
present, nothing in Zeek ever cared about that. Hence, using
`std::optional` removes an unnecessary dependency on a Broker detail
while also being more efficient (no extra heap allocation when no value
is present).
2023-12-04 15:23:28 +01:00
Tim Wojtulewicz
4fa06cef75 Fix some compiler warnings in logging::Manager 2023-12-01 11:49:26 -07:00
Arne Welzel
3c99b7ae9c logging/Manager: Fix token_val->AsCount() in debug logging
Second UBSAN error triggered from log delay merge.
2023-12-01 16:01:45 +01:00
Arne Welzel
acf4ed9c6c logging/Manager: Fix AsTime() to AsInterval()
Found by UBSAN after merge of log delay branch.
2023-12-01 13:26:40 +01:00
Arne Welzel
9956d96824 logging: Fix typos from review 2023-11-30 12:26:08 +01:00
Arne Welzel
ee65623600 logging/Manager: Make LogDelayExpiredTimer an implementation detail
The only reason this was a private component of Manager was to access
the Stream's function. Use a generic callback and a lambda to avoid
that exposure.
2023-11-30 12:25:49 +01:00
Arne Welzel
dfa8bac273 logging/WriteToFilters: Use range-based for loop 2023-11-30 11:37:10 +01:00
Arne Welzel
e3796894c6 logging: Do not keep delay state persistent
If Log::remove_stream() and Log::create_stream() is called for a stream,
do not restore the previously used max delay or max queue size.
2023-11-29 11:53:11 +01:00
Arne Welzel
fd096b1ce6 logging: delay documentation polishing
Based on PR feedback.
2023-11-29 11:53:11 +01:00
Arne Welzel
e2ce929fa4 logging: Better error messages for invalid Log::delay() calls
Add a test for Log::delay() usage within filter policy hooks, too.
2023-11-29 11:53:11 +01:00
Arne Welzel
5e046eee58 logging/Manager: Implement DelayTokenType as an actual opaque
With a bit of tweaking in the JavaScript plugin to support opaque types, this
will allow the delay functionality to work there, too.

Making the LogDelayToken an actual opaque seems reasonable, too. It's not
supposed to be user inspected.
2023-11-29 11:53:11 +01:00
Arne Welzel
2dbb467ba2 logging: Implement get_delay_queue_size()
Primarily for introspection given that re-delaying may exceed
queue sizes.
2023-11-29 11:53:11 +01:00
Arne Welzel
f0e67022fd logging: Introduce Log::delay() and Log::delay_finish()
This is a verbose, opinionated and fairly restrictive version of the log delay idea.
Main drivers are explicitly, foot-gun-avoidance and implementation simplicity.

Calling the new Log::delay() function is only allowed within the execution
of a Log::log_stream_policy() hook for the currently active log write.

Conceptually, the delay is placed between the execution of the global stream
policy hook and the individual filter policy hooks. A post delay callback
can be registered with every Log::delay() invocation. Post delay callbacks
can (1) modify a log record as they see fit, (2) veto the forwarding of the
log record to the log filters and (3) extend the delay duration by calling
Log::delay() again. The last point allows to delay a record by an indefinite
amount of time, rather than a fixed maximum amount. This should be rare and
is therefore explicit.

Log::delay() increases an internal reference count and returns an opaque
token value to be passed to Log::delay_finish() to release a delay reference.
Once all references are released, the record is forwarded to all filters
attached to a stream when the delay completes.

This functionality separates Log::log_stream_policy() and individual filter
policy hooks. One consequence is that a common use-case of filter policy hooks,
removing unproductive log records, may run after a record was delayed. Users
can lift their filtering logic to the stream level (or replicate the condition
before the delay decision). The main motivation here is that deciding on a
stream-level delay in per-filter hooks is too late. Attaching multiple filters
to a stream can additionally result in hard to understand behavior.

On the flip side, filter policy hooks are guaranteed to run after the delay
and can be used for further mangling or filtering of a delayed record.
2023-11-29 11:53:11 +01:00
Arne Welzel
dc552e647f logging/Manager: zeek::detail'ify
Introducing zeek::logging::detail requires detail:: references to be
qualified as preparation.
2023-11-29 11:53:11 +01:00
Arne Welzel
3afd6242c7 logging/Manager: Split Write()
If we delay in the stream policy hook, we'll need to resume writing
to the attached filters later on. Prepare for that by splitting out
the filter processing.
2023-11-29 11:53:11 +01:00
Dominik Charousset
cebb85b1e8 Fix unsafe and inefficient uses of copy_string
Add a new overload to `copy_string` that takes the input characters plus
size. The new overload avoids inefficient scanning of the input for the
null terminator in cases where we know the size beforehand. Furthermore,
this overload *must* be used when dealing with input character sequences
that may have no null terminator, e.g., when the input is from a
`std::string_view` object.
2023-11-03 15:25:38 +01:00
Benjamin Bannier
f5a76c1aed Reformat Zeek in Spicy style
This largely copies over Spicy's `.clang-format` configuration file. The
one place where we deviate is header include order since Zeek depends on
headers being included in a certain order.
2023-10-30 09:40:55 +01:00
Arne Welzel
cbaf43e8ea VectorVal: Embed vector_val
Similar motivation as for RecordVal, save an extra malloc/free
and pointer indirection.

This breaks the `auto& RawVec()` API which previously returned
a reference to the std::vector*. It now returns a reference
to the vector instead. It's commented as intended for internal
and compiled code, so even though it's public API,

The previous `std::vector<std::optional<ZVal>>*&` return type was also very
likely not intended (all consumers just dereference it anyhow). I'm certain
this API was never meant to modify the actual pointer value.

I've switched to explicit typing, too.
2023-09-22 21:52:52 +02:00
Tim Wojtulewicz
75188ea6d7 Fix minor type-clash warning on Windows 2023-05-25 16:50:22 -07:00
Arne Welzel
89c828ac14 Merge remote-tracking branch 'origin/topic/vern/record-optimizations.Apr23B'
* origin/topic/vern/record-optimizations.Apr23B:
  different fix for MSVC compiler issues
  more general approach for addressing MSVC compiler issues with IntrusivePtr
  restored RecordType::Create, now marked as deprecated tidying of namespaces and private class members simplification of flagging record field initializations that should be skipped address peculiar MSVC compilation complaint for IntrusivePtr's
  clarifications and tidying for record field initializations
  optimize record construction by deferring initializations of aggregates
  compile-scripts-to-C++ speedups by switching to raw record access
  logging speedup by switching to raw record access
  remove redundant record coercions

Removed the `#if 0` hunk during merging: Probably could have gone with a
doctest instead.
2023-04-19 11:59:56 +02:00
Arne Welzel
a5e7faf564 logging/Manager: Fix crash for rotation format function not returning
While working on a rotation format function, ran into Zeek crashing
when not returning a value from it, fix and recover the same way as
for scripting errors.
2023-04-13 09:23:51 +02:00
Vern Paxson
4600ca41f6 logging speedup by switching to raw record access 2023-04-10 11:43:19 -07:00
Arne Welzel
545b867ddd logging/Manager: Remove unused variable 2023-02-27 12:51:03 +01:00
Arne Welzel
69a98e2cbb logging: Add telemetry for streams and log writers
This adds one metric per log stream and one metric per log writer (path based)
to track the number of writes on a stream level as well as on a writer level.

    $ curl -sSf localhost:8181/metrics | grep Conn
    zeek_log_writer_writes_total{endpoint="",filter-name="default",module="HTTP",path="http",stream="HTTP::LOG",writer="Log::WRITER_SQLITE"} 1 1677497572770
    zeek_log_stream_writes_total{endpoint="",module="HTTP",stream="HTTP::LOG"} 1 1677497572770

The initial version of this change also included metrics around log
write vetoes, but given no log policies exist in the default configuration
and they are mostly interesting for a few streams/writers only, skip this
for now. These can always be added by the script writer, too.

The difference between the stream level writes and concrete writers can
be used to deduce the number of vetoes (or errors) as a starting point.
2023-02-27 12:51:03 +01:00
Tim Wojtulewicz
3b0e8ee6f1 Fix a bunch of missing class member initializations 2023-01-27 13:03:18 -07:00
Tim Wojtulewicz
2739275b88 Merge remote-tracking branch 'jsoref/spelling-src'
* jsoref/spelling-src:
  Spelling src
2022-11-11 12:49:15 -07:00
Josh Soref
cd201aa24e Spelling src
These are non-functional changes.

* accounting
* activation
* actual
* added
* addresult
* aggregable
* aligned
* alternatively
* ambiguous
* analysis
* analyzer
* anticlimactic
* apparently
* application
* appropriate
* arithmetic
* assignment
* assigns
* associated
* authentication
* authoritative
* barrier
* boundary
* broccoli
* buffering
* caching
* called
* canonicalized
* capturing
* certificates
* ciphersuite
* columns
* communication
* comparison
* comparisons
* compilation
* component
* concatenating
* concatenation
* connection
* convenience
* correctly
* corresponding
* could
* counting
* data
* declared
* decryption
* defining
* dependent
* deprecated
* detached
* dictionary
* directional
* directly
* directory
* discarding
* disconnecting
* distinguishes
* documentation
* elsewhere
* emitted
* empty
* endianness
* endpoint
* enumerator
* essentially
* evaluated
* everything
* exactly
* execute
* explicit
* expressions
* facilitates
* fiddling
* filesystem
* flag
* flagged
* for
* fragments
* guarantee
* guaranteed
* happen
* happening
* hemisphere
* identifier
* identifies
* identify
* implementation
* implemented
* implementing
* including
* inconsistency
* indeterminate
* indices
* individual
* information
* initial
* initialization
* initialize
* initialized
* initializes
* instantiate
* instantiated
* instantiates
* interface
* internal
* interpreted
* interpreter
* into
* it
* iterators
* length
* likely
* log
* longer
* mainly
* mark
* maximum
* message
* minimum
* module
* must
* name
* namespace
* necessary
* nonexistent
* not
* notifications
* notifier
* number
* objects
* occurred
* operations
* original
* otherwise
* output
* overridden
* override
* overriding
* overwriting
* ownership
* parameters
* particular
* payload
* persistent
* potential
* precision
* preexisting
* preservation
* preserved
* primarily
* probably
* procedure
* proceed
* process
* processed
* processes
* processing
* propagate
* propagated
* prototype
* provides
* publishing
* purposes
* queue
* reached
* reason
* reassem
* reassemble
* reassembler
* recommend
* record
* reduction
* reference
* regularly
* representation
* request
* reserved
* retrieve
* returning
* separate
* should
* shouldn't
* significant
* signing
* simplified
* simultaneously
* single
* somebody
* sources
* specific
* specification
* specified
* specifies
* specify
* statement
* subdirectories
* succeeded
* successful
* successfully
* supplied
* synchronization
* tag
* temporarily
* terminating
* that
* the
* transmitted
* true
* truncated
* try
* understand
* unescaped
* unforwarding
* unknown
* unknowndata
* unspecified
* update
* usually
* which
* wildcard

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-11-09 12:08:15 -05:00
Tim Wojtulewicz
e8dbfc1cb0 Fix a bunch of variable shadowing issues from LGTM 2022-11-02 15:54:51 -07:00
Tim Wojtulewicz
f624c18383 Deprecate bro_int_t and bro_uint_t 2022-07-12 12:01:23 -07:00
Arne Welzel
aaa47a709c logging: Introduce Log::default_logdir deprecate LogAscii::logdir and per writer logdir
Also modify FormatRotationPath to keep rotated logs within
Log::default_logdir unless the rotation function explicitly
set dir, e.g. by when the user redef'ed default_rotation_interval.
2022-07-06 18:54:29 +02:00
Tim Wojtulewicz
47e7fe2cd1 Convert Dictionary types to be templated classes
This has the fortunate side-effect of also making it so we can store
the value objects as typed pointers, instead of void*.
2022-07-05 13:33:34 -07:00