Mirror/zeek - git.uphillsecurity.com: We code.

mirror of https://github.com/zeek/zeek.git synced 2025-10-02 06:38:20 +00:00

Author	SHA1	Message	Date
Tim Wojtulewicz	0ec2161b04	Add options to filter at the stream level as well as globally	2025-08-12 17:31:28 -07:00
Tim Wojtulewicz	837fde1a08	Add metrics to track string and container fields limited by length	2025-08-12 17:31:28 -07:00
Tim Wojtulewicz	cd74a4e138	Replace unused stream argument from RecordToLogRecord with WriterInfo This also adds a WriterInfo argument to ValToLogVal and passes the one from RecordToLogRecord into it.	2025-08-12 17:31:28 -07:00
Tim Wojtulewicz	e2e7ab28da	Implement string- and container-length filtering at the log record level	2025-08-12 17:31:28 -07:00
Tim Wojtulewicz	e458da944f	Return weird if a log line is over a configurable size limit	2025-07-21 09:14:52 -07:00
Tim Wojtulewicz	f386deba94	Fix clang-tidy performance-enum-size warnings in headers	2025-06-23 08:35:24 -07:00
Arne Welzel	ab1d48c95a	logging/Manager: Implement new WriteBatchFromRemote()	2024-12-04 12:40:35 +01:00
Arne Welzel	78999d147d	logging/Manager: Extract another CreateWriter() helper For other cluster backends, CreateWriter() will use a logger's filter configuration rather than receiving all configuration through CreateLog. Extract a helper out from WriteToFilters() for reuse.	2024-09-27 15:32:09 +02:00
Arne Welzel	0d925e935e	logging: Dedicated log flush timer Log flushing is currently triggered based on the threading heartbeat timer of WriterBackends and the hard-coded WRITE_BUFFER_SIZE 1000. This change introduces a separate timer that is managed by the logger manager instead of piggy-backing on the heartbeat timer, as well as a const &redef for the buffer size. This allows to modify the log flush frequency and batch size independently of the threading heartbeat interval. Later, this will allow to re-use the buffering and flushing logic of writer frontends for non-Broker cluster backends, too. One change here is that even frontends that do not have a backend will be flushed regularly. This is wanted for non-Broker backends and should be very cheap. Possibly, Broker can piggy back on this timer down the road, too, rather than using its own script-level timer (see Broker::log_flush()).	2024-09-27 15:30:35 +02:00
Arne Welzel	245fd0c94f	broker/logging: Change threading::Value** usage std::vector instead This allows to leverage automatic memory management, less allocations and using move semantics for expressing ownership. This breaks the existing logging and broker API, but keeps the plugin DoWrite() and HookLogWrite() methods functioning. It further changes ValToLogVal to return a threading::Value rather than a threading::Value*. The vector_val and set_val fields unfortunately use the same pointer-to-array-of-pointers approach. this can'tbe changed as it'd break backwards compatibility for plugin provided input readers and log writers.	2024-08-30 10:58:57 +02:00
Tim Wojtulewicz	46ff48c29a	Change all instruments to only handle doubles	2024-05-31 13:36:37 -07:00
Tim Wojtulewicz	a0ae06b3cd	Convert telemetry code to use prometheus-cpp	2024-05-31 13:30:31 -07:00
Arne Welzel	ee65623600	logging/Manager: Make LogDelayExpiredTimer an implementation detail The only reason this was a private component of Manager was to access the Stream's function. Use a generic callback and a lambda to avoid that exposure.	2023-11-30 12:25:49 +01:00
Arne Welzel	2dbb467ba2	logging: Implement get_delay_queue_size() Primarily for introspection given that re-delaying may exceed queue sizes.	2023-11-29 11:53:11 +01:00
Arne Welzel	f0e67022fd	logging: Introduce Log::delay() and Log::delay_finish() This is a verbose, opinionated and fairly restrictive version of the log delay idea. Main drivers are explicitly, foot-gun-avoidance and implementation simplicity. Calling the new Log::delay() function is only allowed within the execution of a Log::log_stream_policy() hook for the currently active log write. Conceptually, the delay is placed between the execution of the global stream policy hook and the individual filter policy hooks. A post delay callback can be registered with every Log::delay() invocation. Post delay callbacks can (1) modify a log record as they see fit, (2) veto the forwarding of the log record to the log filters and (3) extend the delay duration by calling Log::delay() again. The last point allows to delay a record by an indefinite amount of time, rather than a fixed maximum amount. This should be rare and is therefore explicit. Log::delay() increases an internal reference count and returns an opaque token value to be passed to Log::delay_finish() to release a delay reference. Once all references are released, the record is forwarded to all filters attached to a stream when the delay completes. This functionality separates Log::log_stream_policy() and individual filter policy hooks. One consequence is that a common use-case of filter policy hooks, removing unproductive log records, may run after a record was delayed. Users can lift their filtering logic to the stream level (or replicate the condition before the delay decision). The main motivation here is that deciding on a stream-level delay in per-filter hooks is too late. Attaching multiple filters to a stream can additionally result in hard to understand behavior. On the flip side, filter policy hooks are guaranteed to run after the delay and can be used for further mangling or filtering of a delayed record.	2023-11-29 11:53:11 +01:00
Arne Welzel	3afd6242c7	logging/Manager: Split Write() If we delay in the stream policy hook, we'll need to resume writing to the attached filters later on. Prepare for that by splitting out the filter processing.	2023-11-29 11:53:11 +01:00
Benjamin Bannier	f5a76c1aed	Reformat Zeek in Spicy style This largely copies over Spicy's `.clang-format` configuration file. The one place where we deviate is header include order since Zeek depends on headers being included in a certain order.	2023-10-30 09:40:55 +01:00
Vern Paxson	4600ca41f6	logging speedup by switching to raw record access	2023-04-10 11:43:19 -07:00
Arne Welzel	69a98e2cbb	logging: Add telemetry for streams and log writers This adds one metric per log stream and one metric per log writer (path based) to track the number of writes on a stream level as well as on a writer level. $ curl -sSf localhost:8181/metrics \| grep Conn zeek_log_writer_writes_total{endpoint="",filter-name="default",module="HTTP",path="http",stream="HTTP::LOG",writer="Log::WRITER_SQLITE"} 1 1677497572770 zeek_log_stream_writes_total{endpoint="",module="HTTP",stream="HTTP::LOG"} 1 1677497572770 The initial version of this change also included metrics around log write vetoes, but given no log policies exist in the default configuration and they are mostly interesting for a few streams/writers only, skip this for now. These can always be added by the script writer, too. The difference between the stream level writes and concrete writers can be used to deduce the number of vetoes (or errors) as a starting point.	2023-02-27 12:51:03 +01:00
Josh Soref	cd201aa24e	Spelling src These are non-functional changes. * accounting * activation * actual * added * addresult * aggregable * aligned * alternatively * ambiguous * analysis * analyzer * anticlimactic * apparently * application * appropriate * arithmetic * assignment * assigns * associated * authentication * authoritative * barrier * boundary * broccoli * buffering * caching * called * canonicalized * capturing * certificates * ciphersuite * columns * communication * comparison * comparisons * compilation * component * concatenating * concatenation * connection * convenience * correctly * corresponding * could * counting * data * declared * decryption * defining * dependent * deprecated * detached * dictionary * directional * directly * directory * discarding * disconnecting * distinguishes * documentation * elsewhere * emitted * empty * endianness * endpoint * enumerator * essentially * evaluated * everything * exactly * execute * explicit * expressions * facilitates * fiddling * filesystem * flag * flagged * for * fragments * guarantee * guaranteed * happen * happening * hemisphere * identifier * identifies * identify * implementation * implemented * implementing * including * inconsistency * indeterminate * indices * individual * information * initial * initialization * initialize * initialized * initializes * instantiate * instantiated * instantiates * interface * internal * interpreted * interpreter * into * it * iterators * length * likely * log * longer * mainly * mark * maximum * message * minimum * module * must * name * namespace * necessary * nonexistent * not * notifications * notifier * number * objects * occurred * operations * original * otherwise * output * overridden * override * overriding * overwriting * ownership * parameters * particular * payload * persistent * potential * precision * preexisting * preservation * preserved * primarily * probably * procedure * proceed * process * processed * processes * processing * propagate * propagated * prototype * provides * publishing * purposes * queue * reached * reason * reassem * reassemble * reassembler * recommend * record * reduction * reference * regularly * representation * request * reserved * retrieve * returning * separate * should * shouldn't * significant * signing * simplified * simultaneously * single * somebody * sources * specific * specification * specified * specifies * specify * statement * subdirectories * succeeded * successful * successfully * supplied * synchronization * tag * temporarily * terminating * that * the * transmitted * true * truncated * try * understand * unescaped * unforwarding * unknown * unknowndata * unspecified * update * usually * which * wildcard Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>	2022-11-09 12:08:15 -05:00
Vern Paxson	d758585e42	updated Bro->Zeek in comments in the source tree	2022-01-24 14:26:20 -08:00
Tim Wojtulewicz	d50dade24c	GH-1768: Properly cleanup existing log stream when recreated on with the same ID	2021-12-03 13:46:28 -07:00
Tim Wojtulewicz	331161138a	Unify all of the Tag types into one type - Remove tag types for each component type (analyzer, etc) - Add deprecated versions of the old types - Remove unnecessary tag element from templates for TaggedComponent and ComponentManager - Enable TaggedComponent to pass an EnumType when initializing Tag objects - Update some tests that are affected by the tag enum values changing order	2021-11-23 19:36:49 -07:00
Tim Wojtulewicz	b2f171ec69	Reformat the world	2021-09-16 15:35:39 -07:00
Christian Kreibich	795a7ea98e	Add a global log policy hook to the logging framework This addresses the need for a central hook on any log write, which wasn't previously doable without a lot of effort. The log manager invokes the new Log::log_stream_policy hook prior to any filter-specific hooks. Like filter-level hooks, it may veto a log write. Even when it does, filter-level hooks still get invoked, but cannot "un-veto". Includes test cases.	2021-07-02 12:42:45 -07:00
Tim Wojtulewicz	4ad08172d0	Remove obsolete ZEEK_FORWARD_DECLARE_NAMESPACED macros	2021-02-24 14:35:44 -07:00
Tim Wojtulewicz	0618be792f	Remove all of the random single-file deprecations These are the changes that don't require a ton of changes to other files outside of the original removal.	2021-01-27 10:52:40 -07:00
Tim Wojtulewicz	96d9115360	GH-1079: Use full paths starting with zeek/ when including files	2020-11-12 12:15:26 -07:00
Tim Wojtulewicz	fe0c22c789	Base: Clean up explicit uses of namespaces in places where they're not necessary. This commit covers all of the common and base classes.	2020-08-24 12:07:00 -07:00
Tim Wojtulewicz	4b61d60e80	Fix indentation of namespaced aliases	2020-08-20 16:11:46 -07:00
Tim Wojtulewicz	45b5c6e619	Move logging code to zeek namespaces	2020-08-20 15:55:17 -07:00
Tim Wojtulewicz	c9ab1f93e7	Move a few low-use classes to namespaces	2020-07-31 16:25:47 -04:00
Jon Siwek	a06ef66edc	Add Log::rotation_format_func and Log::default_rotation_dir options These may be redefined to customize log rotation path prefixes, including use of a directory. File extensions are still up to individual log writers to add themselves during the actual rotation. These new also allow for some simplication to the default ASCII postprocessor function: it eliminates the need for it doing an extra/awkward rename() operation that only changes the timestamp format. This also teaches the supervisor framework to use these new options to rotate ascii logs into a log-queue/ directory with a specific file name format (intended for an external archiver process to monitor separately).	2020-07-07 18:42:37 -07:00
Jon Siwek	11949ce37a	Implement leftover log rotation/archival for supervised nodes This helps prevent a node from being killed/crashing in the middle of writing a log, restarting, and eventually clobbering that log file that never underwent the rotation/archival process. The old `archive-log` and `post-terminate` scripts as used by ZeekControl previously implemented this behavior, but the new logic is entirely in the ASCII writer. It uses ".shadow" log files stored alongside the real log to help detect such scenarios and rotate them correctly upon the next startup of the Zeek process.	2020-07-07 18:39:23 -07:00
Tim Wojtulewicz	64332ca22c	Move all Val classes to the zeek namespaces	2020-06-30 20:48:09 -07:00
Tim Wojtulewicz	137e416a03	Rename BroType to Type	2020-06-10 14:27:36 -07:00
Tim Wojtulewicz	ed13972924	Move Type types to zeek namespace	2020-06-09 17:20:45 -07:00
Johanna Amann	876c803d75	Merge remote-tracking branch 'origin/topic/timw/776-using-statements' * origin/topic/timw/776-using-statements: Remove 'using namespace std' from SerialTypes.h Remove other using statements from headers GH-776: Remove using statements added by PR 770 Includes small fixes in files that changed since the merge request was made. Also includes a few small indentation fixes.	2020-04-09 13:31:07 -07:00
Tim Wojtulewicz	cb01e098df	iosource/threading/input/logging: Replace nulls with nullptr	2020-04-07 16:08:34 -07:00
Tim Wojtulewicz	d53c1454c0	Remove 'using namespace std' from SerialTypes.h This unfortunately cuases a ton of flow-down changes because a lot of other code was depending on that definition existing. This has a fairly large chance to break builds of external plugins, considering how many internal ones it broke.	2020-04-07 15:59:59 -07:00
Tim Wojtulewicz	5a237d3a3f	Use const-references in lots of places (preformance-unnecessary-value-param)	2020-02-11 14:11:18 -08:00
Max Kellermann	0db61f3094	include cleanup The Zeek code base has very inconsistent #includes. Many sources included a few headers, and those headers included other headers, and in the end, nearly everything is included everywhere, so missing #includes were never noticed. Another side effect was a lot of header bloat which slows down the build. First step to fix it: in each source file, its own header should be included first to verify that each header's includes are correct, and none is missing. After adding the missing #includes, I replaced lots of #includes inside headers with class forward declarations. In most headers, object pointers are never referenced, so declaring the function prototypes with forward-declared classes is just fine. This patch speeds up the build by 19%, because each compilation unit gets smaller. Here are the "time" numbers for a fresh build (with a warm page cache but without ccache): Before this patch: 3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k 760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps After this patch: 2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k 72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps	2020-02-04 20:51:02 +01:00
Dominik Charousset	c1f3fe7829	Switch from header guards to pragma once	2019-09-17 14:10:30 +02:00
Johanna Amann	dcd6454530	Remove RemoteSerializer and related code/types. Also removes broccoli from the source tree.	2019-05-03 15:00:13 -07:00
Robin Sommer	fe7e1ee7f0	Merge topic/actor-system throug a squashed commit.	2018-05-18 22:39:23 +00:00
Johanna Amann	1f2bf50b49	Remove unimplemented & unused functions from header files. All of these functions were defined in header files without ever being implemented or used.	2018-03-16 18:38:04 -07:00
Robin Sommer	5cf7803e68	Fix some minor issues. From Daniel, thanks!	2017-02-23 17:18:43 -08:00
Robin Sommer	511ca9e043	Adding Broker ifdefs for new remote logging code.	2017-02-17 16:28:20 -08:00
Robin Sommer	a5e9a535a5	Changing semantics of Broker's remote logging to match old communication framework. Broker had changed the semantics of remote logging: it sent over the original Bro record containing the values to be logged, which on the receiving side would then pass through the logging framework normally, including triggering filters and events. The old communication system however special-cases logs: it sends already processed log entries, just as they go into the log files, and without any receiver-side filtering etc. This more efficient as it short-cuts the processing path, and also avoids the more expensive Val serialization. It also lets the sender determine the specifics of what gets logged (and how). This commit changes Broker over to now use the same semantics as the old communication system. TODOs: - The new Broker code doesn't have consistent #ifdefs yet. - Right now, when a new log receiver connects, all existing logs are broadcasted out again to all current clients. That doesn't so any harm, but is unncessary. Need to add a way to send the existing logs to just the new client.	2017-02-10 18:46:45 -08:00
Jon Siwek	b06d82cced	broker integration: add API documentation (broxygen/doxygen) Also changed asynchronous data store query code a bit; trying to make memory management and handling of corner cases a bit clearer (former maybe could still be better, but I need to lookup queries by memory address to associate response cookies to them, and so wrapping pointers kind of just gets in the way).	2015-02-17 10:50:57 -06:00

1 2

72 commits