Commit graph

42 commits

Author SHA1 Message Date
Arne Welzel
3dd1f8d38a logging/WriterFrontend: Add LogWriteHeader as member
The header captures the enum values as well as the fields
2024-12-04 12:37:22 +01:00
Arne Welzel
f5d4526eac logging: Add filter_name to WriterInfo
...with this change, it'll be possible to identify WriterFrontend's
based on (stream, filter_name, path) pairs in addition to (stream,
writer, path) pairs.
2024-12-04 12:37:22 +01:00
Arne Welzel
0d925e935e logging: Dedicated log flush timer
Log flushing is currently triggered based on the threading heartbeat timer
of WriterBackends and the hard-coded WRITE_BUFFER_SIZE 1000.

This change introduces a separate timer that is managed by the logger
manager instead of piggy-backing on the heartbeat timer, as well as a
const &redef for the buffer size.

This allows to modify the log flush frequency and batch size independently
of the threading heartbeat interval. Later, this will allow to re-use the
buffering and flushing logic of writer frontends for non-Broker cluster
backends, too.

One change here is that even frontends that do not have a backend will
be flushed regularly. This is wanted for non-Broker backends and should be
very cheap. Possibly, Broker can piggy back on this timer down the road, too,
rather than using its own script-level timer (see Broker::log_flush()).
2024-09-27 15:30:35 +02:00
Arne Welzel
f0ab10a46c logging/WriterFrontend: No need for explicit CleanupWriteBuffer()
Any pending records will be cleaned in the destructor of WriterFrontend
and WriteBuffer, no need to do this explicitly.
2024-08-30 11:00:17 +02:00
Arne Welzel
245fd0c94f broker/logging: Change threading::Value** usage std::vector instead
This allows to leverage automatic memory management, less allocations
and using move semantics for expressing ownership.

This breaks the existing logging and broker API, but keeps the plugin
DoWrite() and HookLogWrite() methods functioning.

It further changes ValToLogVal to return a threading::Value rather than
a threading::Value*. The vector_val and set_val fields unfortunately
use the same pointer-to-array-of-pointers approach. this can'tbe changed
as it'd break backwards compatibility for plugin provided input readers
and log writers.
2024-08-30 10:58:57 +02:00
Tim Wojtulewicz
c3e869b827 Don't attempt to stop or flush disabled writer frontends 2024-04-22 16:45:55 -07:00
Benjamin Bannier
f5a76c1aed Reformat Zeek in Spicy style
This largely copies over Spicy's `.clang-format` configuration file. The
one place where we deviate is header include order since Zeek depends on
headers being included in a certain order.
2023-10-30 09:40:55 +01:00
Arne Welzel
7a043e5e8f all: Fix typos identified by typos pre-commit hook 2023-06-13 17:57:32 +02:00
Josh Soref
cd201aa24e Spelling src
These are non-functional changes.

* accounting
* activation
* actual
* added
* addresult
* aggregable
* aligned
* alternatively
* ambiguous
* analysis
* analyzer
* anticlimactic
* apparently
* application
* appropriate
* arithmetic
* assignment
* assigns
* associated
* authentication
* authoritative
* barrier
* boundary
* broccoli
* buffering
* caching
* called
* canonicalized
* capturing
* certificates
* ciphersuite
* columns
* communication
* comparison
* comparisons
* compilation
* component
* concatenating
* concatenation
* connection
* convenience
* correctly
* corresponding
* could
* counting
* data
* declared
* decryption
* defining
* dependent
* deprecated
* detached
* dictionary
* directional
* directly
* directory
* discarding
* disconnecting
* distinguishes
* documentation
* elsewhere
* emitted
* empty
* endianness
* endpoint
* enumerator
* essentially
* evaluated
* everything
* exactly
* execute
* explicit
* expressions
* facilitates
* fiddling
* filesystem
* flag
* flagged
* for
* fragments
* guarantee
* guaranteed
* happen
* happening
* hemisphere
* identifier
* identifies
* identify
* implementation
* implemented
* implementing
* including
* inconsistency
* indeterminate
* indices
* individual
* information
* initial
* initialization
* initialize
* initialized
* initializes
* instantiate
* instantiated
* instantiates
* interface
* internal
* interpreted
* interpreter
* into
* it
* iterators
* length
* likely
* log
* longer
* mainly
* mark
* maximum
* message
* minimum
* module
* must
* name
* namespace
* necessary
* nonexistent
* not
* notifications
* notifier
* number
* objects
* occurred
* operations
* original
* otherwise
* output
* overridden
* override
* overriding
* overwriting
* ownership
* parameters
* particular
* payload
* persistent
* potential
* precision
* preexisting
* preservation
* preserved
* primarily
* probably
* procedure
* proceed
* process
* processed
* processes
* processing
* propagate
* propagated
* prototype
* provides
* publishing
* purposes
* queue
* reached
* reason
* reassem
* reassemble
* reassembler
* recommend
* record
* reduction
* reference
* regularly
* representation
* request
* reserved
* retrieve
* returning
* separate
* should
* shouldn't
* significant
* signing
* simplified
* simultaneously
* single
* somebody
* sources
* specific
* specification
* specified
* specifies
* specify
* statement
* subdirectories
* succeeded
* successful
* successfully
* supplied
* synchronization
* tag
* temporarily
* terminating
* that
* the
* transmitted
* true
* truncated
* try
* understand
* unescaped
* unforwarding
* unknown
* unknowndata
* unspecified
* update
* usually
* which
* wildcard

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-11-09 12:08:15 -05:00
Tim Wojtulewicz
b2f171ec69 Reformat the world 2021-09-16 15:35:39 -07:00
Tim Wojtulewicz
4ad08172d0 Remove obsolete ZEEK_FORWARD_DECLARE_NAMESPACED macros 2021-02-24 14:35:44 -07:00
Tim Wojtulewicz
0618be792f Remove all of the random single-file deprecations
These are the changes that don't require a ton of changes to other files outside
of the original removal.
2021-01-27 10:52:40 -07:00
Tim Wojtulewicz
96d9115360 GH-1079: Use full paths starting with zeek/ when including files 2020-11-12 12:15:26 -07:00
Tim Wojtulewicz
fe0c22c789 Base: Clean up explicit uses of namespaces in places where they're not necessary.
This commit covers all of the common and base classes.
2020-08-24 12:07:00 -07:00
Tim Wojtulewicz
4b61d60e80 Fix indentation of namespaced aliases 2020-08-20 16:11:46 -07:00
Tim Wojtulewicz
45b5c6e619 Move logging code to zeek namespaces 2020-08-20 15:55:17 -07:00
Tim Wojtulewicz
64332ca22c Move all Val classes to the zeek namespaces 2020-06-30 20:48:09 -07:00
Max Kellermann
0db61f3094 include cleanup
The Zeek code base has very inconsistent #includes.  Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed.  Another side effect was a lot of header
bloat which slows down the build.

First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.

After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations.  In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.

This patch speeds up the build by 19%, because each compilation unit
gets smaller.  Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):

Before this patch:

 3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
 760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps

After this patch:

 2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
 72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
2020-02-04 20:51:02 +01:00
Dominik Charousset
c1f3fe7829 Switch from header guards to pragma once 2019-09-17 14:10:30 +02:00
Robin Sommer
fe7e1ee7f0 Merge topic/actor-system throug a squashed commit. 2018-05-18 22:39:23 +00:00
Johanna Amann
1f2bf50b49 Remove unimplemented & unused functions from header files.
All of these functions were defined in header files without ever being
implemented or used.
2018-03-16 18:38:04 -07:00
Robin Sommer
5cf7803e68 Fix some minor issues.
From Daniel, thanks!
2017-02-23 17:18:43 -08:00
Robin Sommer
a5e9a535a5 Changing semantics of Broker's remote logging to match old communication framework.
Broker had changed the semantics of remote logging: it sent over the
original Bro record containing the values to be logged, which on the
receiving side would then pass through the logging framework normally,
including triggering filters and events. The old communication system
however special-cases logs: it sends already processed log entries,
just as they go into the log files, and without any receiver-side
filtering etc. This more efficient as it short-cuts the processing
path, and also avoids the more expensive Val serialization. It also
lets the sender determine the specifics of what gets logged (and how).

This commit changes Broker over to now use the same semantics as the
old communication system.

TODOs:
     - The new Broker code doesn't have consistent #ifdefs yet.

     - Right now, when a new log receiver connects, all existing logs
     are broadcasted out again to all current clients. That doesn't so
     any harm, but is unncessary. Need to add a way to send the
     existing logs to just the new client.
2017-02-10 18:46:45 -08:00
Robin Sommer
4059d4b4f1 Merge remote-tracking branch 'origin/topic/johanna/bit-1683'
Looks like the right fix. Two tiny tweaks:

     - changed the order of arguments for DeleteVals() for consistency
       with the corresponding Manager function.

     - turned the InternalWarning into a Warning: if I understand
       correctly, this can happen when scripts on nodes diverge; which
       is a user-side problem, not an internal Bro logic issue.

BIT-1683 #merged

* origin/topic/johanna/bit-1683:
  Actually check if the number of fields in a write are equal to the number of fields required.
2016-09-27 12:40:36 -07:00
Johanna Amann
038dfa6273 Actually check if the number of fields in a write are equal to the
number of fields required.

Addresses BIT-1683

I do not think this quite fixes the underlying issue of BIT-1683 - it
should not be possible to get to this state in normal operations.

Also fixes a small memory leak for disabled writers.
2016-09-22 16:43:37 -07:00
Bernhard Amann
39f1b9e01f Change thread shutdown again to also work with input framework.
Seems to work, tests pass, but not really verified.

Major change 1:
finished flag in MsgThread was replaced by 2 flags:
child_finished and main_finished.

child_finished is set by child_thread and means that the processing
loop is stopped immediately (no longer needed, no new input messages
will be processed, if loop continues running there is an ugly delay
on shutdown). (This took me a while to realize...)

main_finished is set by a message that is sent back by the child
to the main thread when Finished() is called (and child_finished
is set). when main_finished is set, processing of output messages
stops. But all messages that the child thread pushed in the queue
before calling Finish() are still processed.

Change 2:
Logging terminate call was replaced by a smaller call that just
flushes out the cache held by the main thread. This call
has to be done before thread shutdown is called - otherwhise
the threads will be shut down before all messages are pushed
on them. (This also took me a while to realize...).

Change 3:
Input framework actually calls it stop methods correctly (everything
was prepared, function call was missing)
2013-05-14 23:45:55 -07:00
Robin Sommer
38e1dc9ca4 Support for cleaning up threads that have terminated.
Once a BasicThread leaves its run() method, a thread is now marked for
cleaning up, and the ThreadMgr will soon join it to release the OS
resources.

Also, adding a function Log::remove_stream() that remove a logging
stream, stopping all writer threads that are associated with it.

Note, however, that removing a *filter* from a stream still doesn't
clean up any threads. The problem is that because of the output paths
potentially being created dynamically it's unclear if the writer
thread will still be needed in the future. We could add clean writers
up with timeouts, but that doesn't sound great either. So for now, the
only way to sure clean up logging threads is to remove the entire
stream.

Also note that cleanup doesn't work with input threads yet, which
don't seem to terminate (at least in the case I tried).
2013-03-14 14:59:05 -07:00
Robin Sommer
24aea295fa Merge branch 'topic/robin/master-test'
* topic/robin/master-test: (60 commits)
  Script fix for Linux.
  Updating test base line.
  Another small change to MsgThread API.
  Bug fix for BasicThread.
  make version_ok return true for TLSv12
  Sed usage in canonifier script didn't work on non-Linux systems.
  Changing HTTP DPD port 3138 to 3128.
  Temporarily removing tuning/logs-to-elasticsearch.bro from the test-all-policy.
  More documentation updates.
  Revert "Fixing calc_next_rotate to use UTC based time functions."
  Some documentation updates for elasticsearch plugin.
  Give configure a --disable-perftools option.
  Updating tests for the #start/#end change.
  Further threading and API restructuring for logging and input frameworks.
  Reworking forceful thread termination.
  Moving the ASCII writer over to use UNIX I/O rather than stdio.
  Further reworking the thread API.
  Reworking thread termination logic.
  If a thread doesn't terminate, we log that but not longer proceed (because it could hang later still).
  Removing the thread kill functionality.
  ...
2012-07-23 16:20:44 -07:00
Robin Sommer
87e10b5f97 Further threading and API restructuring for logging and input
frameworks.

There were a number of cases that weren't thread-safe. In particular,
we don't use std::string anymore for anything that's passed between
threads (but instead plain old const char*, with manual memmory
managmenet).

This is still a check-point commit, I'll do more testing.
2012-07-19 22:28:30 -07:00
Robin Sommer
f6b883bafc Further reworking the thread API. 2012-07-19 21:22:28 -07:00
Robin Sommer
f73eb3b086 Reworking thread termination logic.
Turns out the finish methods weren't called correctly, caused by a
mess up with method names which all sounded too similar and the wrong
one ended up being called. I've reworked this by changing the
thread/writer/reader interfaces, which actually also simplifies them
by getting rid of the requirement for writer backends to call their
parent methods (i.e., less opportunity for errors).

This commit also includes the following (because I noticed the problem
above when working on some of these):

     - The ASCII log writer now includes "#start <timestamp>" and
      "#end <timestamp> lines in the each file. The latter supersedes
      Bernhard's "EOF" patch.

      This required a number of tests updates. The standard canonifier
      removes the timestamps, but some tests compare files directly,
      which doesn't work if they aren't printing out the same
      timestamps (like the comm tests).

     - The above required yet another change to the writer API to
       network_time to methods.

     - Renamed ASCII logger "header" options to "meta".

     - Fixes #763 "Escape # when first character in log file line".

All btests pass for me on Linux FC15. Will try MacOS next.
2012-07-19 21:21:53 -07:00
Robin Sommer
b38d1e1ec2 Reworking log writer API to make it easier to pass additional
information to a writer's initialization method.

However, for now the information provided is still the same.
2012-06-21 11:57:45 -07:00
Robin Sommer
077089a047 Merge branch 'topic/robin/log-threads'
* topic/robin/log-threads: (42 commits)
  Two more tweaks to reliably terminate when reading from trace.
  This could be fixing the memory problems finally.
  Fix compile errors due to now-explicit IPAddr ctors and global IPFamily enum.
  Switching log buffer size back to normal
  Teaching cmake to always link in tcmalloc if it finds it.
  Extending queue statistics.
  Small fixes and tweaks.
  Don't assert during shutdown.
  Reverting accidental commit.
  Finetuning communication CPU usage.
  Adding new leak tests involving remote logging.
  Removing some no longer needed checks.
  Fixing problem logging remotely when local logging was turned off.
  Preventing busy looping when no threads have been spawned.
  Prevent manager from busy looping.
  Adding missing includes needed on FreeBSD.
  Updating submodule(s).
  Updating submodule(s).
  A number of bugfixes for the recent threading updates.
  Making exchange of addresses between threads thread-safe.
  ...
2012-04-04 17:32:13 -07:00
Robin Sommer
d7c9471818 Extending queue statistics. 2012-03-23 15:57:25 -07:00
Robin Sommer
c0678e7e1f Fixing problem logging remotely when local logging was turned off.
For that, moved the remote logging from the Manager to the
WriterFrontend. That also simplifies the Manager a bit.
2012-03-08 17:30:18 -08:00
Bernhard Amann
8a6dfee00c Merge remote-tracking branch 'origin/topic/robin/log-threads' into topic/bernhard/log-threads 2012-02-13 02:30:24 -08:00
Robin Sommer
b8ec653ebf Bugfixes.
- Data queued at termination wasn't written out completely.

    - Fixed some race conditions.

    - Fixing IOSource integration.

    - Fixing setting thread names on Linux.

    - Fixing minor leaks.

All tests now pass for me on Linux in debug and non-debug compiles.

Remaining TODOs:

        - Needs leak check.

        - Test on MacOS and FreeBSD.

        - More testing:
            - High volume traffic.
            - Different platforms.
2012-02-12 13:07:26 -08:00
Bernhard Amann
a0487ecb30 move Value and Field from the logging namespace to the threading namespace, because other modules using threading will need them. 2012-02-03 14:12:29 -08:00
Robin Sommer
70fe7876a1 Updating thread naming.
Also includes experimental code to adapt the thread name as shown by
top, but it's untested.
2012-02-03 04:04:38 -08:00
Robin Sommer
29fc56105d Documenting logging API. 2012-02-03 04:04:37 -08:00
Robin Sommer
4f0fc571ef Doing bulkd writes instead of individual writes now.
Also slight change to Writer API, going back to how the rotate methods
were before.
2012-02-03 04:04:37 -08:00
Robin Sommer
e4e770d475 Threaded logging framework.
This is based on Gilbert's code but I ended up refactoring it quite a
bit. That's why I didn't do a direct merge but started with a new
branch and copied things over to adapt. It looks quite a bit different
now as I tried to generalize things a bit more to also support the
Input Framework.

The larger changes code are:

    - Moved all logging code into subdirectory src/logging/. Code
      here is in namespace "logging".

    - Moved all threading code into subdirectory src/threading/. Code
      here is in namespace "threading".

    - Introduced a central thread manager that tracks threads and is
      in charge of termination and (eventually) statistics.

    - Refactored logging independent threading code into base classes
      BasicThread and MsgThread. The former encapsulates all the
      pthread code with simple start/stop methods and provides a
      single Run() method to override.

      The latter is derived from BasicThread and adds bi-directional
      message passing between main and child threads. The hope is that
      the Input Framework can reuse this part quite directly.

    - A log writer is now split into a general WriterFrontend
      (LogEmissary in Gilbert's code) and a type-specific
      WriterBackend. Specific writers are implemented by deriving from
      the latter. (The plugin interface is almost unchanged compared
      to the 2.0 version.).

      Frontend and backend communicate via MsgThread's message
      passing.

    - MsgThread (and thus WriterBackend) has a Heartbeat() method that
      a thread can override to execute code on a regular basis. It's
      triggered roughly once a second by the main thread.

    - Integration into "the rest of Bro". Threads can send messages to
      the reporter and do debugging output; they are hooked into the
      I/O loop for sending messages back; and there's a new debugging
      stream "threading" that logs, well, threading activity.

This all seems to work for the most part, but it's not done yet.

TODO list:

    - Not all tests pass yet. In particular, diffs for the external
      tests seem to indicate some memory problem (no crashes, just an
      occasional weird character).

    - Only tested in --enable-debug mode.

    - Only tested on Linux.

    - Needs leak check.

    - Each log write is currently a single inter-thread message. Bring
      Gilbert's bulk writes back.

    - Code needs further cleanup.

    - Document the class API.

    - Document the internal structure of the logging framework.

    - Check for robustness: live traffic, aborting, signals, etc.

    - Add thread statistics to profile.log (most of the code is there).

    - Customize the OS-visible thread names on platforms that support it.
2012-01-27 17:16:14 -08:00