Commit graph

16 commits

Author SHA1 Message Date
Robin Sommer
e4e770d475 Threaded logging framework.
This is based on Gilbert's code but I ended up refactoring it quite a
bit. That's why I didn't do a direct merge but started with a new
branch and copied things over to adapt. It looks quite a bit different
now as I tried to generalize things a bit more to also support the
Input Framework.

The larger changes code are:

    - Moved all logging code into subdirectory src/logging/. Code
      here is in namespace "logging".

    - Moved all threading code into subdirectory src/threading/. Code
      here is in namespace "threading".

    - Introduced a central thread manager that tracks threads and is
      in charge of termination and (eventually) statistics.

    - Refactored logging independent threading code into base classes
      BasicThread and MsgThread. The former encapsulates all the
      pthread code with simple start/stop methods and provides a
      single Run() method to override.

      The latter is derived from BasicThread and adds bi-directional
      message passing between main and child threads. The hope is that
      the Input Framework can reuse this part quite directly.

    - A log writer is now split into a general WriterFrontend
      (LogEmissary in Gilbert's code) and a type-specific
      WriterBackend. Specific writers are implemented by deriving from
      the latter. (The plugin interface is almost unchanged compared
      to the 2.0 version.).

      Frontend and backend communicate via MsgThread's message
      passing.

    - MsgThread (and thus WriterBackend) has a Heartbeat() method that
      a thread can override to execute code on a regular basis. It's
      triggered roughly once a second by the main thread.

    - Integration into "the rest of Bro". Threads can send messages to
      the reporter and do debugging output; they are hooked into the
      I/O loop for sending messages back; and there's a new debugging
      stream "threading" that logs, well, threading activity.

This all seems to work for the most part, but it's not done yet.

TODO list:

    - Not all tests pass yet. In particular, diffs for the external
      tests seem to indicate some memory problem (no crashes, just an
      occasional weird character).

    - Only tested in --enable-debug mode.

    - Only tested on Linux.

    - Needs leak check.

    - Each log write is currently a single inter-thread message. Bring
      Gilbert's bulk writes back.

    - Code needs further cleanup.

    - Document the class API.

    - Document the internal structure of the logging framework.

    - Check for robustness: live traffic, aborting, signals, etc.

    - Add thread statistics to profile.log (most of the code is there).

    - Customize the OS-visible thread names on platforms that support it.
2012-01-27 17:16:14 -08:00
Jon Siwek
edc0a451f8 Teach LogWriterAscii to use BRO_LOG_SUFFIX env. var. (addresses #704) 2011-12-01 16:18:56 -06:00
Robin Sommer
8aaccf1c95 Logging speed improvements.
We now use Google's replacement functions for slow printf-based
num-to-ascii conversion.
2011-10-06 15:55:45 -07:00
Robin Sommer
630c256a72 Merge remote branch 'origin/topic/gilbert/ascii-header'
* origin/topic/gilbert/ascii-header:
  Updated tests; removed net type from type conversion code.
  Updated header format (see #558)
  Header modification to LogWriterAscii to make it easier for scripts to understand bro log files.

Notes:

    - I've refactored the code a bit, also adapting the style a bit.
      Also edited the header format slightly.

    - I'm skipping the testing/btest/profiles directory, which seems
      unrelated.

    - I'm also skipping the baseline updates as they weren't
      up-to-date anymore. Will update them in a subsequent commit.
2011-09-04 12:12:08 -07:00
Robin Sommer
96a9d488e0 Reworking logging's postprocessor logic.
The main change is that the postprocessor commands are no longer run
by the log writers themselves. Instead, the writers send back a
message to the log mgr once they have rotated. The manager then calls
a script level function to do somethign with the rotated file. By
default, it will be renamed to somethingn nice and then a
postprocessor shell command will be run on it if defined.

Pieces going into this:

    - Terminology change: "postprocessor" now refers to a script
    *function*. In addition, there are "postprocessor commands", which
    are shell commands that may be triggered by the function to run on
    a rotated file.

    - The RotationInfo record now comes with all the information that
    was previously provided internally to the C++ function running the
    post-processor command.

    - Changing the default time format to %Y-%m-%d-%H-%M-%S

    - rotation_path_func is gone

    - The default postprocessor function is defined individually by
      each LogWriter in frameworks/logging/plugin/*

    - The interface to postprocessor shell commands remains the same.

Needs a bit more testing ...
2011-07-29 17:32:33 -07:00
Robin Sommer
13a492091f Merge remote branch 'origin/topic/robin/logging-internals'
Includes some additional cleanup.
2011-04-20 21:30:41 -07:00
Robin Sommer
c6e3174bc8 The logging systems now supports fields of type set[<atomic_type>]. 2011-03-09 18:01:41 -08:00
Robin Sommer
b69ecff3ee More options for the ASCII writer.
# The prefix for the header line if included.
	const header_prefix = "# " &redef;

	# The string to use for empty string fields.
	const empty_field = "" &redef;

	# The string to use for an unset optional field.
	const unset_field = "-" &redef;
2011-03-09 16:52:46 -08:00
Robin Sommer
cb9e0a5d5a If a field value contains the separator, that is now escape with hex
characters.
2011-03-09 16:26:11 -08:00
Robin Sommer
c6d20dbfdf Adding a few options to the ASCII writer.
module LogAscii;

export {
	# Output everything to stdout rather than into files. This is primarily
	# for testing purposes.
	const output_to_stdout = F &redef;

	# The separator between fields.
	const separator = "\t" &redef;

	# True to include a header line with column names.
	const include_header = T &redef;
}
2011-03-08 21:44:46 -08:00
Robin Sommer
26eab74ecc The ASCII writer can now deal with /dev/* paths.
It will not longer try to add a ".log" extension.
2011-03-08 17:59:05 -08:00
Robin Sommer
d6cef16f77 Rotation support.
This follows rather closely how rotation currently works in
rotate-logs.bro. logging.bro now defines:

        # Default rotation interval; zero disables rotation.
        const default_rotation_interval = 0secs &redef;

        # Default naming suffix format.
        const default_rotation_date_format = "%y-%m-%d_%H.%M.%S" &redef;

        # Default postprocessor for writers outputting into files.
        const default_rotation_postprocessor = "" &redef;

        # Default function to construct the name of the rotated file.
        # The default implementation includes
        # default_rotation_date_format into the file name.
        global default_rotation_path_func: function(info: RotationInfo) : string &redef;

Writer support for rotation is optional, usually it will only make
sense for file-based writers.

TODO: Currently, there's no way to customize rotation on a per file
basis, there are only the global defaults as described above.
Individual customization is coming next.
2011-03-06 19:32:44 -08:00
Robin Sommer
ab15437339 Working on the logging API exposed to scripts.
- Moving all functions into the Log::* namespace, using the recent
  bifcl updates. Moved logging-specific stuff to logging.bif.

- Log::create_stream() now takes a record Log::Stream as its second
  argument, which specifies columns and (optionally) the event.

- All the internal BiFs are now called "Log::__<something>", with
  script-level wrappers "Log::<something>". That first allows to add
  additional code at the script-level, and second makes things better
  comprehendible as now all relevant functionality is collected (and
  later documetned) in policy/logging.bro.

- New function Log::flush(id), which does the obvious assuming the
  writer supports it.

- add_default_filter() is now called implicitly with every
  create_stream(). Seems that we usually want that functionality, and
  when not, remove_default_filter() gets rid of it.

- The namespace of a stream's ID is now used as the default "path"
  (e.g., if the namespace is SSH, the default log file is "ssh.log").

- Updated policy/test-logging.bro as well as the btest tests according
  to these changes.
2011-02-27 15:09:37 -08:00
Robin Sommer
cf148c8a25 New bif log_set_buf() to set the buffering state for a stream. 2011-02-21 17:33:29 -08:00
Robin Sommer
091547de4f Preparing LogWriter API for rotation and flushing. 2011-02-21 14:13:49 -08:00
Robin Sommer
68062e87f1 Lots of infracstructure for the new logging framework.
This pretty much follows the proposal on the projects page.

It includes:

    - A new LogMgr, maintaining the set of writers.

    - The abstract LogWriter API.

    - An initial implementation in the form of LogWriterAscii
      producing tab-separated columns.

Note that things are only partially working right now, things are
subject to change, and it's all not much tested at all. That's why I'm
creating separate branch for now.

Example:

     bro -B logging test-logging && cat debug.log
    1298063168.409852/1298063168.410368 [logging] Created new logging stream 'SSH::LOG_SSH'
    1298063168.409852/1298063168.410547 [logging] Created new filter 'default' for stream 'SSH::LOG_SSH'
    1298063168.409852/1298063168.410564 [logging]    writer    : Ascii
    1298063168.409852/1298063168.410574 [logging]    path      : ssh_log_ssh
    1298063168.409852/1298063168.410584 [logging]    path_func : not set
    1298063168.409852/1298063168.410594 [logging]    event     : not set
    1298063168.409852/1298063168.410604 [logging]    pred      : not set
    1298063168.409852/1298063168.410614 [logging]    field          t: time
    1298063168.409852/1298063168.410625 [logging]    field  id.orig_h: addr
    1298063168.409852/1298063168.410635 [logging]    field  id.orig_p: port
    1298063168.409852/1298063168.410645 [logging]    field  id.resp_h: addr
    1298063168.409852/1298063168.410655 [logging]    field  id.resp_p: port
    1298063168.409852/1298063168.410665 [logging]    field     status: string
    1298063168.409852/1298063168.410675 [logging]    field    country: string
    1298063168.409852/1298063168.410817 [logging] Wrote record to filter 'default' on stream 'SSH::LOG_SSH'
    1298063168.409852/1298063168.410865 [logging] Wrote record to filter 'default' on stream 'SSH::LOG_SSH'
    1298063168.409852/1298063168.410906 [logging] Wrote record to filter 'default' on stream 'SSH::LOG_SSH'
    1298063168.409852/1298063168.410945 [logging] Wrote record to filter 'default' on stream 'SSH::LOG_SSH'
    1298063168.409852/1298063168.411044 [logging] Wrote record to filter 'default' on stream 'SSH::LOG_SSH

> cat ssh_log_ssh.log
1298063168.40985        1.2.3.4 66770   2.3.4.5 65616   success unknown
1298063168.40985        1.2.3.4 66770   2.3.4.5 65616   failure US
1298063168.40985        1.2.3.4 66770   2.3.4.5 65616   failure UK
1298063168.40985        1.2.3.4 66770   2.3.4.5 65616   success BR
1298063168.40985        1.2.3.4 66770   2.3.4.5 65616   failure MX
2011-02-18 13:03:46 -08:00