Threaded logging framework.

This is based on Gilbert's code but I ended up refactoring it quite a
bit. That's why I didn't do a direct merge but started with a new
branch and copied things over to adapt. It looks quite a bit different
now as I tried to generalize things a bit more to also support the
Input Framework.

The larger changes code are:

    - Moved all logging code into subdirectory src/logging/. Code
      here is in namespace "logging".

    - Moved all threading code into subdirectory src/threading/. Code
      here is in namespace "threading".

    - Introduced a central thread manager that tracks threads and is
      in charge of termination and (eventually) statistics.

    - Refactored logging independent threading code into base classes
      BasicThread and MsgThread. The former encapsulates all the
      pthread code with simple start/stop methods and provides a
      single Run() method to override.

      The latter is derived from BasicThread and adds bi-directional
      message passing between main and child threads. The hope is that
      the Input Framework can reuse this part quite directly.

    - A log writer is now split into a general WriterFrontend
      (LogEmissary in Gilbert's code) and a type-specific
      WriterBackend. Specific writers are implemented by deriving from
      the latter. (The plugin interface is almost unchanged compared
      to the 2.0 version.).

      Frontend and backend communicate via MsgThread's message
      passing.

    - MsgThread (and thus WriterBackend) has a Heartbeat() method that
      a thread can override to execute code on a regular basis. It's
      triggered roughly once a second by the main thread.

    - Integration into "the rest of Bro". Threads can send messages to
      the reporter and do debugging output; they are hooked into the
      I/O loop for sending messages back; and there's a new debugging
      stream "threading" that logs, well, threading activity.

This all seems to work for the most part, but it's not done yet.

TODO list:

    - Not all tests pass yet. In particular, diffs for the external
      tests seem to indicate some memory problem (no crashes, just an
      occasional weird character).

    - Only tested in --enable-debug mode.

    - Only tested on Linux.

    - Needs leak check.

    - Each log write is currently a single inter-thread message. Bring
      Gilbert's bulk writes back.

    - Code needs further cleanup.

    - Document the class API.

    - Document the internal structure of the logging framework.

    - Check for robustness: live traffic, aborting, signals, etc.

    - Add thread statistics to profile.log (most of the code is there).

    - Customize the OS-visible thread names on platforms that support it.
This commit is contained in:
Robin Sommer 2012-01-26 17:47:36 -08:00
parent 60ae6f01d1
commit e4e770d475
28 changed files with 1745 additions and 503 deletions

189
src/logging/WriterBackend.h Normal file
View file

@ -0,0 +1,189 @@
// See the file "COPYING" in the main distribution directory for copyright.
//
// Bridge class between main process and writer threads.
#ifndef LOGGING_WRITERBACKEND_H
#define LOGGING_WRITERBACKEND_H
#include "Manager.h"
#include "threading/MsgThread.h"
namespace logging {
// The backend runs in its own thread, separate from the main process.
class WriterBackend : public threading::MsgThread
{
public:
WriterBackend(const string& name);
virtual ~WriterBackend();
// One-time initialization of the writer to define the logged fields.
// Interpretation of "path" is left to the writer, and will be
// corresponding the value configured on the script-level.
//
// Returns false if an error occured, in which case the writer must
// not be used further.
//
// The new instance takes ownership of "fields", and will delete them
// when done.
bool Init(string path, int num_fields, const Field* const * fields);
// Writes one log entry. The method takes ownership of "vals" and
// will return immediately after queueing the write request, which is
// potentially before output has actually been written out.
//
// num_fields and the types of the Values must match what was passed
// to Init().
//
// Returns false if an error occured, in which case the writer must
// not be used any further.
bool Write(int num_fields, Value** vals);
// Sets the buffering status for the writer, if the writer supports
// that. (If not, it will be ignored).
bool SetBuf(bool enabled);
// Flushes any currently buffered output, if the writer supports
// that. (If not, it will be ignored).
bool Flush();
// Triggers rotation, if the writer supports that. (If not, it will
// be ignored).
bool Rotate(WriterFrontend* writer, string rotated_path, double open, double close, bool terminating);
// Finishes writing to this logger regularly. Must not be called if
// an error has been indicated earlier. After calling this, no
// further writing must be performed.
bool Finish();
//// Thread-safe methods that may be called from the writer
//// implementation.
// The following methods return the information as passed to Init().
const string Path() const { return path; }
int NumFields() const { return num_fields; }
const Field* const * Fields() const { return fields; }
// Returns the current buffering state.
bool IsBuf() { return buffering; }
// Signals to the log manager that a file has been rotated.
//
// writer: The frontend writer that triggered the rotation. This must
// be the value passed into DoRotate().
//
// new_name: The filename of the rotated file. old_name: The filename
// of the origina file.
//
// open/close: The timestamps when the original file was opened and
// closed, respectively.
//
// terminating: True if rotation request occured due to the main Bro
// process shutting down.
bool FinishedRotation(WriterFrontend* writer, string new_name, string old_name,
double open, double close, bool terminating);
protected:
// Methods for writers to override. If any of these returs false, it
// will be assumed that a fatal error has occured that prevents the
// writer from further operation. It will then be disabled and
// deleted. When returning false, the writer should also report the
// error via Error(). Note that even if a writer does not support the
// functionality for one these methods (like rotation), it must still
// return true if that is not to be considered a fatal error.
//
// Called once for initialization of the writer.
virtual bool DoInit(string path, int num_fields,
const Field* const * fields) = 0;
// Called once per log entry to record.
virtual bool DoWrite(int num_fields, const Field* const * fields,
Value** vals) = 0;
// Called when the buffering status for this writer is changed. If
// buffering is disabled, the writer should attempt to write out
// information as quickly as possible even if doing so may have a
// performance impact. If enabled (which is the default), it may
// buffer data as helpful and write it out later in a way optimized
// for performance. The current buffering state can be queried via
// IsBuf().
//
// A writer may ignore buffering changes if it doesn't fit with its
// semantics (but must still return true in that case).
virtual bool DoSetBuf(bool enabled) = 0;
// Called to flush any currently buffered output.
//
// A writer may ignore flush requests if it doesn't fit with its
// semantics (but must still return true in that case).
virtual bool DoFlush() = 0;
// Called when a log output is to be rotated. Most directly this only
// applies to writers writing into files, which should then close the
// current file and open a new one. However, a writer may also
// trigger other apppropiate actions if semantics are similar.
//
// Once rotation has finished, the implementation should call
// RotationDone() to signal the log manager that potential
// postprocessors can now run.
//
// "writer" is the frontend writer that triggered the rotation. The
// *only* purpose of this value is to be passed into
// FinishedRotation() once done. You must not otherwise access the
// frontend, it's running in a different thread.
//
// "rotate_path" reflects the path to where the rotated output is to
// be moved, with specifics depending on the writer. It should
// generally be interpreted in a way consistent with that of "path"
// as passed into DoInit(). As an example, for file-based output,
// "rotate_path" could be the original filename extended with a
// timestamp indicating the time of the rotation.
//
// "open" and "close" are the network time's when the *current* file
// was opened and closed, respectively.
//
// "terminating" indicated whether the rotation request occurs due
// the main Bro prcoess terminating (and not because we've reach a
// regularly scheduled time for rotation).
//
// A writer may ignore rotation requests if it doesn't fit with its
// semantics (but must still return true in that case).
virtual bool DoRotate(WriterFrontend* writer, string rotated_path,
double open, double close, bool terminating) = 0;
// Called once on termination. Not called when any of the other
// methods has previously signaled an error, i.e., executing this
// method signals a regular shutdown of the writer.
virtual bool DoFinish() = 0;
// Triggered by regular heartbeat messages from the main process.
virtual bool DoHeartbeat(double network_time, double current_time) { return true; };
private:
friend class Manager;
// When an error occurs, we call this method to set a flag marking
// the writer as disabled. The Manager will check the flag later and
// remove the writer.
bool Disabled() { return disabled; }
// Deletes the values as passed into Write().
void DeleteVals(Value** vals);
string path;
int num_fields;
const Field* const * fields;
bool buffering;
bool disabled;
// For implementing Fmt().
char* buf;
unsigned int buf_len;
};
}
#endif