Threaded logging framework.

This is based on Gilbert's code but I ended up refactoring it quite a
bit. That's why I didn't do a direct merge but started with a new
branch and copied things over to adapt. It looks quite a bit different
now as I tried to generalize things a bit more to also support the
Input Framework.

The larger changes code are:

    - Moved all logging code into subdirectory src/logging/. Code
      here is in namespace "logging".

    - Moved all threading code into subdirectory src/threading/. Code
      here is in namespace "threading".

    - Introduced a central thread manager that tracks threads and is
      in charge of termination and (eventually) statistics.

    - Refactored logging independent threading code into base classes
      BasicThread and MsgThread. The former encapsulates all the
      pthread code with simple start/stop methods and provides a
      single Run() method to override.

      The latter is derived from BasicThread and adds bi-directional
      message passing between main and child threads. The hope is that
      the Input Framework can reuse this part quite directly.

    - A log writer is now split into a general WriterFrontend
      (LogEmissary in Gilbert's code) and a type-specific
      WriterBackend. Specific writers are implemented by deriving from
      the latter. (The plugin interface is almost unchanged compared
      to the 2.0 version.).

      Frontend and backend communicate via MsgThread's message
      passing.

    - MsgThread (and thus WriterBackend) has a Heartbeat() method that
      a thread can override to execute code on a regular basis. It's
      triggered roughly once a second by the main thread.

    - Integration into "the rest of Bro". Threads can send messages to
      the reporter and do debugging output; they are hooked into the
      I/O loop for sending messages back; and there's a new debugging
      stream "threading" that logs, well, threading activity.

This all seems to work for the most part, but it's not done yet.

TODO list:

    - Not all tests pass yet. In particular, diffs for the external
      tests seem to indicate some memory problem (no crashes, just an
      occasional weird character).

    - Only tested in --enable-debug mode.

    - Only tested on Linux.

    - Needs leak check.

    - Each log write is currently a single inter-thread message. Bring
      Gilbert's bulk writes back.

    - Code needs further cleanup.

    - Document the class API.

    - Document the internal structure of the logging framework.

    - Check for robustness: live traffic, aborting, signals, etc.

    - Add thread statistics to profile.log (most of the code is there).

    - Customize the OS-visible thread names on platforms that support it.
This commit is contained in:
Robin Sommer 2012-01-26 17:47:36 -08:00
parent 60ae6f01d1
commit e4e770d475
28 changed files with 1745 additions and 503 deletions

View file

@ -0,0 +1,129 @@
#include <sys/signal.h>
#include <signal.h>
#include "BasicThread.h"
#include "Manager.h"
using namespace threading;
BasicThread::BasicThread(const string& arg_name)
{
started = false;
terminating = false;
pthread = 0;
buf = 0;
buf_len = 1024;
char tmp[128];
snprintf(tmp, sizeof(tmp), "%s@%p", arg_name.c_str(), this);
name = string(tmp);
thread_mgr->AddThread(this);
}
BasicThread::~BasicThread()
{
}
const char* BasicThread::Fmt(const char* format, ...)
{
if ( ! buf )
buf = (char*) malloc(buf_len);
va_list al;
va_start(al, format);
int n = safe_vsnprintf(buf, buf_len, format, al);
va_end(al);
if ( (unsigned int) n >= buf_len )
{ // Not enough room, grow the buffer.
buf_len = n + 32;
buf = (char*) realloc(buf, buf_len);
// Is it portable to restart?
va_start(al, format);
n = safe_vsnprintf(buf, buf_len, format, al);
va_end(al);
}
return buf;
}
void BasicThread::Start()
{
if ( sem_init(&terminate, 0, 0) != 0 )
reporter->FatalError("Cannot create terminate semaphore for thread %s", name.c_str());
if ( pthread_create(&pthread, 0, BasicThread::launcher, this) != 0 )
reporter->FatalError("Cannot create thread %s", name.c_str());
DBG_LOG(DBG_THREADING, "Started thread %s", name.c_str());
started = true;
OnStart();
}
void BasicThread::Stop()
{
if ( ! started )
return;
if ( terminating )
return;
DBG_LOG(DBG_THREADING, "Signaling thread %s to terminate ...", name.c_str());
// Signal that it's ok for the thread to exit now.
if ( sem_post(&terminate) != 0 )
reporter->FatalError("Failure flagging terminate condition for thread %s", name.c_str());
terminating = true;
OnStop();
}
void BasicThread::Join()
{
if ( ! started )
return;
if ( ! terminating )
Stop();
DBG_LOG(DBG_THREADING, "Joining thread %s ...", name.c_str());
if ( pthread_join(pthread, 0) != 0 )
reporter->FatalError("Failure joining thread %s", name.c_str());
sem_destroy(&terminate);
DBG_LOG(DBG_THREADING, "Done with thread %s", name.c_str());
pthread = 0;
}
void* BasicThread::launcher(void *arg)
{
BasicThread* thread = (BasicThread *)arg;
// Block signals in thread. We handle signals only in the main
// process.
sigset_t mask_set;
sigfillset(&mask_set);
int res = pthread_sigmask(SIG_BLOCK, &mask_set, 0);
assert(res == 0); //
// Run thread's main function.
thread->Run();
// Wait until somebody actually wants us to terminate.
if ( sem_wait(&thread->terminate) != 0 )
reporter->FatalError("Failure flagging terminate condition for thread %s", thread->Name().c_str());
return 0;
}

View file

@ -0,0 +1,63 @@
#ifndef THREADING_BASICTHREAD_H
#define THREADING_BASICTHREAD_H
#include <pthread.h>
#include <semaphore.h>
#include "Queue.h"
#include "util.h"
using namespace std;
namespace threading {
class Manager;
class BasicThread
{
public:
BasicThread(const string& name); // Managed by manager, must not delete otherwise.
virtual ~BasicThread();
const string& Name() const { return name; }
void Start(); // Spawns the thread and enters Run().
void Stop(); // Signals the thread to terminate.
bool Terminating() const { return terminating; }
// A thread-safe version of fmt().
const char* Fmt(const char* format, ...);
protected:
virtual void Run() = 0;
virtual void OnStart() {}
virtual void OnStop() {}
private:
friend class Manager;
static void* launcher(void *arg);
// Used from the ThreadMgr.
void Join(); // Waits until the thread has terminated and then joins it.
bool started; // Set to to true once running.
bool terminating; // Set to to true to signal termination.
string name;
pthread_t pthread;
sem_t terminate;
// For implementing Fmt().
char* buf;
unsigned int buf_len;
};
}
extern threading::Manager* thread_mgr;
#endif

104
src/threading/Manager.cc Normal file
View file

@ -0,0 +1,104 @@
#include "Manager.h"
using namespace threading;
Manager::Manager()
{
DBG_LOG(DBG_THREADING, "Creating thread manager ...");
did_process = false;
next_beat = 0;
}
Manager::~Manager()
{
if ( all_threads.size() )
Terminate();
}
void Manager::Terminate()
{
DBG_LOG(DBG_THREADING, "Terminating thread manager ...");
// First process remaining thread output for the message threads.
do Process(); while ( did_process );
// Signal all to stop.
for ( all_thread_list::iterator i = all_threads.begin(); i != all_threads.end(); i++ )
(*i)->Stop();
// Then join them all.
for ( all_thread_list::iterator i = all_threads.begin(); i != all_threads.end(); i++ )
{
(*i)->Join();
delete *i;
}
all_threads.clear();
msg_threads.clear();
}
void Manager::AddThread(BasicThread* thread)
{
DBG_LOG(DBG_THREADING, "Adding thread %s ...", thread->Name().c_str());
all_threads.push_back(thread);
}
void Manager::AddMsgThread(MsgThread* thread)
{
DBG_LOG(DBG_THREADING, "%s is a MsgThread ...", thread->Name().c_str());
msg_threads.push_back(thread);
}
void Manager::GetFds(int* read, int* write, int* except)
{
}
double Manager::NextTimestamp(double* network_time)
{
if ( did_process || ! next_beat == 0 )
// If we had something to process last time (or haven't had a
// chance to check yet), we want to check for more asap.
return timer_mgr->Time();
// Else we assume we don't have much to do at all and wait for the next heart beat.
return next_beat;
}
void Manager::Process()
{
bool do_beat = (next_beat == 0 || network_time >= next_beat);
did_process = false;
for ( msg_thread_list::iterator i = msg_threads.begin(); i != msg_threads.end(); i++ )
{
MsgThread* t = *i;
if ( do_beat )
t->Heartbeat();
if ( ! t->HasOut() )
continue;
Message* msg = t->RetrieveOut();
if ( msg->Process() )
did_process = true;
else
{
string s = msg->Name() + " failed, terminating thread";
reporter->Error(s.c_str());
t->Stop();
}
delete msg;
}
if ( do_beat )
next_beat = network_time + HEART_BEAT_INTERVAL;
}

52
src/threading/Manager.h Normal file
View file

@ -0,0 +1,52 @@
#ifndef THREADING_MANAGER_H
#define THREADING_MANAGER_H
#include <list>
#include "IOSource.h"
#include "BasicThread.h"
#include "MsgThread.h"
namespace threading {
class Manager : public IOSource
{
public:
Manager();
~Manager();
void Terminate();
protected:
friend class BasicThread;
friend class MsgThread;
void AddThread(BasicThread* thread);
void AddMsgThread(MsgThread* thread);
// IOSource interface.
virtual void GetFds(int* read, int* write, int* except);
virtual double NextTimestamp(double* network_time);
virtual void Process();
virtual const char* Tag() { return "threading::Manager"; }
private:
static const int HEART_BEAT_INTERVAL = 1;
typedef std::list<BasicThread*> all_thread_list;
all_thread_list all_threads;
typedef std::list<MsgThread*> msg_thread_list;
msg_thread_list msg_threads;
bool did_process;
double next_beat;
};
}
extern threading::Manager* thread_mgr;
#endif

285
src/threading/MsgThread.cc Normal file
View file

@ -0,0 +1,285 @@
#include "DebugLogger.h"
#include "MsgThread.h"
#include "Manager.h"
using namespace threading;
static void strreplace(const string& s, const string& o, const string& n)
{
string r = s;
while ( true )
{
size_t i = r.find(o);
if ( i == std::string::npos )
break;
r.replace(i, o.size(), n);
}
}
namespace threading {
// Standard messages.
class TerminateMessage : public InputMessage<MsgThread>
{
public:
TerminateMessage(MsgThread* thread) : InputMessage("Terminate", thread) { }
virtual bool Process() { return true; }
};
class ReporterMessage : public OutputMessage<MsgThread>
{
public:
enum Type {
INFO, WARNING, ERROR, FATAL_ERROR, FATAL_ERROR_WITH_CORE,
INTERNAL_WARNING, INTERNAL_ERROR
};
ReporterMessage(Type arg_type, MsgThread* thread, const string& arg_msg)
: OutputMessage<MsgThread>("ReporterMessage", thread)
{ type = arg_type; msg = arg_msg; }
virtual bool Process();
private:
string msg;
Type type;
};
class HeartbeatMessage : public InputMessage<MsgThread>
{
public:
HeartbeatMessage(MsgThread* thread, double arg_network_time, double arg_current_time)
: InputMessage("Heartbeat", thread)
{ network_time = arg_network_time; current_time = arg_current_time; }
virtual bool Process() { return Object()->DoHeartbeat(network_time, current_time); }
private:
double network_time;
double current_time;
};
#ifdef DEBUG
class DebugMessage : public OutputMessage<MsgThread>
{
public:
DebugMessage(DebugStream arg_stream, MsgThread* thread, const string& arg_msg)
: OutputMessage<MsgThread>("DebugMessage", thread)
{ stream = arg_stream; msg = arg_msg; }
virtual bool Process()
{
string s = Object()->Name() + ": " + msg;
strreplace(s, "%", "%%");
debug_logger.Log(stream, s.c_str());
return true;
}
private:
string msg;
DebugStream stream;
};
#endif
}
// Methods.
Message::~Message()
{
}
bool ReporterMessage::Process()
{
string s = Object()->Name() + ": " + msg;
strreplace(s, "%", "%%");
const char* cmsg = s.c_str();
switch ( type ) {
case INFO:
reporter->Info(cmsg);
break;
case WARNING:
reporter->Warning(cmsg);
break;
case ERROR:
reporter->Error(cmsg);
break;
case FATAL_ERROR:
reporter->FatalError(cmsg);
break;
case FATAL_ERROR_WITH_CORE:
reporter->FatalErrorWithCore(cmsg);
break;
case INTERNAL_WARNING:
reporter->InternalWarning(cmsg);
break;
case INTERNAL_ERROR :
reporter->InternalError(cmsg);
break;
default:
reporter->InternalError("unknown ReporterMessage type %d", type);
}
return true;
}
MsgThread::MsgThread(const string& name) : BasicThread(name)
{
cnt_sent_in = cnt_sent_out = 0;
thread_mgr->AddMsgThread(this);
}
void MsgThread::OnStop()
{
// This is to unblock the current queue read operation.
SendIn(new TerminateMessage(this), true);
}
void MsgThread::Heartbeat()
{
SendIn(new HeartbeatMessage(this, network_time, current_time()));
}
void MsgThread::Info(const char* msg)
{
SendOut(new ReporterMessage(ReporterMessage::INFO, this, msg));
}
void MsgThread::Warning(const char* msg)
{
SendOut(new ReporterMessage(ReporterMessage::WARNING, this, msg));
}
void MsgThread::Error(const char* msg)
{
SendOut(new ReporterMessage(ReporterMessage::ERROR, this, msg));
}
void MsgThread::FatalError(const char* msg)
{
SendOut(new ReporterMessage(ReporterMessage::FATAL_ERROR, this, msg));
}
void MsgThread::FatalErrorWithCore(const char* msg)
{
SendOut(new ReporterMessage(ReporterMessage::FATAL_ERROR_WITH_CORE, this, msg));
}
void MsgThread::InternalWarning(const char* msg)
{
SendOut(new ReporterMessage(ReporterMessage::INTERNAL_WARNING, this, msg));
}
void MsgThread::InternalError(const char* msg)
{
SendOut(new ReporterMessage(ReporterMessage::INTERNAL_ERROR, this, msg));
}
#ifdef DEBUG
void MsgThread::Debug(DebugStream stream, const char* msg)
{
SendOut(new DebugMessage(stream, this, msg));
}
#endif
void MsgThread::SendIn(BasicInputMessage* msg, bool force)
{
if ( Terminating() && ! force )
return;
DBG_LOG(DBG_THREADING, "Sending '%s' to %s ...", msg->Name().c_str(), Name().c_str());
queue_in.Put(msg);
++cnt_sent_in;
}
void MsgThread::SendOut(BasicOutputMessage* msg, bool force)
{
if ( Terminating() && ! force )
return;
queue_out.Put(msg);
++cnt_sent_out;
}
BasicOutputMessage* MsgThread::RetrieveOut()
{
BasicOutputMessage* msg = queue_out.Get();
assert(msg);
#ifdef DEBUG
if ( msg->Name() != "DebugMessage" ) // Avoid recursion.
{
string s = Fmt("Retrieved '%s' from %s", msg->Name().c_str(), Name().c_str());
Debug(DBG_THREADING, s.c_str());
}
#endif
return msg;
}
BasicInputMessage* MsgThread::RetrieveIn()
{
BasicInputMessage* msg = queue_in.Get();
assert(msg);
#ifdef DEBUG
string s = Fmt("Retrieved '%s' in %s", msg->Name().c_str(), Name().c_str());
Debug(DBG_THREADING, s.c_str());
#endif
return msg;
}
void MsgThread::Run()
{
while ( true )
{
// When requested to terminate, we only do so when
// all input has been processed.
if ( Terminating() && ! queue_in.Ready() )
break;
BasicInputMessage* msg = RetrieveIn();
bool result = msg->Process();
if ( ! result )
{
string s = msg->Name() + " failed, terminating thread";
Error(s.c_str());
Stop();
break;
}
delete msg;
}
}
void MsgThread::GetStats(Stats* stats)
{
stats->sent_in = cnt_sent_in;
stats->sent_out = cnt_sent_out;
stats->pending_in = cnt_sent_in - queue_in.Size();
stats->pending_out = cnt_sent_out - queue_out.Size();
}

157
src/threading/MsgThread.h Normal file
View file

@ -0,0 +1,157 @@
#ifndef THREADING_MSGTHREAD_H
#define THREADING_MSGTHREAD_H
#include <pthread.h>
#include "DebugLogger.h"
#include "BasicThread.h"
#include "Queue.h"
namespace threading {
class BasicInputMessage;
class BasicOutputMessage;
class HeartbeatMessage;
class MsgThread : public BasicThread
{
public:
MsgThread(const string& name);
void SendIn(BasicInputMessage* msg) { return SendIn(msg, false); }
void SendOut(BasicOutputMessage* msg) { return SendOut(msg, false); }
BasicOutputMessage* RetrieveOut();
// Report an informational message, nothing that needs specific
// attention.
void Info(const char* msg);
// Report a warning that may indicate a problem.
void Warning(const char* msg);
// Report a non-fatal error. Processing proceeds normally after the error
// has been reported.
void Error(const char* msg);
// Report a fatal error. Bro will terminate after the message has been
// reported.
void FatalError(const char* msg);
// Report a fatal error. Bro will terminate after the message has been
// reported and always generate a core dump.
void FatalErrorWithCore(const char* msg);
// Report about a potential internal problem. Bro will continue
// normally.
void InternalWarning(const char* msg);
// Report an internal program error. Bro will terminate with a core
// dump after the message has been reported.
void InternalError(const char* msg);
#ifdef DEBUG
// Records a debug message for the given stream.
void Debug(DebugStream stream, const char* msg);
#endif
void Heartbeat();
struct Stats
{
uint64_t sent_in;
uint64_t sent_out;
uint64_t pending_in;
uint64_t pending_out;
};
void GetStats(Stats* stats);
protected:
friend class HeartbeatMessage;
virtual void Run();
virtual void OnStop();
virtual bool DoHeartbeat(double network_time, double current_time) { return true; }
private:
friend class Manager;
BasicInputMessage* RetrieveIn();
void SendIn(BasicInputMessage* msg, bool force);
void SendOut(BasicOutputMessage* msg, bool force);
bool HasIn() { return queue_in.Ready(); }
bool HasOut() { return queue_out.Ready(); }
Queue_<BasicInputMessage *> queue_in;
Queue_<BasicOutputMessage *> queue_out;
uint64_t cnt_sent_in;
uint64_t cnt_sent_out;
};
class Message
{
public:
virtual ~Message();
const string& Name() const { return name; }
virtual bool Process() = 0; // Thread will be terminated if returngin false.
protected:
Message(const string& arg_name) { name = arg_name; }
private:
string name;
};
class BasicInputMessage : public Message
{
protected:
BasicInputMessage(const string& name) : Message(name) {}
};
class BasicOutputMessage : public Message
{
protected:
BasicOutputMessage(const string& name) : Message(name) {}
};
template<typename O>
class InputMessage : public BasicInputMessage
{
public:
O* Object() const { return object; }
protected:
InputMessage(const string& name, O* arg_object) : BasicInputMessage(name)
{ object = arg_object; }
private:
O* object;
};
template<typename O>
class OutputMessage : public BasicOutputMessage
{
public:
O* Object() const { return object; }
protected:
OutputMessage(const string& name, O* arg_object) : BasicOutputMessage(name)
{ object = arg_object; }
private:
O* object;
};
}
#endif

150
src/threading/Queue.h Normal file
View file

@ -0,0 +1,150 @@
#ifndef THREADING_QUEUE_H
#define THREADING_QUEUE_H
#include <pthread.h>
#include <queue>
#include <deque>
#include <stdint.h>
#include "Reporter.h"
namespace threading {
/**
* Just a simple threaded queue wrapper class. Uses multiple queues and reads / writes in rotary fashion in an attempt to limit contention.
* Due to locking granularity, bulk put / get is no faster than single put / get as long as FIFO guarantee is required.
*/
template<typename T>
class Queue_
{
public:
Queue_();
~Queue_();
T Get();
void Put(T data);
bool Ready();
uint64_t Size();
private:
static const int NUM_QUEUES = 8;
pthread_mutex_t mutex[NUM_QUEUES]; // Mutex protected shared accesses.
pthread_cond_t has_data[NUM_QUEUES]; // Signals when data becomes available
std::queue<T> messages[NUM_QUEUES]; // Actually holds the queued messages
int read_ptr; // Where the next operation will read from
int write_ptr; // Where the next operation will write to
uint64_t size;
};
inline static void safe_lock(pthread_mutex_t* mutex)
{
if ( pthread_mutex_lock(mutex) != 0 )
reporter->FatalErrorWithCore("cannot lock mutex");
}
inline static void safe_unlock(pthread_mutex_t* mutex)
{
if ( pthread_mutex_unlock(mutex) != 0 )
reporter->FatalErrorWithCore("cannot unlock mutex");
}
template<typename T>
inline Queue_<T>::Queue_()
{
read_ptr = 0;
write_ptr = 0;
for( int i = 0; i < NUM_QUEUES; ++i )
{
if ( pthread_cond_init(&has_data[i], NULL) != 0 )
reporter->FatalError("cannot init queue condition variable");
if ( pthread_mutex_init(&mutex[i], NULL) != 0 )
reporter->FatalError("cannot init queue mutex");
}
}
template<typename T>
inline Queue_<T>::~Queue_()
{
for( int i = 0; i < NUM_QUEUES; ++i )
{
pthread_cond_destroy(&has_data[i]);
pthread_mutex_destroy(&mutex[i]);
}
}
template<typename T>
inline T Queue_<T>::Get()
{
safe_lock(&mutex[read_ptr]);
int old_read_ptr = read_ptr;
if ( messages[read_ptr].empty() )
pthread_cond_wait(&has_data[read_ptr], &mutex[read_ptr]);
T data = messages[read_ptr].front();
messages[read_ptr].pop();
--size;
read_ptr = (read_ptr + 1) % NUM_QUEUES;
safe_unlock(&mutex[old_read_ptr]);
return data;
}
template<typename T>
inline void Queue_<T>::Put(T data)
{
safe_lock(&mutex[write_ptr]);
int old_write_ptr = write_ptr;
bool need_signal = messages[write_ptr].empty();
messages[write_ptr].push(data);
++size;
if ( need_signal )
pthread_cond_signal(&has_data[write_ptr]);
write_ptr = (write_ptr + 1) % NUM_QUEUES;
safe_unlock(&mutex[old_write_ptr]);
}
template<typename T>
inline bool Queue_<T>::Ready()
{
safe_lock(&mutex[read_ptr]);
bool ret = (messages[read_ptr].size());
safe_unlock(&mutex[read_ptr]);
return ret;
}
template<typename T>
inline uint64_t Queue_<T>::Size()
{
safe_lock(&mutex[read_ptr]);
uint64_t s = size;
safe_unlock(&mutex[read_ptr]);
return s;
}
}
#endif