This moves all threading code in Bro from pthreads to the c++11
primitives, which make for shorter, easier to use, and less error-prone
code.
pthreads is still used in 2 places in Bro currently. BasicThread uses
two bits of functionality that are not available using the c++ API
(setting thread names & setting signal masks). Since all c++
implementations that I am aware of still use an underlying pthreads
implementation, we just use native_handle to access the underlying
pthreads implementation for these cases. I do not expect this to lead to
problems in the forseable future. If we ever encounter a platform where
a different thread architecture is used, we might have to change that
around.
This code is guarded by static_asserts, so we will notice if a platform
uses a different implementation.
sqlite also uses pthreads directly.
PrepareStop() is now SignalStop() and just signals a thread that it
should terminate. After that's called, WaitForStop() (formerly Stop())
wait for it to actually finish processing.
When stopping writers during operation, we now no longer wait for them
to finish.
Once a BasicThread leaves its run() method, a thread is now marked for
cleaning up, and the ThreadMgr will soon join it to release the OS
resources.
Also, adding a function Log::remove_stream() that remove a logging
stream, stopping all writer threads that are associated with it.
Note, however, that removing a *filter* from a stream still doesn't
clean up any threads. The problem is that because of the output paths
potentially being created dynamically it's unclear if the writer
thread will still be needed in the future. We could add clean writers
up with timeouts, but that doesn't sound great either. So for now, the
only way to sure clean up logging threads is to remove the entire
stream.
Also note that cleanup doesn't work with input threads yet, which
don't seem to terminate (at least in the case I tried).
Threads will now reliably get a call to DoFinish() no matter how the
thread terminates. This will always be called from within the thread,
whereas the destructor is called from the main thread after the child
thread has already terminated.
Also removing debugging code.
However, two problems remain with the ASCII writer (seeing them only
on MacOS):
- the #start/#end timestamps contain only dummy values right now.
The odd thing is that once I enable strftime() to print actual
timestamps, I get crashes (even though strftime() is supposed to
be thread-safe).
- occassionally, there's still output missing in tests. In those
cases, the file descriptor apparently goes bad: a write() will
suddently return EBADF for reasons I don't understand yet.
frameworks.
There were a number of cases that weren't thread-safe. In particular,
we don't use std::string anymore for anything that's passed between
threads (but instead plain old const char*, with manual memmory
managmenet).
This is still a check-point commit, I'll do more testing.
Turns out the finish methods weren't called correctly, caused by a
mess up with method names which all sounded too similar and the wrong
one ended up being called. I've reworked this by changing the
thread/writer/reader interfaces, which actually also simplifies them
by getting rid of the requirement for writer backends to call their
parent methods (i.e., less opportunity for errors).
This commit also includes the following (because I noticed the problem
above when working on some of these):
- The ASCII log writer now includes "#start <timestamp>" and
"#end <timestamp> lines in the each file. The latter supersedes
Bernhard's "EOF" patch.
This required a number of tests updates. The standard canonifier
removes the timestamps, but some tests compare files directly,
which doesn't work if they aren't printing out the same
timestamps (like the comm tests).
- The above required yet another change to the writer API to
network_time to methods.
- Renamed ASCII logger "header" options to "meta".
- Fixes#763 "Escape # when first character in log file line".
All btests pass for me on Linux FC15. Will try MacOS next.
* origin/fastpath:
Fix overrides of TCP_ApplicationAnalyzer::EndpointEOF.
Fix segfault when incrementing whole vector values.
Remove baselines for some leak-detecting unit tests.
Unblock SIGFPE, SIGILL, SIGSEGV and SIGBUS for threads.
According to POSIX, behavior is unspecified if a specific thread receives one of those signals (because of e.g. executing an invalid instruction) if the signal is blocked.
This resulted in segfaults in threads not propagating to the main thread.
Adresses #848
set frontend type before starting the thread. This means that the thread type will be output correctly in the error message.
return errno string of pthread functions called in thread start
- Data queued at termination wasn't written out completely.
- Fixed some race conditions.
- Fixing IOSource integration.
- Fixing setting thread names on Linux.
- Fixing minor leaks.
All tests now pass for me on Linux in debug and non-debug compiles.
Remaining TODOs:
- Needs leak check.
- Test on MacOS and FreeBSD.
- More testing:
- High volume traffic.
- Different platforms.
Sending SIGTERM triggers a normal shutdown of all threads that waits
until they have processed their remaining data. However, sending a 2nd
SIGTERM while waiting for them to finish will immediately kill them
all.
This is based on Gilbert's code but I ended up refactoring it quite a
bit. That's why I didn't do a direct merge but started with a new
branch and copied things over to adapt. It looks quite a bit different
now as I tried to generalize things a bit more to also support the
Input Framework.
The larger changes code are:
- Moved all logging code into subdirectory src/logging/. Code
here is in namespace "logging".
- Moved all threading code into subdirectory src/threading/. Code
here is in namespace "threading".
- Introduced a central thread manager that tracks threads and is
in charge of termination and (eventually) statistics.
- Refactored logging independent threading code into base classes
BasicThread and MsgThread. The former encapsulates all the
pthread code with simple start/stop methods and provides a
single Run() method to override.
The latter is derived from BasicThread and adds bi-directional
message passing between main and child threads. The hope is that
the Input Framework can reuse this part quite directly.
- A log writer is now split into a general WriterFrontend
(LogEmissary in Gilbert's code) and a type-specific
WriterBackend. Specific writers are implemented by deriving from
the latter. (The plugin interface is almost unchanged compared
to the 2.0 version.).
Frontend and backend communicate via MsgThread's message
passing.
- MsgThread (and thus WriterBackend) has a Heartbeat() method that
a thread can override to execute code on a regular basis. It's
triggered roughly once a second by the main thread.
- Integration into "the rest of Bro". Threads can send messages to
the reporter and do debugging output; they are hooked into the
I/O loop for sending messages back; and there's a new debugging
stream "threading" that logs, well, threading activity.
This all seems to work for the most part, but it's not done yet.
TODO list:
- Not all tests pass yet. In particular, diffs for the external
tests seem to indicate some memory problem (no crashes, just an
occasional weird character).
- Only tested in --enable-debug mode.
- Only tested on Linux.
- Needs leak check.
- Each log write is currently a single inter-thread message. Bring
Gilbert's bulk writes back.
- Code needs further cleanup.
- Document the class API.
- Document the internal structure of the logging framework.
- Check for robustness: live traffic, aborting, signals, etc.
- Add thread statistics to profile.log (most of the code is there).
- Customize the OS-visible thread names on platforms that support it.