A bunch of infrastructure work to move IOSource, IOSourceRegistry (now
iosource::Manager) and PktSrc/PktDumper code into iosource/, and over
to a plugin structure.
Other IOSources aren't touched yet, they are still in src/*.
It compiles and does something with a small trace, but that's all I've
tested so far. There are quite certainly a number of problems left, as
well as various TODOs and cleanup; and nothing's cast in stone yet.
Will continue to work on this.
A thread that is done/killed should signify that the thread manager has
some processing to do -- it needs to process any messages in its out
queue, join the thread, and delete it. Otherwise the thread manager
may reach a state where it makes no progress in processing the last
remaining done/killed thread.
This fix also fixes the deadlock issue without putting any
new strain into the main packet processing path.
Instead of occasionally returning true in MaybeReady sometime,
we occasionally process threads if time_mgr time is not running.
If time_mgr time is running, we have heartbeat messages that will
trigger processing in any case -- processing always checks the
exact state of the Queues.
This fix probably also means that we can remove the communication
loads from all input framework tests and run them all simultaneously.
Now it should work. However - this commit changes a basic assumption
of the threading queue. This basic assumption is, that nothing can
be read out of the out-queue of a dead thread. I think that reading
out of the queue of a dead thread makes perfect sense (when the thread
shuts down, pushes the rest of its work on the queue and says bye,
and wants the main thread to pick it up afterwards) - however, I
guess one can be of a differing opinion here.
In any case, it makes stuff a bit easier to understand - in my opinion.
It took me a while to find out why the messages disappear in thin
air and never arrive in the main thread ;)
PrepareStop() is now SignalStop() and just signals a thread that it
should terminate. After that's called, WaitForStop() (formerly Stop())
wait for it to actually finish processing.
When stopping writers during operation, we now no longer wait for them
to finish.
Once a BasicThread leaves its run() method, a thread is now marked for
cleaning up, and the ThreadMgr will soon join it to release the OS
resources.
Also, adding a function Log::remove_stream() that remove a logging
stream, stopping all writer threads that are associated with it.
Note, however, that removing a *filter* from a stream still doesn't
clean up any threads. The problem is that because of the output paths
potentially being created dynamically it's unclear if the writer
thread will still be needed in the future. We could add clean writers
up with timeouts, but that doesn't sound great either. So for now, the
only way to sure clean up logging threads is to remove the entire
stream.
Also note that cleanup doesn't work with input threads yet, which
don't seem to terminate (at least in the case I tried).
If a thread command fails (like the input framework not finding a
file), that now (1) no longer hangs Bro, and (2) even allows for
propagating error messages back before the thread is stops.
(Actually, the thread doesn't really "stop"; the thread manager keeps
threads around independent of their success; but it no longer polls
them for input.)
Closes#858.
* topic/robin/master-test: (60 commits)
Script fix for Linux.
Updating test base line.
Another small change to MsgThread API.
Bug fix for BasicThread.
make version_ok return true for TLSv12
Sed usage in canonifier script didn't work on non-Linux systems.
Changing HTTP DPD port 3138 to 3128.
Temporarily removing tuning/logs-to-elasticsearch.bro from the test-all-policy.
More documentation updates.
Revert "Fixing calc_next_rotate to use UTC based time functions."
Some documentation updates for elasticsearch plugin.
Give configure a --disable-perftools option.
Updating tests for the #start/#end change.
Further threading and API restructuring for logging and input frameworks.
Reworking forceful thread termination.
Moving the ASCII writer over to use UNIX I/O rather than stdio.
Further reworking the thread API.
Reworking thread termination logic.
If a thread doesn't terminate, we log that but not longer proceed (because it could hang later still).
Removing the thread kill functionality.
...
frameworks.
There were a number of cases that weren't thread-safe. In particular,
we don't use std::string anymore for anything that's passed between
threads (but instead plain old const char*, with manual memmory
managmenet).
This is still a check-point commit, I'll do more testing.
* topic/robin/input-threads-merge: (130 commits)
And now it even compiles after my earlier changes.
A set of input framework refactoring, cleanup, and polishing.
another small memory leak in ascii reader:
and another small memory leak when using streaming reads.
fix another memory lead (when updating tables).
Input framework merge in progress.
filters have been called streams for eternity. And I always was too lazy to change it everywhere...
reactivate network_time check in threading manager. previously this line made all input framework tests fail - it works now. Some of the other recent changes of the threading manager must have fixed that problem.
fix up the executeraw test - now it works for the first time and does not always fail
baselines for the autostart removal.
remove last remnants of autostart, which has been removed for quite a while.
make input framework source (hopefully) adhere to the usual indentation style. No functional changes.
fix two memory leaks which occured when one used filters.
update description to current interface.
rename a couple of structures and make the names in manager fit the api more.
fix memory leak in tables and vectors that are read into tables
fix missing get call for heart beat in benchmark reader.
fix heart_beat_interval -- initialization in constructor does not work anymore (probably due to change in init ordering?)
fix memory leak for tables... nearly completely.
fix a couple more leaks. But - still leaking quite a lot with tables.
...
line made all input framework tests fail - it works now. Some of the
other recent changes of the threading manager must have fixed that
problem.
This was easy :)
* change internal reader interface again
* remove some quite embarassing bugs that must have been in the interface for rather long
* add different read methods to script & internal interface (like normal, streaming, etc). Not implemented in ascii reader yet.
But: there are still a few places where I am sure that there are race conditions & memory leaks & I do not really like the current interface & I have to add a few more messages between the front and backend.
But - it works :)
- Data queued at termination wasn't written out completely.
- Fixed some race conditions.
- Fixing IOSource integration.
- Fixing setting thread names on Linux.
- Fixing minor leaks.
All tests now pass for me on Linux in debug and non-debug compiles.
Remaining TODOs:
- Needs leak check.
- Test on MacOS and FreeBSD.
- More testing:
- High volume traffic.
- Different platforms.
Sending SIGTERM triggers a normal shutdown of all threads that waits
until they have processed their remaining data. However, sending a 2nd
SIGTERM while waiting for them to finish will immediately kill them
all.
This is based on Gilbert's code but I ended up refactoring it quite a
bit. That's why I didn't do a direct merge but started with a new
branch and copied things over to adapt. It looks quite a bit different
now as I tried to generalize things a bit more to also support the
Input Framework.
The larger changes code are:
- Moved all logging code into subdirectory src/logging/. Code
here is in namespace "logging".
- Moved all threading code into subdirectory src/threading/. Code
here is in namespace "threading".
- Introduced a central thread manager that tracks threads and is
in charge of termination and (eventually) statistics.
- Refactored logging independent threading code into base classes
BasicThread and MsgThread. The former encapsulates all the
pthread code with simple start/stop methods and provides a
single Run() method to override.
The latter is derived from BasicThread and adds bi-directional
message passing between main and child threads. The hope is that
the Input Framework can reuse this part quite directly.
- A log writer is now split into a general WriterFrontend
(LogEmissary in Gilbert's code) and a type-specific
WriterBackend. Specific writers are implemented by deriving from
the latter. (The plugin interface is almost unchanged compared
to the 2.0 version.).
Frontend and backend communicate via MsgThread's message
passing.
- MsgThread (and thus WriterBackend) has a Heartbeat() method that
a thread can override to execute code on a regular basis. It's
triggered roughly once a second by the main thread.
- Integration into "the rest of Bro". Threads can send messages to
the reporter and do debugging output; they are hooked into the
I/O loop for sending messages back; and there's a new debugging
stream "threading" that logs, well, threading activity.
This all seems to work for the most part, but it's not done yet.
TODO list:
- Not all tests pass yet. In particular, diffs for the external
tests seem to indicate some memory problem (no crashes, just an
occasional weird character).
- Only tested in --enable-debug mode.
- Only tested on Linux.
- Needs leak check.
- Each log write is currently a single inter-thread message. Bring
Gilbert's bulk writes back.
- Code needs further cleanup.
- Document the class API.
- Document the internal structure of the logging framework.
- Check for robustness: live traffic, aborting, signals, etc.
- Add thread statistics to profile.log (most of the code is there).
- Customize the OS-visible thread names on platforms that support it.