* topic/robin/exit-after-terminate:
Updating submodule(s).
Fixing exit-after-terminate when used with bare mode.
New option exit_only_after_terminate to prevent Bro from exiting.
Moved this functionality to be internal instead of in the script-layer
event handlers. The issue with the later is that bad things can happen
between the time a reporter event handler is dispatched and the time it
is executed, and if bro crashes in that time, the message may never be
seen/logged.
Addressed #930 (and revisits #836).
I've only tested that it compiles, not whether it still works. The
fact that we don't have any tests for this makes me uneasy ...
* remotes/origin/topic/seth/elasticsearch: (35 commits)
Some documentation updates for elasticsearch plugin.
Temporarily removing the ES timeout because it works with signals and is incompatible with Bro threads.
Changed ES index names to localtime and added a meta index.
New script for easily duplicating logs to ElasticSearch.
Some better elasticsearch reliability.
Fixed small elasticsearch problem in configure output.
Re-adding the needed call to FinishedRotation in the ES writer plugin.
Tiny updates.
Bringing elasticsearch branch up to date with master.
Adding a define to make the stdint C macros available.
Adding an extra header.
Fixed a bug with messed up time value passing to elasticsearch.
Small updates and a little standardization for config.h.in naming.
Bug fixes.
Bug fix and feature.
Forgot to call the parent method for DoHeartBeat.
Changed the escaping method.
Flush logs to ES daemon as Bro is shutting down.
Reduce the batch size to 1000 and add a maximum time interval for batches.
Reworked bulk operation string construction to use ODesc and added json escaping.
...
Using the default scripts, the events from RemoteSerializer::LogStats()
were attempting to use the logging framework after logging/threading
had been terminated which never worked right and sometimes caused
crashes with "fatal error: cannot lock mutex".
Also made communication log baseline test pass more reliably.
I copied the code over manually, no merging, because (1) it needed to
be adapted to the new threading API, and (2) there's more stuff in the
branch that I haven't ported yet.
The DS output generally seems to work, but it has seen no further
testing yet.
Not unit tests yet either.
Also renaming --enable-perftools to --enable-perftool-debug to
indicate that the switch is only relevant for debugging the heap. It's
not needed to pick up tcmalloc for better performance.
--with-perftools can still (and always) be used to give a hint where
to find the libraries.
With the threading, using tcmalloc improves memory usage on FreeBSD
significantly when running on a trace. If it fixes the live problems,
remains to be seen ...
In default mode, Bro would load the packet filter script framework
which installs a filter that allows all packets, but in bare mode
(the -b option), this old filter would not follow IPv6 protocol
chains and thus filter out packets with extension headers.
But: there are still a few places where I am sure that there are race conditions & memory leaks & I do not really like the current interface & I have to add a few more messages between the front and backend.
But - it works :)
- Data queued at termination wasn't written out completely.
- Fixed some race conditions.
- Fixing IOSource integration.
- Fixing setting thread names on Linux.
- Fixing minor leaks.
All tests now pass for me on Linux in debug and non-debug compiles.
Remaining TODOs:
- Needs leak check.
- Test on MacOS and FreeBSD.
- More testing:
- High volume traffic.
- Different platforms.
most stuff is inplace, logging framework needs a few changes merged before continuing here...
Conflicts:
src/CMakeLists.txt
src/LogMgr.h
src/logging/Manager.cc
src/main.cc
Sending SIGTERM triggers a normal shutdown of all threads that waits
until they have processed their remaining data. However, sending a 2nd
SIGTERM while waiting for them to finish will immediately kill them
all.
This is based on Gilbert's code but I ended up refactoring it quite a
bit. That's why I didn't do a direct merge but started with a new
branch and copied things over to adapt. It looks quite a bit different
now as I tried to generalize things a bit more to also support the
Input Framework.
The larger changes code are:
- Moved all logging code into subdirectory src/logging/. Code
here is in namespace "logging".
- Moved all threading code into subdirectory src/threading/. Code
here is in namespace "threading".
- Introduced a central thread manager that tracks threads and is
in charge of termination and (eventually) statistics.
- Refactored logging independent threading code into base classes
BasicThread and MsgThread. The former encapsulates all the
pthread code with simple start/stop methods and provides a
single Run() method to override.
The latter is derived from BasicThread and adds bi-directional
message passing between main and child threads. The hope is that
the Input Framework can reuse this part quite directly.
- A log writer is now split into a general WriterFrontend
(LogEmissary in Gilbert's code) and a type-specific
WriterBackend. Specific writers are implemented by deriving from
the latter. (The plugin interface is almost unchanged compared
to the 2.0 version.).
Frontend and backend communicate via MsgThread's message
passing.
- MsgThread (and thus WriterBackend) has a Heartbeat() method that
a thread can override to execute code on a regular basis. It's
triggered roughly once a second by the main thread.
- Integration into "the rest of Bro". Threads can send messages to
the reporter and do debugging output; they are hooked into the
I/O loop for sending messages back; and there's a new debugging
stream "threading" that logs, well, threading activity.
This all seems to work for the most part, but it's not done yet.
TODO list:
- Not all tests pass yet. In particular, diffs for the external
tests seem to indicate some memory problem (no crashes, just an
occasional weird character).
- Only tested in --enable-debug mode.
- Only tested on Linux.
- Needs leak check.
- Each log write is currently a single inter-thread message. Bring
Gilbert's bulk writes back.
- Code needs further cleanup.
- Document the class API.
- Document the internal structure of the logging framework.
- Check for robustness: live traffic, aborting, signals, etc.
- Add thread statistics to profile.log (most of the code is there).
- Customize the OS-visible thread names on platforms that support it.
* origin/topic/jsiwek/brofiler:
Fix superfluous/duplicate data getting in to testing coverage log.
Add "# @no-test" tag to blacklist statements from test coverage analysis.
Test coverage integration for external tests and complete suite.
Integrate Bro script coverage profiling with the btest suite.
Add simple profiling class to accumulate Stmt usage stats across runs.
Renaming environment variable BROFILER_FILE to BRO_PROFILER_FILE for
consistency. Yeah, I know, such a nice name! :)
Also replaced the --snaplen/-l command line option with a
scripting-layer option called "snaplen" (which can also be
redefined on the command line, e.g. `bro -i eth0 snaplen=65535`).
Use the BROFILER_FILE environment variable to point to a file in
which Stmt usage statistics from Bro script-layer can be output.
This should be able to be used to check Bro script coverage that
that e.g. the entire test suite covers.
Some of the changes only clean up at termination to make perftools
happt, but there were some "real" leaks as well.
This fixes all DNS leaks I could reproducem, including most likely
what's reported in #534. Closing #534.
I'm also adding a new btest subdir core/leaks with tests requiring
perftools support. These don't compare against base lines but abort
whenever perftools reports a leak (with stack information to track it
down). Right now, these are passing.
Reading from an interface like `bro -i en0` no longer expects to
start reading stdin for a script to load. Explicitly passing in
'-' as an additional command line argument still allows reading a
script from stdin.
Closes#561
The format string given to the reporter warning call wasn't printing
the handler names. Also changed it so that each warning message has
the full context of the warning.
The communication subsystem is now disabled until a new BiF,
enable_communication(), is called. The base scripts do this
automatically when either a Communication::Node is defined, or Bro is
asked to listen for incoming connections.