This helps prevent a node from being killed/crashing in the middle
of writing a log, restarting, and eventually clobbering that log
file that never underwent the rotation/archival process.
The old `archive-log` and `post-terminate` scripts as used by
ZeekControl previously implemented this behavior, but the new logic is
entirely in the ASCII writer. It uses ".shadow" log files stored
alongside the real log to help detect such scenarios and rotate them
correctly upon the next startup of the Zeek process.
I copied the same style that caf uses ("zk" with single dot and no space).
This gives some consistency with caf and avoids us wasting more
space beyond "bro: ". OSs only give 16 characters for thread names
so anything we can gain here is nice.
Note - this compiles, but you cannot run Bro anymore - it crashes
immediately with a 0-pointer access. The reason behind it is that the
required clone functionality does not work anymore.
Broker had changed the semantics of remote logging: it sent over the
original Bro record containing the values to be logged, which on the
receiving side would then pass through the logging framework normally,
including triggering filters and events. The old communication system
however special-cases logs: it sends already processed log entries,
just as they go into the log files, and without any receiver-side
filtering etc. This more efficient as it short-cuts the processing
path, and also avoids the more expensive Val serialization. It also
lets the sender determine the specifics of what gets logged (and how).
This commit changes Broker over to now use the same semantics as the
old communication system.
TODOs:
- The new Broker code doesn't have consistent #ifdefs yet.
- Right now, when a new log receiver connects, all existing logs
are broadcasted out again to all current clients. That doesn't so
any harm, but is unncessary. Need to add a way to send the
existing logs to just the new client.
is present. Testcase crashes on unpatched versions of Bro.
Found by Aaron Eppert <aeppert@gmail.com>.
This (probably) fixes the crash issue with sqlite a few people have
reported on the mailing list in the past.
Setting the thread name on every heartbeat uses a mild amount of
cycles and there's not much benefit to doing it there to get the
additional info regarding the number of processed messages since thread
names usually get truncated to 16 characters and omit that part anyway.
First step - factored out everything the logging classes
use ( so only output ).
Moved the script-level configuration to logging/main,
and made the individual writers just refer to it -
no idea if this is good design. It works. But I am happy
about opinions :)
Next step - add support for input...
There are now two FinishedRotation() methods, one that triggers
post-processing and one that doesn't. There's also insurance built in
against a writer not calling either (or both), in which case we abort
with an internal error.
This changes writer implementations to always respond to rotation
messages in their DoRotate() method, even for failure/no-op cases
with a new RotationFailedMessage. This informs the manager to
decrement its count of pending rotations.
Addresses #860.
failure.
Once a writer/reader Do* method has returned false, no further ones
will be executed anymore. This is primarily a safety mechanism to make
it easier for writer/reader authors as otherwise they would often need
to track the failure state themselves (because with the now delayed
termination from the earlier commit, furhter messages can now still
arrive for a little bit).
Since WriterFrontend objects are looked up internally by writer type and
path, and they also expect to write consistent field arguments, it could
be the case that more than one filter of a given stream attempts to
write to the same path (derived either from $path or $path_func fields
of the filter) with the same writer type. This won't work, so now
WriterFrontend objects are bound to the filter that instantiated them so
that we can warn about other filters attempting to write to the
conflicting writer/path and the write can be skipped. Remote logs don't
appear to suffer the same issue due to pre-filtering.
Addresses #842.
Instantiations of WriterInfo in RemoteSerializer::ProcessLogCreateWriter()
would leave the network_time member uninitialized which could later
cause localtime_r() calls in Ascii::Timestamp() to return a null pointer
due to the bizarre input and giving that to strftime() causes it to segfault.
frameworks.
There were a number of cases that weren't thread-safe. In particular,
we don't use std::string anymore for anything that's passed between
threads (but instead plain old const char*, with manual memmory
managmenet).
This is still a check-point commit, I'll do more testing.
Turns out the finish methods weren't called correctly, caused by a
mess up with method names which all sounded too similar and the wrong
one ended up being called. I've reworked this by changing the
thread/writer/reader interfaces, which actually also simplifies them
by getting rid of the requirement for writer backends to call their
parent methods (i.e., less opportunity for errors).
This commit also includes the following (because I noticed the problem
above when working on some of these):
- The ASCII log writer now includes "#start <timestamp>" and
"#end <timestamp> lines in the each file. The latter supersedes
Bernhard's "EOF" patch.
This required a number of tests updates. The standard canonifier
removes the timestamps, but some tests compare files directly,
which doesn't work if they aren't printing out the same
timestamps (like the comm tests).
- The above required yet another change to the writer API to
network_time to methods.
- Renamed ASCII logger "header" options to "meta".
- Fixes#763 "Escape # when first character in log file line".
All btests pass for me on Linux FC15. Will try MacOS next.
* robin/topic/writer-info:
Extending the log writer DoInit() API.
Reworking log writer API to make it easier to pass additional information to a writer's initialization method.
Conflicts:
src/logging/WriterBackend.cc
src/logging/WriterBackend.h
src/logging/WriterFrontend.cc
* origin/fastpath:
Fix inconsistencies in random number generation.
Updating input framework unit tests.
Add front-end name to InitMessage from WriterFrontend to Backend.
Small tweak to make test complete quicker.
Drain events before terminating log/thread managers.
Fix strict-aliasing warning in RemoteSerializer.cc (fixes#834).
Fix typos in event documentation
Fix typos in NEWS for Bro 2.1 beta
At the time WriterBackend::Init() happens, it's in a different thread
than its frontend member, but tried to access it directly to get its
name, that info is now sent in the InitMessage instead.
(Problem was observed segfaulting the unit test
scripts.base.frameworks.notice.mail-alarms on Ubuntu 12.04).
We now pass in a Info struct that contains:
- the path name (as before)
- the rotation interval
- the log_rotate_base_time in seconds
- a table of key/value pairs with further configuration options.
To fill the table, log filters have a new field "config: table[string]
of strings". This gives a way to pass arbitrary values from
script-land to writers. Interpretation is left up to the writer.
Also splits calc_next_rotate() into two functions, one of which is
thread-safe and can be used with the log_rotate_base_time value from
DoInit().
Includes also updates to the None writer:
- It gets its own script writers/none.bro.
- New bool option LogNone::debug to enable debug output. It then
prints out all the values passed to DoInit(). That's used by a
btest test to ensure the new DoInit() values are right.
- Fixed a bug that prevented Bro from terminating..
(scripts.base.frameworks.logging.rotate-custom currently fails.
Haven't yet investigated why.)
Instead, the `addr_to_uri` script-level function can be used to
explicitly add brackets to an address if it's IPv6 and will be
included in a URI or when a ":<port>" needs to be appended to it.
I copied the code over manually, no merging, because (1) it needed to
be adapted to the new threading API, and (2) there's more stuff in the
branch that I haven't ported yet.
The DS output generally seems to work, but it has seen no further
testing yet.
Not unit tests yet either.
As we can't use the IPAddr class (because it's not thread-safe), this
involved a bit manual address manipulation and also shuffling some
things around a bit.
Not fully working yet, the tests for remote logging still fail.
- Data queued at termination wasn't written out completely.
- Fixed some race conditions.
- Fixing IOSource integration.
- Fixing setting thread names on Linux.
- Fixing minor leaks.
All tests now pass for me on Linux in debug and non-debug compiles.
Remaining TODOs:
- Needs leak check.
- Test on MacOS and FreeBSD.
- More testing:
- High volume traffic.
- Different platforms.