This commit marks (hopefully) ever one-parameter constructor as explicit.
It also uses override in (hopefully) all circumstances where a virtual
method is overridden.
There are a very few other minor changes - most of them were necessary
to get everything to compile (like one additional constructor). In one
case I changed an implicit operation to an explicit string conversion -
I think the automatically chosen conversion was much more convoluted.
This took longer than I want to admit but not as long as I feared :)
The two hooks being added are:
void HookLogInit(const std::string& writer, const std::string& instantiating_filter, bool local, bool remote, const logging::WriterBackend::WriterInfo& info, int num_fields, const threading::Field* const* fields);
which is called when a writer is being instantiated and contains
information about the fields being logged, as well as
bool HookLogWrite(const std::string& writer, const std::string& filter, const logging::WriterBackend::WriterInfo& info, int num_fields, const threading::Field* const* fields, threading::Value** vals);
which is called for each log line being written by each writer. It
contains all the data being written. The data can be changed in the
function call and lines can be prevented from being written.
This commit also fixes a few small problems with plugin hooks itself,
and extends the tests that were already there, besides introducing tests
for the added functionality.
This feature can be enabled globally for all logs by setting
LogAscii::gzip_level to a value greater than 0.
This feature can be enabled on a per-log basis by setting gzip-level in
$confic to a value greater than 0.
Broker had changed the semantics of remote logging: it sent over the
original Bro record containing the values to be logged, which on the
receiving side would then pass through the logging framework normally,
including triggering filters and events. The old communication system
however special-cases logs: it sends already processed log entries,
just as they go into the log files, and without any receiver-side
filtering etc. This more efficient as it short-cuts the processing
path, and also avoids the more expensive Val serialization. It also
lets the sender determine the specifics of what gets logged (and how).
This commit changes Broker over to now use the same semantics as the
old communication system.
TODOs:
- The new Broker code doesn't have consistent #ifdefs yet.
- Right now, when a new log receiver connects, all existing logs
are broadcasted out again to all current clients. That doesn't so
any harm, but is unncessary. Need to add a way to send the
existing logs to just the new client.
Looks like the right fix. Two tiny tweaks:
- changed the order of arguments for DeleteVals() for consistency
with the corresponding Manager function.
- turned the InternalWarning into a Warning: if I understand
correctly, this can happen when scripts on nodes diverge; which
is a user-side problem, not an internal Bro logic issue.
BIT-1683 #merged
* origin/topic/johanna/bit-1683:
Actually check if the number of fields in a write are equal to the number of fields required.
number of fields required.
Addresses BIT-1683
I do not think this quite fixes the underlying issue of BIT-1683 - it
should not be possible to get to this state in normal operations.
Also fixes a small memory leak for disabled writers.
The order in which the plugin initializers are executed is compiler
dependent. With this change, Tags will always be generated in
alphabetical ordering, not in compiler-dependent order.
* origin/topic/seth/log-framework-ext:
Log extensions: series of small fixes and new tests.
Change the function for log extension to take a path only and update tests.
Final changes to log framework ext code.
Add logging framework metadata mechanism.
Add unrolling separator & field name map to logging framework.
The extensions now work with optional types, as well with complex types
(like subrecords). Not returning a record in the ext_func no longer
crashes bro.
The default_ext_func was switched to return void in
cases where no extension revord is defined (was bool).
I also got rid of the offsets in the indices - with the rest of the
implementation, that was not really necessary and made the code more
complex.
The "metadata" functionality has been renamed to "ext" to
represent that the logs are being extended. The function that
returns the record which is used to extend the log now receives
a log filter as it's single argument.
The field name "unrolling" is now renamed to "scope" so the variables
names now look like this: "Log::default_scope_sep"
This change introduces error events for Table and Event readers. Users
can now specify an event that is called when an info, warning, or error
is emitted by their input reader. This can, e.g., be used to raise
notices in case errors occur when reading an important input stream.
Example:
event error_event(desc: Input::TableDescription, msg: string, level: Reporter::Level)
{
...
}
event bro_init()
{
Input::add_table([$source="a", $error_ev=error_event, ...]);
}
For the moment, this converts all errors in the Asciiformatter into
warnings (to show that they are non-fatal) - the Reader itself also has
to throw an Error to show that a fatal error occurred and processing
will be abort.
It might be nicer to change this and require readers to mark fatal
errors as such when throwing them.
Addresses BIT-1181
This allows all threads accessing the same database to share sqlite
objects. This, for example, fixes the issue with several threads
simultaneously writing to the same database file.
See https://www.sqlite.org/sharedcache.html
Addresses BIT-1325
- When a log record is being "unrolled" (sub-records flattened
out into a single record), it's now possible to choose the
character/string to separate the outer name from the inner
name. This can be used to work around the problems
with ElasticSearch 2.0 not supporting dots "." in field names.
This value can be provided per-filter as well as a global
default value.
- Log fields can be renamed by providing a table per-filter
(or a global default) to rename fields for any log writer.
The name translation is performed after unrolling so the
value in the field name table must match whatever is being
used to separate field names.
For example if the unrolling separator was set to "*":
redef Log::default_unrolling_sep = "*";
The field name map would need to reflect it:
redef Log::default_field_name_map = {
["id*orig_h"] = "src",
["id*orig_p"] = "src_port",
["id*resp_h"] = "dst",
["id*resp_p"] = "dst_port",
};
Turns out that in error situations, the final finish message might not
reach the writer anymore, as communication between the threads will be
shut down. Instead of aborting, we now just clean up in that case and
proceed. This isn't changing any other behaviour. The original error
check was in place mostly for helping debug the data flow between the
threads anyways.
Addresses BIT-1331.
is present. Testcase crashes on unpatched versions of Bro.
Found by Aaron Eppert <aeppert@gmail.com>.
This (probably) fixes the crash issue with sqlite a few people have
reported on the mailing list in the past.
Also changed asynchronous data store query code a bit; trying to make
memory management and handling of corner cases a bit clearer (former
maybe could still be better, but I need to lookup queries by memory
address to associate response cookies to them, and so wrapping pointers
kind of just gets in the way).
It now works a bit differently than before: whether to send a remote log
write is now a property of the logging stream, not the logging filter
and it's now up the the receiver side filters to instantiate the desired
writer. i.e. the sender now has no say in what the receiver should use
as the log writer backend.
Under the new style of remote logging, the "Log::enable_remote_logging"
option is repurposed to set the default behavior for new logging
streams. There's also "Comm::{enable,disable}_remote_logging()" to
explicitly set the desired behavior for a given logging stream. To
receive remote logs, one calls "Comm::subscribe_to_logs(<topic>)", where
senders implicitly use topics of the form "bro/log/<stream id>".
Setting the thread name on every heartbeat uses a mild amount of
cycles and there's not much benefit to doing it there to get the
additional info regarding the number of processed messages since thread
names usually get truncated to 16 characters and omit that part anyway.