When looking up the postprocessor function from shadow files, id::find_func()
would abort() if the function wasn't available instead of falling back
to the default postprocessor.
Fix by using id::find() and checking the type explicitly and also adding a
strict type check while at it.
This issue was tickled by loading the json-streaming-logs package,
Zeek creating shadow files containing its custom postprocessor function,
then restarting Zeek without the package loaded.
Closes#4562
When a shadow file is empty/missing during rotation, Zeek aborts
with an error message, but if the shadow file was empty, it'll still
be there after the restart, causing an endless restart loop. This
solution gracefully handles the rotation in such cases using the
default file extension and post processing function.
This largely copies over Spicy's `.clang-format` configuration file. The
one place where we deviate is header include order since Zeek depends on
headers being included in a certain order.
Removes a bit of reliance around the magic DoLog() rendering at the
cost of needing to open-code some of it. The new obj_desc_short()
helper makes that acceptable, though.
Also modify FormatRotationPath to keep rotated logs within
Log::default_logdir unless the rotation function explicitly
set dir, e.g. by when the user redef'ed default_rotation_interval.
With the introduction of LogAscii::logdir, log filenames can now include
parent directories rather than being plain basenames. Enabling log rotation,
leftover log rotation and setting LogAscii::logdir broke due to not
handling this situation.
This change ensures that .shadow files are placed within the directory where
the respective .log file is created. Previously, the .shadow. (or .tmp.shadow.)
prefix was simply prepended, yielding non-sensical paths such as
.tmp.shadow.foo/bar/packet_filter.log for a logdir of foo/bar.
Additionally, respect LogAscii::logdir when searching for leftover log files
rather than defaulting to the current working directory.
The following quirk exist around LogAscii::logdir, but will be addressed
in a follow-up.
* By default, logs are currently rotated into the working directory of the
process, rather than staying confined within LogAscii::logdir. One of
the added tests shows this behavior.
We're using shadow files for log rotation on systems with ext4 running
Linux 4.19. We've observed zero-length shadow files in the logger's working
directory after a power-outage. This leads to a broken/stuck logger
process due to empty shadow files being considered invalid and the
process exiting:
error: failed to process leftover log 'conn.log.gz': Found leftover log, 'conn.log.gz', but the associated shadow file, '.shadow.conn.log.gz', required to process it is invalid
PR #1137 introduced atomic renaming of shadow files and was supposed to
handle this. However, after more investigation, the rename() has to be
preceded by an fsync() in order to avoid zero-length files in the presence
of hard-crashes or power-failures. This is generally operating system
and filesystem dependent, but should not hurt to add. The performance impact
can likely be neglected due to the low frequency and limited number of
log streams.
This has happened to others, too. Some references around this issue:
* https://stackoverflow.com/questions/7433057/is-rename-without-fsync-safe
* https://unix.stackexchange.com/questions/464382/which-filesystems-require-fsync-for-crash-safety-when-replacing-an-existing-fi
* https://bugzilla.kernel.org/show_bug.cgi?id=15910
Reproducer
This issue was reproduced artificially on Linux using the sysrq-trigger
functionality to hard-reset the system shortly after a .shadow file was
renamed to it's final destination with the following script watching for
.shadow.conn.log.gz:
#!/bin/bash
set -eu
dir=/data/logger-01/
# Allow everything via /proc/sysrq-trigger
echo "1" > /proc/sys/kernel/sysrq
inotifywait -m -e MOVED_TO --format '%e %w%f' "${dir}" | while read -r line; do
if echo "${line}" | grep -q '^MOVED_TO .*/.shadow.conn.log.gz$'; then
echo "RESET: $line"
sleep 4
# Trigger a hard-reset without sync/unmount
echo "b" > /proc/sysrq-trigger
fi
done
This quite reliably (4 out of 4 times) yielded a system with zero-length
shadow files and a broken logger after it came back online:
$ ls -lha /data/logger-01/.shadow.*
-rw-r--r-- 1 bro bro 0 Oct 14 02:26 .shadow.conn.log.gz
-rw-r--r-- 1 bro bro 0 Oct 14 02:26 .shadow.dns.log.gz
-rw-r--r-- 1 bro bro 0 Oct 14 02:26 .shadow.files.log.gz
After this change while running the reproducer, the shadow files always
contained content after a hard-reset.
Rework with util::safe_fsync helper
* 'logging/script-logdir' of https://github.com/kramse/zeek:
Copy of ascii-empty test, just changed path in the beginning
Logdir: Change requested by 0xxon, no problem
Introduce script-land variable that can be used to set logdir.
Closes GH-772
This enables locating the headers within the install-tree using the
dirs provided by `zeek-config --include_dir`.
To enable locating these headers within the build-tree, this change also
creates a 'build/src/include/zeek -> ..' symlink.
A logger process being terminated/killed while in the process of creating
a new .shadow file may leave an empty (invalid) one around. This in turn
causes the logger to error and exit during startup.
$ $ ls -lha .shadow.*
-rw-r--r-- 1 root root 0 Dec 16 18:48 .shadow.dns.log
-rw-r--r-- 1 root root 0 Dec 16 18:48 .shadow.packet_filter.log
$ zeek LogAscii::enable_leftover_log_rotation=T Log::default_rotation_interval=30sec -i wlp0s20f3
error in <params>, line 1: failed to process leftover log 'dns.log': Found leftover log, 'dns.log', but the associated shadow file, '.shadow.dns.log', required to process it is invalid
error in <params>, line 1: failed to process leftover log 'packet_filter.log': Found leftover log, 'packet_filter.log', but the associated shadow file, '.shadow.packet_filter.log', required to process it is invalid
$ ...
Prevent creating invalid .shadow files by atomically creating them.
We have a use case to rotate leftover log files in a non-supervisor
setup. There doesn't seem to be a strict requirement on supervisor
functionality. Allow enabling leftover log rotation through
LogAscii::enable_leftover_log_rotation and redef this for the
logger node in a supervisor setup individually.