Commit graph

270 commits

Author SHA1 Message Date
Tim Wojtulewicz
a7d3cb48ef Add concept of "parent" tag namespaces
This allows us to create an EnumType that groups all of the analyzer
tag values into a single type, while still having the existing types
that split them up. We can then use this for certain events that benefit
from taking all of the tag types at once.
2021-11-23 19:36:49 -07:00
Benjamin Bannier
72cbc7cd13 Move 3rdparty source files to 3rdparty/
This patch moves in-tree 3rdparty source files to `3rdparty/`. With that
we can remove special treatment of these files for `run-clang-format`.
2021-11-09 07:20:18 +01:00
Johanna Amann
1b3b9a3cfc Merge branch 'fsync-shadow-files-before-rename' of https://github.com/awelzel/zeek
* 'fsync-shadow-files-before-rename' of https://github.com/awelzel/zeek:
  logging/writers/ascii: shadow files: Add fsync() before rename()
2021-10-15 09:47:08 +01:00
Arne Welzel
dc6e21d6ae logging/writers/ascii: shadow files: Add fsync() before rename()
We're using shadow files for log rotation on systems with ext4 running
Linux 4.19. We've observed zero-length shadow files in the logger's working
directory after a power-outage. This leads to a broken/stuck logger
process due to empty shadow files being considered invalid and the
process exiting:

    error: failed to process leftover log 'conn.log.gz': Found leftover log, 'conn.log.gz', but the associated shadow  file, '.shadow.conn.log.gz', required to process it is invalid

PR #1137 introduced atomic renaming of shadow files and was supposed to
handle this. However, after more investigation, the rename() has to be
preceded by an fsync() in order to avoid zero-length files in the presence
of hard-crashes or power-failures. This is generally operating system
and filesystem dependent, but should not hurt to add. The performance impact
can likely be neglected due to the low frequency and limited number of
log streams.

This has happened to others, too. Some references around this issue:

* https://stackoverflow.com/questions/7433057/is-rename-without-fsync-safe
* https://unix.stackexchange.com/questions/464382/which-filesystems-require-fsync-for-crash-safety-when-replacing-an-existing-fi
* https://bugzilla.kernel.org/show_bug.cgi?id=15910

Reproducer

This issue was reproduced artificially on Linux using the sysrq-trigger
functionality to hard-reset the system shortly after a .shadow file was
renamed to it's final destination with the following script watching for
.shadow.conn.log.gz:

    #!/bin/bash
    set -eu
    dir=/data/logger-01/

    # Allow everything via /proc/sysrq-trigger
    echo "1" > /proc/sys/kernel/sysrq

    inotifywait -m -e MOVED_TO --format '%e %w%f' "${dir}" | while read -r line; do
        if echo "${line}" | grep -q '^MOVED_TO .*/.shadow.conn.log.gz$'; then
            echo "RESET: $line"
            sleep 4
            # Trigger a hard-reset without sync/unmount
            echo "b" > /proc/sysrq-trigger
        fi
    done

This quite reliably (4 out of 4 times) yielded a system with zero-length
shadow files and a broken logger after it came back online:

    $ ls -lha /data/logger-01/.shadow.*
    -rw-r--r-- 1 bro bro 0 Oct 14 02:26 .shadow.conn.log.gz
    -rw-r--r-- 1 bro bro 0 Oct 14 02:26 .shadow.dns.log.gz
    -rw-r--r-- 1 bro bro 0 Oct 14 02:26 .shadow.files.log.gz

After this change while running the reproducer, the shadow files always
contained content after a hard-reset.

Rework with util::safe_fsync helper
2021-10-14 15:54:45 +02:00
Tim Wojtulewicz
9af6b2f48d clang-format: Set penalty for breaking after assignment operator 2021-09-27 10:49:48 -07:00
Tim Wojtulewicz
4423574d26 clang-format: Set IndentCaseBlocks to false 2021-09-27 10:49:48 -07:00
Christian Kreibich
2585ccd873 Add unit tests for memory helpers 2021-09-20 17:51:43 -07:00
Christian Kreibich
c5cceaf5ad Add memory sizing/alignment helpers to util.cc/h
This functionality previously lived in the CompHash class, with one difference:
this removes a discrepancy between the offset aligner and the memory pointer
aligner/padder. The size aligner used to align the provided offset and then add an
additional alignment size (for example, 1 aligned to 4 wouldn't yield 4 but 8).
Like the memory aligners it now only rounds up as needed.

Includes unit tests.
2021-09-20 17:51:43 -07:00
Tim Wojtulewicz
b2f171ec69 Reformat the world 2021-09-16 15:35:39 -07:00
Tim Wojtulewicz
58cb9163d1 Fix mis-usage of string::append that leads to an overflow 2021-09-07 09:16:53 -07:00
Tim Wojtulewicz
404fed6923 Use json_escape_utf8 for all utf8 data in ODesc 2021-09-07 09:16:53 -07:00
Christian Kreibich
66e71c7fd2 Remove unneccessary >= 0 check in a UTF32 comparison
Resolves Coverity CID 1461523.
2021-08-25 14:13:17 -07:00
Christian Kreibich
ddbba17e57 Trivial signedness warning fix 2021-08-25 13:45:19 -07:00
Tim Wojtulewicz
f442893c98 Return fully-escaped string if utf8 conversion fails
This adds a new function for validating UTF-8 sequences by converting to
UTF-32. This allows us to also check for various blocks of codepointsi
that we consider invalid while checking for valid sequences in general.
2021-08-19 08:56:27 -07:00
Johanna Amann
ec6b954499 Merge branch 'master' of https://github.com/sowmyaramapatruni/zeek
Fixes GH-1689

* 'master' of https://github.com/sowmyaramapatruni/zeek:
  Fix issue-1689
2021-08-03 10:25:26 +01:00
Sowmya Ramapatruni
58fae22708 Fix issue-1689 2021-08-02 13:52:43 -07:00
Christian Kreibich
63259ef9fa Use mallinfo2() instead of mallinfo() when available
glibc 2.33 deprecates mallinfo in favor of a struct that returns
its members as size_ts instead of ints.
2021-07-01 16:40:28 -07:00
Andrew Benson
2ad482535e Fix incomplete-type for struct timeval 2021-03-29 22:41:31 -05:00
Tim Wojtulewicz
f45df63cd0 Merge remote-tracking branch 'origin/topic/vern/zval'
* origin/topic/vern/zval: (42 commits)
  whitespace tweaks
  resolved some TODO comments
  remove unnecessary casts, and change necessary ones to use static_cast<>
  explain cmp_func default
  change functions for ZVal type management to static members
  fix some unsigned/signed integer warnings
  address lint concern about uninitialized variable
  Remove use of obsolete forward-declaration macros
  fix #include's that lack zeek/ prefixes
  explicitly populate holes created in vectors
  fixes for now-incorrect assumption that GetField always returns an existing ValPtr
  memory management for assignment to vector elements
  memory management for assignment to record fields
  destructor cleanup from ZAM_vector/ZAM_record
  fix #include's that lack zeek/ prefixes
  overlooked another way in which vector holes can be created
  initialize vector holes to the correct corresponding type
  explicitly populate holes created in vectors
  fix other instances of GetField().get() assuming long-lived ValPtr's
  fix for now-incorrect assumption that GetField always returns an existing ValPtr
  ...
2021-03-23 20:44:19 -07:00
Jon Siwek
9ced370b48 Add starts_with()/ends_with() to zeek::util namespace 2021-02-26 14:43:55 -08:00
Vern Paxson
62bab66114 migration to using new differentiated methods for setting record fields 2021-02-25 16:59:26 -08:00
Jon Siwek
dacdf5424b Merge remote-tracking branch 'origin/topic/jsiwek/deprecate-zeekenv'
* origin/topic/jsiwek/deprecate-zeekenv:
  Deprecate zeekenv() and use getenv() directly
2021-02-01 12:13:47 -08:00
Jon Siwek
8a8a983c49 Add missing zeek/ to header includes
Related to https://github.com/zeek/zeek/pull/1377
2021-01-29 19:16:29 -08:00
Jon Siwek
b8c563dbdd Deprecate zeekenv() and use getenv() directly 2021-01-29 16:55:44 -08:00
Tim Wojtulewicz
725e759560 Remove support for .bro script extension and BRO_ environment variables 2021-01-27 10:52:40 -07:00
Tim Wojtulewicz
0618be792f Remove all of the random single-file deprecations
These are the changes that don't require a ton of changes to other files outside
of the original removal.
2021-01-27 10:52:40 -07:00
Tim Wojtulewicz
96d9115360 GH-1079: Use full paths starting with zeek/ when including files 2020-11-12 12:15:26 -07:00
Tim Wojtulewicz
72ccaee4d5 GH-1256: Write out strerror when writing errno during safe_write 2020-10-30 15:45:32 -07:00
Tim Wojtulewicz
69da2d7b1d Prep work for IP changes
- Move all of the time handling code out of PktSrc into RunState
- Call packet_mgr->ProcessPacket() from various places to setup layer 2 data in packets
2020-10-15 12:12:07 -07:00
Vlad Grigorescu
c3a395a6f0 Fix another umask issue. #1145 2020-08-26 18:07:21 -05:00
Vlad Grigorescu
e12db6bac0 Have mkdir in ensure_dir respect umask.
This also aligns with the mkdir bif. Fixes #1145
2020-08-26 10:01:20 -05:00
Jon Siwek
427a7de411 Merge remote-tracking branch 'origin/topic/timw/266-namespaces-part5'
- Did a few whitespace re-adjustments during merge

* origin/topic/timw/266-namespaces-part5:
  Update plugin btests for namespace changes
  Plugins: Clean up explicit uses of namespaces in places where they're not necessary.
  Base: Clean up explicit uses of namespaces in places where they're not necessary.
2020-08-25 19:51:42 -07:00
Tim Wojtulewicz
fe0c22c789 Base: Clean up explicit uses of namespaces in places where they're not necessary.
This commit covers all of the common and base classes.
2020-08-24 12:07:00 -07:00
Robin Sommer
165dcacd98 Make set_processing_status() signal-safe.
Closes #574.
2020-08-24 10:26:58 +00:00
Tim Wojtulewicz
54215ab9cd Rename methods in RunState to remove 'net' from their names 2020-08-20 16:11:47 -07:00
Tim Wojtulewicz
0ac3fafe13 Move zeek::net namespace to zeek::run_state namespace.
This also moves all of the code from Net.{h,cc} to RunState.{h,cc} and marks Net.h as deprecated
2020-08-20 16:11:47 -07:00
Tim Wojtulewicz
db36688bf0 Move a few smaller files to zeek namespaces 2020-08-20 16:11:46 -07:00
Tim Wojtulewicz
ddf48d7529 Move a few of the zeek::util methods and variables to zeek::util::detail 2020-08-20 16:11:44 -07:00
Tim Wojtulewicz
8d2d867a65 Move everything in util.h to zeek::util namespace.
This commit includes renaming a number of methods prefixed with bro_ to be prefixed with zeek_.
2020-08-20 16:00:33 -07:00
Tim Wojtulewicz
8862b585fa Deprecate ptr_compat_uint and ptr_compat_int in util.h 2020-08-20 15:55:17 -07:00
Tim Wojtulewicz
e7c6d51ae7 Move the functions and variables in Net.h to the zeek::net namespace. This includes moving network_time out of util.h. 2020-08-20 15:55:17 -07:00
Tim Wojtulewicz
be92bd536f Move iosource code to zeek namespaces 2020-08-20 15:55:17 -07:00
Tim Wojtulewicz
f1ed66d52c Fix some printf warnings with size_t values 2020-08-11 13:42:03 -07:00
Tim Wojtulewicz
4e9a5e9d98 Move ODesc to zeek namespace 2020-07-31 16:25:54 -04:00
Tim Wojtulewicz
a2a435360a Move all of the hashing classes/functions to zeek::detail namespace 2020-07-31 16:23:34 -04:00
Tim Wojtulewicz
bfab224d7c Move Reporter to zeek namespace 2020-07-31 16:22:41 -04:00
Jon Siwek
b17627fa09 Deprecate bro_srandom(), replace with zeek::seed_random().
Avoiding zeek::srandom() to avoid potential for confusion with srandom()
2020-07-22 14:01:33 -07:00
Jon Siwek
d486af06b1 Add zeek::max_random() & fix misuse of RAND_MAX w/ zeek::random_number()
In deterministic mode, RAND_MAX is not related to the result of
zeek::random_number() (formerly bro_random()), but some logic was
using RAND_MAX as indication of the possible range of values.  The
new zeek::max_random() will give the correct upper-bound regardless
of whether deterministic-mode is used.
2020-07-22 14:01:33 -07:00
Jon Siwek
bde38893ce Deprecate bro_random(), replace with zeek::random_number()
Avoiding the use of zeek::random() due to potential for confusion
with random().
2020-07-22 14:01:33 -07:00
Jon Siwek
6bbb0a6b48 Deprecate bro_prng(), replace with zeek::prng()
The type used for storing the state of the RNG is changed from
`unsigned int` to `long int` since the former has a minimal range
of [0, 65,535] while the RNG function itself has a range of
[1, 2147483646].  A `long int` must be capable of
[−2147483647, +2147483647] and is also the return type of `random()`,
which is what zeek::prng() aims to roughly parity.
2020-07-22 14:01:33 -07:00