This unfortunately cuases a ton of flow-down changes because a lot of other
code was depending on that definition existing. This has a fairly large chance
to break builds of external plugins, considering how many internal ones it broke.
Only one instance of base_type() getting a NewRef instead of AdoptRef
fixed in merge. All other changes are superficial formatting and
factoring.
* 'leaks' of https://github.com/MaxKellermann/zeek: (22 commits)
Stmt: use class IntrusivePtr
Stmt: remove unused default constructors and `friend` declarations
Val: remove unimplemented prototype recover_val()
Val: cast_value_to_type() returns IntrusivePtr
Val: use IntrusivePtr in check_and_promote()
Val: use nullptr instead of 0
zeekygen: use class IntrusivePtr
ID: use class IntrusivePtr
Expr: use class IntrusivePtr
Var: copy Location to stack, to fix use-after-free crash bug
Scope: lookup_ID() and install_ID() return IntrusivePtr<ID>
Scope: delete duplicate locals
EventRegistry: automatically delete EventHandlers
main: destroy event_registry after iosource_mgr
zeekygen/IdentifierInfo: delete duplicate fields
main: free the global scope in terminate_bro()
Scope: pop_scope() returns IntrusivePtr<>
Scope: unref all inits in destructor
Var: pass IntrusivePtr to add_global(), add_local() etc.
plugin/ComponentManager: hold a reference to the EnumType
...
Zeek scripts located on separate filesystems, but sharing the same inode
number leads to scripts not being loaded. The reason is that a `ScannedFile`
is only identified by `st_ino` which is not enough to uniquely identify a
file in a system.
This problem may be hit when `ZEEKPATH` points to separate filesystems and
two script files happen have the same `st_ino` value - definitely not very
likely, but possibly very confusing when it happens.
The following test case creates two zeek scripts on separate filesystems.
As the filesystems are freshly created and of the same type, the files will
(tested a few times with xfs/ext4) have the same `st_ino` values.
#!/bin/bash
ZEEKDIR=${ZEEKDIR:-/home/awelzel/projects/zeek}
export ZEEKPATH=.:${ZEEKDIR}/build/scripts:${ZEEKDIR}/scripts
cat << EOF > hello.zeek
event zeek_init() {
print("Hello, once or twice?");
}
EOF
for i in 1 2 ; do
dd if=/dev/urandom of=img${i} count=16 bs=1M 2>/dev/null
sudo mkfs.xfs -q ./img${i}
mkdir -p mount${i}
sudo mount ./img${i} ./mount${i}
sudo cp hello.zeek ./mount${i}/hello.zeek
done
ls ./mount*/*zeek
stat -c "%n: device=%d inode=%i" ./mount*/hello.zeek
${ZEEKDIR}/build/src/zeek -b ./mount1/hello.zeek ./mount2/hello.zeek
# Cleanup
for i in 1 2 ; do
sudo umount ./mount${i}
rm -rfv ./img${i} ./mount${i}
rm -rfv hello.zeek
done
Before this patch, `Hello, once or twice?` is printed only once,
afterwards twice:
$ sh testcase.sh
[sudo] password for awelzel:
./mount1/hello.zeek ./mount2/hello.zeek
./mount1/hello.zeek: device=1794 inode=6915
./mount2/hello.zeek: device=1795 inode=6915
Hello, once or twice?
Hello, once or twice?
This is really a memory leak because the Unref() call is missing. But
since this usually returns a "stock" object (`ValManager::b_true` or
`ValManager::b_false`), nothing really leaks. But eventually, the
reference counter will overflow to `INT_MAX`, leading to a crash in
bad_ref().
The Zeek code base has very inconsistent #includes. Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed. Another side effect was a lot of header
bloat which slows down the build.
First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.
After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations. In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.
This patch speeds up the build by 19%, because each compilation unit
gets smaller. Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):
Before this patch:
3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps
After this patch:
2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
The full process hierarchy isn't set up yet, but these changes
help prepare by doing two things:
- Add a -j option to enable supervisor-mode. Currently, just a single
"stem" process gets forked early on to be used as the basis for
further forking into real cluster nodes.
- Separates the parsing of command-line options from their consumption.
i.e. need to parse whether we're in -j supervisor-mode before
modifying any global state since that would taint the "stem" process.
The new intermediate structure containing the parsed options may
also serve as a way to pass configuration info from "stem" to its
descendent cluster node processes.
* origin/topic/timw/deprecate-int-types:
Deprecate the internal int/uint types in favor of the cstdint types they were based on
Merge adjustments:
* A bpf type mistakenly got replaced (inside an unlikely #ifdef)
* Did a few substitutions that got missed (likely due to
pre-processing out of DEBUG macros)
For backward compatibility when reading values, we first check
the ZEEK-prefixed value, and if not set, then check the corresponding
BRO-prefixed value.
To be more exact: &encrypt, &mergeable, &rotate_interval, &rotate_size
Also removes no longer used redef-able constants:
log_rotate_interval, log_max_size, log_encryption_key
GH-243
This commit removed functions/events that have been deprecated in Bro
2.6. It also removes the detection code that checks if the old
communication framework is used (since all the functions that are
checked were removed).
Addresses parts of GH-243
* All "Broxygen" usages have been replaced in
code, documentation, filenames, etc.
* Sphinx roles/directives like ":bro:see" are now ":zeek:see"
* The "--broxygen" command-line option is now "--zeexygen"
When searching for script files, look for both the new and old file
extensions. If a file with ".zeek" can't be found, then search for
a file with ".bro" as a fallback.
* origin/topic/vern/case-insensitive-patterns:
use PCRE syntax instead of the beautiful new (?i ...) syntax
nitlet in NEWS entry
test suite update for case-insensitive patterns
document use of double quotes to escape case-insensitivity
bug fix for recent memory leak patch
documentation updates for case-insensitive patterns
d'oh there's isalpha. I looked earlier for isletter :-P
fix for handling [:(lower|upper):] in case-insensitive patterns
implemented /re/i for case-insensitive patterns
* 'topic/vern/bit-ops' of https://github.com/bro/bro:
documentation clarification for "p1 | p2"
documentation for bitwise operators
document the '|' operator for patterns
test suite for bitwise operators brief NEWS blurb allow for "counter" operands (does anyone still use these?) for one (but not both) of the bitwise operands
bitwise operations for "count" types implemented
Starting branch for supporting bit operations on count's.
This environment variable is now set to listen only on IPv4 loopback
when running unit tests (instead of using the default INADDR_ANY).
This also moves some of the @loads out from init-bare.bro into a new
init-frameworks-and-bifs.bro in order to better support calling BIFs
(like `getenv`) from variable initializations in those particular
frameworks.
The configuration framework consists of three mostly distinct parts:
* option variables
* the config reader
* the script level framework
I will describe the three elements in the following.
Internally, this commit also performs a range of changes to the Input
manager; it marks a lot of functions as const and introduces a new
ValueToVal method (which could in theory replace the already existing
one - it is a bit more powerful).
This also changes SerialTypes to have a subtype for Values, just as
Fields already have it; I think it was mostly an oversight that this was
not introduced from the beginning. This should not necessitate any code
changes for people already using SerialTypes.
option variable
===============
The option keyword allows variables to be specified as run-tine options.
Such variables cannot be changed using normal assignments. Instead, they
can be changed using Option::set. It is possible to "subscribe" to
options and be notified when an option value changes.
Change handlers can also change values before they are applied; this
gives them the opportunity to reject changes. Priorities can be
specified if there are several handlers for one option.
Example script:
option testbool: bool = T;
function option_changed(ID: string, new_value: bool): bool
{
print fmt("Value of %s changed from %s to %s", ID, testbool, new_value);
return new_value;
}
event bro_init()
{
print "Old value", testbool;
Option::set_change_handler("testbool", option_changed);
Option::set("testbool", F);
print "New value", testbool;
}
config reader
=============
The config reader provides a way to read configuration files back into
Bro. Most importantly it automatically converts values to the correct
types. This is important because it is at least inconvenient (and
sometimes near impossible) to perform the necessary type conversions in
Bro scripts themselves. This is especially true for sets/vectors.
Configuration generally look like this:
[option name][tab/spaces][new variable value]
so, for example:
testaddr 2607:f8b0:4005:801::200e
testinterval 60
testtime 1507321987
test_set a b c d erdbeerschnitzel
The reader uses the option name to look up the type that variable has in
the Bro core and automatically converts the value to the correct type.
Example script use:
type Idx: record {
option_name: string;
};
type Val: record {
option_val: string;
};
global currconfig: table[string] of string = table();
event InputConfig::new_value(name: string, source: string, id: string, value: any)
{
print id, value;
}
event bro_init()
{
Input::add_table([$reader=Input::READER_CONFIG, $source="../configfile", $name="configuration", $idx=Idx, $val=Val, $destination=currconfig, $want_record=F]);
}
Script-level config framework
=============================
The script-level framework ties these two features together and makes
them a bit more convenient to use. Configuration files can simply be
specified by placing them into Config::config_files. The framework also
creates a config.log that shows all value changes that took place.
Usage example:
redef Config::config_files += {configfile};
export {
option testbool : bool = F;
}
The file is now monitored for changes; when a change occurs the
respective option values are automatically updated and the value change
is written to config.log.
* 'topic/corelight/load-hook' of https://github.com/corelight/bro:
Fix and extend behavior of HookLoadFile
I refactored some parts of scan.l to avoid the ambiguity of some
branches returning 0 and some branches not returning anything.