The options were never really used and do not seem especially useful;
initialization with a seed file still works.
This also fixes a bug with the initialization of the siphash key.
The test adds 170,000 IP addresses. After the recent hashing changes,
HLL estimates 171,250 entries (completely stable). Before, HLL estimated,
depending on the initial seeds, ~700 to 300,000 entries.
Added a new BIF haversine_distance that computes distance between two
geographic locations.
Added a new Bro script function haversine_distance_ip that does the same
but takes two IP addresses instead of latitude/longitude. This function
requires that Bro be built with libgeoip.
(Cleaned up some code a little bit.)
* origin/topic/seth/stats-improvement:
Fixing tests for stats improvements
Rename the reporting interval variable for stats.
Removing more broken functionality due to changed stats apis.
Removing some references to resource_usage()
Removing Broker stats, it was broken and incomplete.
Fixing default stats collection interval to every 5 minutes.
Add DNS stats to the stats.log
Small stats script tweaks and beginning broker stats.
Continued stats cleanup and extension.
More stats collection extensions.
More stats improvements
Slight change to Mach API for collecting memory usage.
Fixing some small mistakes.
Updating the cmake submodule for the stats updates.
Fix memory usage collection on Mac OS X.
Cleaned up stats collection.
BIT-1581 #merged
This bif works similar to the matching_subnet bif. The difference is
that, instead of returning a vector of the subnets that match, we return
a filtered view of the original set/table only containing the changed
subnets.
This commit also fixes a small bug in TableVal::UpdateTimestamp
(ReadOperation only has to be called when LoggingAccess() is true).
Without this patch, this scenario results in a segmentation fault.
I opted to keep the separator present for non-existing elements. Hence,
a vector a, [empty], b with separator "|" will result in
a||b
* 'topic/jgras/base64-logging' of https://github.com/J-Gras/bro:
Update calls of Base64 functions.
Refactoring of Base64 functions.
I've removed the additional bif for encoding with a connection, as I'm
not sure there's much of a use case for it; we can always add it back
later if it turns out there is. I've also renamed
decode_base64_intern() to decode_base64_conn() to be a bit more
explicit about the difference.
Base64Converter now uses a connection directly, instead of an analyzer
redirecting to the underlying connection for reporting to Weird. The new
built-in functions en-/decode_base64_intern make use of this to send
encoding-errors to Weird instead of Reporter.
According to the documentation, using the empty string as alphabet in
the built-in functions, will use the default alphabet. Therefore the
built-in functions can now use default arguments and
en-/decode_base64_custom is deprecated.
The tests have been updated accordingly.
This also patches a few tests to contain certificates that were removed.
Furthermore, we include the old CA file with the external tests and load
it automatically. Those traces are kind of old now, more and more of the
CAs in them are no longer valid and it does not really make sense to
update them on each change...
These functions are now deprecated in favor of alternative versions that
return a vector of strings rather than a table of strings.
Deprecated functions:
- split: use split_string instead.
- split1: use split_string1 instead.
- split_all: use split_string_all instead.
- split_n: use split_string_n instead.
- cat_string_array: see join_string_vec instead.
- cat_string_array_n: see join_string_vec instead.
- join_string_array: see join_string_vec instead.
- sort_string_array: use sort instead instead.
- find_ip_addresses: use extract_ip_addresses instead.
Changed functions:
- has_valid_octets: uses a string_vec parameter instead of string_array.
Addresses BIT-924, BIT-757.
* origin/topic/jsiwek/file-signatures:
File type detection changes and fix https.log {orig,resp}_fuids fields.
Various minor changes related to file mime type detection.
Refactor common MIME magic matching code.
Replace libmagic w/ Bro signatures for file MIME type identification.
Conflicts:
scripts/base/init-default.bro
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
BIT-1143 #merged
Notable changes:
- libmagic is no longer used at all. All MIME type detection is
done through new Bro signatures, and there's no longer a means to get
verbose file type descriptions (e.g. "PNG image data, 1435 x 170").
The majority of the default file magic signatures are derived
from the default magic database of libmagic ~5.17.
- File magic signatures consist of two new constructs in the
signature rule parsing grammar: "file-magic" gives a regular
expression to match against, and "file-mime" gives the MIME type
string of content that matches the magic and an optional strength
value for the match.
- Modified signature/rule syntax for identifiers: they can no longer
start with a '-', which made for ambiguous syntax when doing negative
strength values in "file-mime". Also brought syntax for Bro script
identifiers in line with reality (they can't start with numbers or
include '-' at all).
- A new Built-In Function, "file_magic", can be used to get all
file magic matches and their corresponding strength against a given
chunk of data
- The second parameter of the "identify_data" Built-In Function
can no longer be used to get verbose file type descriptions, though it
can still be used to get the strongest matching file magic signature.
- The "file_transferred" event's "descr" parameter no longer
contains verbose file type descriptions.
- The BROMAGIC environment variable no longer changes any behavior
in Bro as magic databases are no longer used/installed.
- Reverted back to minimum requirement of CMake 2.6.3 from 2.8.0
(it's back to being the same requirement as the Bro v2.2 release).
The bump was to accomodate building libmagic as an external project,
which is no longer needed.
Addresses BIT-1143.
This changes the internal type that is used to signal that a vector
is unspecified from any to void.
I tried to verify that the behavior of Bro is still the same. After
a lot of playing around, I think everything still should worl as before.
However, it might be good for someone to take a look at this.
addresses BIT-1144
* origin/topic/bernhard/hyperloglog: (32 commits)
add clustered leak test for hll. No issues.
make gcc happy
(hopefully) fix refcounting problem in hll/bloom-filter opaque vals. Thanks Robin.
re-use same hash class for all add operations
get hll ready for merging
and forgot a file...
adapt to new structure
fix opaqueval-related memleak.
make it compile on case-sensitive file systems and fix warnings
make error rate configureable
add persistence test not using predetermined random seeds.
update cluster test to also use hll
persistence really works.
well, with this commit synchronizing the data structure should work.. ...if we had consistent hashing.
and also serialize the other things we need
ok, this bug was hard to find.
serialization compiles.
change plugin after feedback of seth
Forgot a file. Again. Like always. Basically.
do away with old file.
...
BIT-1048 #merged
I'm reverting the serializer version update for now as that breaks
Broccoli. Let's do that later for 2.2.
* topic/robin/topk-merge:
update documentation, rename get* to Get* and make hasher persistent
adapt to new folder structure
fix opaqueval-related memleak
synchronize pruned attribute
potentially found wrong Ref.
add sum function that can be used to get the number of total observed elements.
in cluster settings, the resultvals can apparently been uninitialized in some special cases
fix memory leaks
fix warnings
add topk cluster test
make size of topk-list configureable when using sumstats
implement merging for top-k.
add serialization for topk
make the get function const
topk for sumstats
well, a test that works..
implement topk.
* topic/robin/bloom-filter-merge:
Using a real hash function for hashing a BitVector's internal state.
Support UHF hashing for >= UHASH_KEY_SIZE bytes.
Changing the Bloom filter hashing so that it's independent of CompositeHash.
Add new BiF for low-level Bloom filter initialization.
Introduce global_hash_seed script variable.
Conflicts:
testing/btest/Baseline/bifs.bloomfilter/output
* origin/topic/bernhard/topk:
adapt to new folder structure
fix opaqueval-related memleak
synchronize pruned attribute
potentially found wrong Ref.
add sum function that can be used to get the number of total observed elements.
in cluster settings, the resultvals can apparently been uninitialized in some special cases
fix memory leaks
fix warnings
add topk cluster test
make size of topk-list configureable when using sumstats
implement merging for top-k.
add serialization for topk
make the get function const
topk for sumstats
well, a test that works..
implement topk.
CompositeHash.
We do this by hashing values added to a BloomFilter another time more
with a stable hash seeded only by either the filter's name or the
global_hash_seed (or Bro's random() seed if neither is defined).
I'm also adding a new bif bloomfilter_internal_state() that returns a
string representation of a Bloom filter's current internal state. This
is solely for writing tests that check that the filters end up
consistent when seeded with the same value.