BIT-1048 #merged
I'm reverting the serializer version update for now as that breaks
Broccoli. Let's do that later for 2.2.
* topic/robin/topk-merge:
update documentation, rename get* to Get* and make hasher persistent
adapt to new folder structure
fix opaqueval-related memleak
synchronize pruned attribute
potentially found wrong Ref.
add sum function that can be used to get the number of total observed elements.
in cluster settings, the resultvals can apparently been uninitialized in some special cases
fix memory leaks
fix warnings
add topk cluster test
make size of topk-list configureable when using sumstats
implement merging for top-k.
add serialization for topk
make the get function const
topk for sumstats
well, a test that works..
implement topk.
Nice solution with the ComponentManager/TaggedComponent!
BIT-1049 #Merged Merged into master.
* origin/topic/jsiwek/faf-updates:
Fix some build errors.
Minor fix to file/protocol analyzer plugin reference doc.
Internal refactoring of how plugin components are tagged/managed.
Factor out the need for a tag field in Files::AnalyzerArgs record.
Add a distinct tag class for file analyzers.
Fix various documentation, mostly related to file analysis.
* topic/robin/bloom-filter-merge:
Using a real hash function for hashing a BitVector's internal state.
Support UHF hashing for >= UHASH_KEY_SIZE bytes.
Changing the Bloom filter hashing so that it's independent of CompositeHash.
Add new BiF for low-level Bloom filter initialization.
Introduce global_hash_seed script variable.
Conflicts:
testing/btest/Baseline/bifs.bloomfilter/output
* origin/topic/bernhard/topk:
adapt to new folder structure
fix opaqueval-related memleak
synchronize pruned attribute
potentially found wrong Ref.
add sum function that can be used to get the number of total observed elements.
in cluster settings, the resultvals can apparently been uninitialized in some special cases
fix memory leaks
fix warnings
add topk cluster test
make size of topk-list configureable when using sumstats
implement merging for top-k.
add serialization for topk
make the get function const
topk for sumstats
well, a test that works..
implement topk.
CompositeHash.
We do this by hashing values added to a BloomFilter another time more
with a stable hash seeded only by either the filter's name or the
global_hash_seed (or Bro's random() seed if neither is defined).
I'm also adding a new bif bloomfilter_internal_state() that returns a
string representation of a Bloom filter's current internal state. This
is solely for writing tests that check that the filters end up
consistent when seeded with the same value.
This commit adds support for script-level specification of a seed to be used by
hashers. For example, if the given name of a Bloom filter is not empty, then
the seed used by the underlying hasher only depends on the Bloom filter name.
If the name is empty, we check whether the user defined a non-empty
global_hash_seed string variable at script and use it instead. If that script
variable does not exist, then we fall back to the initial seed computed a
Bro startup (which is affected ultimately by $BRO_SEED).
See Hasher::MakeSeed for details.
with a bloom-filter already containing values.
I assume that it is ok to merge an empty bloom-filter with any bloom-filter -
if not we have to change the patch to return an error in this case.
* origin/topic/jsiwek/exec-module:
Exec module changes/fixes.
Coverage test fixes and whitespace/doc tweaks.
Update to make Dir::monitor watch inodes instead of file names.
Updates to use new input framework mechanism to execute command line programs.
Added Exec, Dir, and ActiveHTTP modules.
BIT-1046 #merged.
Conflicts:
magic
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
* origin/topic/seth/faf-updates: (27 commits)
Undoing the FTP tests I updated earlier.
Update the last two btest FAF tests.
File analysis fixes and test updates.
Fix a bug with getting analyzer tags.
A few test updates.
Some tests work now (at least they all don't fail anymore!)
Forgot a file.
Added protocol description functions that provide a super compressed log representation.
Fix a bug where orig file information in http wasn't working right.
Added mime types to http.log
Clean up queued but unused file_over_new_connections event args.
Add jar files to the default MHR lookups.
Adding CAB files for MHR checking.
Improve malware hash registry script.
Fix a small issue with finding smtp entities.
Added support for files to the notice framework.
Make the custom libmagic database a git submodule.
Add an is_orig parameter to file_over_new_connection event.
Make magic for emitting application/msword mime type less strict.
Disable more libmagic builtin checks that override the magic database.
...
Conflicts:
doc/scripts/DocSourcesList.cmake
scripts/base/init-bare.bro
scripts/test-all-policy.bro
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
- Fix examples/references in the file analysis how-to/usage doc.
- Add Broxygen-generated docs for file analyzer plugins.
- Break FTP::Info type declaration out in to its own file to get
rid of some circular dependencies (between s/b/p/ftp/main and
s/b/p/ftp/utils).
- Several places were just using old variable names or not loading
scripts correctly after they'd been renamed/moved.
- Revert/adjust a change in how HTTP file handles are generated that
broke partial content responses.
- Turn some libmagic builtin checks back on; seems some are actually
useful (e.g. text detection seems to be a builtin). The rule going
forward probably will be only to turn off a builtin if we confirm it
causes issues.
- Removed some tests that are redundant or not necessary anymore because
the generic file analysis tests cover them.
- A couple FTP tests still fail that I think need an actual solution via
script changes.
in.
No more manual includes to pull them in.
(It doesn't quite work fully automatically yet for some bifs that need
script-level types defined, like the input and logging frameworks.
They still do a manual "@load foo.bif" in their main.bro to get the
order right. It's a bit tricky to fix that and would probably need
splitting main.bro into two parts; not sure that's worth it.)
- Give Dir::monitor() a param for the polling interval, so different
dirs can be monitored at different frequencies.
- Fix race in Exec::run() when reading extra output files produced by
a process -- it was possible for Exec::run() to return before all
extra output files had been fully read.
- Add test cases.
- Intel importing format has changed (refer to docs).
- All string matching is now case insensitive.
- SMTP intel script has been updated to extract email
addresses correctly.
- Small fix sneaking into the smtp base script to actually
extract individual email addresses in the To: field
correctly.