Commit graph

80 commits

Author SHA1 Message Date
Robin Sommer
6f9d28cc18 Merge branch 'topic/robin/hyperloglog-merge'
* topic/robin/hyperloglog-merge: (35 commits)
  Making the confidence configurable.
  Renaming HyperLogLog->CardinalityCounter.
  Fixing bug introduced during merging.
  add clustered leak test for hll. No issues.
  make gcc happy
  (hopefully) fix refcounting problem in hll/bloom-filter opaque vals. Thanks Robin.
  re-use same hash class for all add operations
  get hll ready for merging
  and forgot a file...
  adapt to new structure
  fix opaqueval-related memleak.
  make it compile on case-sensitive file systems and fix warnings
  make error rate configureable
  add persistence test not using predetermined random seeds.
  update cluster test to also use hll
  persistence really works.
  well, with this commit synchronizing the data structure should work.. ...if we had consistent hashing.
  and also serialize the other things we need
  ok, this bug was hard to find.
  serialization compiles.
  ...
2013-08-31 10:42:42 -07:00
Robin Sommer
4dcf8fc0db Merge remote-tracking branch 'origin/topic/bernhard/hyperloglog'
* origin/topic/bernhard/hyperloglog: (32 commits)
  add clustered leak test for hll. No issues.
  make gcc happy
  (hopefully) fix refcounting problem in hll/bloom-filter opaque vals. Thanks Robin.
  re-use same hash class for all add operations
  get hll ready for merging
  and forgot a file...
  adapt to new structure
  fix opaqueval-related memleak.
  make it compile on case-sensitive file systems and fix warnings
  make error rate configureable
  add persistence test not using predetermined random seeds.
  update cluster test to also use hll
  persistence really works.
  well, with this commit synchronizing the data structure should work.. ...if we had consistent hashing.
  and also serialize the other things we need
  ok, this bug was hard to find.
  serialization compiles.
  change plugin after feedback of seth
  Forgot a file. Again. Like always. Basically.
  do away with old file.
  ...
2013-08-30 11:30:05 -07:00
Bernhard Amann
d83edf8068 Merge remote-tracking branch 'origin/master' into topic/bernhard/hyperloglog
Conflicts:
	src/NetVar.cc
	src/NetVar.h
	src/SerialTypes.h
	src/probabilistic/CMakeLists.txt
	testing/btest/scripts/base/frameworks/sumstats/basic-cluster.bro
	testing/btest/scripts/base/frameworks/sumstats/basic.bro
2013-08-12 09:47:53 -07:00
Robin Sommer
2a0790c231 Changing the Bloom filter hashing so that it's independent of
CompositeHash.

We do this by hashing values added to a BloomFilter another time more
with a stable hash seeded only by either the filter's name or the
global_hash_seed (or Bro's random() seed if neither is defined).

I'm also adding a new bif bloomfilter_internal_state() that returns a
string representation of a Bloom filter's current internal state. This
is solely for writing tests that check that the filters end up
consistent when seeded with the same value.
2013-07-31 19:56:34 -07:00
Bernhard Amann
83ce77e575 re-use same hash class for all add operations 2013-07-30 18:48:05 -07:00
Bernhard Amann
18c10f3cb5 get hll ready for merging 2013-07-30 16:47:26 -07:00
Bernhard Amann
32c2885742 Merge remote-tracking branch 'origin/master' into topic/bernhard/hyperloglog
Conflicts:
	src/Func.cc
	src/probabilistic/CMakeLists.txt
2013-07-25 14:46:38 -07:00
Bernhard Amann
b7cdfc0e6e adapt to new structure 2013-07-24 12:50:01 -07:00
Matthias Vallentin
5769c32f1e Support emptiness check on Bloom filters. 2013-07-24 13:18:19 +02:00
Matthias Vallentin
5736aef440 Refactor Bloom filter merging. 2013-07-24 13:05:38 +02:00
Matthias Vallentin
5383e8f75b Add bloomfilter_clear() BiF. 2013-07-24 11:21:10 +02:00
Bernhard Amann
9e0fd963e0 Merge remote-tracking branch 'origin/topic/robin/bloom-filter-merge' into topic/bernhard/hyperloglog
Conflicts:
	scripts/base/frameworks/sumstats/plugins/__load__.bro
	src/CMakeLists.txt
	src/NetVar.cc
	src/NetVar.h
	src/OpaqueVal.h
	src/SerialTypes.h
	src/bro.bif
2013-07-23 21:31:05 -07:00
Robin Sommer
474107fe40 Broifying the code.
Also extending API documentation a bit more and fixing a memory leak.
2013-07-23 20:10:32 -07:00
Robin Sommer
21685d2529 Merge remote-tracking branch 'origin/topic/matthias/bloom-filter'
I'm moving the new files into a subdirectory probabilistic, and into a
corresponding namespace. We can later put code for the other
probabilistic data structures there as well.

* origin/topic/matthias/bloom-filter: (45 commits)
  Implement and test Bloom filter merging.
  Make hash functions equality comparable.
  Make counter vectors mergeable.
  Use half adder for bitwise addition and subtraction.
  Fix and test counting Bloom filter.
  Implement missing CounterVector functions.
  Tweak hasher interface.
  Add missing include for GCC.
  Fixing for unserializion error.
  Small fixes and style tweaks.
  Only serialize Bloom filter type if available.
  Create hash policies through factory.
  Remove lingering debug code.
  Factor implementation and change interface.
  Expose Bro's linear congruence PRNG as utility function.
  H3 does not check for zero length input.
  Support seeding for hashers.
  Add utility function to access first random seed.
  Update H3 documentation (and minor style nits.)
  Make H3 seed configurable.
  ...
2013-07-23 16:40:56 -07:00
Matthias Vallentin
a39f980cd4 Implement and test Bloom filter merging. 2013-07-22 18:11:12 +02:00
Matthias Vallentin
5f70452a9a Small fixes and style tweaks. 2013-06-18 10:40:00 -07:00
Matthias Vallentin
14a701a237 Implement value merging.
The actual BloomFilter merging still lacks, this is just the first step in the
right direction from the user interface side.
2013-06-10 22:46:24 -07:00
Matthias Vallentin
880d02f720 Associate a Comphash with a BloomFilterVal.
We also keep track of the Bloom filter's element type inside each value. The
first use of the BiF bloomfilter_add will "typify" the Bloom filter and lock
the Bloom filter's type to the element type.
2013-06-05 16:25:48 -07:00
Matthias Vallentin
751cf61293 Add more serialization implementation. 2013-06-04 15:30:27 -07:00
Matthias Vallentin
f708cd4a36 Work on parameter estimation and serialization. 2013-06-03 22:55:21 -07:00
Bernhard Amann
3e74cdc6e0 Merge remote-tracking branch 'origin/master' into topic/bernhard/hyperloglog 2013-05-03 22:58:02 -07:00
Matthias Vallentin
9ac00f8c79 Do not allocate one OpaqueType per OpaqueVal.
Instead, we now allocate type information globally in NetVar.cc.

Addresses #986.
2013-05-03 15:48:06 -07:00
Bernhard Amann
240d667e30 ok, this bug was hard to find.
hyperloglog.h was missing guards and randomly deleting memory at
addresses equal to variable contents.

I am not entirely sure why that did not crash before...
2013-04-10 13:45:21 -04:00
Bernhard Amann
a37ffab0ea serialization compiles.
Not entirely sure if it works too...
2013-04-10 13:15:31 -04:00
Bernhard Amann
53d6f3aae7 rework cardinality interface to use opaque.
I like it better...
2013-04-07 23:05:14 +02:00
Robin Sommer
da90976170 Merge remote-tracking branch 'origin/topic/matthias/opaque'
* origin/topic/matthias/opaque:
  Add new unit test for opaque serialization.
  Migrate entropy testing to opaque.
  C++ify RandTest.*
  Fix a hard-to-spot bug.
  Use more descriptive error message.
  Fix the fix :-/.
  Fix initialization of hash values.
  Be clearer about delegation.
  Implement serialization of opaque types.
  Update hash BiF documentation.
  Migrate free SHA* functions to SHA*Val::digest().
  Add missing type name that caused failing tests.
  Update base scripts and unit tests.
  Simplify hash function BiFs.
  Add support for opaque hash values.
  Adapt BiF & Bro parser to handle opaque types.
  More lexer/parser work.
  Implement equivalence relation for opaque types.
  Support basic serialization of opaque.
  Add opaque type to lexer, parser, and BroType.

Closes #925

Conflicts:
	aux/broccoli
2012-12-20 16:30:22 -08:00
Matthias Vallentin
b9d05f56d0 Migrate entropy testing to opaque. 2012-12-13 19:28:19 -08:00
Matthias Vallentin
652a015522 Be clearer about delegation.
Bro uses the Do* prefix to signify the implementation of an aspect. This commit
adopts the opaque values to use this pattern.
2012-12-12 14:54:07 -08:00
Matthias Vallentin
ddd306f00f Migrate free SHA* functions to SHA*Val::digest(). 2012-12-12 10:28:56 -08:00
Matthias Vallentin
624003f036 Add support for opaque hash values. 2012-12-11 16:25:11 -08:00