Commit graph

28 commits

Author SHA1 Message Date
Tim Wojtulewicz
b2f171ec69 Reformat the world 2021-09-16 15:35:39 -07:00
Tim Wojtulewicz
96d9115360 GH-1079: Use full paths starting with zeek/ when including files 2020-11-12 12:15:26 -07:00
Tim Wojtulewicz
fe0c22c789 Base: Clean up explicit uses of namespaces in places where they're not necessary.
This commit covers all of the common and base classes.
2020-08-24 12:07:00 -07:00
Tim Wojtulewicz
f310795d79 Move probabilistic code into zeek namespaces 2020-08-20 15:55:17 -07:00
Tim Wojtulewicz
bfab224d7c Move Reporter to zeek namespace 2020-07-31 16:22:41 -04:00
Tim Wojtulewicz
1248411a2f Use properly-sized loop variables or convert to ranged-for (bugprone-too-small-loop-variable) 2020-07-28 12:36:40 -07:00
Tim Wojtulewicz
d53c1454c0 Remove 'using namespace std' from SerialTypes.h
This unfortunately cuases a ton of flow-down changes because a lot of other
code was depending on that definition existing. This has a fairly large chance
to break builds of external plugins, considering how many internal ones it broke.
2020-04-07 15:59:59 -07:00
Tim Wojtulewicz
eda1b4a23e Use const references over copying variables (performance-unnecessary-copy-initialization, performance-for-range-copy) 2020-02-11 11:02:08 -08:00
Tim Wojtulewicz
95d2af4501 Move constructors/operators should be marked noexcept to avoid the compiler picking the copy constructor instead (performance-noexcept-move-constructor) 2020-02-11 11:02:08 -08:00
Max Kellermann
0db61f3094 include cleanup
The Zeek code base has very inconsistent #includes.  Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed.  Another side effect was a lot of header
bloat which slows down the build.

First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.

After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations.  In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.

This patch speeds up the build by 19%, because each compilation unit
gets smaller.  Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):

Before this patch:

 3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
 760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps

After this patch:

 2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
 72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
2020-02-04 20:51:02 +01:00
Tim Wojtulewicz
54752ef9a1 Deprecate the internal int/uint types in favor of the cstdint types they were based on 2019-08-12 13:50:07 -07:00
Jon Siwek
399496efa8 Merge remote-tracking branch 'origin/topic/johanna/remove-serializer'
* origin/topic/johanna/remove-serializer:
  Fix memory leak introduced by removing opaque of ocsp_resp.
  Change return value of OpaqueVal::DoSerialize.
  Add missing ShallowClone implementation for SetType
  Remove opaque of ocsp_resp.
  Remove remnants of event serializer.
  Fix cardinalitycounter deserialization.
  Smaller compile fixes for the new opaque serialization.
  Reimplement serialization infrastructure for OpaqueVals.
  Couple of compile fixes.
  Remove const from ShallowClone.
  Remove test-case for removed functionality
  Implement a Shallow Clone operation for types.
  Remove value serialization.

Various changes I made:

- Fix memory leak in type-checker for opaque vals wrapped in broker::data

- Noticed the two "copy-all" leak tests weren't actually checking for
  memory leaks because the heap checker isn't active until after zeek_init()
  is evaluated.

- Change OpaqueVal::DoClone to use the clone caching mechanism

- Improve copy elision for broker::expected return types in the various
  OpaqueVal serialize methods

  - Not all compilers end up properly treating the return of
    local/automatic variable as an rvalue that can be moved, and ends up
    copying it instead.

  - Particularly, until GCC 8, this pattern ends up copying instead of
    moving, and we still support platforms whose default compiler
    pre-dates that version.

  - Generally seems it's something that wasn't addressed until C++14.
    See http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1579

- Change OpaqueVal::SerializeType to return broker::expected

- Change probabilistic DoSerialize methods to return broker::expected
2019-06-20 13:38:54 -07:00
Johanna Amann
ca28b98fd4 Fix cardinalitycounter deserialization.
This one took me way too long to admit. Values were pushed back on
deserialization - instead of assigned. Meaning they were added to the
end of the already 0-assigned vector.

The mean thing here is that estimation still worked - just merging
resulted in 0. And estimation still was correct because m, V, alpha_m
are enough for this - and those were correctly copied...

With this change, all tests pass.
2019-06-18 08:59:31 -07:00
Johanna Amann
618f0802f4 Smaller compile fixes for the new opaque serialization.
Also remove the non-existing clone function for EntrypyVals - which now
can just use serialization :)
2019-06-17 14:48:02 -07:00
Robin Sommer
01e662b3e0 Reimplement serialization infrastructure for OpaqueVals.
We need this to sender through Broker, and we also leverage it for
cloning opaques. The serialization methods now produce Broker data
instances directly, and no longer go through the binary formatter.

Summary of the new API for types derived from OpaqueVal:

    - Add DECLARE_OPAQUE_VALUE(<class>) to the class declaration
    - Add IMPLEMENT_OPAQUE_VALUE(<class>) to the class' implementation file
    - Implement these two methods (which are declated by the 1st macro):
        - broker::data DoSerialize() const
        - bool DoUnserialize(const broker::data& data)

This machinery should work correctly from dynamic plugins as well.

OpaqueVal provides a default implementation of DoClone() as well that
goes through serialization. Derived classes can provide a more
efficient version if they want.

The declaration of the "OpaqueVal" class has moved into the header
file "OpaqueVal.h", along with the new serialization infrastructure.
This is breaking existing code that relies on the location, but
because the API is changing anyways that seems fine.

This adds an internal BiF
"Broker::__opaque_clone_through_serialization" that does what the name
says: deep-copying an opaque by serializing, then-deserializing. That
can be used to tests the new functionality from btests.

Not quite done yet. TODO:
    - Not all tests pass yet:
        [  0%] language.named-set-ctors ... failed
        [ 16%] language.copy-all-opaques ... failed
        [ 33%] language.set-type-checking ... failed
        [ 50%] language.table-init-container-ctors ... failed
        [ 66%] coverage.sphinx-zeekygen-docs ... failed
        [ 83%] scripts.base.frameworks.sumstats.basic-cluster ... failed

      (Some of the serialization may still be buggy.)

    - Clean up the code a bit more.
2019-06-17 16:13:54 +00:00
Johanna Amann
474efe9e69 Remove value serialization.
Note - this compiles, but you cannot run Bro anymore - it crashes
immediately with a 0-pointer access. The reason behind it is that the
required clone functionality does not work anymore.
2019-05-09 11:54:38 -07:00
Robin Sommer
4d84ee82da Merge remote-tracking branch 'origin/topic/johanna/bit-1612'
Addig a new random seed for external tests.

I added a wrapper around the siphash() function to make calling it a
little bit safer at least.

BIT-1612 #merged

* origin/topic/johanna/bit-1612:
  HLL: Fix missing typecast in test case.
  Remove the -K/-J options for setting keys.
  Add test checking the quality of HLL by adding a lot of elements.
  Fix serializing probabilistic hashers.
  Baseline updates after hash function change.
  Also switch BloomFilters from H3 to siphash.
  Change Hashing from H3 to Siphash.
  HLL: Remove unnecessary comparison.
  Hyperloglog: change calculation of Rho
2016-07-14 16:26:17 -07:00
Johanna Amann
f1bae871e9 Also switch BloomFilters from H3 to siphash.
This removes all dependencies on H3 in our source tree.
2016-07-13 09:04:10 -07:00
Johanna Amann
e1218cc7fa Change Hashing from H3 to Siphash.
This commit mostly changes the hash function that is used for Internal
hashing of data < 36 bytes from H3 to Siphash. This change is motivated
by the fact that it turns out that H3 apparently does not deliver a very
good source of data uniqueness; running HLL with H3 as a hashing
function results in quite poor results (up to of 75% off in my tests).
In difference, running HLL with Siphash (or HMAC-MD5) changes this
factor to ~2%.

This also fixes a long-standing bug in Hash.h which truncated our hash
values to 32 bit on most machines.

Furthermore, it once again fixes a problem with the Rank function in
HLL.
2016-07-13 06:44:51 -07:00
Johanna Amann
b7c64c4522 HLL: Remove unnecessary comparison.
Rank always returns at least 1, hence this check is not necessary.
2016-06-15 11:33:37 -07:00
Johanna Amann
3aabe83ec6 Hyperloglog: change calculation of Rho
This commit changes the calculation of the rho-value to be in line with
the implementation of the original research paper, counting the number
of zero bits before the data.

This also fixes an infinite loop in case the hash value is 0.

I also cleaned up the code a bit, converting the raw pointers that were
used to a STL vector.

Addresses BIT-1612
2016-06-13 15:18:44 -07:00
Bernhard Amann
ecc20b932a and const 2 more functions 2013-09-16 11:00:54 -07:00
Bernhard Amann
c0f780c728 update hll documentation, make a few functions private and create
a new copy constructor.
2013-09-16 10:40:25 -07:00
Jon Siwek
0b97343ff7 Fix various potential memory leaks.
Though I expect most not to be exercised in practice.
2013-09-12 15:23:52 -05:00
Robin Sommer
de5bb65ff7 Removing the "uint8*" methods from SerializationFormat.
They conflict with the "char" version, so that other classes would now
pick the wrong one. Added a bit of casting to HLL to use the "char"
versions instead.
2013-08-31 11:17:49 -07:00
Robin Sommer
6f9d28cc18 Merge branch 'topic/robin/hyperloglog-merge'
* topic/robin/hyperloglog-merge: (35 commits)
  Making the confidence configurable.
  Renaming HyperLogLog->CardinalityCounter.
  Fixing bug introduced during merging.
  add clustered leak test for hll. No issues.
  make gcc happy
  (hopefully) fix refcounting problem in hll/bloom-filter opaque vals. Thanks Robin.
  re-use same hash class for all add operations
  get hll ready for merging
  and forgot a file...
  adapt to new structure
  fix opaqueval-related memleak.
  make it compile on case-sensitive file systems and fix warnings
  make error rate configureable
  add persistence test not using predetermined random seeds.
  update cluster test to also use hll
  persistence really works.
  well, with this commit synchronizing the data structure should work.. ...if we had consistent hashing.
  and also serialize the other things we need
  ok, this bug was hard to find.
  serialization compiles.
  ...
2013-08-31 10:42:42 -07:00
Robin Sommer
295987c8d0 Making the confidence configurable. 2013-08-31 10:34:50 -07:00
Robin Sommer
fb3ceae6d5 Renaming HyperLogLog->CardinalityCounter.
For consistency with the class' name.
2013-08-31 10:22:27 -07:00
Renamed from src/probabilistic/HyperLogLog.cc (Browse further)