By avoiding to use `broker::data` directly, we gain a degree of freedom
that allows us to swap out `broker::data` for something else (e.g.,
`broker::variant`) in the future. Furthermore, it also helps us to keep
Broker types "local" to the Broker manager and gives us a nicer
interface.
Also replaces uses of `broker::expected` with `std::optional`. While an
`expected `can carry additional information as to why a value is not
present, nothing in Zeek ever cared about that. Hence, using
`std::optional` removes an unnecessary dependency on a Broker detail
while also being more efficient (no extra heap allocation when no value
is present).
This largely copies over Spicy's `.clang-format` configuration file. The
one place where we deviate is header include order since Zeek depends on
headers being included in a certain order.
The Zeek code base has very inconsistent #includes. Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed. Another side effect was a lot of header
bloat which slows down the build.
First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.
After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations. In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.
This patch speeds up the build by 19%, because each compilation unit
gets smaller. Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):
Before this patch:
3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps
After this patch:
2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
* origin/topic/johanna/remove-serializer:
Fix memory leak introduced by removing opaque of ocsp_resp.
Change return value of OpaqueVal::DoSerialize.
Add missing ShallowClone implementation for SetType
Remove opaque of ocsp_resp.
Remove remnants of event serializer.
Fix cardinalitycounter deserialization.
Smaller compile fixes for the new opaque serialization.
Reimplement serialization infrastructure for OpaqueVals.
Couple of compile fixes.
Remove const from ShallowClone.
Remove test-case for removed functionality
Implement a Shallow Clone operation for types.
Remove value serialization.
Various changes I made:
- Fix memory leak in type-checker for opaque vals wrapped in broker::data
- Noticed the two "copy-all" leak tests weren't actually checking for
memory leaks because the heap checker isn't active until after zeek_init()
is evaluated.
- Change OpaqueVal::DoClone to use the clone caching mechanism
- Improve copy elision for broker::expected return types in the various
OpaqueVal serialize methods
- Not all compilers end up properly treating the return of
local/automatic variable as an rvalue that can be moved, and ends up
copying it instead.
- Particularly, until GCC 8, this pattern ends up copying instead of
moving, and we still support platforms whose default compiler
pre-dates that version.
- Generally seems it's something that wasn't addressed until C++14.
See http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1579
- Change OpaqueVal::SerializeType to return broker::expected
- Change probabilistic DoSerialize methods to return broker::expected
We need this to sender through Broker, and we also leverage it for
cloning opaques. The serialization methods now produce Broker data
instances directly, and no longer go through the binary formatter.
Summary of the new API for types derived from OpaqueVal:
- Add DECLARE_OPAQUE_VALUE(<class>) to the class declaration
- Add IMPLEMENT_OPAQUE_VALUE(<class>) to the class' implementation file
- Implement these two methods (which are declated by the 1st macro):
- broker::data DoSerialize() const
- bool DoUnserialize(const broker::data& data)
This machinery should work correctly from dynamic plugins as well.
OpaqueVal provides a default implementation of DoClone() as well that
goes through serialization. Derived classes can provide a more
efficient version if they want.
The declaration of the "OpaqueVal" class has moved into the header
file "OpaqueVal.h", along with the new serialization infrastructure.
This is breaking existing code that relies on the location, but
because the API is changing anyways that seems fine.
This adds an internal BiF
"Broker::__opaque_clone_through_serialization" that does what the name
says: deep-copying an opaque by serializing, then-deserializing. That
can be used to tests the new functionality from btests.
Not quite done yet. TODO:
- Not all tests pass yet:
[ 0%] language.named-set-ctors ... failed
[ 16%] language.copy-all-opaques ... failed
[ 33%] language.set-type-checking ... failed
[ 50%] language.table-init-container-ctors ... failed
[ 66%] coverage.sphinx-zeekygen-docs ... failed
[ 83%] scripts.base.frameworks.sumstats.basic-cluster ... failed
(Some of the serialization may still be buggy.)
- Clean up the code a bit more.
Note - this compiles, but you cannot run Bro anymore - it crashes
immediately with a 0-pointer access. The reason behind it is that the
required clone functionality does not work anymore.
This builds upon the previous commit to make Zeek compile on FIPS
systems.
This patch makes the changes a bit more aggressive. Instead of having a
number of different hash functions with different return values, we now
standardize on EVP_MD_CTX and just have one set of functions, to which
the hash algorithm that is desired is passed.
On the positive side, this enables us to support a wider range of hash
algorithm (and to easily add to them in the future).
I reimplemented the internal_md5 function - we don't support ebdic
systems in any case.
The md5/sha1 serialization functions are now also tested (I don't think
they were before).
CompositeHash.
We do this by hashing values added to a BloomFilter another time more
with a stable hash seeded only by either the filter's name or the
global_hash_seed (or Bro's random() seed if neither is defined).
I'm also adding a new bif bloomfilter_internal_state() that returns a
string representation of a Bloom filter's current internal state. This
is solely for writing tests that check that the filters end up
consistent when seeded with the same value.
I'm moving the new files into a subdirectory probabilistic, and into a
corresponding namespace. We can later put code for the other
probabilistic data structures there as well.
* origin/topic/matthias/bloom-filter: (45 commits)
Implement and test Bloom filter merging.
Make hash functions equality comparable.
Make counter vectors mergeable.
Use half adder for bitwise addition and subtraction.
Fix and test counting Bloom filter.
Implement missing CounterVector functions.
Tweak hasher interface.
Add missing include for GCC.
Fixing for unserializion error.
Small fixes and style tweaks.
Only serialize Bloom filter type if available.
Create hash policies through factory.
Remove lingering debug code.
Factor implementation and change interface.
Expose Bro's linear congruence PRNG as utility function.
H3 does not check for zero length input.
Support seeding for hashers.
Add utility function to access first random seed.
Update H3 documentation (and minor style nits.)
Make H3 seed configurable.
...