Commit graph

52 commits

Author SHA1 Message Date
Tim Wojtulewicz
a8fc63e182 Merge remote-tracking branch 'microsoft/master'
* microsoft/master: (71 commits)
  Clang formatting
  Mask ports before inserting them into the map
  Fix compiler warning from applied patch
  Remove statistics plugin in favor of stats bif
  Add EventHandler version of stats plugin
  Mark a few EventHandler methods const
  Changed implementation from std::map to std::unordered_map of Val.cc
  Removed const, Windows build is now working
  Added fixes suggested in PR
  Update src/packet_analysis/protocol/ip/IP.cc
  Apply suggestions from code review
  Clang format again but now with v13.0.1
  Rewrote usages of define(_MSC_VER) to ifdef _MSC_VER
  Clang format it all
  Fixed initial CR comments
  Add NEWS entry about Windows port
  Add a couple of extra unistd.h includes to fix a build failure
  Use std::chrono instead of gettimeofday
  Update libkqueue submodule [nomail]
  Don't call tokenize_string if the input string is empty
  ...
2022-11-11 15:23:21 -07:00
Josh Soref
cd201aa24e Spelling src
These are non-functional changes.

* accounting
* activation
* actual
* added
* addresult
* aggregable
* aligned
* alternatively
* ambiguous
* analysis
* analyzer
* anticlimactic
* apparently
* application
* appropriate
* arithmetic
* assignment
* assigns
* associated
* authentication
* authoritative
* barrier
* boundary
* broccoli
* buffering
* caching
* called
* canonicalized
* capturing
* certificates
* ciphersuite
* columns
* communication
* comparison
* comparisons
* compilation
* component
* concatenating
* concatenation
* connection
* convenience
* correctly
* corresponding
* could
* counting
* data
* declared
* decryption
* defining
* dependent
* deprecated
* detached
* dictionary
* directional
* directly
* directory
* discarding
* disconnecting
* distinguishes
* documentation
* elsewhere
* emitted
* empty
* endianness
* endpoint
* enumerator
* essentially
* evaluated
* everything
* exactly
* execute
* explicit
* expressions
* facilitates
* fiddling
* filesystem
* flag
* flagged
* for
* fragments
* guarantee
* guaranteed
* happen
* happening
* hemisphere
* identifier
* identifies
* identify
* implementation
* implemented
* implementing
* including
* inconsistency
* indeterminate
* indices
* individual
* information
* initial
* initialization
* initialize
* initialized
* initializes
* instantiate
* instantiated
* instantiates
* interface
* internal
* interpreted
* interpreter
* into
* it
* iterators
* length
* likely
* log
* longer
* mainly
* mark
* maximum
* message
* minimum
* module
* must
* name
* namespace
* necessary
* nonexistent
* not
* notifications
* notifier
* number
* objects
* occurred
* operations
* original
* otherwise
* output
* overridden
* override
* overriding
* overwriting
* ownership
* parameters
* particular
* payload
* persistent
* potential
* precision
* preexisting
* preservation
* preserved
* primarily
* probably
* procedure
* proceed
* process
* processed
* processes
* processing
* propagate
* propagated
* prototype
* provides
* publishing
* purposes
* queue
* reached
* reason
* reassem
* reassemble
* reassembler
* recommend
* record
* reduction
* reference
* regularly
* representation
* request
* reserved
* retrieve
* returning
* separate
* should
* shouldn't
* significant
* signing
* simplified
* simultaneously
* single
* somebody
* sources
* specific
* specification
* specified
* specifies
* specify
* statement
* subdirectories
* succeeded
* successful
* successfully
* supplied
* synchronization
* tag
* temporarily
* terminating
* that
* the
* transmitted
* true
* truncated
* try
* understand
* unescaped
* unforwarding
* unknown
* unknowndata
* unspecified
* update
* usually
* which
* wildcard

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-11-09 12:08:15 -05:00
Tomer Lev
73e749a162 Clang format again but now with v13.0.1 2022-11-09 18:56:00 +02:00
Tomer Lev
5cdc6e150e Clang format it all 2022-11-09 18:55:51 +02:00
Tim Wojtulewicz
77c555a3a8 Fixing some issues from rebasing 2022-11-09 18:16:13 +02:00
Elad Solomon
3a80b79497 Compile Zeek with MSVC
Allow Zeek to be embedded in another project
2022-11-09 18:15:30 +02:00
Tim Wojtulewicz
3b69dd38f3 Add equality, inequality, copy, and move operators to HashKey 2022-10-10 10:08:58 -07:00
Tim Wojtulewicz
4d4c6280e9 Miscellaneous deprecations and renaming 2022-07-12 12:01:23 -07:00
Tim Wojtulewicz
f624c18383 Deprecate bro_int_t and bro_uint_t 2022-07-12 12:01:23 -07:00
Tim Wojtulewicz
70e63d4749 Remove deprecated MemoryAllocation() methods and related code 2022-06-30 18:56:52 +00:00
Tim Wojtulewicz
7c4fd382d9 Code modernization: Convert from deprecated C standard library headers 2022-06-27 09:47:31 -07:00
Tim Wojtulewicz
64748edab1 Replace most uses of typedef with using for type aliasing 2021-10-11 14:51:10 -07:00
Christian Kreibich
10e8d36340 Remove unused HashKey constructor and reorder for consistency
One of the HashKey constructors was only used in the old CompHash code.
This aso reorders some constructors and the destructor for readability.
2021-09-20 17:51:43 -07:00
Christian Kreibich
b6a11a69db Add debug string and ODesc support to HashKey class
This allows tracing of hash key buffer reservations, reads, and writes via a new
debug stream, and supports printing a summary of a HashKey object via
Describe(). The latter comes in handy e.g. in TableVal::Describe() (where
including the hash key is now available but commented out).
2021-09-20 17:51:43 -07:00
Christian Kreibich
82822b1e07 Refactor HashKey class to support read/write operations
This preserves the optimization of storing values directly in the key_u member
union when feasible, and using a variable size buffer otherwise. It also adds
bounds-checking for that buffer, moves size arguments to size_t, decouples
construction from hash computation, emulates the tagging feature found in
SerializationFormat to assist troubleshooting, and switches feasible
reinterpret_casts to static_casts.
2021-09-20 17:51:43 -07:00
Tim Wojtulewicz
b2f171ec69 Reformat the world 2021-09-16 15:35:39 -07:00
Tim Wojtulewicz
a7fd34375f GH-572: Mark MemoryAllocation() and related methods deprecated 2021-06-28 11:07:58 -07:00
Tim Wojtulewicz
4ad08172d0 Remove obsolete ZEEK_FORWARD_DECLARE_NAMESPACED macros 2021-02-24 14:35:44 -07:00
Tim Wojtulewicz
0618be792f Remove all of the random single-file deprecations
These are the changes that don't require a ton of changes to other files outside
of the original removal.
2021-01-27 10:52:40 -07:00
Tim Wojtulewicz
96d9115360 GH-1079: Use full paths starting with zeek/ when including files 2020-11-12 12:15:26 -07:00
Tim Wojtulewicz
fe0c22c789 Base: Clean up explicit uses of namespaces in places where they're not necessary.
This commit covers all of the common and base classes.
2020-08-24 12:07:00 -07:00
Tim Wojtulewicz
ddf48d7529 Move a few of the zeek::util methods and variables to zeek::util::detail 2020-08-20 16:11:44 -07:00
Tim Wojtulewicz
8d2d867a65 Move everything in util.h to zeek::util namespace.
This commit includes renaming a number of methods prefixed with bro_ to be prefixed with zeek_.
2020-08-20 16:00:33 -07:00
Tim Wojtulewicz
c9ab1f93e7 Move a few low-use classes to namespaces 2020-07-31 16:25:47 -04:00
Tim Wojtulewicz
a2a435360a Move all of the hashing classes/functions to zeek::detail namespace 2020-07-31 16:23:34 -04:00
Tim Wojtulewicz
736a3f53d4 Rename BroString to zeek::String 2020-07-02 16:15:01 -07:00
Tim Wojtulewicz
58c6e10b62 Move BroString to zeek namespace 2020-06-30 21:12:26 -07:00
Tim Wojtulewicz
937a462e70 Move Frame and Scope to zeek::detail namespace 2020-06-30 20:51:53 -07:00
Tim Wojtulewicz
64332ca22c Move all Val classes to the zeek namespaces 2020-06-30 20:48:09 -07:00
Jon Siwek
8561c79363 Remove inline from some static KeyedHash members
Coverity Scan builds currently encounter catastrophic error, claiming
alignas requires use on both declaration and definition, so appears to
actually not understand "static inline" in combo with alignas.
2020-06-05 18:20:05 -07:00
Jon Siwek
9c133b9b10 Integrate review feedback
* Add deprecation for MIME_Entity::ContentType(), use GetContentType()

* Add deprecation for MIME_Entity::ContentSubType(), use GetContentSubType()

* Add deprecation for MIME_Message::BuildHeaderVal(), use ToHeaderVal()

* Add deprecation for MIME_Message::BuildHeaderTable(), use ToHeaderTable()

* Add deprecation for mime::new_string_val(), use mime::to_stringval()

* Add deprecation for ARP_Analyzer::ConstructAddrVal(), use ToAddrVal()

* Add deprecation for ARP_Analyzer::EthAddrToStr(), use ToEthAddrStr()

* Change the Func::Call() replacement to be named Func::Invoke()
2020-05-29 19:14:35 -07:00
Jon Siwek
ca1e5fe4be Replace deprecated usage of BifFunc:: with zeek::BifFunc::
Names of functions also changed slightly, like bro_fmt -> fmt_bif.

Should generally be unusual/unexpected to see somone calling these
directly from C++ in their plugin, but since technically possible in
previous versions, I also removed the "private" restriction on accessing
the BifReturnVal member.
2020-05-14 17:26:30 -07:00
Johanna Amann
ce8b121e12 Hash unification: address PR feedback 2020-05-13 14:07:59 +00:00
Johanna Amann
04ed125941 Merge remote-tracking branch 'origin/master' into topic/johanna/hash-unification 2020-05-06 23:18:33 +00:00
Johanna Amann
3bce313b12 Switch file UID hashing from md5 to highwayhash.
This commit switches UID hashing from md5 to a highway hash. It also
moves the salt value out of the file plugin - and makes it
installation-specific instead - it is moved to the global namespace.

There now are digest hash functions to make "static"
installation-specific hashes that are stable over workers available to
everyone; hashes can be 64, 128 or 256 bits in size.

Due to the fact that we switch the file hashing algorithm, all file
hashes change.

The underlyigng algorithm that is used for hashing is highwayhash-128,
which is significantly faster than md5.
2020-04-30 10:20:09 -07:00
Johanna Amann
360c06a3f8 Start refactoring hashing.
This commit moves some of the hash datastructures and code from
util.cc into Hash.cc - where it seems more appropriate.

It also starts to make more Keyed hash functions available - still
using siphash as the default 64 bit keyed hash, but also making
128 and 256 bit highway hashes available.

There already are a few other functions that are defined but not
yet implemented - these will be "static" keyed hashes - which use
an installation specific key. These will be used to, e.g., get
rid of md5 hashing for the generation of file UIDs.
2020-04-24 18:27:09 -07:00
Johanna Amann
5e7915ae7a Remove the siphash->hmac-md5 switch after 36 bytes.
Currently, siphash is used for strings up to 36 bytes. hmac-md5 is used
for longer strings.

This switch-over is a remnant of the previous hash-function that was
used, which apparently was slower with longer input strings.

This change serves no purpose anymore. I performed a few performance tests
on strings of varying sizes:

For a 40 byte string with 10 million iterations:

siphash: 0.31 seconds
hmac-md5: 3.8 seconds

For a 1080 byte string with 10 million iterations:

siphash: 4.2 seconds
hmac-md5: 17 seconds

For a 18360 byte string with 10 million iterations:

siphash: 69 seconds
hmac-md5: 240 seconds

Hence, this commit removes the use of hmac-md5.

This change causes reordering of lines in a few logs.

This commit also changes the datastructure for the seed in probabilistic/Hasher
to get rid of a type-punning warning.
2020-04-24 13:14:29 -07:00
Tim Wojtulewicz
fd5e15b116 The Great Embooleanating
A large number of functions had return values and/or arguments changed
to use ``bool`` types instead of ``int``.
2020-03-31 06:41:54 +00:00
Max Kellermann
0db61f3094 include cleanup
The Zeek code base has very inconsistent #includes.  Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed.  Another side effect was a lot of header
bloat which slows down the build.

First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.

After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations.  In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.

This patch speeds up the build by 19%, because each compilation unit
gets smaller.  Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):

Before this patch:

 3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
 760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps

After this patch:

 2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
 72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
2020-02-04 20:51:02 +01:00
Dominik Charousset
c1f3fe7829 Switch from header guards to pragma once 2019-09-17 14:10:30 +02:00
Tim Wojtulewicz
54752ef9a1 Deprecate the internal int/uint types in favor of the cstdint types they were based on 2019-08-12 13:50:07 -07:00
Johanna Amann
6d612ced3d Mark one-parameter constructors as explicit & use override where possible
This commit marks (hopefully) ever one-parameter constructor as explicit.

It also uses override in (hopefully) all circumstances where a virtual
method is overridden.

There are a very few other minor changes - most of them were necessary
to get everything to compile (like one additional constructor). In one
case I changed an implicit operation to an explicit string conversion -
I think the automatically chosen conversion was much more convoluted.

This took longer than I want to admit but not as long as I feared :)
2018-03-27 07:17:32 -07:00
Johanna Amann
e1218cc7fa Change Hashing from H3 to Siphash.
This commit mostly changes the hash function that is used for Internal
hashing of data < 36 bytes from H3 to Siphash. This change is motivated
by the fact that it turns out that H3 apparently does not deliver a very
good source of data uniqueness; running HLL with H3 as a hashing
function results in quite poor results (up to of 75% off in my tests).
In difference, running HLL with Siphash (or HMAC-MD5) changes this
factor to ~2%.

This also fixes a long-standing bug in Hash.h which truncated our hash
values to 32 bit on most machines.

Furthermore, it once again fixes a problem with the Rank function in
HLL.
2016-07-13 06:44:51 -07:00
Jon Siwek
d7dafe2fe2 Refactoring various usages of new IPAddr class.
Reducing number of places that internal representation was exposed
via GetBytes/CopyIPv6.

Also fixed a bug in remask_addr bif.
2012-02-22 14:45:44 -06:00
Jon Siwek
0f207c243c Port DNS_Mgr to use new IPAddr class, enable lookups on IPv6 addrs.
Host lookups still need to be changed to also do AAAA queries.
2012-02-13 15:57:59 -06:00
Jon Siwek
f945f3c518 Change HashKey threshold for using H3 to 36 bytes.
This is enough to accommodate using H3 instead of HMAC/MD5 for IPv6
Conn::Key's and performs better since a hash happens for every packet.
2012-02-09 12:55:55 -06:00
Jon Siwek
495e987938 Remove $Id$ tags 2011-08-04 15:21:18 -05:00
Robin Sommer
59d6202104 Merge remote branch 'origin/topic/robin/conn-ids'
* origin/topic/robin/conn-ids:
  Moving uid from conn_id to connection, and making output determistic if a hash seed is given.
  Extending conn_id with a globally unique identifiers.
2011-04-22 22:13:44 -07:00
Robin Sommer
03c0d587a4 Removing code for unused hash functions. 2011-04-01 16:09:28 -07:00
Robin Sommer
881071cc99 Extending conn_id with a globally unique identifiers. 2011-03-15 20:42:56 -07:00