mirror of
https://github.com/zeek/zeek.git
synced 2025-10-04 07:38:19 +00:00
Change Hashing from H3 to Siphash.
This commit mostly changes the hash function that is used for Internal hashing of data < 36 bytes from H3 to Siphash. This change is motivated by the fact that it turns out that H3 apparently does not deliver a very good source of data uniqueness; running HLL with H3 as a hashing function results in quite poor results (up to of 75% off in my tests). In difference, running HLL with Siphash (or HMAC-MD5) changes this factor to ~2%. This also fixes a long-standing bug in Hash.h which truncated our hash values to 32 bit on most machines. Furthermore, it once again fixes a problem with the Rank function in HLL.
This commit is contained in:
parent
c15f48661d
commit
e1218cc7fa
10 changed files with 257 additions and 25 deletions
|
@ -165,6 +165,8 @@ private:
|
|||
*/
|
||||
uint8_t Rank(uint64_t hash_modified) const;
|
||||
|
||||
static int flsll(uint64_t mask);
|
||||
|
||||
/**
|
||||
* This is the number of buckets that will be stored. The standard
|
||||
* error is 1.04/sqrt(m), so the actual cardinality will be the
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue