Addig a new random seed for external tests.
I added a wrapper around the siphash() function to make calling it a
little bit safer at least.
BIT-1612 #merged
* origin/topic/johanna/bit-1612:
HLL: Fix missing typecast in test case.
Remove the -K/-J options for setting keys.
Add test checking the quality of HLL by adding a lot of elements.
Fix serializing probabilistic hashers.
Baseline updates after hash function change.
Also switch BloomFilters from H3 to siphash.
Change Hashing from H3 to Siphash.
HLL: Remove unnecessary comparison.
Hyperloglog: change calculation of Rho
This commit mostly changes the hash function that is used for Internal
hashing of data < 36 bytes from H3 to Siphash. This change is motivated
by the fact that it turns out that H3 apparently does not deliver a very
good source of data uniqueness; running HLL with H3 as a hashing
function results in quite poor results (up to of 75% off in my tests).
In difference, running HLL with Siphash (or HMAC-MD5) changes this
factor to ~2%.
This also fixes a long-standing bug in Hash.h which truncated our hash
values to 32 bit on most machines.
Furthermore, it once again fixes a problem with the Rank function in
HLL.
The hash function was internally casting the void* data argument into an
unsigned char* and then using values from that to index another internal
array that's dimensioned based on the assumption of 256 values possible
for an unsigned char (8-bit chars/bytes). This is probably a correct
assumption most of the time, but should be safer to use the limits as
defined in standard headers to get it right for the particular
system/compiler.
There was an unused uint8* casted variable in HashKey::HashBytes that
seemed like it might have been meant to be passed to H3's hash function
as an unfinished attempt to solve the 8-bit byte assumption problem, but
that doesn't seem as good as taking care of that internally in H3 so
users of the API are only concerned with byte sizes as reported by
`sizeof`. Removing the unused variable addresses #530.
Also a minor tweak to an hmac_md5 call that was casting away const from
one argument (which doesn't match the prototype).