mirror of
https://github.com/zeek/zeek.git
synced 2025-10-15 13:08:20 +00:00

Currently, siphash is used for strings up to 36 bytes. hmac-md5 is used for longer strings. This switch-over is a remnant of the previous hash-function that was used, which apparently was slower with longer input strings. This change serves no purpose anymore. I performed a few performance tests on strings of varying sizes: For a 40 byte string with 10 million iterations: siphash: 0.31 seconds hmac-md5: 3.8 seconds For a 1080 byte string with 10 million iterations: siphash: 4.2 seconds hmac-md5: 17 seconds For a 18360 byte string with 10 million iterations: siphash: 69 seconds hmac-md5: 240 seconds Hence, this commit removes the use of hmac-md5. This change causes reordering of lines in a few logs. This commit also changes the datastructure for the seed in probabilistic/Hasher to get rid of a type-punning warning.
19 lines
358 B
Text
19 lines
358 B
Text
/^?(one|foo|bar)$?/
|
|
/^?(two|oob)$?/
|
|
/^?(three|oob)$?/
|
|
/^?(four)$?/
|
|
-----------------
|
|
/^?(two|oob)$?/
|
|
/^?(four)$?/
|
|
/^?(one|foo|bar)$?/
|
|
/^?(three|oob)$?/
|
|
-----------------
|
|
/^?(two|oob)$?/, 1
|
|
/^?(four)$?/, 3
|
|
/^?(one|foo|bar)$?/, 0
|
|
/^?(three|oob)$?/, 2
|
|
-----------------
|
|
/^?(one|foo|bar)$?/, 2, 0
|
|
/^?(four)$?/, 5, 6
|
|
/^?(two|oob)$?/, 3, 2
|
|
/^?(three|oob)$?/, 4, 4
|