This is meant to be used for a new 'X' code in the history in scenarios when
packets are knowingly not processed or an unexpected unknown situation
is recognized.
Usually, these situations are currently reported via weirds or analyzer violations,
but being able to include it in the history field allows them to be more visible.
Will be used for exceeding tunnel depths first.
This largely copies over Spicy's `.clang-format` configuration file. The
one place where we deviate is header include order since Zeek depends on
headers being included in a certain order.
Borrows the `in_cksum` code from tcpdump, which borrowed from FreeBSD.
It handles unaligned data better and also unrolls the inner loop to
process 16 two-byte values at a time versus 2 one-byte values at a time
in the previous version. Generally measured as ~1.5x faster in a
release build. The new API should generally be more amenable to any
future optimization explorations since all relevant data blocks are
available within a single call rather than spread across multiple.
The Zeek code base has very inconsistent #includes. Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed. Another side effect was a lot of header
bloat which slows down the build.
First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.
After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations. In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.
This patch speeds up the build by 19%, because each compilation unit
gets smaller. Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):
Before this patch:
3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps
After this patch:
2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
* origin/topic/jsiwek/reassembly-improvements-map:
Rename a reassembly DataBlockList function
Add comments to reassembly classes
Use DataBlock value instead of pointer in reassembly map
Remove linked list from reassembly data structures
Use an std::map for reassembly DataBlock searches
Refactor Reassembler/DataBlock bookkeeping
Reorganize reassembly data structures
Remove a superfluous reassembler DataBlock member
* origin/topic/vern/perf-history:
only generate history threshold events for > 1 instance mention those events in NEWS
a different sort of history update
'W' for zero window implemented; logarithmic 'T'/'C'/'W' history repetitions
I reverted a change that made TCP window tracking unconditional (possibly
accepting out-of-order packets) until further verification of test suite
changes.
The main change is that reassembly code (e.g. for TCP) now uses
int64/uint64 (signedness is situational) data types in place of int
types in order to support delivering data to analyzers that pass 2GB
thresholds. There's also changes in logic that accompany the change in
data types, e.g. to fix TCP sequence space arithmetic inconsistencies.
Another significant change is in the Analyzer API: the *Packet and
*Undelivered methods now use a uint64 in place of an int for the
relative sequence space offset parameter.