This largely copies over Spicy's `.clang-format` configuration file. The
one place where we deviate is header include order since Zeek depends on
headers being included in a certain order.
This also includes:
- Deprecating the NetSessions name.
- Renaming the zeek::sessions global to zeek::session_mgr and deprecating the old name.
- Renaming Sessions.{h,cc} to SessionManager.{h,cc}.
* origin/topic/timw/nullptr:
The remaining nulls
plugin/probabilistic/zeekygen: Replace nulls with nullptr
file_analysis: Replace nulls with nullptr
analyzer: Replace nulls with nullptr
iosource/threading/input/logging: Replace nulls with nullptr
During merge, changed a bit of how Frame OffsetMap
assignments/contruction were handled to keep parity with old version.
* origin/topic/timw/structure-packing:
Lazy-initalize some of the fields in Frame to reduce the size of all Frames when they're not used
Set InternalHashTag to a uint16_t so CompositeHash doesn't have a gap in it.
Mark constants in List constexpr so they don't actually take up space in created objects
Reorder some class variables to fill in gaps in structure packing
The big hitters:
Dict: Fills in four 4-byte holes in the structure. This shrinks Dictionary from 136 bytes to 114 bytes.
Desc: Fills in a 6-byte hole in the structure. This shrinks ODesc from 152 bytes to 144 bytes.
Frame: Moves and combines 4 bool variables from a few places into one single 4-byte block. This resolves all of the holes at once. This shrinks Frame from 216 bytes to 192 bytes and removes one cache line.
Func: Moves one int32_t variable to fill in a 4-byte hole. This shrinks Func from 112 bytes to 104 bytes.
ID: Moves two bool variables to fill in a 3-byte hole. This leaves behind a 1-byte hole, but removes a 6-byte pad from the end of the structure. This shrinks ID from 144 bytes to 136 bytes.
Other changes:
RuleHdrTest: Fills in one 4-byte hole in the structure. This shrinks RuleHdrTest from 248 bytes to 240 bytes.
RuleEndpointState: Moves one bool variable down in the structure to reduce a 7-byte hole. This unfortunately causes a 3-byte hole later in the structure but there’s no easy way to filll it in. This does shrink RuleEndpointState from 128 bytes to 120 bytes though.
ScannedFile: Moves two bool values to reduce a 4-byte hole by 2 bytes. This shrinks ScannedFile from 64 bytes to 56 bytes.
Brofiler: Moves one char value to reduce a 4-byte hole by 1 byte. This shrinks Brofiler from 96 bytes to 88 bytes and removes one cache line.
DbgBreakpoint: Moves some values around to fill in a 4-byte hole and reduce a second. A 2-byte hole still exists, but the structure shrinks from 632 bytes to 624 bytes. It’s possible on this one that one of the int32_t values could be an int16_t and remove the last 2-byte gap.
ParseLocationRec: Moves one int to fill in a 4-byte hole. This shrinks ParseLocationRec from 32 bytes to 24 bytes.
DebugCmdInfo: Moves one bool variable to shift a few others up. This results in a 6-byte pad at the end of the structure but removes a 7-byte hole in the middle. This shrinks DebugCmdInfo from 56 bytes to 48 bytes.
FragReassembler: Moves one variable down to fill in a 4-byte hole. This shrinks FragReassembler from 272 bytes to 264 bytes.
nb_dns_result: Moves ones uint32_t variable to fill in a 4-byte hole, also removing a 4-byte pad from the end of the structure. This shrinks nb_dns_result from 32 bytes to 24 bytes.
nb_dns_entry: Moves one short value to fill in a 2-byte hole, also removing a 6-byte hole. This shrinks nb_dns_entry from 1064 bytes to 1056 bytes.
The Zeek code base has very inconsistent #includes. Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed. Another side effect was a lot of header
bloat which slows down the build.
First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.
After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations. In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.
This patch speeds up the build by 19%, because each compilation unit
gets smaller. Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):
Before this patch:
3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps
After this patch:
2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
* origin/topic/jsiwek/reassembly-improvements-map:
Rename a reassembly DataBlockList function
Add comments to reassembly classes
Use DataBlock value instead of pointer in reassembly map
Remove linked list from reassembly data structures
Use an std::map for reassembly DataBlock searches
Refactor Reassembler/DataBlock bookkeeping
Reorganize reassembly data structures
Remove a superfluous reassembler DataBlock member
Started by factoring some details into a new DataBlockList class to at
least make it more clear where modifications occur. More abstractions
likely to happen later as I experiment with alternate data structures
aimed at improving worse-case scenarios.
This commit marks (hopefully) ever one-parameter constructor as explicit.
It also uses override in (hopefully) all circumstances where a virtual
method is overridden.
There are a very few other minor changes - most of them were necessary
to get everything to compile (like one additional constructor). In one
case I changed an implicit operation to an explicit string conversion -
I think the automatically chosen conversion was much more convoluted.
This took longer than I want to admit but not as long as I feared :)
The main change is that reassembly code (e.g. for TCP) now uses
int64/uint64 (signedness is situational) data types in place of int
types in order to support delivering data to analyzers that pass 2GB
thresholds. There's also changes in logic that accompany the change in
data types, e.g. to fix TCP sequence space arithmetic inconsistencies.
Another significant change is in the Analyzer API: the *Packet and
*Undelivered methods now use a uint64 in place of an int for the
relative sequence space offset parameter.
Replaced some with InternalWarning or InternalAnalyzerError, the later
being a new method which signals the analyzer to not process further
input. Some usages I just removed if they didn't make sense or clearly
couldn't happen. Also did some minor refactors of related code while
reviewing/exploring ways to get rid of InternalError usages.
Also, for TCP content file write failures there's a new event:
"contents_file_write_failure".
- The script-layer 'pkt_hdr' type is extended with a new 'ip6' field
representing the full IPv6 header chain.
- The 'new_packet' event is now raised for IPv6 packets (addresses #523)
- A new event called 'ipv6_ext_header' is raised for any IPv6 packet
containing extension headers.
- A new event called 'esp_packet' is raised for any packets using ESP
('new_packet' and 'ipv6_ext_header' events provide connection info,
but that info can't be provided here since the upper-layer payload
is encrypted).
- The 'unknown_protocol' weird is now raised more reliably when Bro
sees a transport protocol or IPv6 extension header it can't handle.
(addresses #522)
Still need to do IPv6 fragment reassembly and needs more testing.