Helper to get information from the ContentLine analyzer about
bytes still pending to be delivered. In certain cases this can
be a signal for weirdness.
This largely copies over Spicy's `.clang-format` configuration file. The
one place where we deviate is header include order since Zeek depends on
headers being included in a certain order.
This commit marks (hopefully) ever one-parameter constructor as explicit.
It also uses override in (hopefully) all circumstances where a virtual
method is overridden.
There are a very few other minor changes - most of them were necessary
to get everything to compile (like one additional constructor). In one
case I changed an implicit operation to an explicit string conversion -
I think the automatically chosen conversion was much more convoluted.
This took longer than I want to admit but not as long as I feared :)
In ContentLine_Analyzer, prevent excessively long lines being assembled.
The line length will default to just under 16MB, but can be overriden on
a per-analyzer basis. This is done for the finger,ident, and irc
analyzers.
Singular CR or LF characters in multipart body content are no longer
converted to a full CRLF (thus corrupting the file) and it also no
longer considers the CRLF before the multipart boundary as part of the
content.
Addresses BIT-1235.
The main change is that reassembly code (e.g. for TCP) now uses
int64/uint64 (signedness is situational) data types in place of int
types in order to support delivering data to analyzers that pass 2GB
thresholds. There's also changes in logic that accompany the change in
data types, e.g. to fix TCP sequence space arithmetic inconsistencies.
Another significant change is in the Analyzer API: the *Packet and
*Undelivered methods now use a uint64 in place of an int for the
relative sequence space offset parameter.