Elsewhere (zeek/zeek#2482), it was observed that when binpac encounters
include failures, it still exits with 0 indicating success. Subsequent
compilation of the produced .h and .cc files likely fails.
Exit with 1 on include errors to make pin pointing issues easier by
having make/ninja stop earlier.
CMake documentation says that CMAKE_CFG_INTDIR is no longer necessary to
find the right binary for the configuration and is in fact deprecated in
recent versions of CMake.
For example:
inum: uint32 = case (ed & 0x0f) of {
0x00 -> n_8; # n_8 is a uint8
0x01 -> n_16; # n_16 is a uint16
0x02 -> n_32; # n_32 is a uint32
default -> 0;
};
Previously, the temporary storage used for evaluating the
case-expression was based on whatever type the first case yields, which
is a uint8 in the above example. That behavior can lead to a narrowing
conversion whenever the 0x01 or 0x02 cases occur.
The new behavior is to base the temporary storage's type on the largest
numeric type that the case-expression can yield, which is uint32 in the
above example.
Incremental flowbuffer parsing sought to first parse the "minimum header
length" required to get the full frame length, possibly from a record
field, but generating the logic to parse that field could greedily
bundle in additional boundary-checks for all subsequent fields of
known-size.
E.g. for flowunit parsing of this:
type HDR = record {
version: uint8;
reserved: uint8;
len: uint16;
} &byteorder=bigendian;
type FOO_PDU(is_orig: bool) = record {
hdr: HDR;
plen: uint8;
ptype: uint8;
something: bytestring &restofdata;
} &byteorder=bigendian, &length=hdr.len;
The flowbuffer was correctly seeking to buffer 4 bytes and parse the
"hdr.len" field, but the generated parsing logic for "hdr.len" included
a boundary check all the way up to include "plen" and "ptype".
This causes out-of-bounds exceptions to be thrown for inputs that should
actually be possible to incrementally parse via flowbuffer.
OpenBSD shared library names are like "libfoo.so.major.minor" and
binpac was previously letting the post-release number into the name
like "libbinpac.so.0.54-7", which isn't compatible with that scheme.
Related to https://github.com/zeek/zeek/issues/649
Array lengths use signed integer storage, so multiplication of that
by the element size for purpose of bounds checking against available
data may produce a signed integer overlow, which is undefined behavior.
The type used to store the index for a case-type now tracks the
type of the index expression rather than always using an "int".
The case fields also now have some checking done at code-gen-time to
ensure the constants used for cases does not exceed the numeric limit
of the type used in the case's index expression. Then, assuming, it
looks safe, the C++ case labels are generated with casts to the type
of the Binpac case's index expression to ensure compilers accept it
(since all Binpac numbers use "int" for storage/printing internally).
This allows for tunability of the following behaviors:
* Minimum flowbuffer capacity to use when parsing a new unit
* Threshold at which flowbuffer capacity is contracted back to the
minimum after parsing a complete unit and before parsing the next
* Maximum flowbuffer capacity to allow when parsing a given unit
Failed flowbuffer allocations due to reaching maximum capacity or any
other reason now throw ExceptionFlowBufferAlloc.
For arrays that are fields within a record, the bounds check was based
on a pointer to the start of the record rather than the start of the
array field.
For arrays with unknown element size, it used to cap the array length to
be the maximum it could be for the given input buffer, assuming 1-byte
elements. An exception is instead now raised for cases where the
maximum array size (number of elements) exceeds what's possibly in the
buffer. Using an exception versus capping the length may help prevent
protocol analyzers from unintentionally accessing array indices that
were not actually parsed even if the evauluated-length-expression for
that given array implies it may have been.
The former is easy to misuse by accidentally storing the contents of
the temporary string return value and accessing it later. There's also
potential pitfalls in changing it to return a pointer into a static
buffer, so instead start using strfmt() uniformly across the codebase
and change some methods to use strings instead of char*.