Incremental flowbuffer parsing sought to first parse the "minimum header
length" required to get the full frame length, possibly from a record
field, but generating the logic to parse that field could greedily
bundle in additional boundary-checks for all subsequent fields of
known-size.
E.g. for flowunit parsing of this:
type HDR = record {
version: uint8;
reserved: uint8;
len: uint16;
} &byteorder=bigendian;
type FOO_PDU(is_orig: bool) = record {
hdr: HDR;
plen: uint8;
ptype: uint8;
something: bytestring &restofdata;
} &byteorder=bigendian, &length=hdr.len;
The flowbuffer was correctly seeking to buffer 4 bytes and parse the
"hdr.len" field, but the generated parsing logic for "hdr.len" included
a boundary check all the way up to include "plen" and "ptype".
This causes out-of-bounds exceptions to be thrown for inputs that should
actually be possible to incrementally parse via flowbuffer.
Array lengths use signed integer storage, so multiplication of that
by the element size for purpose of bounds checking against available
data may produce a signed integer overlow, which is undefined behavior.
The type used to store the index for a case-type now tracks the
type of the index expression rather than always using an "int".
The case fields also now have some checking done at code-gen-time to
ensure the constants used for cases does not exceed the numeric limit
of the type used in the case's index expression. Then, assuming, it
looks safe, the C++ case labels are generated with casts to the type
of the Binpac case's index expression to ensure compilers accept it
(since all Binpac numbers use "int" for storage/printing internally).
For arrays that are fields within a record, the bounds check was based
on a pointer to the start of the record rather than the start of the
array field.
For arrays with unknown element size, it used to cap the array length to
be the maximum it could be for the given input buffer, assuming 1-byte
elements. An exception is instead now raised for cases where the
maximum array size (number of elements) exceeds what's possibly in the
buffer. Using an exception versus capping the length may help prevent
protocol analyzers from unintentionally accessing array indices that
were not actually parsed even if the evauluated-length-expression for
that given array implies it may have been.
The former is easy to misuse by accidentally storing the contents of
the temporary string return value and accessing it later. There's also
potential pitfalls in changing it to return a pointer into a static
buffer, so instead start using strfmt() uniformly across the codebase
and change some methods to use strings instead of char*.
In this case, the bounds checking for individual elements can be
optimized out of the parsing-loop in favor of a single, array-wide
bounds check beforehand.
For arrays with a length expression (e.g. uint16[size] instead of
uint16[]), the parsing loop would consider reaching the end of the
data buffer as a successful loop termination condition even if it's
not yet parsed the required number of elements.
Now, for such arrays, the loop will only terminate based on the loop
counter (derived from the length expression) or else it will throw an
OOB exception when trying to parse an element and finding not enough
data in the buffer.
Credit to Tomas Bortoli for reporting the problem and proposing
patches.
It should only suppress the parsing-loop boundary check in the case
where array elaments are a single byte in length and thus covered by
the boundary check (generated as a result of &length) that is placed
before the parsing-loop.
The issue is that t_begin_of_data + %s can sometimes overflow.
Bug reported and patch proposed by
Philippe Antoine <p.antoine@catenacyber.fr> from Catena cyber.
A common BinPAC construct for parsing records is a switch statement,
with no breaks between the cases, as control is expected to fall
through.
Coverity raises an error about this; this commit should fix that.
Specifying &length on a record no longer skips generating boundary
checks for individual fields. E.g. a record field that specifies a
&length that extends beyond the &length of the record containing it
should throw binpac::ExceptionOutOfBound, the usual way of handling
out-of-bounds conditions.
Field lengths derived from other data in the input could potentially
lead to reading from outside the bounds of the input buffer.
Reported by John Villamil and Chris Rohlf - Yahoo Paranoids
Strings with a constant &length expression can be checked for negative
length values while generating the parser instead of in the parser
itself (which likely just ends up being dead code).
parsers.
This consists of two parts:
1. The generated Flow classes expose their flow buffers via a new
method flow_buffer().
2. Flow buffers get two new methods:
// Interface for delayed parsing. Sometimes BinPAC doesn't get the
// buffering right and then one can use these to feed parts
// individually and assemble them internally. After calling
// FinishBuffer(), one can send the uppper-layer flow an FlowEOF()
// to trigger parsing.
void BufferData(const_byteptr data, const_byteptr end);
void FinishBuffer();
Switch to using a no-argument throw to preserve the dynamic type of
the binpac exception. Otherwise, the exception is "sliced" and can only
be subsequently handled as binpac::Exception and not a derived type.
This allows analyzers to define their own types of the same name
without mistakingly overshadowing the usages of binpac::Exception
and its derived types in the generated parser code.
If set, parsed elements won't actually be added to the array, and read
access to the array aren't permitted. This is helpful to save memory
in the case of large arrays for which elements don't need (or can't)
be buffered.