We should not be passing the untrusted TCP header length into
DeliverPacket(). Also, DeliverPacket() cap len parameter should
be the capture length of the packet, not remaining data.
With packet->len representing the wire length and other places
relying on it, ensure it's updated for fragments as well. This
assumes non-truncated fragments right now. Otherwise we'd need
to teach the FragmentReassembler to somehow track this independently
but it would be a mess.
The protocol analyzers are prepared to receive truncated data and
this way we give analyzers a chance to look at data. We previously
allowed empty data being passed: When len ended up 0 and remaining
was 0 too.
This may happen with truncated packets and will cause asan builds to bail out
before the packet can be forwarded along. The TCP analyzer already has this
check, but it's missing for UDP.
While writing documentation about troubleshooting and looking a bit
at the older stats.log, realized we don't have the packet lag metric
exposed as metric/telemetry. Add it.
This is a Zeek instance lagging behind in network time ~6second because
it's very overloaded:
zeek_net_packet_lag_seconds{endpoint=""} 6.169406 1684848998092
This seems to have relied on the reading file twice behavior simply
testing that 16 lines are observed. Switch to using two separate
files and doing a system("mv ...") to trigger the REREAD logic, there's
not force_update() needed and it wouldn't do anything if the file
hadn't changed anyway.
Found while writing documentation and being confused why
all lines and end_of_data() arrive twice during startup.
The test is a bit fuzzy, but does fail reliably without
the changes to Raw.cc
Also fix not checking dev in the MODE_REREAD path.
Closes#3053
It needs to be defined by the time we create zeek-config, which happens before
its current definition. To avoid a redundant TOLOWER when we check for presence
of --enable-debug at the beginning, this also switches this to a case-unadjusted
comparison to "Debug", which we use elsewhere in the file too.
Repeating the message for every new call to get_file_handle() is not
very useful. It's pretty much an analyzer configuration issue so logging
it once should be enough.
If DataIn() was called and a cur_entity_id (file_id) has been produced
previously, re-use it for calls to EndOfFile(). This avoids a costly
event_mgr.Drain() when we already have that information. It should be safer,
too, as `get_file_handle()` in script may generate a different ID and
thereby de-synchronizing.
I'm not sure if we somehow set this for oss-fuzz through the environment,
but didn't find anything obvious.
Running oss-fuzz reproducers locally can triggers lookups to malware.hash.cymru.com
and potentially other domains due to loading local.zeek.
When the parent of a support analyzer has been disabled, short-circuit
delivering stream or packet data to it.
The specific scenario this avoids is the Content-Line analyzer continuing
to feed data lines into an disabled SMTP analyzer in turn creating more
events.
This is primarily useful for our fuzzing setup where data chunks up to 1MB
are generated and fed into the analyzer pipeline. In the real-world, chunk
sizes are usually bounded to packet size. Certain TCP reassembly constellations
may trigger these scenarios, however.
Closes#168
OSS-Fuzz generated traffic containing a CWD command with a single very large
path argument (427kb) starting with ".___/` \x00\x00...", This is followed
by a large number of ftp replies with code 250. The directory logic in
ftp_reply() would match every incoming reply with the one pending CWD command,
triggering path buildup ending with something 120MB in size.
Protect from re-using a directory command by setting a flag in the
CmdArg record when it was consumed for the path traversal logic.
This doesn't prevent unbounded path build-up generally, but does prevent the
amplification of a single large command with very many small ftp_replies.
Re-using a pending path command seems like a bug as well.