This was exposed by OSS-Fuzz after the HTTP/0.9 changes in zeek/zeek#2851:
We do not check the result of parsing the from and last bytes of a
Content-Range header and would reference uninitialized values on the stack
if these were not valid.
This doesn't seem as bad as it sounds outside of yielding non-sensible values:
If the result was negative, we weird/bailed. If the result was positive, we
already had to treat it with suspicion anyway and the SetPlainDelivery()
logic accounts for that.
OSS-Fuzz tickled an assert when sending a HTTP response before a HTTP/0.9
request. Avoid this by resetting reply_message upon seeing a HTTP/0.9 request.
PCAP was generated artificially: Server sending a reply providing a
Content-Length. Because HTTP/0.9 processing would remove the ContentLine
support analyzer, more data was delivered to the HTTP_Message than
expected, triggering an assert.
This is a follow-up for zeek/zeek#2851.
oss-fuzz generated "HTTP traffic" containing 250k+ sequences of "T<space>\r\r"
which Zeek then logged as individual HTTP requests. Add a heuristic to bail
on such request lines. It's a bit specific to the test case, but should work.
There are more issues around handling HTTP/0.9, e.g. triggering
"not a http reply line" when HTTP/0.9 never had such a thing, but
I don't think that's worth fixing up.
Fixes#119
The current_entity tracking in HTTP assumes that client/server never
send HTTP entities at the same time. The attached pcap (generated
artificially) violates this and triggers:
1663698249.307259 expression error in <...>base/protocols/http/./entities.zeek, line 89: field value missing (HTTP::c$http$current_entity)
For the http-no-crlf test, include weird.log as baseline. Now that weird is
@load'ed from http, it is actually created and seems to make sense
to btest-diff it, too.
Now that it's loaded in bare mode, no need to load it explicitly.
The main thing that tests were relying on seems to be tracking of
c$service for conn.log baselines. Very few were actually checking
for dpd.log
This is a script-only change that unrolls File::Info records into
multiple files.log entries if the same file was seen over different
connections by single worker. Consequently, the File::Info record
gets the commonly used uid and id fields added. These fields are
optional for File::Info - a file may be analyzed without relation
to a network connection (e.g by using Input::add_analysis()).
The existing tx_hosts, rx_hosts and conn_uids fields of Files::Info
are not meaningful after this change and removed by default. Therefore,
files.log will have them removed, too.
The tx_hosts, rx_hosts and conn_uids fields can be revived by using the
policy script frameworks/files/deprecated-txhosts-rxhosts-connuids.zeek
included in the distribution. However, with v6.1 this script will be
removed.
By default all baslines are run through diff-remove-timestamp. On a BSD
sed implementation, this means that a newline is added to the end of the
file, if no newline was there originally. This behavior differs from GNU
sed, which does not add a newline.
In this commit we unify this behavior by always adding a newline, even
when using GNU sed. This commit also disables the canonifier for a bunch
of binary baselines, so we do not have to change them.
This is to avoid missing large sessions where a single side exceeds
the DPD buffer size. It comes with the trade-off that now the analyzer
can be triggered by anybody controlling one of the endpoints (instead
of both).
Test suite changes are minor, and nothing in "external".
Closes#343.
* origin/topic/timw/open-dict: (40 commits)
Move Dict constants to detail namespace
Add a few missing deprecation fixes
Adjust Dict whitespace/style
Adjust more btest timings
Improve termination reliability/speed for brokerstore btests
General btest cleanup
Update NEWS about change in Dictionary implementation
Improve Intel expire-item btest to be less time-sensitive
Improve btests with unstable table/set output ordering
Update doc submodule
Adjust a few btests that were unstable due to time-sensitivity
Fix DNS script deleting a table element while iterating
Improve a brokerstore btest to filter out Broker connection messages
Sort output of a few SumStats cluster tests
Fix extract_first_email_addr() to really return the first email
Add find_all_ordered() BIF
Extend external test suite canonifier with set-sorting logic
Update btests/baselines for OpenDict compat
Fix new/malloc/delete/free mismatches in Dictionary code
Add explanation for a Dict TODO item
...
- Use `-b` most everywhere, it will save time.
- Start some intel tests upon the input file being fully read instead of
at an arbitrary time.
- Improve termination condition for some sumstats/cluster tests.
- Filter uninteresting output from some supervisor tests.
- Test for `notice_policy.log` is no longer needed.
The body-lengths of sub-entities, like multipart messages, got counted
twice by mistake: once upon the end of the sub-entity and then again
upon the end of the top-level entity that contains all sub-entities.
The size of just the top-level entity is the correct one to use.
This also installs symlinks from "zeek" and "bro-config" to a wrapper
script that prints a deprecation warning.
The btests pass, but this is still WIP. broctl renaming is still
missing.
#239
The "orig_fuids", "orig_filenames", "orig_mime_types" http.log fields as
well as their "resp" counterparts are now limited to having
"HTTP::max_files_orig" or "HTTP::max_files_resp" entries, which are 15
by default. The limit can also be ignored case-by-case via the
"HTTP::max_files_policy" hook.
Fixes GH-289
This adds a slight patch to the HTTP analyzer, which recognizez when a connection is
upgraded to a different protocol (using a 101 reply with a few specific headers being
set).
In this case, the analyzer stops further processing of the connection (which will
result in DPD errors) and raises a new event:
event http_connection_upgrade(c: connection, protocol: string);
Protocol contains the name of the protocol that is being upgraded to, as specified in
one of the header values.
The change from #49 made it an error to not have a URI. That however
then led requests with an URI yet no version to abort as well.
Instead, we now check if the token following the method is an "HTTP/"
version identifier. If, so accept that the URI is empty (and trigger
a weird) but otherwise keep processing.
Adding test cases for both HTTP requests without URI and without
version.
The HTTP analyzer was propogating Gaps to the files framework even
in the case of a packet drop occurring immediately after the headers
are completed in an HTTP response when the response content length
was declared to be zero (no file started, so no loss).
Includes passing test.
The logic for determining whether a gap was entirely within a MIME
entity body was not asking the current entity, which may be better able
to answer that question if it was using the Content-Range header and
thus knows if the gap exceeds the length of the body that's still
expected.
Addresses BIT-1247
For example, if we have a connection between TCP "A" and TCP "B" and "A"
sends segments "1" and "2", but we don't see the first and then the next
acknowledgement from "B" is for everything up to, and including, "2",
the gap would be reported to include both segments instead of just the
first and then delivering the second. Put generally: any segments that
weren't yet delivered because they're waiting for an earlier gap to be
filled would be dropped when an ACK comes in that includes the gap as
well as those pending segments. (If a distinct ACK was seen for just
the gap, that situation would have worked).
Addresses BIT-1246.
As opposed to delaying until a certain-sized-buffer fills, which is
problematic because then the event becomes out of sync with the "rest of
the world". E.g. content_gap handlers being called sooner than
expected.
Addresses BIT-1240.
* topic/robin/http-connect:
HTTP fix for output handlers.
Expanding the HTTP methods used in the signature to detect HTTP traffic.
Updating submodule(s).
Fixing removal of support analyzers, plus some tweaking and cleanup of CONNECT code.
HTTP CONNECT proxy support.
BIT-1132 #merged
CONNECT code.
Removal of support analyzers was broken. The code now actually doesn't
delete them immediately anymore but instead just flags them as
disabled. They'll be destroyed with the parent analyzer later.
Also includes a new leak tests exercising the CONNECT code.
Lines starting # with '#' will be ignored, and an empty message aborts
the commit. # On branch topic/robin/http-connect # Changes to be
committed: # modified: scripts/base/protocols/http/main.bro #
modified: scripts/base/protocols/ssl/consts.bro # modified:
src/analyzer/Analyzer.cc # modified: src/analyzer/Analyzer.h #
modified: src/analyzer/protocol/http/HTTP.cc # new file:
testing/btest/core/leaks/http-connect.bro # modified:
testing/btest/scripts/base/protocols/http/http-connect.bro # #
Untracked files: # .tags # changes.txt # conn.log # debug.log # diff #
mpls-in-vlan.patch # newfile.pcap # packet_filter.log # reporter.log #
src/PktSrc.cc.orig # weird.log #
- Several places were just using old variable names or not loading
scripts correctly after they'd been renamed/moved.
- Revert/adjust a change in how HTTP file handles are generated that
broke partial content responses.
- Turn some libmagic builtin checks back on; seems some are actually
useful (e.g. text detection seems to be a builtin). The rule going
forward probably will be only to turn off a builtin if we confirm it
causes issues.
- Removed some tests that are redundant or not necessary anymore because
the generic file analysis tests cover them.
- A couple FTP tests still fail that I think need an actual solution via
script changes.
http.log now has files taken from request and response bodies in
different fields for each, and can now track multiple files per body.
That is, the "extraction_file" field is now "extracted_request_files"
and "extracted_response_files".