Commit graph

46 commits

Author SHA1 Message Date
Arne Welzel
377fd711bd HTTP: Implement FlipRoles()
When Zeek flips roles of a HTTP connection subsequent to the HTTP analyzer
being attached, that analyzer would not update its own ContentLine analyzer
state, resulting in the wrong ContentLine analyzer being switched into
plain delivery mode.

In debug builds, this would result in assertion failures, in production
builds, the HTTP analyzer would receive HTTP bodies as individual header
lines, or conversely, individual header lines would be delivered as a
large chunk from the ContentLine analyzer.

PCAPs were generated locally using tcprewrite to select well-known-http ports
for both endpoints, then editcap to drop the first SYN packet.

Kudos to @JordanBarnartt for keeping at it.

Closes #3789
2024-07-04 11:38:33 +02:00
Arne Welzel
e11c20e1eb test-all-policy: Do not load iso-9660.zeek
Changing the default_file_bof_buffer_size has subtle impact on
MIME type detection and changed the zeek-testing baseline. Do
not load this new script via test-all-policy to avoid this.

The new test was mainly an aid to understand what is actually going on.
In short, if default_file_bof_buffer_size is larger than the file MIME
detection only runs when the buffer is full, or when the file is removed.
When a file transfer happens over multiple HTTP connections, only
some or one of the http.log entries will have a proper response MIME type.

PCAP extracted from 2009-M57-day11-18.trace.gz.
2024-02-26 17:58:26 +01:00
Arne Welzel
d2409dd432 signatures: Fix ISO 9960 signature
This signature only really works when default_file_bof_buffer_size is bumped
to a sufficient value (40k).
2024-02-22 12:37:40 +01:00
Arne Welzel
2a858d252e MIME: Cap nested MIME analysis depth to 100
OSS-Fuzz managed to produce a MIME multipart message construction with
thousands of nested entities (or that's what Zeek makes out of it anyhow).
Prevent such deep analysis by capping at a nesting depth of 100,
preventing unnecessary resource usage. A new weird named exceeded_mime_max_depth
is reported when this limit is reached.

This change reduces the runtime of the OSS-Fuzz reproducer from ~45 seconds
to ~2.5 seconds.

The test PCAP was produced from a Python script using the email package
and sending the rendered version via POST to a HTTP server.

Closes #208
2024-01-17 10:18:13 -07:00
xb-anssi
c8103dd963
Test how the signature framework matches HTTP body
This adds a signatures/http-body-match btest to verify how the signature
framework matches HTTP body in requests and responses.

It currently fails because the 'http-request-body' and 'http-reply-body'
clauses never match anything when there is a '$' in their regular
expressions.

The other pattern clauses such as the 'payload' clause do not suffer
from that restriction and it is not documented as a limitation of HTTP
body pattern clauses either, so it is probably a bug.

The "http-body-match" btest shows that without a fix any signatures
which ends with a '$' in a http-request-body or http-reply-body rule
will never raise a signature_match() event, and that signatures which do
not end with a '$' cannot distinguish an HTTP body prefixed by the
matching pattern (ex: ABCD) from an HTTP body consisting entirely of the
matching pattern (ex: AB).

Test cases by source port:
- 13579:
  - GET without body, plain res body (CD, only)
- 13578:
  - GET without body, plain res body (CDEF, prefix)
- 24680:
  - POST plain req body (AB, only), plain res body (CD, only)
- 24681:
  - POST plain req body (ABCD, prefix), plain res body (CDEF, prefix)
- 24682:
  - POST gzipped req body (AB, only), gzipped res body (CD, only)
  - POST plain req body (CD, only), plain res body (EF, only)
- 33210:
  - POST multipart plain req body (AB;CD;EF, prefix)
  - plain res body (CD, only)
- 33211:
  - POST multipart plain req body (ABCD;EF, prefix)
  - plain res body (CDEF, prefix)
- 34527:
  - POST chunked gzipped req body (AB, only)
  - chunked gzipped res body (CD, only)
- 34528:
  - POST chunked gzipped req body (ABCD, prefix)
  - chunked gzipped res body (CDEF, prefix)

The tests with source ports 24680, 24682 and 34527 should
match the signature http_request_body_AB_only and the signature
http_request_body_AB_prefix, but they only match the latter.

The tests with source ports 13579, 24680, 24682, 33210 and 34527 should
match the signature http_response_body_CD_only and the signature
http_response_body_CD_prefix, but they only match the latter.

The tests with source ports 24680, 24681, 33210 and 33211 show how the
http_request_body_AB_then_CD signature with two http-request-body
conditions match either on one or multiple requests (documented
behaviour).

The test cases with other source ports show where the
http_request_body_AB_only and http_response_body_CD_only signatures
should not match because their bodies include more than the searched
patterns.
2023-11-03 15:28:15 +01:00
Johanna Amann
e18edfa452 Add extract_limit_includes_missing option for file extraction
Setting this option to false does not count missing bytes in files towards the
extraction limits, and allows to extract data up to the desired limit,
even when partial files are written.

When missing bytes are encountered, files are now written as sparse
files.

Using this option requires the underlying storage and utilities to support
sparse files.
2023-09-14 12:11:42 -07:00
Tim Wojtulewicz
5934e143aa Revert "Add extract_limit_includes_missing option for file extraction"
This reverts commit f4d0fdcd5c.
2023-09-14 12:10:40 -07:00
Johanna Amann
f4d0fdcd5c Add extract_limit_includes_missing option for file extraction
Setting this option to false does not count missing bytes in files towards the
extraction limits, and allows to extract data up to the desired limit,
even when partial files are written.

When missing bytes are encountered, files are now written as sparse
files.

Using this option requires the underlying storage and utilities to support
sparse files.

(cherry picked from commit afa6f3a0d3b8db1ec5b5e82d26225504c2891089)
2023-09-12 12:00:36 -07:00
Arne Welzel
af1714853f http: Prevent request/response de-synchronization and unbounded state growth
When http_reply events are received before http_request events, either
through faking traffic or possible re-ordering, it is possible to trigger
unbounded state growth due to later http_requests never being matched
again with responses.

Prevent this by synchronizing request/response counters when late
requests come in.

Also forcefully flush pending requests when http_replies are never
observed either due to the analyzer having been disabled or because
half-duplex traffic.

Fixes #1705
2023-08-28 15:02:58 +02:00
Arne Welzel
b18122da08 Merge branch 'master' of https://github.com/progmboy/zeek
* 'master' of https://github.com/progmboy/zeek:
  fix http AUTHORIZATION base64 decode failed

Added a test during merge.
2023-06-27 18:21:34 +02:00
Arne Welzel
c29b98b224 Merge remote-tracking branch 'origin/topic/awelzel/http-content-range-parsing-robustness'
* origin/topic/awelzel/http-content-range-parsing-robustness:
  HTTP: Make Content-Range parsing more robust
2023-03-13 18:41:16 +01:00
Arne Welzel
b21e6f72da HTTP: Make Content-Range parsing more robust
This was exposed by OSS-Fuzz after the HTTP/0.9 changes in zeek/zeek#2851:
We do not check the result of parsing the from and last bytes of a
Content-Range header and would reference uninitialized values on the stack
if these were not valid.

This doesn't seem as bad as it sounds outside of yielding non-sensible values:
If the result was negative, we weird/bailed. If the result was positive, we
already had to treat it with suspicion anyway and the SetPlainDelivery()
logic accounts for that.
2023-03-13 18:00:39 +01:00
Arne Welzel
fbf9d53c44 HTTP: Reset reply_message for HTTP/0.9
OSS-Fuzz tickled an assert when sending a HTTP response before a HTTP/0.9
request. Avoid this by resetting reply_message upon seeing a HTTP/0.9 request.

PCAP was generated artificially: Server sending a reply providing a
Content-Length. Because HTTP/0.9 processing would remove the ContentLine
support analyzer, more data was delivered to the HTTP_Message than
expected, triggering an assert.

This is a follow-up for zeek/zeek#2851.
2023-03-13 14:13:50 +01:00
Arne Welzel
c4302ec280 testing/http: http-11-request-then-cruft
A client sends a "proper" HTTP/1.1 request and afterwards a few T /\n\n sequences.
The latter ones aren't logged.
2023-01-26 19:59:39 +01:00
Arne Welzel
0b26866ecf testing/http: Add pcap extracted from m5-long external test-suite
This tests that the HTTP version is now updated if it changes in the
course of a connection.
2023-01-26 19:59:39 +01:00
Arne Welzel
6d19c49efe intel/seen/file-names: Use file_over_new_connection()
The seen/file-names script relies on f$info$filename to be populated.
For HTTP and other network protocols, however, this field is only
populated during file_over_new_connection() that's running after
file_new().

Use the file_new() event only for files without connections and
file_over_new_connection() implies that f$conns is populated, anyway.

Special case SMB to avoid finding files twice, because there's a
custom implementation in seen/smb-filenames.zeek.

Fixes #2647
2023-01-10 10:10:28 +01:00
Arne Welzel
1e06c8bfda frameworks/notice: Handle fa_file with no or more than a single connection better
* When a file is transferred over multiple connection, have
  create_file_info() just pick the first one instead of none.

* Do not unconditionally assume cid and cuid as set on a
  Notice::FileInfo object.
2022-12-06 11:17:30 +01:00
Arne Welzel
540fe7aff7 http: Heuristic around rejecting malformed HTTP/0.9 traffic
oss-fuzz generated "HTTP traffic" containing 250k+ sequences of "T<space>\r\r"
which Zeek then logged as individual HTTP requests. Add a heuristic to bail
on such request lines. It's a bit specific to the test case, but should work.

There are more issues around handling HTTP/0.9, e.g. triggering
"not a http reply line" when HTTP/0.9 never had such a thing, but
I don't think that's worth fixing up.

Fixes #119
2022-11-18 18:19:58 +01:00
Arne Welzel
38e226bf75 http: Prevent script errors when http$current_entity is not set
The current_entity tracking in HTTP assumes that client/server never
send HTTP entities at the same time. The attached pcap (generated
artificially) violates this and triggers:

    1663698249.307259 expression error in <...>base/protocols/http/./entities.zeek, line 89: field value missing (HTTP::c$http$current_entity)

For the http-no-crlf test, include weird.log as baseline. Now that weird is
@load'ed from http, it is actually created and seems to make sense
to btest-diff it, too.
2022-09-26 10:18:24 +02:00
Arne Welzel
d2314d2666 files.log: Unroll and introduce uid and id fields
This is a script-only change that unrolls File::Info records into
multiple files.log entries if the same file was seen over different
connections by single worker. Consequently, the File::Info record
gets the commonly used uid and id fields added. These fields are
optional for File::Info - a file may be analyzed without relation
to a network connection (e.g by using Input::add_analysis()).

The existing tx_hosts, rx_hosts and conn_uids fields of Files::Info
are not meaningful after this change and removed by default. Therefore,
files.log will have them removed, too.

The tx_hosts, rx_hosts and conn_uids fields can be revived by using the
policy script frameworks/files/deprecated-txhosts-rxhosts-connuids.zeek
included in the distribution. However, with v6.1 this script will be
removed.
2022-08-16 17:22:20 +02:00
jerome Grandvalet
8cabecec40 Fix HTTP evasion
- Happen when there is no CRLF at the end of HTTP
    - Fix by adding CRLF when packet is complete (in relation to content-length in header)
2021-07-23 09:28:29 +02:00
Peter Oettig
b2e6c9ac9a Initial implementation of Lower-Level analyzers 2020-09-23 11:13:25 -07:00
Robin Sommer
0af57d12b2 Change HTTP's DPD signatures so that each side can trigger the analyzer on its own.
This is to avoid missing large sessions where a single side exceeds
the DPD buffer size. It comes with the trade-off that now the analyzer
can be triggered by anybody controlling one of the endpoints (instead
of both).

Test suite changes are minor, and nothing in "external".

Closes #343.
2020-09-08 07:33:36 +00:00
Jon Siwek
363b167bd2 GH-1100: Fix reported body-length of HTTP messages w/ sub-entities
The body-lengths of sub-entities, like multipart messages, got counted
twice by mistake: once upon the end of the sub-entity and then again
upon the end of the top-level entity that contains all sub-entities.
The size of just the top-level entity is the correct one to use.
2020-08-04 14:21:03 -07:00
Jon Siwek
2000e2a424 GH-977: Improve pcap error handling
Switches from pcap_next() to pcap_next_ex() to better handle all error
conditions.  This allows, for example, to have a non-zero exit code for
a Zeek process that fails to fully process all packets in a pcap file.
2020-06-08 18:11:58 -07:00
Jan Grashoefer
788b56a652 Add speculative service script.
The speculative service script handles dpd_late_match events to extend
conn.log with infos about potential protocol identifications.
2019-08-29 11:47:04 +02:00
Jon Siwek
1f777b57b8 BIT-1926: add unit tests for misc. HTTP patches 2018-05-08 15:39:27 -05:00
Johanna Amann
eab80c8834 HTTP: Recognize and skip upgrade/websocket connections.
This adds a slight patch to the HTTP analyzer, which recognizez when a connection is
upgraded to a different protocol (using a 101 reply with a few specific headers being
set).

In this case, the analyzer stops further processing of the connection (which will
result in DPD errors) and raises a new event:

event http_connection_upgrade(c: connection, protocol: string);

Protocol contains the name of the protocol that is being upgraded to, as specified in
one of the header values.
2017-08-04 07:04:28 -07:00
Johanna Amann
ade9aa219b Better handling of % at end of line. 2017-07-27 22:04:47 -07:00
Seth Hall
90399db32d Additional test specifically for the HTTP filename handling. 2016-06-15 01:56:07 -04:00
wglodek
9ebe7b2a21 updated weird message and tests 2016-03-04 18:03:24 -05:00
wglodek
93f52fcdd2 detect possible HTTP evasion attempts 2016-02-07 11:22:09 -05:00
Robin Sommer
642ef5d3c1 Tweaking how HTTP requests without URIs are handled.
The change from #49 made it an error to not have a URI. That however
then led requests with an URI yet no version to abort as well.
Instead, we now check if the token following the method is an "HTTP/"
version identifier. If, so accept that the URI is empty (and trigger
a weird) but otherwise keep processing.

Adding test cases for both HTTP requests without URI and without
version.
2016-01-15 12:59:11 -08:00
Robin Sommer
c151a25843 Fix support for HTTP connect when server adds headers to response.
Patch by Eric Karasuda.

I slightly tweaked the patch to not need a new member variable. Also
turned the provided trace into a test case.
2015-10-23 13:10:33 -07:00
Robin Sommer
46e584daa2 Adding tests for Flash version parsing and plugin detection.
(The plugin detection isn't testing the Chrome behaviour actually,
don't have a trace for that.)
2015-07-30 07:23:14 -07:00
Seth Hall
ea2ce67c5f Fixes an issue with missing zlib headers on deflated HTTP content.
- Includes a test.
2015-05-18 14:30:32 -04:00
Seth Hall
89d66af792 Fix an issue with packet loss in http file reporting.
The HTTP analyzer was propogating Gaps to the files framework even
in the case of a packet drop occurring immediately after the headers
are completed in an HTTP response when the response content length
was declared to be zero (no file started, so no loss).

Includes passing test.
2015-04-08 13:39:42 -04:00
Jon Siwek
af9d31dcc1 Fix incorrect data delivery skips after gap in HTTP Content-Range.
The logic for determining whether a gap was entirely within a MIME
entity body was not asking the current entity, which may be better able
to answer that question if it was using the Content-Range header and
thus knows if the gap exceeds the length of the body that's still
expected.

Addresses BIT-1247
2014-09-11 14:53:47 -05:00
Jon Siwek
1e02d5d5b5 Fix file analysis placement of data after gap in HTTP Content-Range.
Addresses BIT-1248.
2014-09-11 12:25:43 -05:00
Jon Siwek
f1cef9d2a9 Fix issue w/ TCP reassembler not delivering some segments.
For example, if we have a connection between TCP "A" and TCP "B" and "A"
sends segments "1" and "2", but we don't see the first and then the next
acknowledgement from "B" is for everything up to, and including, "2",
the gap would be reported to include both segments instead of just the
first and then delivering the second.  Put generally: any segments that
weren't yet delivered because they're waiting for an earlier gap to be
filled would be dropped when an ACK comes in that includes the gap as
well as those pending segments.  (If a distinct ACK was seen for just
the gap, that situation would have worked).

Addresses BIT-1246.
2014-09-11 10:47:56 -05:00
Jon Siwek
f97f58e9db Raise http_entity_data in line with data arrival.
As opposed to delaying until a certain-sized-buffer fills, which is
problematic because then the event becomes out of sync with the "rest of
the world".  E.g. content_gap handlers being called sooner than
expected.

Addresses BIT-1240.
2014-09-10 13:20:47 -05:00
Seth Hall
dd0856a57f HTTP CONNECT proxy support.
- The HTTP analyzer now supports handling HTTP CONNECT proxies
   same as the SOCKS analyzer handles proxying.
2014-02-12 22:38:59 -05:00
Jon Siwek
0cb2a90da4 Add script to detect filtered TCP traces, addresses BIT-1119.
If reading a trace file w/ only TCP control packets, a warning is
emitted to suggest the 'detect_filtered_traces' option if the user
doesn't desire Bro to report missing TCP segments for such a trace file.
2014-01-31 17:04:58 -06:00
Jon Siwek
e18084b68d Add unit tests for new Bro Manual docs. 2014-01-21 16:01:55 -06:00
Jon Siwek
3cbef60f57 Fix HTTP multipart body file analysis.
Each part now gets assigned a different file handle/id.
2013-05-21 15:35:22 -05:00
Jon Siwek
59ed5c75f1 FileAnalysis: add unit tests covering current protocol integration.
And had to make various fixes/refinements after scrutinizing results.
2013-03-19 15:50:05 -05:00