diff --git a/CHANGES b/CHANGES index 270c83ba16..99affd0461 100644 --- a/CHANGES +++ b/CHANGES @@ -1,3 +1,109 @@ +6.2.0-dev.102 | 2023-11-07 09:58:25 +0100 + + * Merge branch 'topic/xb-anssi/http_signature_body_end_match' of https://github.com/xb-anssi/zeek (Arne Welzel, Corelight) + + * 'topic/xb-anssi/http_signature_body_end_match' of https://github.com/xb-anssi/zeek: + Let signature framework match HTTP body end + Test how the signature framework matches HTTP body + + * Let signature framework match HTTP body end (xb-anssi) + + The HTTP analyzer never tells the signature framework when the body of a + request or a response ends, so any signature regex ending in a '$' used + in an 'http-request-body' or in an 'http-reply-body' condition will + never match. + + This made it impossible to write a signature which could distinguish an + HTTP body consisting only of something from an HTTP body prefixed by + that same something. + + - Fix: + + The fix notifies the signature framework on EndOfData() that there will + be no further data to match for this body by giving it an empty buffer + of length 0 with the eol parameter set to true and all others set to + false. This lets it reach the '$' state in its DFA, and doesn't affect + other documented HTTP match behaviours. + + - Limitation: + + Since the signature framework doesn't appear to keep previously consumed + data on hand, any match of an http-*-body condition whose patterns ends + with a '$' will lead to an empty data parameter being passed to the + signature_match() event because the body data is no longer available + when EndOfData() happens. + + Due to segmentation there is anyway no guarantee the data parameter + would have held the entire match even without the '$', since the data + parameter only receives the last chunk of data which completed the match + condition, as can be seen on prefix matches in the btest cases where the + matching data spans multiple segments (the event gives 'B' and not + 'AB'), so this is only an extreme case of partial data being given to + that event. + + * Test how the signature framework matches HTTP body (xb-anssi) + + This adds a signatures/http-body-match btest to verify how the signature + framework matches HTTP body in requests and responses. + + It currently fails because the 'http-request-body' and 'http-reply-body' + clauses never match anything when there is a '$' in their regular + expressions. + + The other pattern clauses such as the 'payload' clause do not suffer + from that restriction and it is not documented as a limitation of HTTP + body pattern clauses either, so it is probably a bug. + + The "http-body-match" btest shows that without a fix any signatures + which ends with a '$' in a http-request-body or http-reply-body rule + will never raise a signature_match() event, and that signatures which do + not end with a '$' cannot distinguish an HTTP body prefixed by the + matching pattern (ex: ABCD) from an HTTP body consisting entirely of the + matching pattern (ex: AB). + + Test cases by source port: + - 13579: + - GET without body, plain res body (CD, only) + - 13578: + - GET without body, plain res body (CDEF, prefix) + - 24680: + - POST plain req body (AB, only), plain res body (CD, only) + - 24681: + - POST plain req body (ABCD, prefix), plain res body (CDEF, prefix) + - 24682: + - POST gzipped req body (AB, only), gzipped res body (CD, only) + - POST plain req body (CD, only), plain res body (EF, only) + - 33210: + - POST multipart plain req body (AB;CD;EF, prefix) + - plain res body (CD, only) + - 33211: + - POST multipart plain req body (ABCD;EF, prefix) + - plain res body (CDEF, prefix) + - 34527: + - POST chunked gzipped req body (AB, only) + - chunked gzipped res body (CD, only) + - 34528: + - POST chunked gzipped req body (ABCD, prefix) + - chunked gzipped res body (CDEF, prefix) + + The tests with source ports 24680, 24682 and 34527 should + match the signature http_request_body_AB_only and the signature + http_request_body_AB_prefix, but they only match the latter. + + The tests with source ports 13579, 24680, 24682, 33210 and 34527 should + match the signature http_response_body_CD_only and the signature + http_response_body_CD_prefix, but they only match the latter. + + The tests with source ports 24680, 24681, 33210 and 33211 show how the + http_request_body_AB_then_CD signature with two http-request-body + conditions match either on one or multiple requests (documented + behaviour). + + The test cases with other source ports show where the + http_request_body_AB_only and http_response_body_CD_only signatures + should not match because their bodies include more than the searched + patterns. + 6.2.0-dev.99 | 2023-11-07 09:56:55 +0100 * Fix unsafe and inefficient uses of copy_string (Dominik Charousset, Corelight) diff --git a/VERSION b/VERSION index 4f1c08cc3e..fd957554f0 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -6.2.0-dev.99 +6.2.0-dev.102 diff --git a/src/analyzer/protocol/http/HTTP.cc b/src/analyzer/protocol/http/HTTP.cc index 863c65ad02..4a4eaca9d0 100644 --- a/src/analyzer/protocol/http/HTTP.cc +++ b/src/analyzer/protocol/http/HTTP.cc @@ -69,6 +69,12 @@ void HTTP_Entity::EndOfData() { encoding = IDENTITY; } + zeek::detail::Rule::PatternType rule = + http_message->IsOrig() ? zeek::detail::Rule::HTTP_REQUEST_BODY : zeek::detail::Rule::HTTP_REPLY_BODY; + + http_message->MyHTTP_Analyzer()->Conn()->Match(rule, reinterpret_cast(""), 0, http_message->IsOrig(), + false, true, false); + if ( body_length ) http_message->MyHTTP_Analyzer()->ForwardEndOfData(http_message->IsOrig()); diff --git a/testing/btest/Baseline/signatures.http-body-match/out b/testing/btest/Baseline/signatures.http-body-match/out new file mode 100644 index 0000000000..6a8982e7ce --- /dev/null +++ b/testing/btest/Baseline/signatures.http-body-match/out @@ -0,0 +1,27 @@ +### BTest baseline data generated by btest-diff. Do not edit. Use "btest -U/-u" to update. Requires BTest >= 0.63. +HTTP body match for 192.0.2.42:13578 -> 192.88.99.42:80 with signature 'http_response_body_CD_prefix', data: 'D' +HTTP body match for 192.0.2.42:13579 -> 192.88.99.42:80 with signature 'http_response_body_CD_only', data: '' +HTTP body match for 192.0.2.42:13579 -> 192.88.99.42:80 with signature 'http_response_body_CD_prefix', data: 'D' +HTTP body match for 192.0.2.42:24680 -> 192.88.99.42:80 with signature 'http_request_body_AB_only', data: '' +HTTP body match for 192.0.2.42:24680 -> 192.88.99.42:80 with signature 'http_request_body_AB_prefix', data: 'B' +HTTP body match for 192.0.2.42:24680 -> 192.88.99.42:80 with signature 'http_response_body_CD_only', data: '' +HTTP body match for 192.0.2.42:24680 -> 192.88.99.42:80 with signature 'http_response_body_CD_prefix', data: 'D' +HTTP body match for 192.0.2.42:24681 -> 192.88.99.42:80 with signature 'http_request_body_AB_prefix', data: 'B' +HTTP body match for 192.0.2.42:24681 -> 192.88.99.42:80 with signature 'http_response_body_CD_prefix', data: 'D' +HTTP body match for 192.0.2.42:24682 -> 192.88.99.42:80 with signature 'http_request_body_AB_only', data: '' +HTTP body match for 192.0.2.42:24682 -> 192.88.99.42:80 with signature 'http_request_body_AB_prefix', data: 'AB' +HTTP body match for 192.0.2.42:24682 -> 192.88.99.42:80 with signature 'http_request_body_AB_then_CD', data: 'CD' +HTTP body match for 192.0.2.42:24682 -> 192.88.99.42:80 with signature 'http_response_body_CD_only', data: '' +HTTP body match for 192.0.2.42:24682 -> 192.88.99.42:80 with signature 'http_response_body_CD_prefix', data: 'CD' +HTTP body match for 192.0.2.42:33210 -> 192.88.99.42:80 with signature 'http_request_body_AB_prefix', data: 'AB' +HTTP body match for 192.0.2.42:33210 -> 192.88.99.42:80 with signature 'http_request_body_AB_then_CD', data: 'CD' +HTTP body match for 192.0.2.42:33210 -> 192.88.99.42:80 with signature 'http_response_body_CD_only', data: '' +HTTP body match for 192.0.2.42:33210 -> 192.88.99.42:80 with signature 'http_response_body_CD_prefix', data: 'D' +HTTP body match for 192.0.2.42:33211 -> 192.88.99.42:80 with signature 'http_request_body_AB_prefix', data: 'ABCD' +HTTP body match for 192.0.2.42:33211 -> 192.88.99.42:80 with signature 'http_response_body_CD_prefix', data: 'D' +HTTP body match for 192.0.2.42:34527 -> 192.88.99.42:80 with signature 'http_request_body_AB_only', data: '' +HTTP body match for 192.0.2.42:34527 -> 192.88.99.42:80 with signature 'http_request_body_AB_prefix', data: 'AB' +HTTP body match for 192.0.2.42:34527 -> 192.88.99.42:80 with signature 'http_response_body_CD_only', data: '' +HTTP body match for 192.0.2.42:34527 -> 192.88.99.42:80 with signature 'http_response_body_CD_prefix', data: 'CD' +HTTP body match for 192.0.2.42:34528 -> 192.88.99.42:80 with signature 'http_request_body_AB_prefix', data: 'ABCD' +HTTP body match for 192.0.2.42:34528 -> 192.88.99.42:80 with signature 'http_response_body_CD_prefix', data: 'CDEF' diff --git a/testing/btest/Traces/http/http-body-match.pcap b/testing/btest/Traces/http/http-body-match.pcap new file mode 100644 index 0000000000..bc90359f1f Binary files /dev/null and b/testing/btest/Traces/http/http-body-match.pcap differ diff --git a/testing/btest/signatures/http-body-match.zeek b/testing/btest/signatures/http-body-match.zeek new file mode 100644 index 0000000000..e7b944e65f --- /dev/null +++ b/testing/btest/signatures/http-body-match.zeek @@ -0,0 +1,43 @@ +# @TEST-EXEC: zeek -b -r $TRACES/http/http-body-match.pcap %INPUT | sort >out +# @TEST-EXEC: btest-diff out + +@load-sigs test.sig +@load base/protocols/http + +@TEST-START-FILE test.sig +signature http_request_body_AB_prefix { + http-request-body /^AB/ + event "HTTP request body starting with AB" +} + +signature http_request_body_AB_only { + http-request-body /^AB$/ + event "HTTP request body containing AB only" +} + +signature http_request_body_AB_then_CD { + http-request-body /AB/ + http-request-body /CD/ + event "HTTP request body containing AB and CD, but maybe not be on same request (documented behaviour)" +} + +signature http_response_body_CD_prefix { + http-reply-body /^CD/ + event "HTTP response body starting with CD" +} + +signature http_response_body_CD_only { + http-reply-body /^CD$/ + event "HTTP response body containing CD only" +} +@TEST-END-FILE + +event signature_match(state: signature_state, msg: string, data: string) +{ + print(fmt("HTTP body match for %s:%d -> %s:%d with signature '%s', data: '%s'", + state$conn$id$orig_h, state$conn$id$orig_p, + state$conn$id$resp_h, state$conn$id$resp_p, + state$sig_id, + data + )); +}