* origin/topic/vern/script-opt.Nov23:
retention of superseded AST elements to prevent pointer mis-aliasing
BTest updates for latest ZAM maintenance
greater ZAM optimization of inlined function calls
some minor ZAM optimization improvements
added "-O noinline" option to turn off ZAM inlining, to help with diagnosing optimization problems
fixes for a number of ZAM optimization bugs
allow explicitly marking an identifier as equivalent to special '_' identifier
fixed some warnings about mixing signed & unsigned integers
descriptions of "for" statements now include their "value variable" if present
```
## Tells Zeek to skip sending any further input data to the current analyzer.
## This is supported for protocol and file analyzers.
public function skip_input() : void;
```
Closes#3443.
* origin/topic/awelzel/deprecate-things-for-7.1:
Bump zeekctl
EventHandler: Deprecate SetUsed() and Used() as well.
EventRegistry: Deprecate UsedHandlers() and UnusedHandlers()
time machine: Mark leftovers for removal in v7.1
policy/misc/load-balancing: Deprecate script
cluster: Deprecate the Cluster::Node$interface field
Seems the latter isn't used outside of the functions that were deprecated
in the previous commit and with UsageAnalyzer not making use of this
information unclear why we should keep it around.
Relates to #3187.
and check_for_unused_event_handlers: UsageAnalyzer is more thorough
and the previous ones weren't extended to work with &is_used and
should probably be considered superseded by the UsageAnalyzer even
if that currently does not provide a public API and just prints
out deprecation warnings.
I'm also tempted to deprecate SetUsed() and Used() of EventHandler
for the same reason.
Closes#3187.
This field isn't required by a worker and it's certainly not used by a
worker to listen on that specific interface. It also isn't required to
be set consistently and its use in-tree limited to the old load-balancing
script.
There's a bif called packet_source() which on a worker will provide
information about the actually used packet source.
Relates to zeek/zeek#2877.
* 'topic/xb-anssi/http_signature_body_end_match' of https://github.com/xb-anssi/zeek:
Let signature framework match HTTP body end
Test how the signature framework matches HTTP body
The HTTP analyzer never tells the signature framework when the body of a
request or a response ends, so any signature regex ending in a '$' used
in an 'http-request-body' or in an 'http-reply-body' condition will
never match.
This made it impossible to write a signature which could distinguish an
HTTP body consisting only of something from an HTTP body prefixed by
that same something.
- Fix:
The fix notifies the signature framework on EndOfData() that there will
be no further data to match for this body by giving it an empty buffer
of length 0 with the eol parameter set to true and all others set to
false. This lets it reach the '$' state in its DFA, and doesn't affect
other documented HTTP match behaviours.
- Limitation:
Since the signature framework doesn't appear to keep previously consumed
data on hand, any match of an http-*-body condition whose patterns ends
with a '$' will lead to an empty data parameter being passed to the
signature_match() event because the body data is no longer available
when EndOfData() happens.
Due to segmentation there is anyway no guarantee the data parameter
would have held the entire match even without the '$', since the data
parameter only receives the last chunk of data which completed the match
condition, as can be seen on prefix matches in the btest cases where the
matching data spans multiple segments (the event gives 'B' and not
'AB'), so this is only an extreme case of partial data being given to
that event.
This adds a signatures/http-body-match btest to verify how the signature
framework matches HTTP body in requests and responses.
It currently fails because the 'http-request-body' and 'http-reply-body'
clauses never match anything when there is a '$' in their regular
expressions.
The other pattern clauses such as the 'payload' clause do not suffer
from that restriction and it is not documented as a limitation of HTTP
body pattern clauses either, so it is probably a bug.
The "http-body-match" btest shows that without a fix any signatures
which ends with a '$' in a http-request-body or http-reply-body rule
will never raise a signature_match() event, and that signatures which do
not end with a '$' cannot distinguish an HTTP body prefixed by the
matching pattern (ex: ABCD) from an HTTP body consisting entirely of the
matching pattern (ex: AB).
Test cases by source port:
- 13579:
- GET without body, plain res body (CD, only)
- 13578:
- GET without body, plain res body (CDEF, prefix)
- 24680:
- POST plain req body (AB, only), plain res body (CD, only)
- 24681:
- POST plain req body (ABCD, prefix), plain res body (CDEF, prefix)
- 24682:
- POST gzipped req body (AB, only), gzipped res body (CD, only)
- POST plain req body (CD, only), plain res body (EF, only)
- 33210:
- POST multipart plain req body (AB;CD;EF, prefix)
- plain res body (CD, only)
- 33211:
- POST multipart plain req body (ABCD;EF, prefix)
- plain res body (CDEF, prefix)
- 34527:
- POST chunked gzipped req body (AB, only)
- chunked gzipped res body (CD, only)
- 34528:
- POST chunked gzipped req body (ABCD, prefix)
- chunked gzipped res body (CDEF, prefix)
The tests with source ports 24680, 24682 and 34527 should
match the signature http_request_body_AB_only and the signature
http_request_body_AB_prefix, but they only match the latter.
The tests with source ports 13579, 24680, 24682, 33210 and 34527 should
match the signature http_response_body_CD_only and the signature
http_response_body_CD_prefix, but they only match the latter.
The tests with source ports 24680, 24681, 33210 and 33211 show how the
http_request_body_AB_then_CD signature with two http-request-body
conditions match either on one or multiple requests (documented
behaviour).
The test cases with other source ports show where the
http_request_body_AB_only and http_response_body_CD_only signatures
should not match because their bodies include more than the searched
patterns.
Add a new overload to `copy_string` that takes the input characters plus
size. The new overload avoids inefficient scanning of the input for the
null terminator in cases where we know the size beforehand. Furthermore,
this overload *must* be used when dealing with input character sequences
that may have no null terminator, e.g., when the input is from a
`std::string_view` object.
* origin/topic/awelzel/3379-shared-ptr-and-micro-optimizations:
build_inner_connection: Use the outer packet's timestamp
build_inner_connection: Avoid one extra Init()
packet_analysis: Do not run DetectProtocol() on disabled analyzers
packet_analysis/Dispatcher: Do not index table twice
packet_analysis: Avoid shared_ptr copying for analyzer lookups
Packet::Init() is not so cheap as one might think: It computes a
timestamp from { 0, 0 } using double division. Just avoid this
by not initializing an empty Packet.