Commit graph

143 commits

Author SHA1 Message Date
Kshitiz Bartariya
40935c31b1 Ignore case when matching prefix in http analyzer 2025-04-25 10:33:11 -07:00
Arne Welzel
377fd711bd HTTP: Implement FlipRoles()
When Zeek flips roles of a HTTP connection subsequent to the HTTP analyzer
being attached, that analyzer would not update its own ContentLine analyzer
state, resulting in the wrong ContentLine analyzer being switched into
plain delivery mode.

In debug builds, this would result in assertion failures, in production
builds, the HTTP analyzer would receive HTTP bodies as individual header
lines, or conversely, individual header lines would be delivered as a
large chunk from the ContentLine analyzer.

PCAPs were generated locally using tcprewrite to select well-known-http ports
for both endpoints, then editcap to drop the first SYN packet.

Kudos to @JordanBarnartt for keeping at it.

Closes #3789
2024-07-04 11:38:33 +02:00
Robin Sommer
2ec44f098f
Extend PIA's FirstPacket API.
`FirstPacket()` so far supported only TCP. To extend this to UDP, we
move the method into the PIA base class; give it a protocol parameter
for the case that there's no actual packet is available; and add the
ability to create fake UDP packets as well, not just TCP.

This whole thing is pretty ugly to begin with, and this doesn't make
it nicer, but we need this extension that so we can feed UDP data into
the signature engine that's tunneled over other protocols. Without the
fake packets, DPD signatures in particular wouldn't have anything to
match on.
2024-05-07 18:19:46 +02:00
Arne Welzel
11e0322f0f HTTP: Coverity std::move suggestion 2024-01-24 10:50:42 +01:00
Arne Welzel
4d81389df0 HTTP/CONNECT: Also weird on extra data in reply 2024-01-22 18:54:38 +01:00
Arne Welzel
de836ab528 HTTP/Upgrade: Weird when more data is available
After an HTTP upgrade to another protocol, create a weird if the packet
that contains the HTTP reply *also* contains some additional data
belonging to the upgraded to protocol already.
2024-01-22 18:54:38 +01:00
Arne Welzel
7967ef993b HTTP: Drain event queue after instantiating upgrade analyzer
With configurability through script-land comes the draw back
that we actually need to execute event handlers in the middle
of the parsing process: This might not be the best model, but
the script-side configurability it enables is kind of nice.

This explicit call only matters here when the HTTP reply is
directly followed by some WebSocket message data within the
same network packet, otherwise the queue is drained once the
packet has been completely processed anyhow.
2024-01-22 18:54:38 +01:00
Arne Welzel
8ebd054abc HTTP: Add mechanism to instantiate Upgrade analyzer
When a HTTP upgrade request/reply is detected, lookup an analyzer tag
from HTTP::upgrade_analyzers, or if nothing is found, attach PIA_TCP.
2024-01-22 18:54:38 +01:00
Arne Welzel
8a13155a41 Merge branch 'topic/xb-anssi/http_signature_body_end_match' of https://github.com/xb-anssi/zeek
* 'topic/xb-anssi/http_signature_body_end_match' of https://github.com/xb-anssi/zeek:
  Let signature framework match HTTP body end
  Test how the signature framework matches HTTP body
2023-11-07 09:58:59 +01:00
xb-anssi
9e61bfd010
Let signature framework match HTTP body end
The HTTP analyzer never tells the signature framework when the body of a
request or a response ends, so any signature regex ending in a '$' used
in an 'http-request-body' or in an 'http-reply-body' condition will
never match.

This made it impossible to write a signature which could distinguish an
HTTP body consisting only of something from an HTTP body prefixed by
that same something.

- Fix:

The fix notifies the signature framework on EndOfData() that there will
be no further data to match for this body by giving it an empty buffer
of length 0 with the eol parameter set to true and all others set to
false. This lets it reach the '$' state in its DFA, and doesn't affect
other documented HTTP match behaviours.

- Limitation:

Since the signature framework doesn't appear to keep previously consumed
data on hand, any match of an http-*-body condition whose patterns ends
with a '$' will lead to an empty data parameter being passed to the
signature_match() event because the body data is no longer available
when EndOfData() happens.

Due to segmentation there is anyway no guarantee the data parameter
would have held the entire match even without the '$', since the data
parameter only receives the last chunk of data which completed the match
condition, as can be seen on prefix matches in the btest cases where the
matching data spans multiple segments (the event gives 'B' and not
'AB'), so this is only an extreme case of partial data being given to
that event.
2023-11-03 15:28:24 +01:00
Benjamin Bannier
f5a76c1aed Reformat Zeek in Spicy style
This largely copies over Spicy's `.clang-format` configuration file. The
one place where we deviate is header include order since Zeek depends on
headers being included in a certain order.
2023-10-30 09:40:55 +01:00
Arne Welzel
7a043e5e8f all: Fix typos identified by typos pre-commit hook 2023-06-13 17:57:32 +02:00
Arne Welzel
c29b98b224 Merge remote-tracking branch 'origin/topic/awelzel/http-content-range-parsing-robustness'
* origin/topic/awelzel/http-content-range-parsing-robustness:
  HTTP: Make Content-Range parsing more robust
2023-03-13 18:41:16 +01:00
Arne Welzel
b21e6f72da HTTP: Make Content-Range parsing more robust
This was exposed by OSS-Fuzz after the HTTP/0.9 changes in zeek/zeek#2851:
We do not check the result of parsing the from and last bytes of a
Content-Range header and would reference uninitialized values on the stack
if these were not valid.

This doesn't seem as bad as it sounds outside of yielding non-sensible values:
If the result was negative, we weird/bailed. If the result was positive, we
already had to treat it with suspicion anyway and the SetPlainDelivery()
logic accounts for that.
2023-03-13 18:00:39 +01:00
Arne Welzel
fbf9d53c44 HTTP: Reset reply_message for HTTP/0.9
OSS-Fuzz tickled an assert when sending a HTTP response before a HTTP/0.9
request. Avoid this by resetting reply_message upon seeing a HTTP/0.9 request.

PCAP was generated artificially: Server sending a reply providing a
Content-Length. Because HTTP/0.9 processing would remove the ContentLine
support analyzer, more data was delivered to the HTTP_Message than
expected, triggering an assert.

This is a follow-up for zeek/zeek#2851.
2023-03-13 14:13:50 +01:00
Tim Wojtulewicz
9cb6de7447 Add weird for unknown HTTP/0.9 request method 2023-03-10 15:45:11 -07:00
Tim Wojtulewicz
0003495a9b Special case HTTP 0.9 early on
Mostly, treat HTTP0.9 completely separate. Because we're doing raw
delivery of a body directly, fake enough (connection_close=1, and finish
headers manually) so that the MIME infrastructure thinks it is seeing a
body.

This deals better with the body due to accounting for the first line. Also
it avoids the content line analyzer to strip CRLF/LF and the analyzer
then adding CRLF unconditionally by fully bypassing the content line
analyzer.

Concretely, the vlan-mpls test case contains a HTTP response with LF only,
but the previous implementation would use CRLF, accounting for two many bytes.
Same for the http.no-version test which would previously report a body
length of 280 and now is at 323 (which agrees with wireshark).

Further, the mime_type detection for the http-09 test case works because
it's now seeing the full body.

Drawback: We don't extract headers when a server actually replies with
a HTTP/1.1 message, but grrr, something needs to give I guess.
2023-03-10 09:52:34 -07:00
Tim Wojtulewicz
220d8a2795 Remove a couple unnecessary break statements 2023-03-10 09:52:34 -07:00
Arne Welzel
71bcd15d2e analyzer/http: Do not assume char is signed
On aarch64, char is unsigned, so is_HTTP_token_char() allowed
non-ASCII stuff with the high-bit set.

Fixes part of #2742
2023-02-02 14:57:57 +01:00
Arne Welzel
3af6b97c63 analyzers/http: Update request_version on subsequent SetVersion() calls
The #124 PR introduced special treatment when HTTP version 0.9
was set. With #127, a reproducer that set HTTP/1.0 in the first
request was created and subsequent requests wouldn't reset to
HTTP version 0.9.

This is subtle, but doesn't seem like things fall apart.

Improves runtime from 20 seconds to 2 seconds for the given
reproducer.

Fixes #127.
2023-01-26 19:59:02 +01:00
Arne Welzel
76ba9d4698 ContentLine: Fix spelling of "suppress", deprecate SupressWeirds()
Closes #2547
2022-12-02 12:40:47 +01:00
Arne Welzel
540fe7aff7 http: Heuristic around rejecting malformed HTTP/0.9 traffic
oss-fuzz generated "HTTP traffic" containing 250k+ sequences of "T<space>\r\r"
which Zeek then logged as individual HTTP requests. Add a heuristic to bail
on such request lines. It's a bit specific to the test case, but should work.

There are more issues around handling HTTP/0.9, e.g. triggering
"not a http reply line" when HTTP/0.9 never had such a thing, but
I don't think that's worth fixing up.

Fixes #119
2022-11-18 18:19:58 +01:00
Tim Wojtulewicz
2739275b88 Merge remote-tracking branch 'jsoref/spelling-src'
* jsoref/spelling-src:
  Spelling src
2022-11-11 12:49:15 -07:00
Josh Soref
cd201aa24e Spelling src
These are non-functional changes.

* accounting
* activation
* actual
* added
* addresult
* aggregable
* aligned
* alternatively
* ambiguous
* analysis
* analyzer
* anticlimactic
* apparently
* application
* appropriate
* arithmetic
* assignment
* assigns
* associated
* authentication
* authoritative
* barrier
* boundary
* broccoli
* buffering
* caching
* called
* canonicalized
* capturing
* certificates
* ciphersuite
* columns
* communication
* comparison
* comparisons
* compilation
* component
* concatenating
* concatenation
* connection
* convenience
* correctly
* corresponding
* could
* counting
* data
* declared
* decryption
* defining
* dependent
* deprecated
* detached
* dictionary
* directional
* directly
* directory
* discarding
* disconnecting
* distinguishes
* documentation
* elsewhere
* emitted
* empty
* endianness
* endpoint
* enumerator
* essentially
* evaluated
* everything
* exactly
* execute
* explicit
* expressions
* facilitates
* fiddling
* filesystem
* flag
* flagged
* for
* fragments
* guarantee
* guaranteed
* happen
* happening
* hemisphere
* identifier
* identifies
* identify
* implementation
* implemented
* implementing
* including
* inconsistency
* indeterminate
* indices
* individual
* information
* initial
* initialization
* initialize
* initialized
* initializes
* instantiate
* instantiated
* instantiates
* interface
* internal
* interpreted
* interpreter
* into
* it
* iterators
* length
* likely
* log
* longer
* mainly
* mark
* maximum
* message
* minimum
* module
* must
* name
* namespace
* necessary
* nonexistent
* not
* notifications
* notifier
* number
* objects
* occurred
* operations
* original
* otherwise
* output
* overridden
* override
* overriding
* overwriting
* ownership
* parameters
* particular
* payload
* persistent
* potential
* precision
* preexisting
* preservation
* preserved
* primarily
* probably
* procedure
* proceed
* process
* processed
* processes
* processing
* propagate
* propagated
* prototype
* provides
* publishing
* purposes
* queue
* reached
* reason
* reassem
* reassemble
* reassembler
* recommend
* record
* reduction
* reference
* regularly
* representation
* request
* reserved
* retrieve
* returning
* separate
* should
* shouldn't
* significant
* signing
* simplified
* simultaneously
* single
* somebody
* sources
* specific
* specification
* specified
* specifies
* specify
* statement
* subdirectories
* succeeded
* successful
* successfully
* supplied
* synchronization
* tag
* temporarily
* terminating
* that
* the
* transmitted
* true
* truncated
* try
* understand
* unescaped
* unforwarding
* unknown
* unknowndata
* unspecified
* update
* usually
* which
* wildcard

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-11-09 12:08:15 -05:00
Arne Welzel
6ef9423f3c analyzer/HTTP: Call TCP_ApplicationAnalyzer::Done() after RequestMade()/ReplyMade() 2022-11-08 16:44:42 -07:00
Tim Wojtulewicz
7c4fd382d9 Code modernization: Convert from deprecated C standard library headers 2022-06-27 09:47:31 -07:00
Tim Wojtulewicz
612212568a Add analyzer_confirmation and analyzer_violation events 2021-11-23 19:36:50 -07:00
Tim Wojtulewicz
9af6b2f48d clang-format: Set penalty for breaking after assignment operator 2021-09-27 10:49:48 -07:00
Tim Wojtulewicz
4423574d26 clang-format: Set IndentCaseBlocks to false 2021-09-27 10:49:48 -07:00
Tim Wojtulewicz
9cb54f5d44 clang-format: Force zeek-config.h to be earlier in the config ordering 2021-09-25 11:52:55 -07:00
Tim Wojtulewicz
b2f171ec69 Reformat the world 2021-09-16 15:35:39 -07:00
jerome Grandvalet
83f4903250 Fix when HTTP header are on several packet 2021-07-26 15:58:14 +02:00
jerome Grandvalet
8cabecec40 Fix HTTP evasion
- Happen when there is no CRLF at the end of HTTP
    - Fix by adding CRLF when packet is complete (in relation to content-length in header)
2021-07-23 09:28:29 +02:00
Vern Paxson
245108e86e remove unnecessary casts, and change necessary ones to use static_cast<> 2021-03-18 13:24:25 -07:00
Vern Paxson
62bab66114 migration to using new differentiated methods for setting record fields 2021-02-25 16:59:26 -08:00
Jon Siwek
c44cbe1feb Prefix #includes of .bif.h files with zeek/
This enables locating the headers within the install-tree using the
dirs provided by `zeek-config --include_dir`.

To enable locating these headers within the build-tree, this change also
creates a 'build/src/include/zeek -> ..' symlink.
2021-02-02 19:15:05 -08:00
Jon Siwek
8a8a983c49 Add missing zeek/ to header includes
Related to https://github.com/zeek/zeek/pull/1377
2021-01-29 19:16:29 -08:00
Tim Wojtulewicz
e27008ef26 GH-1184: Add 'source' field to weird log denoting where the weird was reported 2020-12-01 09:34:37 -07:00
Tim Wojtulewicz
5589484f26 Fix includes of bif.h and _pac.h files to use full paths inside build directory 2020-11-12 12:15:26 -07:00
Tim Wojtulewicz
96d9115360 GH-1079: Use full paths starting with zeek/ when including files 2020-11-12 12:15:26 -07:00
Tim Wojtulewicz
70c2397f69 Plugins: Clean up explicit uses of namespaces in places where they're not necessary.
This commit covers all of the plugin classes.
2020-08-24 12:07:03 -07:00
Tim Wojtulewicz
0ac3fafe13 Move zeek::net namespace to zeek::run_state namespace.
This also moves all of the code from Net.{h,cc} to RunState.{h,cc} and marks Net.h as deprecated
2020-08-20 16:11:47 -07:00
Tim Wojtulewicz
a34e632eef Move NetVar from zeek to zeek::detail namespace 2020-08-20 16:11:46 -07:00
Tim Wojtulewicz
8d2d867a65 Move everything in util.h to zeek::util namespace.
This commit includes renaming a number of methods prefixed with bro_ to be prefixed with zeek_.
2020-08-20 16:00:33 -07:00
Tim Wojtulewicz
e7c6d51ae7 Move the functions and variables in Net.h to the zeek::net namespace. This includes moving network_time out of util.h. 2020-08-20 15:55:17 -07:00
Tim Wojtulewicz
715ca6549b Move the remainder of the analyzers to zeek namespaces 2020-08-20 15:55:17 -07:00
Tim Wojtulewicz
914ffcadae Move arp, tcp, udp, pia, and stepping stone analyzers 2020-08-20 15:55:17 -07:00
Tim Wojtulewicz
14408235b8 Move file_analysis code to zeek namespaces 2020-08-20 15:55:17 -07:00
Jon Siwek
363b167bd2 GH-1100: Fix reported body-length of HTTP messages w/ sub-entities
The body-lengths of sub-entities, like multipart messages, got counted
twice by mistake: once upon the end of the sub-entity and then again
upon the end of the top-level entity that contains all sub-entities.
The size of just the top-level entity is the correct one to use.
2020-08-04 14:21:03 -07:00
Tim Wojtulewicz
7fefdd97af Move Conn and related types to zeek namespace 2020-07-31 16:25:54 -04:00