RFC 7230 section 4.2.3 states that:
"A recipient SHOULD consider 'x-gzip' to be equivalent to 'gzip'"
This could lead to evasions as an attacker could use:
Content-Encoding: x-gzip
To bypass Bro's decompression.
This adds a slight patch to the HTTP analyzer, which recognizez when a connection is
upgraded to a different protocol (using a 101 reply with a few specific headers being
set).
In this case, the analyzer stops further processing of the connection (which will
result in DPD errors) and raises a new event:
event http_connection_upgrade(c: connection, protocol: string);
Protocol contains the name of the protocol that is being upgraded to, as specified in
one of the header values.
BIT-1611 #merged
* origin/topic/seth/remove-unescaped_special_char-weird:
Add urldecoding for the unofficial %u00AE style of encoding.
Remove the unescaped_special_char HTTP weird.
This weird points out a lot of benign stuff and it would
be easily reimplemented in a Bro script. This commit
also makes the minor change to update the reserved and
unreserved characters from a newer from of the URI RFC.
More specifically, this removes the functions:
strcasecmp_n
strchr_n
strrchr_n
and replaces the calls with the respective C-library calls that should
be part of just about all operating systems by now.
The change from #49 made it an error to not have a URI. That however
then led requests with an URI yet no version to abort as well.
Instead, we now check if the token following the method is an "HTTP/"
version identifier. If, so accept that the URI is empty (and trigger
a weird) but otherwise keep processing.
Adding test cases for both HTTP requests without URI and without
version.
The spec doesn't actually seem to permit these, but Seth had a (private)
pcap showing them used in the wild (and the HTTP/MIME analyzer failed to
parse content as a result).
* origin/topic/seth/more-file-type-ident-fixes:
File API updates complete.
Fixes for file type identification.
API changes to file analysis mime type detection.
Make HTTP 206 reassembly require ETags by default.
More file type identification improvements
Fix an issue with files having gaps before the bof_buffer is filled.
Fix an issue with packet loss in http file reporting.
Adding WOFF fonts to file type identification.
Extended JSON matching and added OCSP responses.
Another large signature update.
More signature updates.
Even more file type ident clean up.
Lots of fixes for file type identification.
BIT-1368 #merged
The HTTP analyzer was propogating Gaps to the files framework even
in the case of a packet drop occurring immediately after the headers
are completed in an HTTP response when the response content length
was declared to be zero (no file started, so no loss).
Includes passing test.
The logic for determining whether a gap was entirely within a MIME
entity body was not asking the current entity, which may be better able
to answer that question if it was using the Content-Range header and
thus knows if the gap exceeds the length of the body that's still
expected.
Addresses BIT-1247
As opposed to delaying until a certain-sized-buffer fills, which is
problematic because then the event becomes out of sync with the "rest of
the world". E.g. content_gap handlers being called sooner than
expected.
Addresses BIT-1240.
Singular CR or LF characters in multipart body content are no longer
converted to a full CRLF (thus corrupting the file) and it also no
longer considers the CRLF before the multipart boundary as part of the
content.
Addresses BIT-1235.
The initial file ID I think is still ambiguous and/or depends on
script-layer state tracking enough that it still needs to request a file
ID via an event at first, but once that is assigned to an HTTP (MIME)
entity, it never makes sense that it can change (so re-using a cached ID
works).
The main change is that reassembly code (e.g. for TCP) now uses
int64/uint64 (signedness is situational) data types in place of int
types in order to support delivering data to analyzers that pass 2GB
thresholds. There's also changes in logic that accompany the change in
data types, e.g. to fix TCP sequence space arithmetic inconsistencies.
Another significant change is in the Analyzer API: the *Packet and
*Undelivered methods now use a uint64 in place of an int for the
relative sequence space offset parameter.
CONNECT code.
Removal of support analyzers was broken. The code now actually doesn't
delete them immediately anymore but instead just flags them as
disabled. They'll be destroyed with the parent analyzer later.
Also includes a new leak tests exercising the CONNECT code.
Lines starting # with '#' will be ignored, and an empty message aborts
the commit. # On branch topic/robin/http-connect # Changes to be
committed: # modified: scripts/base/protocols/http/main.bro #
modified: scripts/base/protocols/ssl/consts.bro # modified:
src/analyzer/Analyzer.cc # modified: src/analyzer/Analyzer.h #
modified: src/analyzer/protocol/http/HTTP.cc # new file:
testing/btest/core/leaks/http-connect.bro # modified:
testing/btest/scripts/base/protocols/http/http-connect.bro # #
Untracked files: # .tags # changes.txt # conn.log # debug.log # diff #
mpls-in-vlan.patch # newfile.pcap # packet_filter.log # reporter.log #
src/PktSrc.cc.orig # weird.log #
* origin/topic/jsiwek/http-file-id-caching:
Revert use of HTTP file ID caching for gaps range request content.
Extend file analysis API to allow file ID caching, adapt HTTP to use it.
BIT-1125 #merged
This allows an analyzer to either provide file IDs associated with some
file content or to cache a file ID that was already determined by
script-layer logic so that subsequent calls to the file analysis
interface can bypass costly detours through script-layer. This can
yield a decent performance improvement for analyzers that are able to
take advantage of it and deal with streaming content (like HTTP).
Replaced some with InternalWarning or InternalAnalyzerError, the later
being a new method which signals the analyzer to not process further
input. Some usages I just removed if they didn't make sense or clearly
couldn't happen. Also did some minor refactors of related code while
reviewing/exploring ways to get rid of InternalError usages.
Also, for TCP content file write failures there's a new event:
"contents_file_write_failure".
Thanks to git this merge was less troublesome that I was afraid it
would be. Not all tests pass yet though (and file hashes have changed
unfortunately).
Conflicts:
cmake
doc/scripts/DocSourcesList.cmake
scripts/base/init-bare.bro
scripts/base/protocols/ftp/main.bro
scripts/base/protocols/irc/dcc-send.bro
scripts/test-all-policy.bro
src/AnalyzerTags.h
src/CMakeLists.txt
src/analyzer/Analyzer.cc
src/analyzer/protocol/file/File.cc
src/analyzer/protocol/file/File.h
src/analyzer/protocol/http/HTTP.cc
src/analyzer/protocol/http/HTTP.h
src/analyzer/protocol/mime/MIME.cc
src/event.bif
src/main.cc
src/util-config.h.in
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/istate.events-ssl/receiver.http.log
testing/btest/Baseline/istate.events-ssl/sender.http.log
testing/btest/Baseline/istate.events/receiver.http.log
testing/btest/Baseline/istate.events/sender.http.log