Special case HTTP 0.9 early on

Mostly, treat HTTP0.9 completely separate. Because we're doing raw
delivery of a body directly, fake enough (connection_close=1, and finish
headers manually) so that the MIME infrastructure thinks it is seeing a
body.

This deals better with the body due to accounting for the first line. Also
it avoids the content line analyzer to strip CRLF/LF and the analyzer
then adding CRLF unconditionally by fully bypassing the content line
analyzer.

Concretely, the vlan-mpls test case contains a HTTP response with LF only,
but the previous implementation would use CRLF, accounting for two many bytes.
Same for the http.no-version test which would previously report a body
length of 280 and now is at 323 (which agrees with wireshark).

Further, the mime_type detection for the http-09 test case works because
it's now seeing the full body.

Drawback: We don't extract headers when a server actually replies with
a HTTP/1.1 message, but grrr, something needs to give I guess.
This commit is contained in:
Tim Wojtulewicz 2023-03-06 16:19:37 -07:00
parent 220d8a2795
commit 0003495a9b
11 changed files with 82 additions and 23 deletions

View file

@ -7,5 +7,6 @@
#open XXXX-XX-XX-XX-XX-XX
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p trans_depth method host uri referrer version user_agent origin request_body_len response_body_len status_code status_msg info_code info_msg tags username password proxied orig_fuids orig_filenames orig_mime_types resp_fuids resp_filenames resp_mime_types
#types time string addr port addr port count string string string string string string string count count count string count string set[enum] string string set[string] vector[string] vector[string] vector[string] vector[string] vector[string] vector[string]
XXXXXXXXXX.XXXXXX CHhAvVGS1DHFjwGM9 131.243.1.23 1035 131.243.1.10 80 1 GET - /cgi-bin/formmail.pl?email=f2@aol.com&subject=www-nrg.ee/cgi-bin/formmail.pl&recipient=unknownz@buy2save.com&msg=w00t - - - - 0 0 - - - - (empty) - - - - - - - - -
XXXXXXXXXX.XXXXXX CHhAvVGS1DHFjwGM9 131.243.1.23 1035 131.243.1.10 80 1 GET - /cgi-bin/formmail.pl?email=f2@aol.com&subject=www-nrg.ee/cgi-bin/formmail.pl&recipient=unknownz@buy2save.com&msg=w00t - 0.9 - - 0 323 0 <empty> - - (empty) - - - - - - FKWfMA1PSzh7CAB0Qj - text/html
XXXXXXXXXX.XXXXXX CHhAvVGS1DHFjwGM9 131.243.1.23 1035 131.243.1.10 80 2 - - - - - - - 0 0 - - - - (empty) - - - - - - - - -
#close XXXX-XX-XX-XX-XX-XX