TCP_Reassembler can now keep a history of old TCP segments using the
`tcp_max_old_segments` option. A value of zero will disable it.
An overlapping segment with different data can indicate a possible
TCP injection attack. The rexmit_inconsistency event will fire if this
is the case.
* origin/topic/seth/more-file-type-ident-fixes:
File API updates complete.
Fixes for file type identification.
API changes to file analysis mime type detection.
Make HTTP 206 reassembly require ETags by default.
More file type identification improvements
Fix an issue with files having gaps before the bof_buffer is filled.
Fix an issue with packet loss in http file reporting.
Adding WOFF fonts to file type identification.
Extended JSON matching and added OCSP responses.
Another large signature update.
More signature updates.
Even more file type ident clean up.
Lots of fixes for file type identification.
BIT-1368 #merged
Under remote communication overload conditions, the child->parent
chunked IO may start rejecting chunks if over the hard cap. Some
messages are made of two chunks, accepting the first part, but rejecting
the second can put the parent in a bad state and the next two chunks it
reads are likely to cause the error.
This patch just removes the rejecting functionality completely and so
now relies solely on shutting down remote peer connections to help
alleviate temporary overload conditions. The
"chunked_io_buffer_soft_cap" script variable can now tune when this
shutting down starts happening and the default setting is now double
what it used to be. For constant overload conditions, communication.log
should keep stating "queue to parent filling up; shutting down heaviest
connection".
An alternative to completely removing the hard cap rejection code could
be ensuring that messages that involve a pair of chunks can never have
the second chunk be rejected when attempting to write it.
Addresses BIT-1376
Removed "file_mime_type" and "file_mime_types" event, replacing them
with a new event called "file_metadata_inferred". It has a record
argument of type "inferred_file_metadata", which contains the mime type
information that the earlier events used to supply. The idea here is
that future extensions to the record with new metadata will be less
likely to break user code than the alternatives (adding new events or
new event parameters).
Addresses BIT-1368.
* origin/topic/jsiwek/file-signatures:
File type detection changes and fix https.log {orig,resp}_fuids fields.
Various minor changes related to file mime type detection.
Refactor common MIME magic matching code.
Replace libmagic w/ Bro signatures for file MIME type identification.
Conflicts:
scripts/base/init-default.bro
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
BIT-1143 #merged
Notable changes:
- libmagic is no longer used at all. All MIME type detection is
done through new Bro signatures, and there's no longer a means to get
verbose file type descriptions (e.g. "PNG image data, 1435 x 170").
The majority of the default file magic signatures are derived
from the default magic database of libmagic ~5.17.
- File magic signatures consist of two new constructs in the
signature rule parsing grammar: "file-magic" gives a regular
expression to match against, and "file-mime" gives the MIME type
string of content that matches the magic and an optional strength
value for the match.
- Modified signature/rule syntax for identifiers: they can no longer
start with a '-', which made for ambiguous syntax when doing negative
strength values in "file-mime". Also brought syntax for Bro script
identifiers in line with reality (they can't start with numbers or
include '-' at all).
- A new Built-In Function, "file_magic", can be used to get all
file magic matches and their corresponding strength against a given
chunk of data
- The second parameter of the "identify_data" Built-In Function
can no longer be used to get verbose file type descriptions, though it
can still be used to get the strongest matching file magic signature.
- The "file_transferred" event's "descr" parameter no longer
contains verbose file type descriptions.
- The BROMAGIC environment variable no longer changes any behavior
in Bro as magic databases are no longer used/installed.
- Reverted back to minimum requirement of CMake 2.6.3 from 2.8.0
(it's back to being the same requirement as the Bro v2.2 release).
The bump was to accomodate building libmagic as an external project,
which is no longer needed.
Addresses BIT-1143.
The event now really returns the extension. If openssl supports printing
it, it is converted into the openssl ascii output.
The output does not always look pretty because it can contain newlines.
New event syntax:
event x509_extension(c: connection, is_orig: bool, cert:X509, extension: X509_extension_info)
Example output for extension:
[name=X509v3 Extended Key Usage,
short_name=extendedKeyUsage,
oid=2.5.29.37,
critical=F,
value=TLS Web Server Authentication, TLS Web Client Authentication]
[name=X509v3 Certificate Policies,
short_name=certificatePolicies,
oid=2.5.29.32,
critical=F,
value=Policy: 1.3.6.1.4.1.6449.1.2.1.3.4^J CPS: https://secure.comodo.com/CPS^J]
in an easily readable form.
This is for debugging purposes, obviously.
Example, including only SMTP events:
> bro -r smtp.trace misc/dump-events.bro DumpEvents::include=/smtp/
[...]
1254722768.219663 smtp_reply
[0] c: connection = [id=[orig_h=10.10.1.4, orig_p=1470/tcp, resp_h=74.53.140.153, [...]
[1] is_orig: bool = F
[2] code: count = 220
[3] cmd: string = >
[4] msg: string = xc90.websitewelcome.com ESMTP Exim 4.69 #1 Mon, 05 Oct 2009 01:05:54 -0500
[5] cont_resp: bool = T
1254722768.219663 smtp_reply
[0] c: connection = [id=[orig_h=10.10.1.4, orig_p=1470/tcp, resp_h=74.53.140.153, [...]
[1] is_orig: bool = F
[2] code: count = 220
[3] cmd: string = >
[4] msg: string = We do not authorize the use of this system to transport unsolicited,
[5] cont_resp: bool = T
[...]
BIT-1048 #merged
I'm reverting the serializer version update for now as that breaks
Broccoli. Let's do that later for 2.2.
* topic/robin/topk-merge:
update documentation, rename get* to Get* and make hasher persistent
adapt to new folder structure
fix opaqueval-related memleak
synchronize pruned attribute
potentially found wrong Ref.
add sum function that can be used to get the number of total observed elements.
in cluster settings, the resultvals can apparently been uninitialized in some special cases
fix memory leaks
fix warnings
add topk cluster test
make size of topk-list configureable when using sumstats
implement merging for top-k.
add serialization for topk
make the get function const
topk for sumstats
well, a test that works..
implement topk.
This commit adds support for script-level specification of a seed to be used by
hashers. For example, if the given name of a Bloom filter is not empty, then
the seed used by the underlying hasher only depends on the Bloom filter name.
If the name is empty, we check whether the user defined a non-empty
global_hash_seed string variable at script and use it instead. If that script
variable does not exist, then we fall back to the initial seed computed a
Bro startup (which is affected ultimately by $BRO_SEED).
See Hasher::MakeSeed for details.
Thanks to git this merge was less troublesome that I was afraid it
would be. Not all tests pass yet though (and file hashes have changed
unfortunately).
Conflicts:
cmake
doc/scripts/DocSourcesList.cmake
scripts/base/init-bare.bro
scripts/base/protocols/ftp/main.bro
scripts/base/protocols/irc/dcc-send.bro
scripts/test-all-policy.bro
src/AnalyzerTags.h
src/CMakeLists.txt
src/analyzer/Analyzer.cc
src/analyzer/protocol/file/File.cc
src/analyzer/protocol/file/File.h
src/analyzer/protocol/http/HTTP.cc
src/analyzer/protocol/http/HTTP.h
src/analyzer/protocol/mime/MIME.cc
src/event.bif
src/main.cc
src/util-config.h.in
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/istate.events-ssl/receiver.http.log
testing/btest/Baseline/istate.events-ssl/sender.http.log
testing/btest/Baseline/istate.events/receiver.http.log
testing/btest/Baseline/istate.events/sender.http.log
- Introducing analyzer::<protocol> namespaces.
- Moving protocol-specific events out of events.bif into analyzer/protocol/<protocol>/events.bif
- Moving ARP over (even though it's not an actual analyzer).
- Moving NetFlow over (even though it's not an actual analyzer).
- Moving MIME over (even though it's not an actual analyzer).
- FileAnalysis::Info is now just a record used for logging, the fa_file
record type is defined in init-bare.bro as the analogue to a
connection record.
- Starting to transfer policy hook triggers and analyzer results to
events.
Also moving src/analyzer.bif to src/analyzer/analyzer.bif, along with
the infrastructure to build/incude bif code at other locations.
We should generally move to having per-directory CMakeLists.txt. I'll
convert the others over later.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.
There are three major parts going into this:
- A new plugin infrastructure in src/plugin. This is independent
of analyzers and will eventually support plugins for other parts
of Bro as well (think: readers and writers). The goal is that
plugins can be alternatively compiled in statically or loadead
dynamically at runtime from a shared library. While the latter
isn't there yet, there'll be almost no code change for a plugin
to make it dynamic later (hopefully :)
- New analyzer infrastructure in src/analyzer. I've moved a number
of analyzer-related classes here, including Analyzer and DPM;
the latter now renamed to Analyzer::Manager. More will move here
later. Currently, there's only one plugin here, which provides
*all* existing analyzers. We can modularize this further in the
future (or not).
- A new script interface in base/framework/analyzer. I think that
this will eventually replace the dpm framework, but for now
that's still there as well, though some parts have moved over.
I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:
const ports = { 22/tcp } &redef;
event bro_init() &priority=5
{
...
Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
}
As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.
This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.
The debug stream "dpm" shows more about the loaded/enabled analyzers.
A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).
This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
Added a generic gtpv1_message event generated for any GTP message type.
Added specific events for the create/update/delete PDP context
request/response messages.
Addresses #934.
This currently supports automatic decapsulation of GTP-U packets on
UDP port 2152.
The GTPv1 headers for such tunnels can be inspected by handling the
"gtpv1_g_pdu_packet" event, which has a parameter of type "gtpv1_hdr".
Analyzer and test cases are derived from submissions by Carsten Langer.
Addresses #690.
The "tunnel_port" and "parse_udp_tunnels" options are still gone
as those did not work entirely (e.g. IPv6 support and misnaming
of tunnel_port/udp_tunnel_port).
- Unserializing files that were previously kicked out of the open-file
cache would cause them to be fopen'd with the original access
permissions which is usually 'w' and causes truncation. They
are now opened in 'a' mode. (addresses #780)
- Add 'max_files_in_cache' script option to manually set the maximum
amount of opened files to keep cached. Mainly this just helped
to create a simple test case for the above change.
- Remove unused NO_HAVE_SETRLIMIT preprocessor switch.
- On systems that don't enforce a limit on number of files opened for
the process, raise default max size of open-file cache from
32 to 512.
UDP tunnel support removed for now, to be re-added in specific
analyzers later, but IP-in-IP is now decapsulated recursively
so nested tunnels can be seen and the inner packets get sent
through the IP fragment reassembler if necessary.