- Add a timeout flag to file_analysis.log so it's easy to tell what
has had at least one timeout trigger happen.
- Fix ftp-data service tag not being set for reused connections.
- Fix HTTP::Incorrect_File_Type because mime types returned by FAF have
the charset still in them, but the HTTP::mime_types_extensions table
does not and it requires an exact string match. (still ugly)
- Add TRIGGER_NEW_CONN to track files going over multiple connections.
- Add an initial file/mime type guess for non-linear file transfers.
- Fix a case where file/mime type detection would never be attempted
if the start of the file was a content gap.
- Improve mime type tracking of HTTP byte-range/partial-content,
even if the requests are pipelined or over multiple connections.
- I changed the modbus.events test because having the baseline output
be 80+ MB is nuts and it was sensitive to connection record redefs.
The notable difference here is that ftp.log now logs by default
the PORT, PASV, EPRT, EPSV commands as well as a separate line for
ftp-data channels in which file extraction was requested.
This difference isn't a direct result of now doing the file extraction
through the file analysis framework, it's just because I noticed even
the old way of tracking extracted-file name didn't work right and this
was the way I came up with so that a locally extracted file can be
associated with a data channel and then that data channel associated
with a control channel.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.
There are three major parts going into this:
- A new plugin infrastructure in src/plugin. This is independent
of analyzers and will eventually support plugins for other parts
of Bro as well (think: readers and writers). The goal is that
plugins can be alternatively compiled in statically or loadead
dynamically at runtime from a shared library. While the latter
isn't there yet, there'll be almost no code change for a plugin
to make it dynamic later (hopefully :)
- New analyzer infrastructure in src/analyzer. I've moved a number
of analyzer-related classes here, including Analyzer and DPM;
the latter now renamed to Analyzer::Manager. More will move here
later. Currently, there's only one plugin here, which provides
*all* existing analyzers. We can modularize this further in the
future (or not).
- A new script interface in base/framework/analyzer. I think that
this will eventually replace the dpm framework, but for now
that's still there as well, though some parts have moved over.
I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:
const ports = { 22/tcp } &redef;
event bro_init() &priority=5
{
...
Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
}
As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.
This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.
The debug stream "dpm" shows more about the loaded/enabled analyzers.
A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).
This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
Versus from synchronous function calls, which doesn't work well because
the function call can see a script-layer state that doesn't reflect
the state as it will be in terms of the event/network stream.
The framework now cycles through callbacks based on a table indexed
by analyzer tags, or the special case of service strings if a given
analyzer is overloaded for multiple protocols (FTP/IRC data). This
lets each protocol script bundle implement the callback locally and
reduces the FAF's external dependencies.
In addition to checking for a finished SSL handshake over an FTP
connection, it now also requires that the SSL handshake occurs after
the FTP client requested AUTH GSSAPI, more specifically identifying the
characteristics of GridFTP control channels.
Addresses #891.
* origin/topic/jsiwek/gridftp:
Add memory leak unit test for GridFTP.
Enable GridFTP detection by default. Track/log SSL client certs.
Add analyzer for GSI mechanism of GSSAPI FTP AUTH method.
Add an example of a GridFTP data channel detection script.
In the *service* field of connection records, GridFTP control channels
are labeled as "gridftp" and data channels as "gridftp-data".
Added *client_subject* and *client_issuer_subject* as &log'd fields to
SSL::Info record. Also added *client_cert* and *client_cert_chain*
fields to track client cert chain.
GSI authentication involves an encoded TLS/SSL handshake over the FTP
control session. Decoding the exchanged tokens and passing them to an
SSL analyzer instance allows use of all the familiar script-layer events
in inspecting the handshake (e.g. client/server certificats are
available). For FTP sessions that attempt GSI authentication, the
service field of the connection record will have both "ftp" and "ssl".
One additional change is an FTP server's acceptance of an AUTH request
no longer causes analysis of the connection to cease (because further
analysis likely wasn't possible). This decision can be made more
dynamically at the script-layer (plus there's now the fact that further
analysis can be done at least on the GSSAPI AUTH method).
Instead, the `addr_to_uri` script-level function can be used to
explicitly add brackets to an address if it's IPv6 and will be
included in a URI or when a ":<port>" needs to be appended to it.
This is to avoid ambiguity between compressed hex notation and
module namespacing, both which use "::". E.g.: "aaaa::bbbb" could
be an identifier or an IPv6 address, but "[aaaa::bbbb]" is now
clearly the address.
Also added IPv6 mixed notation to allow an IPv4 dotted-decimal
address to be specified in the lower 32-bits.
- Large rework on packet filter framework to make many things easier.
- Removed the PacketFilter::all_packets variable because it was confusing.
- New variable (PacketFilter::enable_auto_protocol_capture_filters) to re-enable the old filtering model of only sniffing ports for analyzed protocols.
- In progress plugin model for adding filtering mechanisms.
- New default single item for capture_filters = { ["default"] = PacketFilter::default_capture_filter };
- Mechanism and helper functions to "shunt" traffic with filters.
- Created the Protocols framework to assist with reworking how base protocol scripts are registered with DPD and other things.
- Protocols framework creates BPF filters for registered analyzers. (if using PacketFilter framework in that mode).
- Log path's are generated in the scripting land
now. The default Log stream ID to path string
mapping works like this:
- Notice::LOG -> "notice"
- Notice::POLICY_LOG -> "notice_policy"
- TestModule::LOG -> "test_module"
- Logging streams updated across all of the shipped
scripts to be more user friendly. Instead of
the logging stream ID HTTP::HTTP, we now have
HTTP::LOG, etc.
- The priorities on some bro_init handlers have
been adjusted to make the process of applying
filters or disabling streams easier for users.
- policy/ renamed to scripts/
- By default BROPATH now contains:
- scripts/
- scripts/policy
- scripts/site
- *Nearly* all tests pass.
- All of scripts/base/ is loaded by main.cc
- Can be disabled by setting $BRO_NO_BASE_SCRIPTS
- Scripts in scripts/base/ don't use relative path loading to ease use of BRO_NO_BASE_SCRIPTS (to copy and paste that script).
- The scripts in scripts/base/protocols/ only (or soon will only) do logging and state building.
- The scripts in scripts/base/frameworks/ add functionality without causing any additional overhead.
- All "detection" activity happens through scripts in scripts/policy/.
- Communications framework modified temporarily to need an environment variable to actually enable (ENABLE_COMMUNICATION=1)
- This is so the communications framework can be loaded as part
of the base without causing trouble when it's not needed.
- This will be removed once a resolution to ticket #540 is reached.