- This caused us to lose signatures for POP3 and Bittorrent. These will
need discovered in the repository again when we add scripts
for those analyzers.
- Recorrected the module name to Files.
- Added Files::analyzer_name to get a more readable name for a
file analyzer.
- Improved and just overall better handled multipart mime
transfers in HTTP and SMTP. HTTP now has orig_fuids and resp_fuids
log fields since multiple "files" can be transferred with
multipart mime in a single request/response pair. SMTP has
an fuids field which has file unique IDs for all parts
transferred. FTP and IRC have a log field named fuid added
because only a single file can be transferred per irc and ftp
log line.
Closes#1030.
* origin/topic/seth/packet-filter-updates:
Missed a test fix.
Updating test baselines.
Updates for the PacketFilter framework to simplify it.
Last test update for PacketFilter framework.
Several final fixes for PacketFilter framework.
Packet filter framework checkpoint.
Checkpoint on the packet filter framework.
Initial rework of packet filter framework.
Thanks to git this merge was less troublesome that I was afraid it
would be. Not all tests pass yet though (and file hashes have changed
unfortunately).
Conflicts:
cmake
doc/scripts/DocSourcesList.cmake
scripts/base/init-bare.bro
scripts/base/protocols/ftp/main.bro
scripts/base/protocols/irc/dcc-send.bro
scripts/test-all-policy.bro
src/AnalyzerTags.h
src/CMakeLists.txt
src/analyzer/Analyzer.cc
src/analyzer/protocol/file/File.cc
src/analyzer/protocol/file/File.h
src/analyzer/protocol/http/HTTP.cc
src/analyzer/protocol/http/HTTP.h
src/analyzer/protocol/mime/MIME.cc
src/event.bif
src/main.cc
src/util-config.h.in
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/istate.events-ssl/receiver.http.log
testing/btest/Baseline/istate.events-ssl/sender.http.log
testing/btest/Baseline/istate.events/receiver.http.log
testing/btest/Baseline/istate.events/sender.http.log
- It's derived from the magic database of libmagic 5.14, but with most
everything not related to mime types removed.
- The custom database is always used by default for mime detection, but
the more verbose file type detection will fall back on the default
libmagic installation's database. The result is: mime type strings
are now guaranteed to be consistent across platforms, but the verbose
file type descriptions are not.
- The custom database gets installed in $prefix/share/bro/magic, and
should even be extensible if files with new patterns are added inside
the directory.
- The search path for the mime magic database can be controlled via
BROMAGIC environment variable.
- Remove mime_desc field from ftp.log.
- Stop using the mime/file type canonifier with unit tests.
- libmagic >= 5.04 is now a requirement.
And added an event called "event_queue_flush_point" to mark where that
occured in the event stream. The FAF now uses an explicit event queue
flush instead of buffering input in order to wait for a file handle to
be returned from script-layer.
- FileAnalysis::Info is now just a record used for logging, the fa_file
record type is defined in init-bare.bro as the analogue to a
connection record.
- Starting to transfer policy hook triggers and analyzer results to
events.
This reverts commit fc267d010d.
There were some diffs caused by this in external test suites I'm
unsure about, I'm going to go over optimizations more closely in
a different branch.
When a file handle is needed and the last event in the queue is also
a get_file_handle event with the same arguments, instead of queueing
a new event, just remember to cache/re-use the resulting handle from
the previous event. This depends on get_file_handle handlers not
changing global state that is also used to derive the file handle
string.
I've used the opportunity to also cleanup DPD's expect_connection()
infrastructure, and renamed that bif to schedule_analyzer(), which
seems more appropiate. One can now also schedule more than one
analyzer per connection.
TODOs:
- "make install" is probably broken.
- Broxygen is probably broken for plugin-defined events.
- event groups are broken (do we want to keep them?)
- parallel btest is broken, but I'm not sure why ...
(tests all pass individually, but lots of error when running
in parallel; must be related to *.bif restructuring).
- Document API for src/plugin/*
- Document API for src/analyzer/Analyzer.h
- Document API for scripts/base/frameworks/analyzer
- Add a timeout flag to file_analysis.log so it's easy to tell what
has had at least one timeout trigger happen.
- Fix ftp-data service tag not being set for reused connections.
- Fix HTTP::Incorrect_File_Type because mime types returned by FAF have
the charset still in them, but the HTTP::mime_types_extensions table
does not and it requires an exact string match. (still ugly)
- Add TRIGGER_NEW_CONN to track files going over multiple connections.
- Add an initial file/mime type guess for non-linear file transfers.
- Fix a case where file/mime type detection would never be attempted
if the start of the file was a content gap.
- Improve mime type tracking of HTTP byte-range/partial-content,
even if the requests are pipelined or over multiple connections.
- I changed the modbus.events test because having the baseline output
be 80+ MB is nuts and it was sensitive to connection record redefs.
The notable difference here is that ftp.log now logs by default
the PORT, PASV, EPRT, EPSV commands as well as a separate line for
ftp-data channels in which file extraction was requested.
This difference isn't a direct result of now doing the file extraction
through the file analysis framework, it's just because I noticed even
the old way of tracking extracted-file name didn't work right and this
was the way I came up with so that a locally extracted file can be
associated with a data channel and then that data channel associated
with a control channel.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.
There are three major parts going into this:
- A new plugin infrastructure in src/plugin. This is independent
of analyzers and will eventually support plugins for other parts
of Bro as well (think: readers and writers). The goal is that
plugins can be alternatively compiled in statically or loadead
dynamically at runtime from a shared library. While the latter
isn't there yet, there'll be almost no code change for a plugin
to make it dynamic later (hopefully :)
- New analyzer infrastructure in src/analyzer. I've moved a number
of analyzer-related classes here, including Analyzer and DPM;
the latter now renamed to Analyzer::Manager. More will move here
later. Currently, there's only one plugin here, which provides
*all* existing analyzers. We can modularize this further in the
future (or not).
- A new script interface in base/framework/analyzer. I think that
this will eventually replace the dpm framework, but for now
that's still there as well, though some parts have moved over.
I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:
const ports = { 22/tcp } &redef;
event bro_init() &priority=5
{
...
Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
}
As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.
This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.
The debug stream "dpm" shows more about the loaded/enabled analyzers.
A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).
This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
Versus from synchronous function calls, which doesn't work well because
the function call can see a script-layer state that doesn't reflect
the state as it will be in terms of the event/network stream.
The framework now cycles through callbacks based on a table indexed
by analyzer tags, or the special case of service strings if a given
analyzer is overloaded for multiple protocols (FTP/IRC data). This
lets each protocol script bundle implement the callback locally and
reduces the FAF's external dependencies.
In addition to checking for a finished SSL handshake over an FTP
connection, it now also requires that the SSL handshake occurs after
the FTP client requested AUTH GSSAPI, more specifically identifying the
characteristics of GridFTP control channels.
Addresses #891.
* origin/topic/jsiwek/gridftp:
Add memory leak unit test for GridFTP.
Enable GridFTP detection by default. Track/log SSL client certs.
Add analyzer for GSI mechanism of GSSAPI FTP AUTH method.
Add an example of a GridFTP data channel detection script.
In the *service* field of connection records, GridFTP control channels
are labeled as "gridftp" and data channels as "gridftp-data".
Added *client_subject* and *client_issuer_subject* as &log'd fields to
SSL::Info record. Also added *client_cert* and *client_cert_chain*
fields to track client cert chain.
GSI authentication involves an encoded TLS/SSL handshake over the FTP
control session. Decoding the exchanged tokens and passing them to an
SSL analyzer instance allows use of all the familiar script-layer events
in inspecting the handshake (e.g. client/server certificats are
available). For FTP sessions that attempt GSI authentication, the
service field of the connection record will have both "ftp" and "ssl".
One additional change is an FTP server's acceptance of an AUTH request
no longer causes analysis of the connection to cease (because further
analysis likely wasn't possible). This decision can be made more
dynamically at the script-layer (plus there's now the fact that further
analysis can be done at least on the GSSAPI AUTH method).
Instead, the `addr_to_uri` script-level function can be used to
explicitly add brackets to an address if it's IPv6 and will be
included in a URI or when a ":<port>" needs to be appended to it.
This is to avoid ambiguity between compressed hex notation and
module namespacing, both which use "::". E.g.: "aaaa::bbbb" could
be an identifier or an IPv6 address, but "[aaaa::bbbb]" is now
clearly the address.
Also added IPv6 mixed notation to allow an IPv4 dotted-decimal
address to be specified in the lower 32-bits.