And added an event called "event_queue_flush_point" to mark where that
occured in the event stream. The FAF now uses an explicit event queue
flush instead of buffering input in order to wait for a file handle to
be returned from script-layer.
- FileAnalysis::Info is now just a record used for logging, the fa_file
record type is defined in init-bare.bro as the analogue to a
connection record.
- Starting to transfer policy hook triggers and analyzer results to
events.
This reverts commit fc267d010d.
There were some diffs caused by this in external test suites I'm
unsure about, I'm going to go over optimizations more closely in
a different branch.
When a file handle is needed and the last event in the queue is also
a get_file_handle event with the same arguments, instead of queueing
a new event, just remember to cache/re-use the resulting handle from
the previous event. This depends on get_file_handle handlers not
changing global state that is also used to derive the file handle
string.
- Add a timeout flag to file_analysis.log so it's easy to tell what
has had at least one timeout trigger happen.
- Fix ftp-data service tag not being set for reused connections.
- Fix HTTP::Incorrect_File_Type because mime types returned by FAF have
the charset still in them, but the HTTP::mime_types_extensions table
does not and it requires an exact string match. (still ugly)
- Add TRIGGER_NEW_CONN to track files going over multiple connections.
- Add an initial file/mime type guess for non-linear file transfers.
- Fix a case where file/mime type detection would never be attempted
if the start of the file was a content gap.
- Improve mime type tracking of HTTP byte-range/partial-content,
even if the requests are pipelined or over multiple connections.
- I changed the modbus.events test because having the baseline output
be 80+ MB is nuts and it was sensitive to connection record redefs.
The notable difference here is that ftp.log now logs by default
the PORT, PASV, EPRT, EPSV commands as well as a separate line for
ftp-data channels in which file extraction was requested.
This difference isn't a direct result of now doing the file extraction
through the file analysis framework, it's just because I noticed even
the old way of tracking extracted-file name didn't work right and this
was the way I came up with so that a locally extracted file can be
associated with a data channel and then that data channel associated
with a control channel.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.
There are three major parts going into this:
- A new plugin infrastructure in src/plugin. This is independent
of analyzers and will eventually support plugins for other parts
of Bro as well (think: readers and writers). The goal is that
plugins can be alternatively compiled in statically or loadead
dynamically at runtime from a shared library. While the latter
isn't there yet, there'll be almost no code change for a plugin
to make it dynamic later (hopefully :)
- New analyzer infrastructure in src/analyzer. I've moved a number
of analyzer-related classes here, including Analyzer and DPM;
the latter now renamed to Analyzer::Manager. More will move here
later. Currently, there's only one plugin here, which provides
*all* existing analyzers. We can modularize this further in the
future (or not).
- A new script interface in base/framework/analyzer. I think that
this will eventually replace the dpm framework, but for now
that's still there as well, though some parts have moved over.
I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:
const ports = { 22/tcp } &redef;
event bro_init() &priority=5
{
...
Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
}
As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.
This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.
The debug stream "dpm" shows more about the loaded/enabled analyzers.
A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).
This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
Versus from synchronous function calls, which doesn't work well because
the function call can see a script-layer state that doesn't reflect
the state as it will be in terms of the event/network stream.
* origin/topic/matthias/opaque:
Add new unit test for opaque serialization.
Migrate entropy testing to opaque.
C++ify RandTest.*
Fix a hard-to-spot bug.
Use more descriptive error message.
Fix the fix :-/.
Fix initialization of hash values.
Be clearer about delegation.
Implement serialization of opaque types.
Update hash BiF documentation.
Migrate free SHA* functions to SHA*Val::digest().
Add missing type name that caused failing tests.
Update base scripts and unit tests.
Simplify hash function BiFs.
Add support for opaque hash values.
Adapt BiF & Bro parser to handle opaque types.
More lexer/parser work.
Implement equivalence relation for opaque types.
Support basic serialization of opaque.
Add opaque type to lexer, parser, and BroType.
Closes#925
Conflicts:
aux/broccoli
entities.
Before, whether they did depended on libmagic. To do that,
smpt/entities.bro gets a new option `never_calc_md5`.
Also restructuring the tests a bit so that load a common
testing-setup.bro scripts that can set a global configuration.
- Log path's are generated in the scripting land
now. The default Log stream ID to path string
mapping works like this:
- Notice::LOG -> "notice"
- Notice::POLICY_LOG -> "notice_policy"
- TestModule::LOG -> "test_module"
- Logging streams updated across all of the shipped
scripts to be more user friendly. Instead of
the logging stream ID HTTP::HTTP, we now have
HTTP::LOG, etc.
- The priorities on some bro_init handlers have
been adjusted to make the process of applying
filters or disabling streams easier for users.
- While updating, I did some further work on the branch.
- New function in the base/utils/files for extracting filenames
from content-dispositions.
- New script for entity excerpt extraction if you aren't interested
in full extraction. The data goes a log field too.
- Some renaming and reorganization of types.
- Updated tests to work with new code.
* origin/topic/jsiwek/smtp-refactor:
Make the doc.coverage test happy.
SMTP script refactor. (addresses #509)
Conflicts:
doc/scripts/DocSourcesList.cmake
policy/protocols/smtp/__load__.bro
policy/protocols/smtp/base/__load__.bro
- policy/ renamed to scripts/
- By default BROPATH now contains:
- scripts/
- scripts/policy
- scripts/site
- *Nearly* all tests pass.
- All of scripts/base/ is loaded by main.cc
- Can be disabled by setting $BRO_NO_BASE_SCRIPTS
- Scripts in scripts/base/ don't use relative path loading to ease use of BRO_NO_BASE_SCRIPTS (to copy and paste that script).
- The scripts in scripts/base/protocols/ only (or soon will only) do logging and state building.
- The scripts in scripts/base/frameworks/ add functionality without causing any additional overhead.
- All "detection" activity happens through scripts in scripts/policy/.
- Communications framework modified temporarily to need an environment variable to actually enable (ENABLE_COMMUNICATION=1)
- This is so the communications framework can be loaded as part
of the base without causing trouble when it's not needed.
- This will be removed once a resolution to ticket #540 is reached.