Commit graph

78 commits

Author SHA1 Message Date
Jon Siwek
2747e839fb FileAnalysis: insert explicit event queue flush points.
And added an event called "event_queue_flush_point" to mark where that
occured in the event stream.  The FAF now uses an explicit event queue
flush instead of buffering input in order to wait for a file handle to
be returned from script-layer.
2013-04-10 16:48:10 -05:00
Jon Siwek
d9321e2203 FileAnalysis: remove some file events.
The file_new event now takes over the function of file_type, file_bof,
and file_bof_buffer.
2013-04-10 14:34:23 -05:00
Jon Siwek
a2d9b47bcd FileAnalysis: finish switching hooks to events. 2013-04-10 11:13:43 -05:00
Jon Siwek
641154f8e8 FileAnalysis: checkpoint in middle of big reorganization.
- FileAnalysis::Info is now just a record used for logging, the fa_file
  record type is defined in init-bare.bro as the analogue to a
  connection record.

- Starting to transfer policy hook triggers and analyzer results to
  events.
2013-04-09 15:49:58 -05:00
Jon Siwek
393d35dc60 Revert "FileAnalysis: optimize get_file_handle event queueing."
This reverts commit fc267d010d.

There were some diffs caused by this in external test suites I'm
unsure about, I'm going to go over optimizations more closely in
a different branch.
2013-04-03 09:49:39 -05:00
Jon Siwek
fc267d010d FileAnalysis: optimize get_file_handle event queueing.
When a file handle is needed and the last event in the queue is also
a get_file_handle event with the same arguments, instead of queueing
a new event, just remember to cache/re-use the resulting handle from
the previous event.  This depends on get_file_handle handlers not
changing global state that is also used to derive the file handle
string.
2013-04-02 16:21:51 -05:00
Jon Siwek
3642ecc73e FileAnalysis: misc. tweaks/fixes.
- Add a timeout flag to file_analysis.log so it's easy to tell what
  has had at least one timeout trigger happen.

- Fix ftp-data service tag not being set for reused connections.

- Fix HTTP::Incorrect_File_Type because mime types returned by FAF have
  the charset still in them, but the HTTP::mime_types_extensions table
  does not and it requires an exact string match. (still ugly)

- Add TRIGGER_NEW_CONN to track files going over multiple connections.

- Add an initial file/mime type guess for non-linear file transfers.

- Fix a case where file/mime type detection would never be attempted
  if the start of the file was a content gap.

- Improve mime type tracking of HTTP byte-range/partial-content,
  even if the requests are pipelined or over multiple connections.

- I changed the modbus.events test because having the baseline output
  be 80+ MB is nuts and it was sensitive to connection record redefs.
2013-03-28 16:59:29 -05:00
Jon Siwek
7e895a3a2f FileAnalysis: replace script-layer FTP file analysis.
The notable difference here is that ftp.log now logs by default
the PORT, PASV, EPRT, EPSV commands as well as a separate line for
ftp-data channels in which file extraction was requested.

This difference isn't a direct result of now doing the file extraction
through the file analysis framework, it's just because I noticed even
the old way of tracking extracted-file name didn't work right and this
was the way I came up with so that a locally extracted file can be
associated with a data channel and then that data channel associated
with a control channel.
2013-03-27 12:59:38 -05:00
Jon Siwek
497496ec83 FileAnalysis: replace script-layer SMTP file analysis.
Notable differences:

- Removed SMTP::MD5 notice.

- Removed ability to specify mime entity excerpt length per mime-type.
2013-03-26 15:48:52 -05:00
Robin Sommer
af1809aaa3 First prototype of new analyzer framework.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.

There are three major parts going into this:

    - A new plugin infrastructure in src/plugin. This is independent
      of analyzers and will eventually support plugins for other parts
      of Bro as well (think: readers and writers). The goal is that
      plugins can be alternatively compiled in statically or loadead
      dynamically at runtime from a shared library. While the latter
      isn't there yet, there'll be almost no code change for a plugin
      to make it dynamic later (hopefully :)

    - New analyzer infrastructure in src/analyzer. I've moved a number
      of analyzer-related classes here, including Analyzer and DPM;
      the latter now renamed to Analyzer::Manager. More will move here
      later. Currently, there's only one plugin here, which provides
      *all* existing analyzers. We can modularize this further in the
      future (or not).

    - A new script interface in base/framework/analyzer. I think that
      this will eventually replace the dpm framework, but for now
      that's still there as well, though some parts have moved over.

I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:

    const ports = { 22/tcp } &redef;

    event bro_init() &priority=5
        {
        ...
        Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
        }

As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.

This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.

The debug stream "dpm" shows more about the loaded/enabled analyzers.

A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).

This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
2013-03-26 11:05:38 -07:00
Jon Siwek
84a0c2fdac FileAnalysis: file handles now set from events.
Versus from synchronous function calls, which doesn't work well because
the function call can see a script-layer state that doesn't reflect
the state as it will be in terms of the event/network stream.
2013-03-25 15:37:58 -05:00
Jon Siwek
661677d452 FileAnalysis: separating IRC/FTP data analyzers.
It simplifies the file handle string callbacks.
2013-03-20 11:12:06 -05:00
Jon Siwek
550c3c477d FileAnalysis: integrate w/ SMTP analyzer.
More generally w/ MIME_Mail messages, which POP3 analyzer also uses.
2013-03-18 11:30:59 -05:00
Robin Sommer
da90976170 Merge remote-tracking branch 'origin/topic/matthias/opaque'
* origin/topic/matthias/opaque:
  Add new unit test for opaque serialization.
  Migrate entropy testing to opaque.
  C++ify RandTest.*
  Fix a hard-to-spot bug.
  Use more descriptive error message.
  Fix the fix :-/.
  Fix initialization of hash values.
  Be clearer about delegation.
  Implement serialization of opaque types.
  Update hash BiF documentation.
  Migrate free SHA* functions to SHA*Val::digest().
  Add missing type name that caused failing tests.
  Update base scripts and unit tests.
  Simplify hash function BiFs.
  Add support for opaque hash values.
  Adapt BiF & Bro parser to handle opaque types.
  More lexer/parser work.
  Implement equivalence relation for opaque types.
  Support basic serialization of opaque.
  Add opaque type to lexer, parser, and BroType.

Closes #925

Conflicts:
	aux/broccoli
2012-12-20 16:30:22 -08:00
Seth Hall
0cf98ac325 Improved file name extraction for SMTP when file name is included in Content-Type header. 2012-12-13 10:27:08 -05:00
Matthias Vallentin
30bab14dbf Update base scripts and unit tests. 2012-12-11 16:26:17 -08:00
Vlad Grigorescu
f43576cff3 Fix some Info:Record field documentation. 2012-07-13 14:04:24 -04:00
Seth Hall
4753f2aeca Adding extra fields to smtp and http to track transaction depth.
- This will for help linking in analysis scripts and databases later.

- Test baseline updates coming in a few minutes.
2011-10-25 11:34:48 -04:00
Seth Hall
6d67f7830d Added to the likely_server_ports set for protocols with analyzers.
- Updated some tests since Bro is getting the direction
  correct now.

- Updated BPF filter test since I added a few ports to IRC
  as well.
2011-10-07 13:44:28 -04:00
Robin Sommer
9ee8a9f806 Testing/external scripts no longer compute MD5 checksums for SMTP
entities.

Before, whether they did depended on libmagic. To do that,
smpt/entities.bro gets a new option `never_calc_md5`.

Also restructuring the tests a bit so that load a common
testing-setup.bro scripts that can set a global configuration.
2011-09-15 15:42:10 -07:00
Seth Hall
11c437faa3 Logging framework update and mass Log::ID renaming.
- Log path's are generated in the scripting land
  now.  The default Log stream ID to path string
  mapping works like this:
    - Notice::LOG -> "notice"
    - Notice::POLICY_LOG -> "notice_policy"
    - TestModule::LOG -> "test_module"

- Logging streams updated across all of the shipped
  scripts to be more user friendly.  Instead of
  the logging stream ID HTTP::HTTP, we now have
  HTTP::LOG, etc.

- The priorities on some bro_init handlers have
  been adjusted to make the process of applying
  filters or disabling streams easier for users.
2011-09-03 01:10:17 -04:00
Seth Hall
fc5f22cb5d Merge remote-tracking branch 'origin/topic/jsiwek/reorg-followup' 2011-08-25 16:44:31 -04:00
Jon Siwek
59e5fc5633 Merge branch 'master' into topic/jsiwek/reorg-followup
Conflicts:
	scripts/base/frameworks/cluster/setup-connections.bro
	scripts/base/protocols/ssh/main.bro
2011-08-11 10:56:20 -05:00
Seth Hall
b45c175147 Split out more SMTP analysis functionality. 2011-08-11 08:26:20 -04:00
Jon Siwek
f517d0e0ad Merge branch 'master' into topic/jsiwek/reorg-followup 2011-08-10 19:59:18 -05:00
Jon Siwek
47500ceef4 Add a test that checks each individual script can be loaded in bare-mode.
Fixed most @load dependency issues in the process.  The test is still
failing in a "known" way due to hot.conn.bro and scan.bro.

Adressess #545
2011-08-10 15:38:21 -05:00
Seth Hall
adc486c673 Merge remote-tracking branch 'origin/topic/jsiwek/smtp-refactor'
- While updating, I did some further work on the branch.

- New function in the base/utils/files for extracting filenames
  from content-dispositions.

- New script for entity excerpt extraction if you aren't interested
  in full extraction.  The data goes a log field too.

- Some renaming and reorganization of types.

- Updated tests to work with new code.

* origin/topic/jsiwek/smtp-refactor:
  Make the doc.coverage test happy.
  SMTP script refactor. (addresses #509)

Conflicts:
	doc/scripts/DocSourcesList.cmake
	policy/protocols/smtp/__load__.bro
	policy/protocols/smtp/base/__load__.bro
2011-08-10 13:34:31 -04:00
Seth Hall
597a4d6704 Hopefully the last major script reorganization.
- policy/ renamed to scripts/

- By default BROPATH now contains:
	- scripts/
	- scripts/policy
	- scripts/site

- *Nearly* all tests pass.

- All of scripts/base/ is loaded by main.cc
	- Can be disabled by setting $BRO_NO_BASE_SCRIPTS
	- Scripts in scripts/base/ don't use relative path loading to ease use of BRO_NO_BASE_SCRIPTS (to copy and paste that script).

- The scripts in scripts/base/protocols/ only (or soon will only) do logging and state building.

- The scripts in scripts/base/frameworks/ add functionality without causing any additional overhead.

- All "detection" activity happens through scripts in scripts/policy/.

- Communications framework modified temporarily to need an environment variable to actually enable (ENABLE_COMMUNICATION=1)
	- This is so the communications framework can be loaded as part
	  of the base without causing trouble when it's not needed.
	- This will be removed once a resolution to ticket #540 is reached.
2011-08-05 23:09:53 -04:00