Commit graph

2220 commits

Author SHA1 Message Date
Jon Siwek
b8c98b8bf7 FileAnalysis: change terminology s/action/analyzer 2013-04-11 14:53:54 -05:00
Jon Siwek
e81f2ae7b0 FileAnalysis: libmagic tweaks.
Remove verbose file type detection and automatically strip out charset
from mime type.
2013-04-11 13:11:46 -05:00
Jon Siwek
2fba37e277 FileAnalysis: add bif for setting timeout interval 2013-04-11 12:08:46 -05:00
Jon Siwek
e2fbee9054 FileAnalysis: add more params to some events. 2013-04-11 11:24:18 -05:00
Seth Hall
a615601269 Trying to fix a state maintenance issue. 2013-04-11 09:42:46 -04:00
Jon Siwek
2747e839fb FileAnalysis: insert explicit event queue flush points.
And added an event called "event_queue_flush_point" to mark where that
occured in the event stream.  The FAF now uses an explicit event queue
flush instead of buffering input in order to wait for a file handle to
be returned from script-layer.
2013-04-10 16:48:10 -05:00
Jon Siwek
d9321e2203 FileAnalysis: remove some file events.
The file_new event now takes over the function of file_type, file_bof,
and file_bof_buffer.
2013-04-10 14:34:23 -05:00
Jon Siwek
a2d9b47bcd FileAnalysis: finish switching hooks to events. 2013-04-10 11:13:43 -05:00
Bernhard Amann
f10ed9e29a change plugin after feedback of seth 2013-04-10 10:45:45 -04:00
Jon Siwek
641154f8e8 FileAnalysis: checkpoint in middle of big reorganization.
- FileAnalysis::Info is now just a record used for logging, the fa_file
  record type is defined in init-bare.bro as the analogue to a
  connection record.

- Starting to transfer policy hook triggers and analyzer results to
  events.
2013-04-09 15:49:58 -05:00
Bernhard Amann
07d44f3aa0 Merge remote-tracking branch 'origin/topic/seth/metrics-merge' into topic/bernhard/hyperloglog-with-measurement 2013-04-08 10:56:18 +02:00
Bernhard Amann
bcd610fd50 Forgot a file. Again. Like always. Basically. 2013-04-08 10:55:00 +02:00
Bernhard Amann
7eee2f0d17 measurement framework with hll unique 2013-04-08 10:00:34 +02:00
Bernhard Amann
2cc1f82425 Merge remote-tracking branch 'origin/master' into topic/bernhard/thread-cleanup 2013-04-07 20:43:47 +02:00
Robin Sommer
1a30a57816 Porting syslog analyzer as another example.
The diff to this commit shows what "porting" involves ...

This also adds a small test for syslog.
2013-04-05 13:13:30 -07:00
Robin Sommer
bccaea6883 Adding options Analyzer::disable_all to disable all analyzers at
startup.

One can then selectively enable the ones one wants inside a bro_init()
handler.
2013-04-04 15:24:15 -07:00
Robin Sommer
bfda42b9e9 Removing legacy binpac analyzer for DNS and HTTP. 2013-04-03 13:40:45 -07:00
Jon Siwek
393d35dc60 Revert "FileAnalysis: optimize get_file_handle event queueing."
This reverts commit fc267d010d.

There were some diffs caused by this in external test suites I'm
unsure about, I'm going to go over optimizations more closely in
a different branch.
2013-04-03 09:49:39 -05:00
Jon Siwek
fc267d010d FileAnalysis: optimize get_file_handle event queueing.
When a file handle is needed and the last event in the queue is also
a get_file_handle event with the same arguments, instead of queueing
a new event, just remember to cache/re-use the resulting handle from
the previous event.  This depends on get_file_handle handlers not
changing global state that is also used to derive the file handle
string.
2013-04-02 16:21:51 -05:00
Seth Hall
f2ac938603 Merge remote-tracking branch 'origin/topic/robin/thread-cleanup' into topic/seth/exec-module 2013-04-02 15:12:38 -04:00
Seth Hall
d86748969a Merge remote-tracking branch 'origin/topic/bernhard/input-update' into topic/seth/exec-module 2013-04-02 09:24:19 -04:00
Seth Hall
e8b60d1ba8 Updated FTP bruteforce detection and a few other small changes. 2013-04-02 00:55:25 -04:00
Seth Hall
423bf3b3bf Test updates and cleanup. 2013-04-02 00:30:14 -04:00
Seth Hall
0e3c84e863 Fixed the measurement "sample" plugin. 2013-04-02 00:19:06 -04:00
Seth Hall
f1d165956a Fix path compression to include removing "/./".
- This involved a fix to the FTP scripts that relied on the old behavior.
2013-04-02 00:16:56 -04:00
Seth Hall
b477d2b02d Measurement framework is ready for testing.
- New, expanded API.
 - Calculations moved into plugins.
 - Scripts using measurement framework ported.
 - Updated the script-land queue implementation to make it more generic.
 -
2013-04-01 17:04:15 -04:00
Robin Sommer
e0c4bd1a82 Lots of cleanup and API documentation for the analyzer/* classes.
I've used the opportunity to also cleanup DPD's expect_connection()
infrastructure, and renamed that bif to schedule_analyzer(), which
seems more appropiate. One can now also schedule more than one
analyzer per connection.

TODOs:
        - "make install" is probably broken.
        - Broxygen is probably broken for plugin-defined events.
        - event groups are broken (do we want to keep them?)
        - parallel btest is broken, but I'm not sure why ...
          (tests all pass individually, but lots of error when running
          in parallel; must be related to *.bif restructuring).
        - Document API for src/plugin/*
        - Document API for src/analyzer/Analyzer.h
        - Document API for scripts/base/frameworks/analyzer
2013-04-01 13:12:21 -07:00
Seth Hall
93eca70e6b Merge remote-tracking branch 'origin/master' into topic/seth/metrics-merge 2013-04-01 14:16:46 -04:00
Seth Hall
53f9948b02 Measurement framework tests all pass now. 2013-04-01 14:16:37 -04:00
Robin Sommer
19c1816ebb Infrastructure for modularizing protocol analyzers.
There's now a new directory "src/protocols/", and the plan is for each
protocol analyzer to eventually have its own subdirectory in there
that contains everything it defines (C++/pac/bif). The infrastructure
to make that happen is in place, and two analyzers have been
converted to the new model, HTTP and SSL; there's no further
HTTP/SSL-specific code anywhere else in the core anymore (I believe :-)

Further changes:

    - -N lists available plugins, -NN lists more details on what these
      plugins provide (analyzers, bif elements). (The latter does not
      work for analyzers that haven't been converted yet).

    - *.bif.bro files now go into scripts/base/bif/; and
      scripts/base/bif/plugins/ for bif files provided by plugins.

    - I've factored out the bifcl/binpac CMake magic from
      src/CMakeLists.txt to cmake/{BifCl,Binpac}

    - There's a new cmake/BroPlugin that contains magic to allow
      plugins to have a simple CMakeLists.txt. The hope is that
      eventually the same CMakeLists.txt can be used for compiling a
      plugin either statically or dynamically.

    - bifcl has a new option -c that changes the code it generates so
      that it can be used with a plugin.

TODOs:
    - "make install" is probably broken.
    - Broxygen is probably broken for plugin-defined events.
    - event groups are broken (do we want to keep them?)
2013-03-29 19:59:31 -07:00
Jon Siwek
83f47d6f7a FileAnalysis: first pass over documentation. 2013-03-29 13:41:37 -05:00
Jon Siwek
3642ecc73e FileAnalysis: misc. tweaks/fixes.
- Add a timeout flag to file_analysis.log so it's easy to tell what
  has had at least one timeout trigger happen.

- Fix ftp-data service tag not being set for reused connections.

- Fix HTTP::Incorrect_File_Type because mime types returned by FAF have
  the charset still in them, but the HTTP::mime_types_extensions table
  does not and it requires an exact string match. (still ugly)

- Add TRIGGER_NEW_CONN to track files going over multiple connections.

- Add an initial file/mime type guess for non-linear file transfers.

- Fix a case where file/mime type detection would never be attempted
  if the start of the file was a content gap.

- Improve mime type tracking of HTTP byte-range/partial-content,
  even if the requests are pipelined or over multiple connections.

- I changed the modbus.events test because having the baseline output
  be 80+ MB is nuts and it was sensitive to connection record redefs.
2013-03-28 16:59:29 -05:00
Jon Siwek
27e47f0a57 FileAnalysis: replace script-layer IRC file analysis. 2013-03-27 14:02:20 -05:00
Jon Siwek
7e895a3a2f FileAnalysis: replace script-layer FTP file analysis.
The notable difference here is that ftp.log now logs by default
the PORT, PASV, EPRT, EPSV commands as well as a separate line for
ftp-data channels in which file extraction was requested.

This difference isn't a direct result of now doing the file extraction
through the file analysis framework, it's just because I noticed even
the old way of tracking extracted-file name didn't work right and this
was the way I came up with so that a locally extracted file can be
associated with a data channel and then that data channel associated
with a control channel.
2013-03-27 12:59:38 -05:00
Robin Sommer
2be985433c Test-suite passes.
All tests pass with one exception: some Broxygen tests are broken
because dpd_config doesn't exist anymore. Need to update the mechanism
for auto-documenting well-known ports.
2013-03-26 15:40:23 -07:00
Jon Siwek
497496ec83 FileAnalysis: replace script-layer SMTP file analysis.
Notable differences:

- Removed SMTP::MD5 notice.

- Removed ability to specify mime entity excerpt length per mime-type.
2013-03-26 15:48:52 -05:00
Robin Sommer
af1809aaa3 First prototype of new analyzer framework.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.

There are three major parts going into this:

    - A new plugin infrastructure in src/plugin. This is independent
      of analyzers and will eventually support plugins for other parts
      of Bro as well (think: readers and writers). The goal is that
      plugins can be alternatively compiled in statically or loadead
      dynamically at runtime from a shared library. While the latter
      isn't there yet, there'll be almost no code change for a plugin
      to make it dynamic later (hopefully :)

    - New analyzer infrastructure in src/analyzer. I've moved a number
      of analyzer-related classes here, including Analyzer and DPM;
      the latter now renamed to Analyzer::Manager. More will move here
      later. Currently, there's only one plugin here, which provides
      *all* existing analyzers. We can modularize this further in the
      future (or not).

    - A new script interface in base/framework/analyzer. I think that
      this will eventually replace the dpm framework, but for now
      that's still there as well, though some parts have moved over.

I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:

    const ports = { 22/tcp } &redef;

    event bro_init() &priority=5
        {
        ...
        Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
        }

As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.

This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.

The debug stream "dpm" shows more about the loaded/enabled analyzers.

A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).

This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
2013-03-26 11:05:38 -07:00
Jon Siwek
84a0c2fdac FileAnalysis: file handles now set from events.
Versus from synchronous function calls, which doesn't work well because
the function call can see a script-layer state that doesn't reflect
the state as it will be in terms of the event/network stream.
2013-03-25 15:37:58 -05:00
Jon Siwek
71f0e2d276 FileAnalysis: replace script-layer http file analysis.
Other misc:

- Remove HTTP::MD5 notice.

- Add "last_active" field to FileAnalysis::Info record.

- Replace "conn_uids", "conn_ids" fields in FileAnalysis::Info record
  with just a "conns" fields containing full connection records.

- The http-methods unit test is failing now, but I think it will be
  fixed once I change the file handle callback mechanism to use events
  instead.
2013-03-22 16:14:06 -05:00
Jon Siwek
7034785810 FileAnalysis: add logging, file_analysis.log. 2013-03-20 13:31:11 -05:00
Jon Siwek
661677d452 FileAnalysis: separating IRC/FTP data analyzers.
It simplifies the file handle string callbacks.
2013-03-20 11:12:06 -05:00
Jon Siwek
59ed5c75f1 FileAnalysis: add unit tests covering current protocol integration.
And had to make various fixes/refinements after scrutinizing results.
2013-03-19 15:50:05 -05:00
Seth Hall
6dc204b385 Checkpoint, don't try running this. It's broken all over the place. 2013-03-19 11:39:58 -04:00
Bernhard Amann
8875953751 A bunch of more changes for the raw reader
* send end_of_data event for all kind of streams
* send process_finished event containing exit code of child process for executed programs
* move raw-tests to separate directory
* expose name of input stream to readers
* better handling of some error cases in raw reader
* new force_kill option for raw reader which SIGKILLs progesses on exit

The ordering of events how they arrive in the main loop is a bit peculiar at the moment.
The process_finished event arrives in scriptland before all of the other events, even though
it should be sent last. I have not yet fully figured that out.
2013-03-18 21:49:16 -07:00
Jon Siwek
294570ec2e Merge branch 'master' into topic/jsiwek/file-analysis 2013-03-18 11:48:05 -05:00
Jon Siwek
550c3c477d FileAnalysis: integrate w/ SMTP analyzer.
More generally w/ MIME_Mail messages, which POP3 analyzer also uses.
2013-03-18 11:30:59 -05:00
Robin Sommer
d58a02aa01 Merge remote-tracking branch 'origin/topic/bernhard/base64'
* origin/topic/bernhard/base64:
  and re-enable caching of extracted certs
  and add bae64 bif tests.
  re-unify classes
  and modernize script.
  add base64-encode functionality and bif.

Closes #965.
2013-03-17 13:00:52 -07:00
Robin Sommer
38e1dc9ca4 Support for cleaning up threads that have terminated.
Once a BasicThread leaves its run() method, a thread is now marked for
cleaning up, and the ThreadMgr will soon join it to release the OS
resources.

Also, adding a function Log::remove_stream() that remove a logging
stream, stopping all writer threads that are associated with it.

Note, however, that removing a *filter* from a stream still doesn't
clean up any threads. The problem is that because of the output paths
potentially being created dynamically it's unclear if the writer
thread will still be needed in the future. We could add clean writers
up with timeouts, but that doesn't sound great either. So for now, the
only way to sure clean up logging threads is to remove the entire
stream.

Also note that cleanup doesn't work with input threads yet, which
don't seem to terminate (at least in the case I tried).
2013-03-14 14:59:05 -07:00
Jon Siwek
e0f3713912 FileAnalysis: change file handle -> file id mapping process.
They're now actually directly related via a hash function that will
produce the same results among different instances in a cluster.
2013-03-14 14:08:26 -05:00
Jon Siwek
637fe69cf9 FileAnalysis: buffer input that can't get unique file handle immediately
A retry happens on every new input and also periodically based on a
timer.  If a file handle is returned at those times, the input is
forwarded for analysis, else it keeps retrying until a timeout
threshold.
2013-03-14 10:57:16 -05:00