Commit graph

880 commits

Author SHA1 Message Date
Seth Hall
f1d165956a Fix path compression to include removing "/./".
- This involved a fix to the FTP scripts that relied on the old behavior.
2013-04-02 00:16:56 -04:00
Robin Sommer
e0c4bd1a82 Lots of cleanup and API documentation for the analyzer/* classes.
I've used the opportunity to also cleanup DPD's expect_connection()
infrastructure, and renamed that bif to schedule_analyzer(), which
seems more appropiate. One can now also schedule more than one
analyzer per connection.

TODOs:
        - "make install" is probably broken.
        - Broxygen is probably broken for plugin-defined events.
        - event groups are broken (do we want to keep them?)
        - parallel btest is broken, but I'm not sure why ...
          (tests all pass individually, but lots of error when running
          in parallel; must be related to *.bif restructuring).
        - Document API for src/plugin/*
        - Document API for src/analyzer/Analyzer.h
        - Document API for scripts/base/frameworks/analyzer
2013-04-01 13:12:21 -07:00
Seth Hall
93eca70e6b Merge remote-tracking branch 'origin/master' into topic/seth/metrics-merge 2013-04-01 14:16:46 -04:00
Jon Siwek
3642ecc73e FileAnalysis: misc. tweaks/fixes.
- Add a timeout flag to file_analysis.log so it's easy to tell what
  has had at least one timeout trigger happen.

- Fix ftp-data service tag not being set for reused connections.

- Fix HTTP::Incorrect_File_Type because mime types returned by FAF have
  the charset still in them, but the HTTP::mime_types_extensions table
  does not and it requires an exact string match. (still ugly)

- Add TRIGGER_NEW_CONN to track files going over multiple connections.

- Add an initial file/mime type guess for non-linear file transfers.

- Fix a case where file/mime type detection would never be attempted
  if the start of the file was a content gap.

- Improve mime type tracking of HTTP byte-range/partial-content,
  even if the requests are pipelined or over multiple connections.

- I changed the modbus.events test because having the baseline output
  be 80+ MB is nuts and it was sensitive to connection record redefs.
2013-03-28 16:59:29 -05:00
Jon Siwek
27e47f0a57 FileAnalysis: replace script-layer IRC file analysis. 2013-03-27 14:02:20 -05:00
Jon Siwek
7e895a3a2f FileAnalysis: replace script-layer FTP file analysis.
The notable difference here is that ftp.log now logs by default
the PORT, PASV, EPRT, EPSV commands as well as a separate line for
ftp-data channels in which file extraction was requested.

This difference isn't a direct result of now doing the file extraction
through the file analysis framework, it's just because I noticed even
the old way of tracking extracted-file name didn't work right and this
was the way I came up with so that a locally extracted file can be
associated with a data channel and then that data channel associated
with a control channel.
2013-03-27 12:59:38 -05:00
Robin Sommer
2be985433c Test-suite passes.
All tests pass with one exception: some Broxygen tests are broken
because dpd_config doesn't exist anymore. Need to update the mechanism
for auto-documenting well-known ports.
2013-03-26 15:40:23 -07:00
Jon Siwek
497496ec83 FileAnalysis: replace script-layer SMTP file analysis.
Notable differences:

- Removed SMTP::MD5 notice.

- Removed ability to specify mime entity excerpt length per mime-type.
2013-03-26 15:48:52 -05:00
Robin Sommer
af1809aaa3 First prototype of new analyzer framework.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.

There are three major parts going into this:

    - A new plugin infrastructure in src/plugin. This is independent
      of analyzers and will eventually support plugins for other parts
      of Bro as well (think: readers and writers). The goal is that
      plugins can be alternatively compiled in statically or loadead
      dynamically at runtime from a shared library. While the latter
      isn't there yet, there'll be almost no code change for a plugin
      to make it dynamic later (hopefully :)

    - New analyzer infrastructure in src/analyzer. I've moved a number
      of analyzer-related classes here, including Analyzer and DPM;
      the latter now renamed to Analyzer::Manager. More will move here
      later. Currently, there's only one plugin here, which provides
      *all* existing analyzers. We can modularize this further in the
      future (or not).

    - A new script interface in base/framework/analyzer. I think that
      this will eventually replace the dpm framework, but for now
      that's still there as well, though some parts have moved over.

I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:

    const ports = { 22/tcp } &redef;

    event bro_init() &priority=5
        {
        ...
        Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
        }

As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.

This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.

The debug stream "dpm" shows more about the loaded/enabled analyzers.

A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).

This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
2013-03-26 11:05:38 -07:00
Jon Siwek
84a0c2fdac FileAnalysis: file handles now set from events.
Versus from synchronous function calls, which doesn't work well because
the function call can see a script-layer state that doesn't reflect
the state as it will be in terms of the event/network stream.
2013-03-25 15:37:58 -05:00
Jon Siwek
71f0e2d276 FileAnalysis: replace script-layer http file analysis.
Other misc:

- Remove HTTP::MD5 notice.

- Add "last_active" field to FileAnalysis::Info record.

- Replace "conn_uids", "conn_ids" fields in FileAnalysis::Info record
  with just a "conns" fields containing full connection records.

- The http-methods unit test is failing now, but I think it will be
  fixed once I change the file handle callback mechanism to use events
  instead.
2013-03-22 16:14:06 -05:00
Jon Siwek
661677d452 FileAnalysis: separating IRC/FTP data analyzers.
It simplifies the file handle string callbacks.
2013-03-20 11:12:06 -05:00
Jon Siwek
59ed5c75f1 FileAnalysis: add unit tests covering current protocol integration.
And had to make various fixes/refinements after scrutinizing results.
2013-03-19 15:50:05 -05:00
Jon Siwek
294570ec2e Merge branch 'master' into topic/jsiwek/file-analysis 2013-03-18 11:48:05 -05:00
Jon Siwek
550c3c477d FileAnalysis: integrate w/ SMTP analyzer.
More generally w/ MIME_Mail messages, which POP3 analyzer also uses.
2013-03-18 11:30:59 -05:00
Jon Siwek
637fe69cf9 FileAnalysis: buffer input that can't get unique file handle immediately
A retry happens on every new input and also periodically based on a
timer.  If a file handle is returned at those times, the input is
forwarded for analysis, else it keeps retrying until a timeout
threshold.
2013-03-14 10:57:16 -05:00
Seth Hall
8778761c07 Checkpoint 2013-03-13 22:55:03 -04:00
Jon Siwek
878dfff2f2 FileAnalysis: decentralize unique file handle generator callbacks.
The framework now cycles through callbacks based on a table indexed
by analyzer tags, or the special case of service strings if a given
analyzer is overloaded for multiple protocols (FTP/IRC data).  This
lets each protocol script bundle implement the callback locally and
reduces the FAF's external dependencies.
2013-03-13 10:48:26 -05:00
Jon Siwek
3dd513e26e FileAnalysis: move unique file handle string generation to script-layer
And add minimal integration with HTTP analyzer.
2013-03-12 13:44:31 -05:00
Bernhard Amann
5e8e12182a add base64-encode functionality and bif.
This allows replacing an ugly openssl-call from one of
the policy scripts. The openssl call is now replaced with
a still-but-less-ugly call to base64_encode.

I do not know if I split the Base64 classes in a "smart" way... :)
2013-03-05 16:05:07 -08:00
Seth Hall
a2556642e6 Merge remote-tracking branch 'origin/topic/matthias/notary'
* origin/topic/matthias/notary:
  Small cosmetic changes.
  Give log buffer the correct name.
  Simplify delayed logging of SSL records.
  Implement delay-token style SSL logging.
  More style tweaks: replace spaces with tabs.
  Factor notary code into separte file.
  Adhere to Bro coding style guidelines.
  Enhance ssl.log with information from notary.

Closes #928
2013-02-05 02:06:33 -05:00
Jon Siwek
69afc4a882 Add an error for record coercions that would orphan a field.
These cases should be avoidable by fixing scripts where they occur and
they can also help catch typos that would lead to unintentional runtime
behavior.

Adding this already revealed several scripts where a field in an inlined
record was never removed after a code refactor.
2013-01-24 09:56:19 -06:00
Matthias Vallentin
32a0ead698 Give log buffer the correct name. 2012-12-24 23:06:56 -08:00
Matthias Vallentin
7ff15f4599 Simplify delayed logging of SSL records. 2012-12-24 22:57:49 -08:00
Matthias Vallentin
9e81342c92 Implement delay-token style SSL logging.
This commit moves the notary script into the policy directory, along with some
architectural changes: the main SSL script now has functionality to add and
remove tokens for a given record. When adding a token, the script delays the
logging until the token has been removed or until the record exceeds a maximum
delay time.

As before, the base SSL script stores all records sequentially and buffers even
non-delayed records for the sake of having an ordered log file. If this turns
out to be not so important, we can easily revert to a simpler logic.

(This is still WiP, some debuggin statements still linger.)
2012-12-22 20:30:17 -08:00
Matthias Vallentin
8a569facd6 More style tweaks: replace spaces with tabs. 2012-12-21 18:04:19 -08:00
Matthias Vallentin
382262e286 Factor notary code into separte file.
There exists one complication: the new file notary.bro requires the definition
of the SSL::Info record, but as does main.bro. Because I did not really know
where to put the common code (it's not a constant, so ssl/const.bro does not
really fit), I put it into __load.bro__ so that it sticks out for now. If
anybody has an idea how to solve this elegantly, please let me know.
2012-12-21 17:56:31 -08:00
Matthias Vallentin
7355a0089a Adhere to Bro coding style guidelines. 2012-12-21 17:17:58 -08:00
Matthias Vallentin
ff8184242a Enhance ssl.log with information from notary.
This commit brings enhances each log line with the data from the notary when
available. The added fields include:

  - notary.first_seen
  - notary.last_seen
  - notary.times_seen
  - notary.valid

The semantics of these fields map 1-to-1 to the corresponding fields in DNS TXT
lookups from the notary. The implementation of this feature required a bit
plumbing: when Bro finishes the analysis, the log record is copied into table
indexed by connection ID where it remains until either Bro terminates or the
answer of the notary arrives. The script accummulates requests for a given
digest into a "waitlist," to avoid multiple redundant lookups for high-profile
websites who receive a large chunk of traffic. When a DNS reply arrives
asynchronously, the when handler clears the waitlist and assigns the
information to all records in the buffered.

The script also adds Each log entry into a double-ended queue to make sure the
records arrive on disk in the same way Bro sees them. Each reply also triggers
a sweep through this deque which flushes the buffer up to the first outstanding
reply.

Here is an example from the public M57 trace from 2009:

  % bro-cut ts id.orig_h id.resp_h server_name notary.first_seen notary.last_seen notary.times_seen notary.valid < ssl.log
  1258562650.121682 192.168.1.104 208.97.132.223  mail.m57.biz  - - - -
  1258535660.267128 192.168.1.104 65.55.184.16  - - - - -
  1258561662.604948 192.168.1.105 66.235.128.158  - - - - -
  1258561885.571010 192.168.1.105 65.55.184.155 www.update.microsoft.com  - - - -
  1258563578.455331 192.168.1.103 208.97.132.223  - - - - -
  1258563716.527681 192.168.1.103 96.6.248.124  - - - - -
  1258563884.667153 192.168.1.103 66.235.139.152  - - - - -
  1258564818.755676 192.168.1.103 12.41.118.177 - - - - -
  1258564821.637874 192.168.1.103 12.41.118.177 - - - - -
  1258564821.637871 192.168.1.103 12.41.118.177 - - - - -
  1258564821.637876 192.168.1.103 12.41.118.177 - - - - -
  1258564821.638126 192.168.1.103 12.41.118.177 - - - - -
  1258562467.525034 192.168.1.104 208.97.132.223  mail.m57.biz  15392 15695 301 F
  1258563063.965975 192.168.1.104 63.245.209.105  aus2.mozilla.org  - - - -
  1258563064.091396 192.168.1.104 63.245.209.91 addons.mozilla.org  - - - -
  1258563329.202273 192.168.1.103 208.97.132.223  - 15392 15695 301 F
  1258563712.945933 192.168.1.103 65.55.16.121  - - - - -
  1258563714.044500 192.168.1.103 65.54.186.79  - - - - -
  1258563716.146680 192.168.1.103 96.6.248.124  - - - - -
  1258563737.432312 192.168.1.103 96.6.245.186  - - - - -
  1258563716.526933 192.168.1.103 96.6.245.186  - - - - -
  1258563716.527430 192.168.1.103 96.6.245.186  - - - - -
  1258563716.527179 192.168.1.103 96.6.245.186  - - - - -
  1258563716.527683 192.168.1.103 96.6.245.186  - - - - -
  1258563716.527432 192.168.1.103 96.6.245.186  - - - - -
  1258563751.178683 192.168.1.103 66.235.139.152  - - - - -
  1258563751.171938 192.168.1.103 65.54.234.75  - - - - -
  1258563751.182433 192.168.1.103 65.242.27.35  - - - - -
  1258563883.414188 192.168.1.103 65.55.16.121  - - - - -
  1258563884.702380 192.168.1.103 65.242.27.35  - - - - -
  1258563885.678766 192.168.1.103 65.54.186.79  - - - - -
  1258563886.124987 192.168.1.103 65.54.186.79  - - - - -
  1258564027.877525 192.168.1.103 65.54.234.75  - - - - -
  1258564688.206859 192.168.1.103 65.54.186.107 - - - - -
  1258567162.001225 192.168.1.105 208.97.132.223  mail.m57.biz  - - - -
  1258568040.512840 192.168.1.103 208.97.132.223  - - - - -
  1258564688.577376 192.168.1.103 207.46.120.170  - - - - -
  1258564723.029005 192.168.1.103 65.54.186.107 - - - - -
  1258564723.784032 192.168.1.103 65.55.194.249 - - - - -
  1258564748.521756 192.168.1.103 65.54.186.107 - - - - -
  1258564817.601152 192.168.1.103 12.41.118.177 - - - - -
  1258565684.353653 192.168.1.105 208.97.132.223  mail.m57.biz  15392 15695 301 F
  1258565710.188691 192.168.1.105 74.125.155.109  pop.gmail.com - - - -
  1258566061.103696 192.168.1.103 208.97.132.223  - 15392 15695 301 F
  1258566893.914987 192.168.1.102 208.97.132.223  - 15392 15695 301 F
2012-12-21 17:03:39 -08:00
Robin Sommer
da90976170 Merge remote-tracking branch 'origin/topic/matthias/opaque'
* origin/topic/matthias/opaque:
  Add new unit test for opaque serialization.
  Migrate entropy testing to opaque.
  C++ify RandTest.*
  Fix a hard-to-spot bug.
  Use more descriptive error message.
  Fix the fix :-/.
  Fix initialization of hash values.
  Be clearer about delegation.
  Implement serialization of opaque types.
  Update hash BiF documentation.
  Migrate free SHA* functions to SHA*Val::digest().
  Add missing type name that caused failing tests.
  Update base scripts and unit tests.
  Simplify hash function BiFs.
  Add support for opaque hash values.
  Adapt BiF & Bro parser to handle opaque types.
  More lexer/parser work.
  Implement equivalence relation for opaque types.
  Support basic serialization of opaque.
  Add opaque type to lexer, parser, and BroType.

Closes #925

Conflicts:
	aux/broccoli
2012-12-20 16:30:22 -08:00
Seth Hall
0cf98ac325 Improved file name extraction for SMTP when file name is included in Content-Type header. 2012-12-13 10:27:08 -05:00
Matthias Vallentin
816965f3c7 Merge remote-tracking branch 'origin/master' into topic/matthias/opaque 2012-12-11 16:32:01 -08:00
Matthias Vallentin
30bab14dbf Update base scripts and unit tests. 2012-12-11 16:26:17 -08:00
Robin Sommer
57510464a1 Adapting the HTTP request line parsing to only accept methods
consisting of letters [A-Za-z].

I had some bogus HTTP sessions now with the test-suite that reported
data as HTTP because it started with "<!... ". Requiring letters seems
a reasonable constraint.
2012-12-05 16:56:54 -08:00
Robin Sommer
177c014cb7 Merge remote-tracking branch 'vlad/topic/vladg/http-verbs'
* vlad/topic/vladg/http-verbs:
  A test for HTTP methods, including some horribly illegal requests.
  Remove hardcoded HTTP verbs from the analyzer (#741)

I added a "bad_HTTP_request" weird for HTTP request lines that don't
have more than a single word.

Closes #741.
2012-12-05 15:27:42 -08:00
Vlad Grigorescu
e98343b562 Remove hardcoded HTTP verbs from the analyzer (#741) 2012-11-30 20:08:20 -05:00
Seth Hall
c98301e51f Fixed a DNS attribute issue (reported by Matt Thompson). 2012-11-26 15:58:25 -05:00
Seth Hall
3546d93f36 Merging master. 2012-11-21 12:18:03 -05:00
Robin Sommer
4e12813445 Fixing tests after modbus merge. 2012-11-05 15:58:38 -08:00
Robin Sommer
86ce564107 Merge remote-tracking branch 'remotes/origin/topic/seth/modbus-merge'
* remotes/origin/topic/seth/modbus-merge:
  Small modbus documentation update and tiny refactoring.
  Final touches to modbus analyzer for now.
  Major revisions to Modbus analyzer support (not quite done yet).
  put some make-up on Modbus analyser
  Modbus analyser, added support: FC=20,21
  Modbus analyzer,added support: FC=1,2,15,24
  Modbus analyzer, current support: FC=3,4,5,6,7,16,22,23

Closes #915.
2012-11-05 15:26:57 -08:00
Seth Hall
c32b179ac5 Small modbus documentation update and tiny refactoring. 2012-10-31 23:57:38 -04:00
Seth Hall
a2f336cc72 Final touches to modbus analyzer for now.
- There are still some broken events in the modbus analyzer because
  I don't have traffic to test with (coil and record related events primarily).

- There are a few example scripts in policy/protocols/modbus
2012-10-31 23:34:43 -04:00
Jon Siwek
18f8427579 Change how "gridftp" gets added to service field of connection records.
In addition to checking for a finished SSL handshake over an FTP
connection, it now also requires that the SSL handshake occurs after
the FTP client requested AUTH GSSAPI, more specifically identifying the
characteristics of GridFTP control channels.

Addresses #891.
2012-10-17 12:09:12 -05:00
Robin Sommer
5e12a53ae5 Merge remote-tracking branch 'origin/topic/jsiwek/gridftp'
* origin/topic/jsiwek/gridftp:
  Add memory leak unit test for GridFTP.
  Enable GridFTP detection by default.  Track/log SSL client certs.
  Add analyzer for GSI mechanism of GSSAPI FTP AUTH method.
  Add an example of a GridFTP data channel detection script.
2012-10-12 10:43:16 -07:00
Jon Siwek
e34f6d9e3b Enable GridFTP detection by default. Track/log SSL client certs.
In the *service* field of connection records, GridFTP control channels
are labeled as "gridftp" and data channels as "gridftp-data".

Added *client_subject* and *client_issuer_subject* as &log'd fields to
SSL::Info record.  Also added *client_cert* and *client_cert_chain*
fields to track client cert chain.
2012-10-08 11:38:29 -05:00
Seth Hall
db62369508 Fix for DNS log problem when a DNS response is seen with 0 RRs. 2012-10-05 13:48:49 -04:00
Jon Siwek
49b8c7e390 Add analyzer for GSI mechanism of GSSAPI FTP AUTH method.
GSI authentication involves an encoded TLS/SSL handshake over the FTP
control session.  Decoding the exchanged tokens and passing them to an
SSL analyzer instance allows use of all the familiar script-layer events
in inspecting the handshake (e.g. client/server certificats are
available).  For FTP sessions that attempt GSI authentication, the
service field of the connection record will have both "ftp" and "ssl".

One additional change is an FTP server's acceptance of an AUTH request
no longer causes analysis of the connection to cease (because further
analysis likely wasn't possible).  This decision can be made more
dynamically at the script-layer (plus there's now the fact that further
analysis can be done at least on the GSSAPI AUTH method).
2012-10-05 10:43:23 -05:00
Jon Siwek
68aead024a Add an example of a GridFTP data channel detection script.
It relies on the heuristics of GridFTP data channels commonly default to
SSL mutual authentication with a NULL bulk cipher and that they usually
transfer large datasets (default threshold of script is 1 GB).  The
script also defaults to skip_further_processing() after detection to try
to save cycles analyzing the large, benign connection.

Also added a script in base/protocols/conn/polling that generalizes the
process of polling a connection for interesting features.  The GridFTP
data channel detection script depends on it to monitor bytes
transferred.
2012-10-01 12:32:24 -05:00
Seth Hall
009efbcb27 Major revisions to Modbus analyzer support (not quite done yet).
- Renamed many data structures to align with most recent standard.

- Reworked modbus events to make them more canonically "Bro".

- Converted the Modbus analyzer to a simpler style for easier maintenance.

- Modbus coil related events still don't work (I haven't finished the
  function for converting the data structures).

- Modbus file record events remain incomplete.
2012-09-17 09:19:52 -04:00
Robin Sommer
cbb31cedc3 Merge remote-tracking branch 'origin/topic/dina/modbus' into topic/robin/modbus-merge
* origin/topic/dina/modbus:
  put some make-up on Modbus analyser
  Modbus analyser, added support: FC=20,21
  Modbus analyzer,added support: FC=1,2,15,24
  Modbus analyzer, current support: FC=3,4,5,6,7,16,22,23

I cleaned up the code a bit, mainly layout style.

I did not include the *.bro scripts for now, but a test script
../testing/btest/scripts/base/protocols/modbus/events.bro that prints
out the value for each event.

Merged the Modbus traces from the ics repository into a single trace
as input for the test. They currently trigger 20 of the 34 events.

Addresses #870.
2012-08-29 17:58:41 -07:00