Commit graph

81 commits

Author SHA1 Message Date
Daniel Thayer
8f2336f531 Add README files for base/protocols
The text from these README files appears on the "Bro Script Packages"
page after building the documentation.
2013-10-17 12:47:32 -05:00
Daniel Thayer
fe60404f0f Fix typos and formatting in the http protocol docs
Also adjusted line numbers in scripting doc due to changes in http/main.bro
2013-10-16 13:13:53 -05:00
Robin Sommer
984e9793db Merge remote-tracking branch 'origin/topic/seth/faf-updates'
* origin/topic/seth/faf-updates: (27 commits)
  Undoing the FTP tests I updated earlier.
  Update the last two btest FAF tests.
  File analysis fixes and test updates.
  Fix a bug with getting analyzer tags.
  A few test updates.
  Some tests work now (at least they all don't fail anymore!)
  Forgot a file.
  Added protocol description functions that provide a super compressed log representation.
  Fix a bug where orig file information in http wasn't working right.
  Added mime types to http.log
  Clean up queued but unused file_over_new_connections event args.
  Add jar files to the default MHR lookups.
  Adding CAB files for MHR checking.
  Improve malware hash registry script.
  Fix a small issue with finding smtp entities.
  Added support for files to the notice framework.
  Make the custom libmagic database a git submodule.
  Add an is_orig parameter to file_over_new_connection event.
  Make magic for emitting application/msword mime type less strict.
  Disable more libmagic builtin checks that override the magic database.
  ...

Conflicts:
	doc/scripts/DocSourcesList.cmake
	scripts/base/init-bare.bro
	scripts/test-all-policy.bro
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
2013-07-29 14:21:52 -07:00
Jon Siwek
939619889d File analysis fixes and test updates.
- Several places were just using old variable names or not loading
  scripts correctly after they'd been renamed/moved.

- Revert/adjust a change in how HTTP file handles are generated that
  broke partial content responses.

- Turn some libmagic builtin checks back on; seems some are actually
  useful (e.g. text detection seems to be a builtin).  The rule going
  forward probably will be only to turn off a builtin if we confirm it
  causes issues.

- Removed some tests that are redundant or not necessary anymore because
  the generic file analysis tests cover them.

- A couple FTP tests still fail that I think need an actual solution via
  script changes.
2013-07-25 16:51:16 -05:00
Seth Hall
0bfdcc1fbc Added protocol description functions that provide a super compressed log representation. 2013-07-16 12:01:50 -04:00
Seth Hall
4dd4c5344e Fix a bug where orig file information in http wasn't working right. 2013-07-12 16:12:26 -04:00
Seth Hall
b14f5a853e Added mime types to http.log 2013-07-12 16:06:40 -04:00
Seth Hall
2e0912b543 Merge remote-tracking branch 'origin/topic/seth/bittorrent-fix-and-dpd-sig-breakout' into topic/seth/faf-updates
Conflicts:
	magic
	scripts/base/protocols/http/__load__.bro
	scripts/base/protocols/irc/__load__.bro
	scripts/base/protocols/smtp/__load__.bro
2013-07-10 16:28:38 -04:00
Seth Hall
39444b5af7 Moved DPD signatures into script specific directories.
- This caused us to lose signatures for POP3 and Bittorrent.  These will
   need discovered in the repository again when we add scripts
   for those analyzers.
2013-07-09 22:44:55 -04:00
Jon Siwek
73155c321b Add an is_orig parameter to file_over_new_connection event. 2013-07-09 15:58:28 -05:00
Seth Hall
cdf6b7864e More file analysis updates.
- Recorrected the module name to Files.

  - Added Files::analyzer_name to get a more readable name for a
    file analyzer.

  - Improved and just overall better handled multipart mime
    transfers in HTTP and SMTP.  HTTP now has orig_fuids and resp_fuids
    log fields since multiple "files" can be transferred with
    multipart mime in a single request/response pair.  SMTP has
    an fuids field which has file unique IDs for all parts
    transferred. FTP and IRC have a log field named fuid added
    because only a single file can be transferred per irc and ftp
    log line.
2013-07-09 11:50:54 -04:00
Seth Hall
58d133e764 Merge remote-tracking branch 'origin/master' into topic/seth/faf-updates
Conflicts:
	scripts/base/frameworks/files/main.bro
	scripts/base/init-bare.bro
	scripts/base/protocols/ftp/file-analysis.bro
	scripts/base/protocols/http/file-analysis.bro
	scripts/base/protocols/irc/file-analysis.bro
	scripts/base/protocols/smtp/file-analysis.bro
	src/const.bif
	src/event.bif
	src/file_analysis/Analyzer.h
	src/file_analysis/file_analysis.bif
2013-07-05 02:13:27 -04:00
Seth Hall
df2841458d Large overhaul in name and appearance for file analysis. 2013-07-05 02:00:14 -04:00
Seth Hall
4149724f59 Updates for the PacketFilter framework to simplify it. 2013-07-05 01:12:22 -04:00
Robin Sommer
d8b05af7e5 Merge remote-tracking branch 'origin/topic/jsiwek/faf-cleanup'
Closes #1002.

* origin/topic/jsiwek/faf-cleanup:
  Move file analyzers to new plugin infrastructure.
  Add a general file analysis overview/how-to document.
  Improve file analysis doxygen comments.
  Improve tracking of HTTP file extraction (addresses #988).
  Fix HTTP multipart body file analysis.
  Remove logging of analyzers field of FileAnalysis::Info.
  Remove extraction counter in default file extraction scripts.
  Remove FileAnalysis::postpone_timeout.
  Make default get_file_handle handlers &priority=5.
  Add input interface to forward data for file analysis.
  File analysis framework interface simplifications.
2013-07-03 16:27:16 -07:00
Jon Siwek
f2574636b6 Merge branch 'master' into topic/jsiwek/faf-cleanup
Conflicts:
	scripts/base/protocols/ftp/file-analysis.bro
	scripts/base/protocols/http/file-analysis.bro
	scripts/base/protocols/irc/file-analysis.bro
	scripts/base/protocols/smtp/file-analysis.bro
	src/file_analysis/File.cc
	src/file_analysis/File.h
	src/file_analysis/Manager.cc
	src/file_analysis/Manager.h
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.logging/file_analysis.log
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-0.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-1.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-2.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-3.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-BTsa70Ua9x7-1.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-BTsa70Ua9x7.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-Rqjkzoroau4-0.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-Rqjkzoroau4.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-VLQvJybrm38-2.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-VLQvJybrm38.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-zrfwSs9K1yk-3.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-zrfwSs9K1yk.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp.log
	testing/btest/Baseline/scripts.base.protocols.http.http-extract-files/http-item-BFymS6bFgT3-0.dat
	testing/btest/Baseline/scripts.base.protocols.http.http-extract-files/http-item-BFymS6bFgT3.dat
	testing/btest/Baseline/scripts.base.protocols.http.http-extract-files/http-item.dat
	testing/btest/Baseline/scripts.base.protocols.http.http-extract-files/http.log
	testing/btest/Baseline/scripts.base.protocols.irc.dcc-extract/irc-dcc-item-wqKMAamJVSb-0.dat
	testing/btest/Baseline/scripts.base.protocols.irc.dcc-extract/irc-dcc-item-wqKMAamJVSb.dat
	testing/btest/Baseline/scripts.base.protocols.irc.dcc-extract/irc-dcc-item.dat
	testing/btest/Baseline/scripts.base.protocols.irc.dcc-extract/irc.log
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-0.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-1.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-Ltd7QO7jEv3-1.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-Ltd7QO7jEv3.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-cwR7l6Zctxb-0.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-cwR7l6Zctxb.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp_entities.log
	testing/btest/scripts/base/protocols/ftp/ftp-extract.bro
	testing/btest/scripts/base/protocols/http/http-extract-files.bro
	testing/btest/scripts/base/protocols/irc/dcc-extract.test
	testing/btest/scripts/base/protocols/smtp/mime-extract.test
2013-06-07 15:44:36 -05:00
Jon Siwek
705a84d688 Improve tracking of HTTP file extraction (addresses #988).
http.log now has files taken from request and response bodies in
different fields for each, and can now track multiple files per body.
That is, the "extraction_file" field is now "extracted_request_files"
and "extracted_response_files".
2013-05-21 16:42:35 -05:00
Jon Siwek
3cbef60f57 Fix HTTP multipart body file analysis.
Each part now gets assigned a different file handle/id.
2013-05-21 15:35:22 -05:00
Jon Siwek
28f51a9a22 Remove extraction counter in default file extraction scripts. 2013-05-21 11:12:00 -05:00
Jon Siwek
bc5cd3acc8 Make default get_file_handle handlers &priority=5.
So they're easier to override (just provide a new handler without
specifying a priority).
2013-05-21 10:34:19 -05:00
Robin Sommer
eb637f9f3e Merge remote-tracking branch 'origin/master' into topic/robin/plugins
Thanks to git this merge was less troublesome that I was afraid it
would be. Not all tests pass yet though (and file hashes have changed
unfortunately).

Conflicts:
	cmake
	doc/scripts/DocSourcesList.cmake
	scripts/base/init-bare.bro
	scripts/base/protocols/ftp/main.bro
	scripts/base/protocols/irc/dcc-send.bro
	scripts/test-all-policy.bro
	src/AnalyzerTags.h
	src/CMakeLists.txt
	src/analyzer/Analyzer.cc
	src/analyzer/protocol/file/File.cc
	src/analyzer/protocol/file/File.h
	src/analyzer/protocol/http/HTTP.cc
	src/analyzer/protocol/http/HTTP.h
	src/analyzer/protocol/mime/MIME.cc
	src/event.bif
	src/main.cc
	src/util-config.h.in
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/istate.events-ssl/receiver.http.log
	testing/btest/Baseline/istate.events-ssl/sender.http.log
	testing/btest/Baseline/istate.events/receiver.http.log
	testing/btest/Baseline/istate.events/sender.http.log
2013-05-16 17:58:48 -07:00
Jon Siwek
98f7907dbb FileAnalysis: optimize file handle construction.
cat is slightly faster than fmt.
2013-04-19 11:38:11 -05:00
Jon Siwek
b8c98b8bf7 FileAnalysis: change terminology s/action/analyzer 2013-04-11 14:53:54 -05:00
Jon Siwek
e81f2ae7b0 FileAnalysis: libmagic tweaks.
Remove verbose file type detection and automatically strip out charset
from mime type.
2013-04-11 13:11:46 -05:00
Jon Siwek
e2fbee9054 FileAnalysis: add more params to some events. 2013-04-11 11:24:18 -05:00
Jon Siwek
2747e839fb FileAnalysis: insert explicit event queue flush points.
And added an event called "event_queue_flush_point" to mark where that
occured in the event stream.  The FAF now uses an explicit event queue
flush instead of buffering input in order to wait for a file handle to
be returned from script-layer.
2013-04-10 16:48:10 -05:00
Jon Siwek
d9321e2203 FileAnalysis: remove some file events.
The file_new event now takes over the function of file_type, file_bof,
and file_bof_buffer.
2013-04-10 14:34:23 -05:00
Jon Siwek
a2d9b47bcd FileAnalysis: finish switching hooks to events. 2013-04-10 11:13:43 -05:00
Jon Siwek
641154f8e8 FileAnalysis: checkpoint in middle of big reorganization.
- FileAnalysis::Info is now just a record used for logging, the fa_file
  record type is defined in init-bare.bro as the analogue to a
  connection record.

- Starting to transfer policy hook triggers and analyzer results to
  events.
2013-04-09 15:49:58 -05:00
Jon Siwek
393d35dc60 Revert "FileAnalysis: optimize get_file_handle event queueing."
This reverts commit fc267d010d.

There were some diffs caused by this in external test suites I'm
unsure about, I'm going to go over optimizations more closely in
a different branch.
2013-04-03 09:49:39 -05:00
Jon Siwek
fc267d010d FileAnalysis: optimize get_file_handle event queueing.
When a file handle is needed and the last event in the queue is also
a get_file_handle event with the same arguments, instead of queueing
a new event, just remember to cache/re-use the resulting handle from
the previous event.  This depends on get_file_handle handlers not
changing global state that is also used to derive the file handle
string.
2013-04-02 16:21:51 -05:00
Jon Siwek
3642ecc73e FileAnalysis: misc. tweaks/fixes.
- Add a timeout flag to file_analysis.log so it's easy to tell what
  has had at least one timeout trigger happen.

- Fix ftp-data service tag not being set for reused connections.

- Fix HTTP::Incorrect_File_Type because mime types returned by FAF have
  the charset still in them, but the HTTP::mime_types_extensions table
  does not and it requires an exact string match. (still ugly)

- Add TRIGGER_NEW_CONN to track files going over multiple connections.

- Add an initial file/mime type guess for non-linear file transfers.

- Fix a case where file/mime type detection would never be attempted
  if the start of the file was a content gap.

- Improve mime type tracking of HTTP byte-range/partial-content,
  even if the requests are pipelined or over multiple connections.

- I changed the modbus.events test because having the baseline output
  be 80+ MB is nuts and it was sensitive to connection record redefs.
2013-03-28 16:59:29 -05:00
Jon Siwek
497496ec83 FileAnalysis: replace script-layer SMTP file analysis.
Notable differences:

- Removed SMTP::MD5 notice.

- Removed ability to specify mime entity excerpt length per mime-type.
2013-03-26 15:48:52 -05:00
Robin Sommer
af1809aaa3 First prototype of new analyzer framework.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.

There are three major parts going into this:

    - A new plugin infrastructure in src/plugin. This is independent
      of analyzers and will eventually support plugins for other parts
      of Bro as well (think: readers and writers). The goal is that
      plugins can be alternatively compiled in statically or loadead
      dynamically at runtime from a shared library. While the latter
      isn't there yet, there'll be almost no code change for a plugin
      to make it dynamic later (hopefully :)

    - New analyzer infrastructure in src/analyzer. I've moved a number
      of analyzer-related classes here, including Analyzer and DPM;
      the latter now renamed to Analyzer::Manager. More will move here
      later. Currently, there's only one plugin here, which provides
      *all* existing analyzers. We can modularize this further in the
      future (or not).

    - A new script interface in base/framework/analyzer. I think that
      this will eventually replace the dpm framework, but for now
      that's still there as well, though some parts have moved over.

I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:

    const ports = { 22/tcp } &redef;

    event bro_init() &priority=5
        {
        ...
        Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
        }

As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.

This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.

The debug stream "dpm" shows more about the loaded/enabled analyzers.

A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).

This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
2013-03-26 11:05:38 -07:00
Jon Siwek
84a0c2fdac FileAnalysis: file handles now set from events.
Versus from synchronous function calls, which doesn't work well because
the function call can see a script-layer state that doesn't reflect
the state as it will be in terms of the event/network stream.
2013-03-25 15:37:58 -05:00
Jon Siwek
71f0e2d276 FileAnalysis: replace script-layer http file analysis.
Other misc:

- Remove HTTP::MD5 notice.

- Add "last_active" field to FileAnalysis::Info record.

- Replace "conn_uids", "conn_ids" fields in FileAnalysis::Info record
  with just a "conns" fields containing full connection records.

- The http-methods unit test is failing now, but I think it will be
  fixed once I change the file handle callback mechanism to use events
  instead.
2013-03-22 16:14:06 -05:00
Jon Siwek
661677d452 FileAnalysis: separating IRC/FTP data analyzers.
It simplifies the file handle string callbacks.
2013-03-20 11:12:06 -05:00
Jon Siwek
59ed5c75f1 FileAnalysis: add unit tests covering current protocol integration.
And had to make various fixes/refinements after scrutinizing results.
2013-03-19 15:50:05 -05:00
Jon Siwek
637fe69cf9 FileAnalysis: buffer input that can't get unique file handle immediately
A retry happens on every new input and also periodically based on a
timer.  If a file handle is returned at those times, the input is
forwarded for analysis, else it keeps retrying until a timeout
threshold.
2013-03-14 10:57:16 -05:00
Jon Siwek
878dfff2f2 FileAnalysis: decentralize unique file handle generator callbacks.
The framework now cycles through callbacks based on a table indexed
by analyzer tags, or the special case of service strings if a given
analyzer is overloaded for multiple protocols (FTP/IRC data).  This
lets each protocol script bundle implement the callback locally and
reduces the FAF's external dependencies.
2013-03-13 10:48:26 -05:00
Jon Siwek
3dd513e26e FileAnalysis: move unique file handle string generation to script-layer
And add minimal integration with HTTP analyzer.
2013-03-12 13:44:31 -05:00
Jon Siwek
69afc4a882 Add an error for record coercions that would orphan a field.
These cases should be avoidable by fixing scripts where they occur and
they can also help catch typos that would lead to unintentional runtime
behavior.

Adding this already revealed several scripts where a field in an inlined
record was never removed after a code refactor.
2013-01-24 09:56:19 -06:00
Robin Sommer
da90976170 Merge remote-tracking branch 'origin/topic/matthias/opaque'
* origin/topic/matthias/opaque:
  Add new unit test for opaque serialization.
  Migrate entropy testing to opaque.
  C++ify RandTest.*
  Fix a hard-to-spot bug.
  Use more descriptive error message.
  Fix the fix :-/.
  Fix initialization of hash values.
  Be clearer about delegation.
  Implement serialization of opaque types.
  Update hash BiF documentation.
  Migrate free SHA* functions to SHA*Val::digest().
  Add missing type name that caused failing tests.
  Update base scripts and unit tests.
  Simplify hash function BiFs.
  Add support for opaque hash values.
  Adapt BiF & Bro parser to handle opaque types.
  More lexer/parser work.
  Implement equivalence relation for opaque types.
  Support basic serialization of opaque.
  Add opaque type to lexer, parser, and BroType.

Closes #925

Conflicts:
	aux/broccoli
2012-12-20 16:30:22 -08:00
Matthias Vallentin
816965f3c7 Merge remote-tracking branch 'origin/master' into topic/matthias/opaque 2012-12-11 16:32:01 -08:00
Matthias Vallentin
30bab14dbf Update base scripts and unit tests. 2012-12-11 16:26:17 -08:00
Robin Sommer
57510464a1 Adapting the HTTP request line parsing to only accept methods
consisting of letters [A-Za-z].

I had some bogus HTTP sessions now with the test-suite that reported
data as HTTP because it started with "<!... ". Requiring letters seems
a reasonable constraint.
2012-12-05 16:56:54 -08:00
Robin Sommer
177c014cb7 Merge remote-tracking branch 'vlad/topic/vladg/http-verbs'
* vlad/topic/vladg/http-verbs:
  A test for HTTP methods, including some horribly illegal requests.
  Remove hardcoded HTTP verbs from the analyzer (#741)

I added a "bad_HTTP_request" weird for HTTP request lines that don't
have more than a single word.

Closes #741.
2012-12-05 15:27:42 -08:00
Vlad Grigorescu
e98343b562 Remove hardcoded HTTP verbs from the analyzer (#741) 2012-11-30 20:08:20 -05:00
Robin Sommer
ce4b8dd4ac Changing HTTP DPD port 3138 to 3128.
Addresses #857.
2012-07-20 09:57:38 -07:00
Vlad Grigorescu
f43576cff3 Fix some Info:Record field documentation. 2012-07-13 14:04:24 -04:00