Commit graph

129 commits

Author SHA1 Message Date
Seth Hall
38dbba7622 More file reassembly work.
- The reassembly behavior can be modified per-file by enabling or
   disabling the reassembler and/or modifying the size of the reassembly
   buffer.

 - Changed the file extraction analyzer to use the stream to avoid
   issues with the chunk based approach not immediately triggering
   the file_new event due to mime-type detection delay.  Early chunks
   frequently ended up lost before.

 - Generally things are working now and I'd consider this in testing.
2014-01-05 04:58:01 -05:00
Seth Hall
0b78f444a1 Initial commit of file reassembly. 2013-12-20 00:05:08 -05:00
Robin Sommer
987452beff Cleanup of plugin component API.
- Move more functionality into base class.
- Remove cctors and assignment operators (weren't actually needed anymore)
- Switch from const char* to std::string.
2013-12-16 10:07:20 -08:00
Robin Sommer
d34f23c8d4 A set of file analysis extensions.
- Enable manager to associate analyzers with a MIME type. With that,
  one can now say enable all analyzers for, e.g., "image/gif". This is
  exposed to script-land as

    Files::add_analyzers_for_mime_type(f: fa_file, mtype: string)

  For MIME types identified via libmagic, this happens automatically
  (via the file_new() handler in files/main.bro).

- Extend the analyzer API to better match that of protocol analyzers:

    - Adding unique analyzer IDs so that we can refer to instances
      from script-land.

    - Adding subtypes to Components so that a single analyzer
      implementation can support different types of analyzers
      internally.

    - Add an analyzer method SetTag() that allows to set the tag after
      construction.

    - Adding Init() and Done() methods for consistency with what other
      classes offer.

- Add debug logging to the file_analysis stream.

TODO: test cases missing for the new script-land functionality.
2013-11-26 11:20:14 -08:00
Jon Siwek
89ae4ffd05 Add options to limit extracted file sizes w/ 100MB default. 2013-08-22 16:37:58 -05:00
Jon Siwek
99c89b42d7 Internal refactoring of how plugin components are tagged/managed.
Made some class templates for code that seemed duplicated between
file/protocol tags and managers.  Seems like it helps a bit and
hopefully can be also be used to transition other things that have
enum value "tags" (e.g. logging writers, input readers) to the
plugin system.
2013-08-01 10:35:47 -05:00
Jon Siwek
9bd7a65071 Merge branch 'master' into topic/jsiwek/faf-updates
Conflicts:
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
2013-07-31 10:05:36 -05:00
Jon Siwek
5fa9c5865b Factor out the need for a tag field in Files::AnalyzerArgs record.
This cleans up internals of how analyzer instances get identified by the
tag plus any args given to it and doesn't change script code a user
would write.
2013-07-31 09:48:19 -05:00
Robin Sommer
984e9793db Merge remote-tracking branch 'origin/topic/seth/faf-updates'
* origin/topic/seth/faf-updates: (27 commits)
  Undoing the FTP tests I updated earlier.
  Update the last two btest FAF tests.
  File analysis fixes and test updates.
  Fix a bug with getting analyzer tags.
  A few test updates.
  Some tests work now (at least they all don't fail anymore!)
  Forgot a file.
  Added protocol description functions that provide a super compressed log representation.
  Fix a bug where orig file information in http wasn't working right.
  Added mime types to http.log
  Clean up queued but unused file_over_new_connections event args.
  Add jar files to the default MHR lookups.
  Adding CAB files for MHR checking.
  Improve malware hash registry script.
  Fix a small issue with finding smtp entities.
  Added support for files to the notice framework.
  Make the custom libmagic database a git submodule.
  Add an is_orig parameter to file_over_new_connection event.
  Make magic for emitting application/msword mime type less strict.
  Disable more libmagic builtin checks that override the magic database.
  ...

Conflicts:
	doc/scripts/DocSourcesList.cmake
	scripts/base/init-bare.bro
	scripts/test-all-policy.bro
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
2013-07-29 14:21:52 -07:00
Jon Siwek
1a60fae41c Clean up queued but unused file_over_new_connections event args. 2013-07-11 11:36:49 -05:00
Jon Siwek
73155c321b Add an is_orig parameter to file_over_new_connection event. 2013-07-09 15:58:28 -05:00
Jon Siwek
6a5b825058 Delay file_over_new_connection events until after file_new occurs. 2013-07-09 14:25:41 -05:00
Seth Hall
58d133e764 Merge remote-tracking branch 'origin/master' into topic/seth/faf-updates
Conflicts:
	scripts/base/frameworks/files/main.bro
	scripts/base/init-bare.bro
	scripts/base/protocols/ftp/file-analysis.bro
	scripts/base/protocols/http/file-analysis.bro
	scripts/base/protocols/irc/file-analysis.bro
	scripts/base/protocols/smtp/file-analysis.bro
	src/const.bif
	src/event.bif
	src/file_analysis/Analyzer.h
	src/file_analysis/file_analysis.bif
2013-07-05 02:13:27 -04:00
Seth Hall
2b48396d23 Check file_over_new_connetion to fire for each connection (including the first). 2013-07-05 02:00:35 -04:00
Jon Siwek
f2574636b6 Merge branch 'master' into topic/jsiwek/faf-cleanup
Conflicts:
	scripts/base/protocols/ftp/file-analysis.bro
	scripts/base/protocols/http/file-analysis.bro
	scripts/base/protocols/irc/file-analysis.bro
	scripts/base/protocols/smtp/file-analysis.bro
	src/file_analysis/File.cc
	src/file_analysis/File.h
	src/file_analysis/Manager.cc
	src/file_analysis/Manager.h
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.logging/file_analysis.log
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-0.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-1.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-2.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-3.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-BTsa70Ua9x7-1.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-BTsa70Ua9x7.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-Rqjkzoroau4-0.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-Rqjkzoroau4.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-VLQvJybrm38-2.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-VLQvJybrm38.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-zrfwSs9K1yk-3.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp-item-zrfwSs9K1yk.dat
	testing/btest/Baseline/scripts.base.protocols.ftp.ftp-extract/ftp.log
	testing/btest/Baseline/scripts.base.protocols.http.http-extract-files/http-item-BFymS6bFgT3-0.dat
	testing/btest/Baseline/scripts.base.protocols.http.http-extract-files/http-item-BFymS6bFgT3.dat
	testing/btest/Baseline/scripts.base.protocols.http.http-extract-files/http-item.dat
	testing/btest/Baseline/scripts.base.protocols.http.http-extract-files/http.log
	testing/btest/Baseline/scripts.base.protocols.irc.dcc-extract/irc-dcc-item-wqKMAamJVSb-0.dat
	testing/btest/Baseline/scripts.base.protocols.irc.dcc-extract/irc-dcc-item-wqKMAamJVSb.dat
	testing/btest/Baseline/scripts.base.protocols.irc.dcc-extract/irc-dcc-item.dat
	testing/btest/Baseline/scripts.base.protocols.irc.dcc-extract/irc.log
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-0.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-1.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-Ltd7QO7jEv3-1.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-Ltd7QO7jEv3.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-cwR7l6Zctxb-0.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp-entity-cwR7l6Zctxb.dat
	testing/btest/Baseline/scripts.base.protocols.smtp.mime-extract/smtp_entities.log
	testing/btest/scripts/base/protocols/ftp/ftp-extract.bro
	testing/btest/scripts/base/protocols/http/http-extract-files.bro
	testing/btest/scripts/base/protocols/irc/dcc-extract.test
	testing/btest/scripts/base/protocols/smtp/mime-extract.test
2013-06-07 15:44:36 -05:00
Jon Siwek
0ef074594d Add input interface to forward data for file analysis.
The new Input::add_analysis function is used to automatically forward
input data on to the file analysis framework.
2013-05-21 10:29:22 -05:00
Jon Siwek
90fa331279 File analysis framework interface simplifications.
- Remove script-layer data input interface (will be managed directly
  by input framework later).

- Only track files internally by file id hash.  Chance of collision
  too small to justify also tracking unique file string.
2013-05-20 12:02:48 -05:00
Robin Sommer
eb637f9f3e Merge remote-tracking branch 'origin/master' into topic/robin/plugins
Thanks to git this merge was less troublesome that I was afraid it
would be. Not all tests pass yet though (and file hashes have changed
unfortunately).

Conflicts:
	cmake
	doc/scripts/DocSourcesList.cmake
	scripts/base/init-bare.bro
	scripts/base/protocols/ftp/main.bro
	scripts/base/protocols/irc/dcc-send.bro
	scripts/test-all-policy.bro
	src/AnalyzerTags.h
	src/CMakeLists.txt
	src/analyzer/Analyzer.cc
	src/analyzer/protocol/file/File.cc
	src/analyzer/protocol/file/File.h
	src/analyzer/protocol/http/HTTP.cc
	src/analyzer/protocol/http/HTTP.h
	src/analyzer/protocol/mime/MIME.cc
	src/event.bif
	src/main.cc
	src/util-config.h.in
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/istate.events-ssl/receiver.http.log
	testing/btest/Baseline/istate.events-ssl/sender.http.log
	testing/btest/Baseline/istate.events/receiver.http.log
	testing/btest/Baseline/istate.events/sender.http.log
2013-05-16 17:58:48 -07:00
Robin Sommer
7610aa31b6 Various smalle tweaks in preparation for merging. 2013-05-13 16:47:00 -07:00
Jon Siwek
0141f51801 FileAnalysis: load custom mime magic database just once.
This works around a bug in libmagic since version 5.12 (current at
time of writing is 5.14) -- second call to magic_load() w/ non-default
database segfaults.
2013-04-29 12:49:22 -05:00
Jon Siwek
f07760ba00 FileAnalysis: add is_orig field to fa_file & Info. 2013-04-23 10:50:43 -05:00
Jon Siwek
b8c98b8bf7 FileAnalysis: change terminology s/action/analyzer 2013-04-11 14:53:54 -05:00
Jon Siwek
e81f2ae7b0 FileAnalysis: libmagic tweaks.
Remove verbose file type detection and automatically strip out charset
from mime type.
2013-04-11 13:11:46 -05:00
Jon Siwek
2fba37e277 FileAnalysis: add bif for setting timeout interval 2013-04-11 12:08:46 -05:00
Jon Siwek
e2fbee9054 FileAnalysis: add more params to some events. 2013-04-11 11:24:18 -05:00
Jon Siwek
2747e839fb FileAnalysis: insert explicit event queue flush points.
And added an event called "event_queue_flush_point" to mark where that
occured in the event stream.  The FAF now uses an explicit event queue
flush instead of buffering input in order to wait for a file handle to
be returned from script-layer.
2013-04-10 16:48:10 -05:00
Jon Siwek
d9321e2203 FileAnalysis: remove some file events.
The file_new event now takes over the function of file_type, file_bof,
and file_bof_buffer.
2013-04-10 14:34:23 -05:00
Jon Siwek
a2d9b47bcd FileAnalysis: finish switching hooks to events. 2013-04-10 11:13:43 -05:00
Jon Siwek
641154f8e8 FileAnalysis: checkpoint in middle of big reorganization.
- FileAnalysis::Info is now just a record used for logging, the fa_file
  record type is defined in init-bare.bro as the analogue to a
  connection record.

- Starting to transfer policy hook triggers and analyzer results to
  events.
2013-04-09 15:49:58 -05:00
Renamed from src/file_analysis/Info.cc (Browse further)