Commit graph

81 commits

Author SHA1 Message Date
Seth Hall
e879aa78f5 Merge remote-tracking branch 'origin/topic/seth/mime-updates' into topic/seth/files-reassembly-and-mime-updates
Conflicts:
	scripts/base/init-bare.bro
	testing/btest/Baseline/scripts.policy.misc.dump-events/all-events-no-args.log
	testing/btest/Baseline/scripts.policy.misc.dump-events/all-events.log
2014-11-05 11:42:34 -05:00
Seth Hall
842dfd8b4a Merge remote-tracking branch 'origin/topic/seth/files-tracking' into topic/seth/files-reassembly-and-mime-updates
Conflicts:
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.http.multipart/out
	testing/btest/Baseline/scripts.policy.misc.dump-events/all-events.log
2014-11-05 11:40:26 -05:00
Seth Hall
7ee34981aa Improve TAR file detection and other small changes.
- Remove all of the x-c detections.  Nearly all false
    positives.

  - Remove the back up TAR detections.  Not very helpful.

  - Remove one of the x-elc detections that was too loose
    and caused many false positives.
2014-11-05 11:31:48 -05:00
Seth Hall
d77243823f Updates for file mime type identification.
- Change to the default BOF buffer size to 3000 (was 1024).
 - Reorganized MS signatures into a separate file
 - Improved lots of the signatures and added new ones.
2014-10-08 02:12:10 -04:00
Seth Hall
80656d5294 Improves shockwave flash file signatures.
- This moves the signatures out of the libmagic imported signatures
   and into our own general.sig.

 - Expand the detection to LZMA compressed flash files.
2014-10-06 11:13:13 -04:00
Seth Hall
cafd35e746 Updates the files event api and brings file reassembly up to master. 2014-09-26 00:40:37 -04:00
Seth Hall
42b2d56279 Merge remote-tracking branch 'origin/master' into topic/seth/files-tracking
Conflicts:
	scripts/base/frameworks/files/main.bro
	src/file_analysis/File.cc
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.actions.data_event/out
2014-09-23 13:05:39 -04:00
Daniel Thayer
d226fef723 Fixed some "make doc" warnings caused by reST formatting 2014-09-16 12:44:51 -05:00
Jon Siwek
69b1ba653d Minor adjustments to plugin code/docs.
Mostly whitespace/typos.
Moved some Plugin methods out from public access.
2014-07-30 16:48:23 -05:00
Robin Sommer
c9524757d2 Adding Files::register_for_mime_type() to associate a file analyzer
with a MIME type.

Whenever that MIME is detected, Bro will now automatically activate
the analyzer. The interface mimics how well-known ports are defined
for protocol analyzers.

This isn't actually used by any existing file analyzer (because we
don't have any yet that target a specific file format), but there's a
test making sure it works.
2014-07-21 16:31:22 +02:00
Robin Sommer
9616cd8e61 Further polishing and cleanup in preparation for merge. 2014-07-12 18:12:09 -07:00
Seth Hall
8d9940c8c3 Merge remote-tracking branch 'origin/master' into topic/seth/files-tracking
Conflicts:
	src/Reassem.cc
	src/Reassem.h
	src/analyzer/protocol/tcp/TCP_Reassembler.cc
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.bifs.set_timeout_interval/bro..stdout
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.http.partial-content/b.out
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.http.partial-content/c.out
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.logging/files.log
2014-05-27 10:56:11 -04:00
Robin Sommer
bbd409d274 Merge remote-tracking branch 'origin/master' into topic/robin/dynamic-plugins-2.3
(Never good to name a branch after version anticipated to include it ...)
2014-05-14 16:23:04 -07:00
Robin Sommer
9efb549236 Merge remote-tracking branch 'origin/topic/jsiwek/file-signatures'
* origin/topic/jsiwek/file-signatures:
  File type detection changes and fix https.log {orig,resp}_fuids fields.
  Various minor changes related to file mime type detection.
  Refactor common MIME magic matching code.
  Replace libmagic w/ Bro signatures for file MIME type identification.

Conflicts:
	scripts/base/init-default.bro
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log

BIT-1143 #merged
2014-03-30 22:51:05 +02:00
Jon Siwek
8dad5026fd File type detection changes and fix https.log {orig,resp}_fuids fields.
- Removed "binary" and "octet-stream" mime type detections. They don't
  provide any more information than an uninitialized mime_type field
  which implicitly means no magic signature matches and so the media
  type is unknown to Bro.

- Slight change to "text/plain" signature.  It's still not the most
  accurate, which is reflected in its -20 strength value.

- The logic for adding file ids to {orig,resp}_fuids fields of
  the http.log incorrectly depended on the state of
  {orig,resp}_mime_types fields, so sometimes not all file ids
  associated w/ the session were logged.
2014-03-25 12:44:11 -05:00
Jon Siwek
b1fd161274 Improve type checking of records, addresses BIT-1159. 2014-03-20 13:54:26 -05:00
Jon Siwek
095a68b2ec Various minor changes related to file mime type detection.
- Improve or just remove some file magic signatures ported from libmagic
  that were too general and matched incorrectly too often.

- Fix MHR script's use of fa_file$mime_type before checking if it's
  initialized.  It may be uninitialized if no signatures match.

- The "fa_file" record now contains a "mime_types" field that contains
  all magic signatures that matched the file content (where the
  "mime_type" field is just a shortcut for the strongest match).
2014-03-06 11:41:10 -06:00
Jon Siwek
b22ca5d0a3 Replace libmagic w/ Bro signatures for file MIME type identification.
Notable changes:

- libmagic is no longer used at all.  All MIME type detection is
  done through new Bro signatures, and there's no longer a means to get
  verbose file type descriptions (e.g. "PNG image data, 1435 x 170").
  The majority of the default file magic signatures are derived
  from the default magic database of libmagic ~5.17.

- File magic signatures consist of two new constructs in the
  signature rule parsing grammar: "file-magic" gives a regular
  expression to match against, and "file-mime" gives the MIME type
  string of content that matches the magic and an optional strength
  value for the match.

- Modified signature/rule syntax for identifiers: they can no longer
  start with a '-', which made for ambiguous syntax when doing negative
  strength values in "file-mime".  Also brought syntax for Bro script
  identifiers in line with reality (they can't start with numbers or
  include '-' at all).

- A new Built-In Function, "file_magic", can be used to get all
  file magic matches and their corresponding strength against a given
  chunk of data

- The second parameter of the "identify_data" Built-In Function
  can no longer be used to get verbose file type descriptions, though it
  can still be used to get the strongest matching file magic signature.

- The "file_transferred" event's "descr" parameter no longer
  contains verbose file type descriptions.

- The BROMAGIC environment variable no longer changes any behavior
  in Bro as magic databases are no longer used/installed.

- Reverted back to minimum requirement of CMake 2.6.3 from 2.8.0
  (it's back to being the same requirement as the Bro v2.2 release).
  The bump was to accomodate building libmagic as an external project,
  which is no longer needed.

Addresses BIT-1143.
2014-03-04 11:12:06 -06:00
Seth Hall
38dbba7622 More file reassembly work.
- The reassembly behavior can be modified per-file by enabling or
   disabling the reassembler and/or modifying the size of the reassembly
   buffer.

 - Changed the file extraction analyzer to use the stream to avoid
   issues with the chunk based approach not immediately triggering
   the file_new event due to mime-type detection delay.  Early chunks
   frequently ended up lost before.

 - Generally things are working now and I'd consider this in testing.
2014-01-05 04:58:01 -05:00
Robin Sommer
d34f23c8d4 A set of file analysis extensions.
- Enable manager to associate analyzers with a MIME type. With that,
  one can now say enable all analyzers for, e.g., "image/gif". This is
  exposed to script-land as

    Files::add_analyzers_for_mime_type(f: fa_file, mtype: string)

  For MIME types identified via libmagic, this happens automatically
  (via the file_new() handler in files/main.bro).

- Extend the analyzer API to better match that of protocol analyzers:

    - Adding unique analyzer IDs so that we can refer to instances
      from script-land.

    - Adding subtypes to Components so that a single analyzer
      implementation can support different types of analyzers
      internally.

    - Add an analyzer method SetTag() that allows to set the tag after
      construction.

    - Adding Init() and Done() methods for consistency with what other
      classes offer.

- Add debug logging to the file_analysis stream.

TODO: test cases missing for the new script-land functionality.
2013-11-26 11:20:14 -08:00
Daniel Thayer
b5af589246 Improvements to file analysis docs
Fixed reference to wrong field name.
Added documentation of a function arg.
Added a couple references to other parts of the documentation.
Explained how not specifying extraction filename results in automatic
filename generation.
Several other minor clarifications.
2013-10-11 16:31:53 -05:00
Daniel Thayer
60b2c5f1fe Add README files for most Bro frameworks
The text from these README files appears on the "Bro Script Packages"
page after building the documentation.  The text for these was mostly just
copied from the existing docs.
2013-10-11 00:19:37 -05:00
Daniel Thayer
0d712b35d8 Fix typos and a broken link in the file analysis doc 2013-10-10 12:23:34 -05:00
Jon Siwek
5fa9c5865b Factor out the need for a tag field in Files::AnalyzerArgs record.
This cleans up internals of how analyzer instances get identified by the
tag plus any args given to it and doesn't change script code a user
would write.
2013-07-31 09:48:19 -05:00
Jon Siwek
d84f6e012c Fix various documentation, mostly related to file analysis.
- Fix examples/references in the file analysis how-to/usage doc.

- Add Broxygen-generated docs for file analyzer plugins.

- Break FTP::Info type declaration out in to its own file to get
  rid of some circular dependencies (between s/b/p/ftp/main and
  s/b/p/ftp/utils).
2013-07-29 16:15:37 -05:00
Jon Siwek
939619889d File analysis fixes and test updates.
- Several places were just using old variable names or not loading
  scripts correctly after they'd been renamed/moved.

- Revert/adjust a change in how HTTP file handles are generated that
  broke partial content responses.

- Turn some libmagic builtin checks back on; seems some are actually
  useful (e.g. text detection seems to be a builtin).  The rule going
  forward probably will be only to turn off a builtin if we confirm it
  causes issues.

- Removed some tests that are redundant or not necessary anymore because
  the generic file analysis tests cover them.

- A couple FTP tests still fail that I think need an actual solution via
  script changes.
2013-07-25 16:51:16 -05:00
Seth Hall
0bfdcc1fbc Added protocol description functions that provide a super compressed log representation. 2013-07-16 12:01:50 -04:00
Jon Siwek
73155c321b Add an is_orig parameter to file_over_new_connection event. 2013-07-09 15:58:28 -05:00
Seth Hall
cdf6b7864e More file analysis updates.
- Recorrected the module name to Files.

  - Added Files::analyzer_name to get a more readable name for a
    file analyzer.

  - Improved and just overall better handled multipart mime
    transfers in HTTP and SMTP.  HTTP now has orig_fuids and resp_fuids
    log fields since multiple "files" can be transferred with
    multipart mime in a single request/response pair.  SMTP has
    an fuids field which has file unique IDs for all parts
    transferred. FTP and IRC have a log field named fuid added
    because only a single file can be transferred per irc and ftp
    log line.
2013-07-09 11:50:54 -04:00
Seth Hall
58d133e764 Merge remote-tracking branch 'origin/master' into topic/seth/faf-updates
Conflicts:
	scripts/base/frameworks/files/main.bro
	scripts/base/init-bare.bro
	scripts/base/protocols/ftp/file-analysis.bro
	scripts/base/protocols/http/file-analysis.bro
	scripts/base/protocols/irc/file-analysis.bro
	scripts/base/protocols/smtp/file-analysis.bro
	src/const.bif
	src/event.bif
	src/file_analysis/Analyzer.h
	src/file_analysis/file_analysis.bif
2013-07-05 02:13:27 -04:00
Seth Hall
df2841458d Large overhaul in name and appearance for file analysis. 2013-07-05 02:00:14 -04:00