Commit graph

476 commits

Author SHA1 Message Date
Jon Siwek
095a68b2ec Various minor changes related to file mime type detection.
- Improve or just remove some file magic signatures ported from libmagic
  that were too general and matched incorrectly too often.

- Fix MHR script's use of fa_file$mime_type before checking if it's
  initialized.  It may be uninitialized if no signatures match.

- The "fa_file" record now contains a "mime_types" field that contains
  all magic signatures that matched the file content (where the
  "mime_type" field is just a shortcut for the strongest match).
2014-03-06 11:41:10 -06:00
Jon Siwek
0865b152bb Refactor common MIME magic matching code.
Put some methods in file_analysis::Manager that can perform the
matching process and return MIME type results.  Also helps to
centralize the management/re-use of a signature matcher object.
2014-03-05 10:49:57 -06:00
Jon Siwek
b22ca5d0a3 Replace libmagic w/ Bro signatures for file MIME type identification.
Notable changes:

- libmagic is no longer used at all.  All MIME type detection is
  done through new Bro signatures, and there's no longer a means to get
  verbose file type descriptions (e.g. "PNG image data, 1435 x 170").
  The majority of the default file magic signatures are derived
  from the default magic database of libmagic ~5.17.

- File magic signatures consist of two new constructs in the
  signature rule parsing grammar: "file-magic" gives a regular
  expression to match against, and "file-mime" gives the MIME type
  string of content that matches the magic and an optional strength
  value for the match.

- Modified signature/rule syntax for identifiers: they can no longer
  start with a '-', which made for ambiguous syntax when doing negative
  strength values in "file-mime".  Also brought syntax for Bro script
  identifiers in line with reality (they can't start with numbers or
  include '-' at all).

- A new Built-In Function, "file_magic", can be used to get all
  file magic matches and their corresponding strength against a given
  chunk of data

- The second parameter of the "identify_data" Built-In Function
  can no longer be used to get verbose file type descriptions, though it
  can still be used to get the strongest matching file magic signature.

- The "file_transferred" event's "descr" parameter no longer
  contains verbose file type descriptions.

- The BROMAGIC environment variable no longer changes any behavior
  in Bro as magic databases are no longer used/installed.

- Reverted back to minimum requirement of CMake 2.6.3 from 2.8.0
  (it's back to being the same requirement as the Bro v2.2 release).
  The bump was to accomodate building libmagic as an external project,
  which is no longer needed.

Addresses BIT-1143.
2014-03-04 11:12:06 -06:00
Bernhard Amann
7eb6b5133e Fix circular reference problem and a few other small things.
SSL::Info now holds a reference to Files::Info instead of the
fa_files record.

Everything should work now, if everyone thinks that the interface is
ok I will update the test baselines in a bit.

addresses BIT-953, BIT-760
2014-03-04 05:30:32 -08:00
Bernhard Amann
110d9fbd6a X509 file analyzer nearly done. Verification and most other policy scripts
work fine now.

Todo:
 * update all baselines
 * fix the circular reference to the fa_file structure I introduced :)
   Sadly this does not seem to be entirely straightforward.

addresses BIT-953, BIT-760
2014-03-03 17:07:50 -08:00
Bernhard Amann
a1f2ab34ac Add verify functionality, including the ability to get the validated
chain. This means that it is now possible to get information about the
root-certificates that were used to secure a connection.

Intermediate commit before changing the script interface again.

addresses BIT-953, BIT-760
2014-03-03 10:49:28 -08:00
Bernhard Amann
7ba6bcff2c Second try on the event interface.
Now the x509 opaque is wrapped in the certificate structure. After
pondering on it for a bit, this might not be the brightest idea.
2014-02-28 02:43:16 -08:00
Bernhard Amann
1735e33691 Backport crash fix that made it into master with the x509_extension
backport from here.
2014-02-28 02:09:06 -08:00
Bernhard Amann
30860e4226 Merge remote-tracking branch 'origin/master' into topic/bernhard/file-analysis-x509
Conflicts:
	src/analyzer/protocol/ssl/events.bif
	src/analyzer/protocol/ssl/ssl-analyzer.pac
2014-02-28 01:49:16 -08:00
Robin Sommer
d4b5da1597 Merge remote-tracking branch 'origin/topic/jsiwek/http-file-id-caching'
* origin/topic/jsiwek/http-file-id-caching:
  Revert use of HTTP file ID caching for gaps range request content.
  Extend file analysis API to allow file ID caching, adapt HTTP to use it.

BIT-1125 #merged
2014-01-31 08:41:31 -08:00
Jon Siwek
1842d324cb Extend file analysis API to allow file ID caching, adapt HTTP to use it.
This allows an analyzer to either provide file IDs associated with some
file content or to cache a file ID that was already determined by
script-layer logic so that subsequent calls to the file analysis
interface can bypass costly detours through script-layer.  This can
yield a decent performance improvement for analyzers that are able to
take advantage of it and deal with streaming content (like HTTP).
2014-01-29 15:34:24 -06:00
Bernhard Amann
f821a13cce Merge remote-tracking branch 'origin/master' into topic/bernhard/file-analysis-x509
Conflicts:
	src/analyzer/protocol/ssl/events.bif

Still broken.
2014-01-28 06:43:08 -08:00
Bernhard Amann
2c7e7f962e Make x509 certificates an opaque type 2014-01-28 06:39:50 -08:00
Robin Sommer
3f47c5bc87 Merge remote-tracking branch 'origin/master' into topic/robin/dynamic-plugins-2.3 2014-01-24 20:26:00 -08:00
Jon Siwek
e09763e061 Fix file_over_new_connection event to trigger when entire file is missed.
If a file is nothing but gaps (e.g. due to missing/dropped packets), Bro
can sometimes detect a file is supposed to have been present and never
saw any of its content, but failed to raise file_over_new_connection
events for it.  This was mostly apparent because the tx_hosts/rx_hosts
fields in files.log would not be populated in such cases (but are now
with this change).
2014-01-24 16:47:00 -06:00
Robin Sommer
2c34101394 Moving existing built-in plugins over to new interface. 2014-01-20 13:39:11 -08:00
Seth Hall
38dbba7622 More file reassembly work.
- The reassembly behavior can be modified per-file by enabling or
   disabling the reassembler and/or modifying the size of the reassembly
   buffer.

 - Changed the file extraction analyzer to use the stream to avoid
   issues with the chunk based approach not immediately triggering
   the file_new event due to mime-type detection delay.  Early chunks
   frequently ended up lost before.

 - Generally things are working now and I'd consider this in testing.
2014-01-05 04:58:01 -05:00
Seth Hall
0b78f444a1 Initial commit of file reassembly. 2013-12-20 00:05:08 -05:00
Robin Sommer
987452beff Cleanup of plugin component API.
- Move more functionality into base class.
- Remove cctors and assignment operators (weren't actually needed anymore)
- Switch from const char* to std::string.
2013-12-16 10:07:20 -08:00
Robin Sommer
d34f23c8d4 A set of file analysis extensions.
- Enable manager to associate analyzers with a MIME type. With that,
  one can now say enable all analyzers for, e.g., "image/gif". This is
  exposed to script-land as

    Files::add_analyzers_for_mime_type(f: fa_file, mtype: string)

  For MIME types identified via libmagic, this happens automatically
  (via the file_new() handler in files/main.bro).

- Extend the analyzer API to better match that of protocol analyzers:

    - Adding unique analyzer IDs so that we can refer to instances
      from script-land.

    - Adding subtypes to Components so that a single analyzer
      implementation can support different types of analyzers
      internally.

    - Add an analyzer method SetTag() that allows to set the tag after
      construction.

    - Adding Init() and Done() methods for consistency with what other
      classes offer.

- Add debug logging to the file_analysis stream.

TODO: test cases missing for the new script-land functionality.
2013-11-26 11:20:14 -08:00
Daniel Thayer
5b6468a302 Add documentation for event parameters
Added documentation that was missing for some event parameters, and
fixed documented name of event parameters.
2013-11-22 16:36:08 -06:00
Daniel Thayer
f0f1918954 Merge remote-tracking branch 'origin/master' into topic/dnthayer/doc-changes-for-2.2 2013-10-14 17:26:52 -05:00
Daniel Thayer
b5af589246 Improvements to file analysis docs
Fixed reference to wrong field name.
Added documentation of a function arg.
Added a couple references to other parts of the documentation.
Explained how not specifying extraction filename results in automatic
filename generation.
Several other minor clarifications.
2013-10-11 16:31:53 -05:00
Jon Siwek
b828a6ddc7 Review usage of Reporter::InternalError, addresses BIT-1045.
Replaced some with InternalWarning or InternalAnalyzerError, the later
being a new method which signals the analyzer to not process further
input.  Some usages I just removed if they didn't make sense or clearly
couldn't happen.  Also did some minor refactors of related code while
reviewing/exploring ways to get rid of InternalError usages.

Also, for TCP content file write failures there's a new event:
"contents_file_write_failure".
2013-10-10 14:45:06 -05:00
Jon Siwek
775ec6795e Fix uninitialized (or unused) fields. 2013-09-27 10:13:52 -05:00
Bernhard Amann
df552ca87d parse out extension. One event for general extensions (just returns the
openssl-parsed string-value), one event for basicconstraints (is a certificate
a CA or not) and one event for subject-alternative-names (only DNS parts).
2013-09-19 14:41:34 -07:00
Jon Siwek
a316878d01 Add checks to avoid improper negative values use. 2013-09-17 16:42:48 -05:00
Bernhard Amann
e5a589dbfe Very basic file-analyzer for x509 certificates. Mostly ripped from
the ssl-analyzer and the topic/bernhard/x509 branch.

Simply prints information about the encountered certificates (I have
not yet my mind up, what I will log...).

Next step: extensions...
2013-09-16 14:08:22 -07:00
Jon Siwek
5c119561ad UID optimizations addressing BIT-1016.
Max UID bit-length is now 128, but can be increased w/ trivial source
code change of BRO_UID_LEN.
2013-08-28 15:35:18 -05:00
Robin Sommer
f6b689db81 Merge remote-tracking branch 'origin/topic/jsiwek/uid'
* origin/topic/jsiwek/uid:
  Fix UID compiler warning/error & missed baselines.
  Increase UIDs to 96 bits w/ C/F prefix - BIT-1016
2013-08-27 12:36:12 -07:00
Jon Siwek
22bf3e1196 Increase UIDs to 96 bits w/ C/F prefix - BIT-1016
- The bit-length is adjustable via redef'ing bits_per_uid.

- Prefix 'C' is used for connection UIDS (including IP tunnels) and
  'F' for files.
2013-08-26 15:36:31 -05:00
Jon Siwek
814d827c44 Use macros to create file analyzer plugin classes. 2013-08-22 17:03:50 -05:00
Jon Siwek
89ae4ffd05 Add options to limit extracted file sizes w/ 100MB default. 2013-08-22 16:37:58 -05:00
Robin Sommer
83eae53f54 Merge remote-tracking branch 'origin/topic/seth/unified2-analyzer'
BIT-1054 #merged

* origin/topic/seth/unified2-analyzer:
  Fixes in case a packet isn't seen that matches an event.
  Finished work on unified2 analyzer.
  Fixed some tests.
  Working unified2 analyzer.
  Unified2 file analyzer updated to new plugin style.
  Adding the unified2 analyzer.

Conflicts:
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
2013-08-13 18:37:52 -07:00
Seth Hall
f7c6dd7f7e Finished work on unified2 analyzer. 2013-08-13 03:21:43 -04:00
Seth Hall
091c8f3ebc Working unified2 analyzer.
- No output by default yet.  Most of the activity is centered
   around generating the Unified2::alert event which ties together
   an IDSEvent and a packet.
2013-08-12 14:57:12 -04:00
Seth Hall
04de4ce24b Unified2 file analyzer updated to new plugin style. 2013-08-10 22:26:32 -04:00
Seth Hall
b7877792c9 First commit of file entropy analyzer.
- Code comments need cleaned up still.
2013-08-05 00:02:48 -04:00
Jon Siwek
0d39e00bc4 Fix a ref counting bug.
BIT-1049 #request-merge
2013-08-01 14:39:35 -05:00
Jon Siwek
ee7dba806d Fix some build errors.
On GCC, some namespace sensitivity and file analyzer plugins now need
to link in Analyzer since it's not just a header anymore.
2013-08-01 12:17:51 -05:00
Jon Siwek
99c89b42d7 Internal refactoring of how plugin components are tagged/managed.
Made some class templates for code that seemed duplicated between
file/protocol tags and managers.  Seems like it helps a bit and
hopefully can be also be used to transition other things that have
enum value "tags" (e.g. logging writers, input readers) to the
plugin system.
2013-08-01 10:35:47 -05:00
Jon Siwek
9bd7a65071 Merge branch 'master' into topic/jsiwek/faf-updates
Conflicts:
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
2013-07-31 10:05:36 -05:00
Jon Siwek
5fa9c5865b Factor out the need for a tag field in Files::AnalyzerArgs record.
This cleans up internals of how analyzer instances get identified by the
tag plus any args given to it and doesn't change script code a user
would write.
2013-07-31 09:48:19 -05:00
Jon Siwek
8df4df0b8b Add a distinct tag class for file analyzers.
This should prevent assignment mismatches between file and protocol
analyzer tags.
2013-07-30 15:19:48 -05:00
Robin Sommer
984e9793db Merge remote-tracking branch 'origin/topic/seth/faf-updates'
* origin/topic/seth/faf-updates: (27 commits)
  Undoing the FTP tests I updated earlier.
  Update the last two btest FAF tests.
  File analysis fixes and test updates.
  Fix a bug with getting analyzer tags.
  A few test updates.
  Some tests work now (at least they all don't fail anymore!)
  Forgot a file.
  Added protocol description functions that provide a super compressed log representation.
  Fix a bug where orig file information in http wasn't working right.
  Added mime types to http.log
  Clean up queued but unused file_over_new_connections event args.
  Add jar files to the default MHR lookups.
  Adding CAB files for MHR checking.
  Improve malware hash registry script.
  Fix a small issue with finding smtp entities.
  Added support for files to the notice framework.
  Make the custom libmagic database a git submodule.
  Add an is_orig parameter to file_over_new_connection event.
  Make magic for emitting application/msword mime type less strict.
  Disable more libmagic builtin checks that override the magic database.
  ...

Conflicts:
	doc/scripts/DocSourcesList.cmake
	scripts/base/init-bare.bro
	scripts/test-all-policy.bro
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
2013-07-29 14:21:52 -07:00
Jon Siwek
d84f6e012c Fix various documentation, mostly related to file analysis.
- Fix examples/references in the file analysis how-to/usage doc.

- Add Broxygen-generated docs for file analyzer plugins.

- Break FTP::Info type declaration out in to its own file to get
  rid of some circular dependencies (between s/b/p/ftp/main and
  s/b/p/ftp/utils).
2013-07-29 16:15:37 -05:00
Seth Hall
7ba51786e5 In progress checkpoint. Things are starting to work. 2013-07-27 08:10:08 -04:00
Seth Hall
1e098bae8d Moving the PE analyzer to the new plugin structure. 2013-07-27 00:07:47 -04:00
Seth Hall
998cedb3b8 Merge remote-tracking branch 'origin/master' into topic/seth/file-analysis-exe-analyzer
Conflicts:
	src/CMakeLists.txt
	src/binpac_bro.h
	src/event.bif
	src/file_analysis.bif
	src/file_analysis/AnalyzerSet.cc
2013-07-27 00:04:40 -04:00
Robin Sommer
4a7046848c bif files declared with bif_target() are now automatically compiled
in.

No more manual includes to pull them in.

(It doesn't quite work fully automatically yet for some bifs that need
script-level types defined, like the input and logging frameworks.
They still do a manual "@load foo.bif" in their main.bro to get the
order right. It's a bit tricky to fix that and would probably need
splitting main.bro into two parts; not sure that's worth it.)
2013-07-25 10:12:52 -07:00