Commit graph

27 commits

Author SHA1 Message Date
Seth Hall
809660d48a Tiny mime-type fix from Dan Caselden. 2017-02-14 07:21:00 -08:00
Seth Hall
645ec39f4b New file types sigs from Keith Lehigh. 2017-01-31 23:33:58 -05:00
Seth Hall
04d41dce5c Tiny xlsx file signature fix.
Thanks to Dan Caselden for noticing!
2016-12-08 08:32:45 -05:00
Seth Hall
15f5deed87 Add a files framework signature for VIM tmp files. 2016-11-02 11:51:38 -04:00
Seth Hall
514dfc3479 Merge remote-tracking branch 'origin/master' into topic/seth/smb
# Conflicts:
#	testing/btest/Baseline/plugins.hooks/output
#	testing/btest/Baseline/scripts.policy.misc.dump-events/all-events.log
#	testing/btest/Baseline/scripts.policy.misc.dump-events/smtp-events.log
2016-06-29 09:43:31 -04:00
Seth Hall
2e9491482f Add ACE archive files to the identified file types.
Addresses BIT-1609.  Thanks Stephen Hosom!
2016-06-14 22:27:09 -04:00
Seth Hall
1fe9e522fb Merge remote-tracking branch 'origin/master' into topic/seth/smb 2016-04-14 21:39:48 -04:00
Seth Hall
9aa9618473 Additional mime types for file identification and a few fixes.
Some of the existing mime types received extended matchers
to fix problems with UTF-16 BOMs.

New file mime types:
 - .ini files
 - MS Registry policy files
 - MS Registry files
 - MS Registry format files (e.g. DESKTOP.DAT)
 - MS Outlook PST files
 - Apple AFPInfo files

Mime type fixes:
 - MP3 files with ID3 tags.
 - JSON and XML matchers were extended
2016-04-14 10:06:58 -04:00
Seth Hall
017fa13393 Fix mime type identification for Windows LNK files. 2016-04-04 15:20:03 -04:00
dmfreemon@users.noreply.github.com
b14b189d12 add support for MIME type video/MP2T
BIT-1457 #merged
2015-08-21 17:32:19 -07:00
Seth Hall
217ccf6063 Add signature support for F4M files. 2015-06-02 12:48:53 -04:00
Robin Sommer
ed91732e09 Merge remote-tracking branch 'origin/topic/seth/more-file-type-ident-fixes'
* origin/topic/seth/more-file-type-ident-fixes:
  File API updates complete.
  Fixes for file type identification.
  API changes to file analysis mime type detection.
  Make HTTP 206 reassembly require ETags by default.
  More file type identification improvements
  Fix an issue with files having gaps before the bof_buffer is filled.
  Fix an issue with packet loss in http file reporting.
  Adding WOFF fonts to file type identification.
  Extended JSON matching and added OCSP responses.
  Another large signature update.
  More signature updates.
  Even more file type ident clean up.
  Lots of fixes for file type identification.

BIT-1368 #merged
2015-04-20 13:31:00 -07:00
Seth Hall
faabe8a5e3 Fixes for file type identification.
- Backed out eTag changes.  The real world is more complicated
   than just using eTags to identify the same file.
 - A bit of code simplication in the http base scripts.
 - Test updates (more existing small problems were identified!).
 -
2015-04-20 09:34:09 -04:00
Seth Hall
e8c87e19bd More file type identification improvements
- Split fonts into their own file.
 - Improved JSON matching.
 - Added XML-RPC content matching using application/xml-rpc
 - Added OCSP requests
2015-04-09 01:23:55 -04:00
Seth Hall
8fd5e7f382 Adding WOFF fonts to file type identification. 2015-04-07 02:06:02 -04:00
Seth Hall
422e558d77 Extended JSON matching and added OCSP responses. 2015-04-07 00:46:10 -04:00
Seth Hall
99061fff4c Another large signature update.
- Lots of cleanup and expansion of XML match types.
   - Signatures for ATOM and RSS (text/atom, text/rss).
   - Improved SOAP signature.
   - Improved text/cross-domain-policy signature
 - Improved and expanded javascript matching a bit.
 - Removed a lot of potentially problematic signatures (performance)
 - Split out more signatures from libmagic.sig
 - Added a signature for matching JSON.  Seems to work ok.
 - Signature for MPEGv4 audio.
 - Expanded java applet signature.
 - Improved PNG matching.
 - Improved MP3 matching.
2015-04-06 23:40:20 -04:00
Seth Hall
6861ecc046 More signature updates. 2015-04-06 17:21:53 -04:00
Seth Hall
19f498b4a4 Even more file type ident clean up.
- Add detection for ColdFusion scripts.
 - Support detection of XML/HTML with prefixed comment blocks.
2015-03-14 00:25:13 -04:00
Seth Hall
ee3e885712 Lots of fixes for file type identification.
- Plain text now identified with BOMs for UTF8,16,32
   (even though 16 and 32 wouldn't get identified as plain text, oh-well)
 - X.509 certificates are now populating files.log with
   the mime type application/pkix-cert.
 - File signatures are split apart into file types
   to help group and organize signatures a bit better.
 - Normalized some FILE_ANALYSIS debug messages.
 - Improved Javascript detection.
 - Improved HTML detection.
 - Removed a bunch of bad signatures.
 - Merged a bunch of signatures that ultimately detected
   the same mime type.
 - Added detection for MS LNK files.
 - Added detection for cross-domain-policy XML files.
 - Added detection for SOAP envelopes.
2015-03-13 22:14:44 -04:00
Seth Hall
7ee34981aa Improve TAR file detection and other small changes.
- Remove all of the x-c detections.  Nearly all false
    positives.

  - Remove the back up TAR detections.  Not very helpful.

  - Remove one of the x-elc detections that was too loose
    and caused many false positives.
2014-11-05 11:31:48 -05:00
Seth Hall
d77243823f Updates for file mime type identification.
- Change to the default BOF buffer size to 3000 (was 1024).
 - Reorganized MS signatures into a separate file
 - Improved lots of the signatures and added new ones.
2014-10-08 02:12:10 -04:00
Seth Hall
80656d5294 Improves shockwave flash file signatures.
- This moves the signatures out of the libmagic imported signatures
   and into our own general.sig.

 - Expand the detection to LZMA compressed flash files.
2014-10-06 11:13:13 -04:00
Robin Sommer
9efb549236 Merge remote-tracking branch 'origin/topic/jsiwek/file-signatures'
* origin/topic/jsiwek/file-signatures:
  File type detection changes and fix https.log {orig,resp}_fuids fields.
  Various minor changes related to file mime type detection.
  Refactor common MIME magic matching code.
  Replace libmagic w/ Bro signatures for file MIME type identification.

Conflicts:
	scripts/base/init-default.bro
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log

BIT-1143 #merged
2014-03-30 22:51:05 +02:00
Jon Siwek
8dad5026fd File type detection changes and fix https.log {orig,resp}_fuids fields.
- Removed "binary" and "octet-stream" mime type detections. They don't
  provide any more information than an uninitialized mime_type field
  which implicitly means no magic signature matches and so the media
  type is unknown to Bro.

- Slight change to "text/plain" signature.  It's still not the most
  accurate, which is reflected in its -20 strength value.

- The logic for adding file ids to {orig,resp}_fuids fields of
  the http.log incorrectly depended on the state of
  {orig,resp}_mime_types fields, so sometimes not all file ids
  associated w/ the session were logged.
2014-03-25 12:44:11 -05:00
Jon Siwek
095a68b2ec Various minor changes related to file mime type detection.
- Improve or just remove some file magic signatures ported from libmagic
  that were too general and matched incorrectly too often.

- Fix MHR script's use of fa_file$mime_type before checking if it's
  initialized.  It may be uninitialized if no signatures match.

- The "fa_file" record now contains a "mime_types" field that contains
  all magic signatures that matched the file content (where the
  "mime_type" field is just a shortcut for the strongest match).
2014-03-06 11:41:10 -06:00
Jon Siwek
b22ca5d0a3 Replace libmagic w/ Bro signatures for file MIME type identification.
Notable changes:

- libmagic is no longer used at all.  All MIME type detection is
  done through new Bro signatures, and there's no longer a means to get
  verbose file type descriptions (e.g. "PNG image data, 1435 x 170").
  The majority of the default file magic signatures are derived
  from the default magic database of libmagic ~5.17.

- File magic signatures consist of two new constructs in the
  signature rule parsing grammar: "file-magic" gives a regular
  expression to match against, and "file-mime" gives the MIME type
  string of content that matches the magic and an optional strength
  value for the match.

- Modified signature/rule syntax for identifiers: they can no longer
  start with a '-', which made for ambiguous syntax when doing negative
  strength values in "file-mime".  Also brought syntax for Bro script
  identifiers in line with reality (they can't start with numbers or
  include '-' at all).

- A new Built-In Function, "file_magic", can be used to get all
  file magic matches and their corresponding strength against a given
  chunk of data

- The second parameter of the "identify_data" Built-In Function
  can no longer be used to get verbose file type descriptions, though it
  can still be used to get the strongest matching file magic signature.

- The "file_transferred" event's "descr" parameter no longer
  contains verbose file type descriptions.

- The BROMAGIC environment variable no longer changes any behavior
  in Bro as magic databases are no longer used/installed.

- Reverted back to minimum requirement of CMake 2.6.3 from 2.8.0
  (it's back to being the same requirement as the Bro v2.2 release).
  The bump was to accomodate building libmagic as an external project,
  which is no longer needed.

Addresses BIT-1143.
2014-03-04 11:12:06 -06:00