Commit graph

100 commits

Author SHA1 Message Date
Jon Siwek
a83941d64d Deprecate internal_val() and internal_const_val()
Replaced with zeek::lookup_val() and zeek::lookup_const()
2020-05-14 17:23:19 -07:00
Jon Siwek
1abed4fd4c Migrate Tag classes to use IntrusivePtr
Deprecates various methods that previously took raw pointers
2020-05-14 17:18:00 -07:00
Johanna Amann
04ed125941 Merge remote-tracking branch 'origin/master' into topic/johanna/hash-unification 2020-05-06 23:18:33 +00:00
Johanna Amann
3bce313b12 Switch file UID hashing from md5 to highwayhash.
This commit switches UID hashing from md5 to a highway hash. It also
moves the salt value out of the file plugin - and makes it
installation-specific instead - it is moved to the global namespace.

There now are digest hash functions to make "static"
installation-specific hashes that are stable over workers available to
everyone; hashes can be 64, 128 or 256 bits in size.

Due to the fact that we switch the file hashing algorithm, all file
hashes change.

The underlyigng algorithm that is used for hashing is highwayhash-128,
which is significantly faster than md5.
2020-04-30 10:20:09 -07:00
Jon Siwek
2a63e4a4a2 Deprecate BuildConnVal() methods and update usages to ConnVal()
The later being a new method that returns IntrusivePtr
2020-04-16 17:00:01 -07:00
Jon Siwek
094d6de979 Update all BIFs to return IntrusivePtr instead of Val* 2020-04-16 17:00:01 -07:00
Jon Siwek
93f4c5871b Update deprecated ValManager::GetCount usages 2020-04-16 16:46:36 -07:00
Jon Siwek
0ddac4abcf Update deprecated ValManager::GetInt usages 2020-04-16 16:44:35 -07:00
Jon Siwek
d9edd855da Update deprecated ValManager::GetBool usages 2020-04-16 16:44:33 -07:00
Johanna Amann
876c803d75 Merge remote-tracking branch 'origin/topic/timw/776-using-statements'
* origin/topic/timw/776-using-statements:
  Remove 'using namespace std' from SerialTypes.h
  Remove other using statements from headers
  GH-776: Remove using statements added by PR 770

Includes small fixes in files that changed since the merge request was
made.

Also includes a few small indentation fixes.
2020-04-09 13:31:07 -07:00
Tim Wojtulewicz
393b8353cb file_analysis: Replace nulls with nullptr 2020-04-07 16:08:34 -07:00
Tim Wojtulewicz
d53c1454c0 Remove 'using namespace std' from SerialTypes.h
This unfortunately cuases a ton of flow-down changes because a lot of other
code was depending on that definition existing. This has a fairly large chance
to break builds of external plugins, considering how many internal ones it broke.
2020-04-07 15:59:59 -07:00
Jon Siwek
6980f63a91 Deprecate EventMgr::QueueEventFast() and update usages to Enqueue() 2020-03-25 16:09:33 -07:00
Max Kellermann
ba35ebec4c Type: return IntrusivePtr 2020-03-06 09:06:38 +01:00
Max Kellermann
0a6ddfb6b5 Val: add TableVal::Assign() overload with IntrusivePtr
Prepare the transition to IntrusivePtr for various callers.
2020-03-06 09:06:38 +01:00
Max Kellermann
0cf5799ca6 file_analysis: include cleanup 2020-02-13 10:12:03 +01:00
Tim Wojtulewicz
5a237d3a3f Use const-references in lots of places (preformance-unnecessary-value-param) 2020-02-11 14:11:18 -08:00
Max Kellermann
0db61f3094 include cleanup
The Zeek code base has very inconsistent #includes.  Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed.  Another side effect was a lot of header
bloat which slows down the build.

First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.

After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations.  In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.

This patch speeds up the build by 19%, because each compilation unit
gets smaller.  Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):

Before this patch:

 3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
 760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps

After this patch:

 2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
 72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
2020-02-04 20:51:02 +01:00
Jon Siwek
47235b57a6 Merge remote-tracking branch 'origin/topic/timw/deprecate-int-types'
* origin/topic/timw/deprecate-int-types:
  Deprecate the internal int/uint types in favor of the cstdint types they were based on

Merge adjustments:
  * A bpf type mistakenly got replaced (inside an unlikely #ifdef)
  * Did a few substitutions that got missed (likely due to
    pre-processing out of DEBUG macros)
2019-08-14 15:49:24 -07:00
Jon Siwek
87f85ecca1 Cleanups related to PDict -> std::map replacements 2019-08-13 19:57:42 -07:00
Tim Wojtulewicz
e6558d1f19 Remove other simple uses of PDict 2019-08-13 19:57:42 -07:00
Tim Wojtulewicz
54752ef9a1 Deprecate the internal int/uint types in favor of the cstdint types they were based on 2019-08-12 13:50:07 -07:00
Jon Siwek
b6862c5c59 Add methods to queue events without handler existence check
Added ConnectionEventFast() and QueueEventFast() methods to avoid
redundant event handler existence checks.

It's common practice for caller to already check for event handler
existence before doing all the work of constructing the arguments, so
it's desirable to not have to check for existence again.

E.g. going through ConnectionEvent() means 3 existence checks:
one you do yourself before calling it, one in ConnectionEvent(), and then
another in QueueEvent().

The existence check itself can be more than a few operations sometimes
as it needs to check a few flags that determine if it's enabled, has
a local body, or has any remote receivers in the old comm. system or
has been flagged as something to publish in the new comm. system.
2019-04-11 20:30:25 -07:00
Jon Siwek
8bc65f09ec Cleanup/improve PList usage and Event API
Majority of PLists are now created as automatic/stack objects,
rather than on heap and initialized either with the known-capacity
reserved upfront or directly from an initializer_list (so there's no
wasted slack in the memory that gets allocated for lists containing
a fixed/known number of elements).

Added versions of the ConnectionEvent/QueueEvent methods that take
a val_list by value.

Added a move ctor/assign-operator to Plists to allow passing them
around without having to copy the underlying array of pointers.
2019-04-11 20:30:25 -07:00
Johanna Amann
ffa6756255 Merge branch 'master' of https://github.com/rbclark/bro into topic/johanna/md5-fips
* 'master' of https://github.com/rbclark/bro:
  Tell OpenSSL that MD5 is not used for security in order to allow bro to work properly on a FIPS system
2019-01-18 15:34:06 -08:00
Robert Clark
a72e9a8126
Tell OpenSSL that MD5 is not used for security in order to allow bro to work properly on a FIPS system 2019-01-11 16:09:42 -05:00
Jon Siwek
2982765128 Pre-allocate and re-use Vals for bool, int, count, enum and empty string 2019-01-09 18:29:23 -06:00
Jon Siwek
05b10fe2e7 BIT-1544: allow NULs in file analysis handles 2018-08-15 18:03:02 -05:00
Johanna Amann
41285abea5 Make nearly all bool operators explicit.
These are a bit dangerous because the casting can happen in quite
unexpected circumstances and lead to undesirable comparison results.
2018-01-18 14:02:03 -08:00
Robin Sommer
0b5894ce23 Merge remote-tracking branch 'origin/topic/johanna/ocsp-sct-validate'
* origin/topic/johanna/ocsp-sct-validate:
  SSL SCT/OCSP: small fixes by robin; mostly update comments.
2017-08-04 13:28:08 -07:00
Johanna Amann
d5678418da SSL SCT/OCSP: small fixes by robin; mostly update comments.
SetMime now only works on the first call (as it was documented) and
unused code was used from one of the x.509 functions.
2017-08-01 16:30:08 -07:00
Robin Sommer
faa4150154 Merge remote-tracking branch 'origin/topic/johanna/ocsp-sct-validate'
Closes #1830.

* origin/topic/johanna/ocsp-sct-validate: (82 commits)
  Tiny script changes for SSL.
  Update CT Log list
  SSL: Update OCSP/SCT scripts and documentation.
  Revert "add parameter 'status_type' to event ssl_stapled_ocsp"
  Revert "parse multiple OCSP stapling responses"
  SCT: Fix script error when mime type of file unknown.
  SCT: another memory leak in SCT parsing.
  SCT validation: fix small memory leak (public keys were not freed)
  Change end-of-connection handling for validation
  OCSP/TLS/SCT: Fix a number of test failures.
  SCT Validate: make caching a bit less aggressive.
  SSL: Fix type of ssl validation result
  TLS-SCT: compile on old versions of OpenSSL (1.0.1...)
  SCT: Add caching support for validation
  SCT: Add signed certificate timestamp validation script.
  SCT: Allow verification of SCTs in Certs.
  SCT: only compare correct OID/NID for Cert/OCSP.
  SCT: add validation of proofs for extensions and OCSP.
  SCT: pass timestamp as uint64 instead of time
  Add CT log information to Bro
  ...
2017-07-30 08:49:41 -07:00
Johanna Amann
9fd7816501 Allow File analyzers to direcly pass mime type.
This makes it much easier for protocols where the mime type is known in
advance like, for example, TLS. We now do no longer have to perform deep
script-level magic.
2017-02-10 17:03:33 -08:00
Seth Hall
ee3e885712 Lots of fixes for file type identification.
- Plain text now identified with BOMs for UTF8,16,32
   (even though 16 and 32 wouldn't get identified as plain text, oh-well)
 - X.509 certificates are now populating files.log with
   the mime type application/pkix-cert.
 - File signatures are split apart into file types
   to help group and organize signatures a bit better.
 - Normalized some FILE_ANALYSIS debug messages.
 - Improved Javascript detection.
 - Improved HTML detection.
 - Removed a bunch of bad signatures.
 - Merged a bunch of signatures that ultimately detected
   the same mime type.
 - Added detection for MS LNK files.
 - Added detection for cross-domain-policy XML files.
 - Added detection for SOAP envelopes.
2015-03-13 22:14:44 -04:00
Jon Siwek
cbbe7b52dc Review/fix/change file reassembly functionality.
- Re-arrange how some fa_file fields (e.g. source, connection info, mime
  type) get updated/set for consistency.

- Add more robust mechanisms for flushing the reassembly buffer.
  The goal being to report all gaps and deliveries to file analyzers
  regardless of the state of the reassembly buffer at the time it has to
  be flushed.
2014-12-16 14:05:15 -06:00
Seth Hall
cafd35e746 Updates the files event api and brings file reassembly up to master. 2014-09-26 00:40:37 -04:00
Seth Hall
42b2d56279 Merge remote-tracking branch 'origin/master' into topic/seth/files-tracking
Conflicts:
	scripts/base/frameworks/files/main.bro
	src/file_analysis/File.cc
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.actions.data_event/out
2014-09-23 13:05:39 -04:00
Robin Sommer
f4cbcb9b03 Converting log writers and input readers to plugins. 2014-07-20 19:17:58 +02:00
Robin Sommer
9616cd8e61 Further polishing and cleanup in preparation for merge. 2014-07-12 18:12:09 -07:00
Seth Hall
8d9940c8c3 Merge remote-tracking branch 'origin/master' into topic/seth/files-tracking
Conflicts:
	src/Reassem.cc
	src/Reassem.h
	src/analyzer/protocol/tcp/TCP_Reassembler.cc
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.bifs.set_timeout_interval/bro..stdout
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.http.partial-content/b.out
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.http.partial-content/c.out
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.logging/files.log
2014-05-27 10:56:11 -04:00
Robin Sommer
bbd409d274 Merge remote-tracking branch 'origin/master' into topic/robin/dynamic-plugins-2.3
(Never good to name a branch after version anticipated to include it ...)
2014-05-14 16:23:04 -07:00
Jon Siwek
af3b87e100 Fix buffer over-reads in file_analysis::Manager::Terminate() 2014-05-06 12:36:02 -05:00
Jon Siwek
4b059ea15a Improve file analysis manager shutdown/cleanup.
file_analysis::Manager's dtor now doesn't assume any more analysis
progress can be made because too many of Bro's other subsystems
are shutdown by that point.  Any file analysis requests made after
Terminate cannot be reliably processed.
2014-04-29 12:44:53 -05:00
Jon Siwek
bc5c02cb74 Refactor file analysis file ID lookup.
Now using a dictionary instead of std::map as order doesn't matter and
lookup time shouldn't increase as more files are in process of being
analyzed.
2014-04-18 16:35:43 -05:00
Jon Siwek
095a68b2ec Various minor changes related to file mime type detection.
- Improve or just remove some file magic signatures ported from libmagic
  that were too general and matched incorrectly too often.

- Fix MHR script's use of fa_file$mime_type before checking if it's
  initialized.  It may be uninitialized if no signatures match.

- The "fa_file" record now contains a "mime_types" field that contains
  all magic signatures that matched the file content (where the
  "mime_type" field is just a shortcut for the strongest match).
2014-03-06 11:41:10 -06:00
Jon Siwek
0865b152bb Refactor common MIME magic matching code.
Put some methods in file_analysis::Manager that can perform the
matching process and return MIME type results.  Also helps to
centralize the management/re-use of a signature matcher object.
2014-03-05 10:49:57 -06:00
Robin Sommer
d4b5da1597 Merge remote-tracking branch 'origin/topic/jsiwek/http-file-id-caching'
* origin/topic/jsiwek/http-file-id-caching:
  Revert use of HTTP file ID caching for gaps range request content.
  Extend file analysis API to allow file ID caching, adapt HTTP to use it.

BIT-1125 #merged
2014-01-31 08:41:31 -08:00
Jon Siwek
1842d324cb Extend file analysis API to allow file ID caching, adapt HTTP to use it.
This allows an analyzer to either provide file IDs associated with some
file content or to cache a file ID that was already determined by
script-layer logic so that subsequent calls to the file analysis
interface can bypass costly detours through script-layer.  This can
yield a decent performance improvement for analyzers that are able to
take advantage of it and deal with streaming content (like HTTP).
2014-01-29 15:34:24 -06:00
Seth Hall
38dbba7622 More file reassembly work.
- The reassembly behavior can be modified per-file by enabling or
   disabling the reassembler and/or modifying the size of the reassembly
   buffer.

 - Changed the file extraction analyzer to use the stream to avoid
   issues with the chunk based approach not immediately triggering
   the file_new event due to mime-type detection delay.  Early chunks
   frequently ended up lost before.

 - Generally things are working now and I'd consider this in testing.
2014-01-05 04:58:01 -05:00
Robin Sommer
987452beff Cleanup of plugin component API.
- Move more functionality into base class.
- Remove cctors and assignment operators (weren't actually needed anymore)
- Switch from const char* to std::string.
2013-12-16 10:07:20 -08:00