Commit graph

81 commits

Author SHA1 Message Date
Jon Siwek
d9edd855da Update deprecated ValManager::GetBool usages 2020-04-16 16:44:33 -07:00
Jon Siwek
9af84bb2b0 Update deprecated ValManager GetTrue/GetFalse usages 2020-04-16 16:40:59 -07:00
Johanna Amann
876c803d75 Merge remote-tracking branch 'origin/topic/timw/776-using-statements'
* origin/topic/timw/776-using-statements:
  Remove 'using namespace std' from SerialTypes.h
  Remove other using statements from headers
  GH-776: Remove using statements added by PR 770

Includes small fixes in files that changed since the merge request was
made.

Also includes a few small indentation fixes.
2020-04-09 13:31:07 -07:00
Tim Wojtulewicz
393b8353cb file_analysis: Replace nulls with nullptr 2020-04-07 16:08:34 -07:00
Tim Wojtulewicz
d53c1454c0 Remove 'using namespace std' from SerialTypes.h
This unfortunately cuases a ton of flow-down changes because a lot of other
code was depending on that definition existing. This has a fairly large chance
to break builds of external plugins, considering how many internal ones it broke.
2020-04-07 15:59:59 -07:00
Tim Wojtulewicz
fd5e15b116 The Great Embooleanating
A large number of functions had return values and/or arguments changed
to use ``bool`` types instead of ``int``.
2020-03-31 06:41:54 +00:00
Jon Siwek
e7e5cf0f89 Use const-ref parameter for zeek::val_list_to_args()
It ended up being used a bit more than initially expected and this
is closer to the style we're generally aiming for.
2020-03-26 11:32:01 -07:00
Jon Siwek
e394ea38bc Deprecate file_analysis::File::FileEvent methods using val_list args
And update usages to the overload that takes a zeek::Args instead.
2020-03-25 18:40:49 -07:00
Jon Siwek
6980f63a91 Deprecate EventMgr::QueueEventFast() and update usages to Enqueue() 2020-03-25 16:09:33 -07:00
Jon Siwek
b62727a7fa Merge branch 'intrusive_ptr' of https://github.com/MaxKellermann/zeek
* 'intrusive_ptr' of https://github.com/MaxKellermann/zeek: (32 commits)
  Scope: store IntrusivePtr in `local`
  Scope: pass IntrusivePtr to AddInit()
  DNS_Mgr: use class IntrusivePtr
  Scope: use class IntrusivePtr
  Attr: use class IntrusivePtr
  Expr: check_and_promote_expr() returns IntrusivePtr
  Frame: use class IntrusivePtr
  Val: RecordVal::LookupWithDefault() returns IntrusivePtr
  Type: RecordType::FieldDefault() returns IntrusivePtr
  Val: TableVal::Delete() returns IntrusivePtr
  Type: base_type() returns IntrusivePtr
  Type: init_type() returns IntrusivePtr
  Type: merge_types() returns IntrusivePtr
  Type: use class IntrusivePtr in VectorType
  Type: use class IntrusivePtr in EnumType
  Type: use class IntrusivePtr in FileType
  Type: use class IntrusivePtr in TypeDecl
  Type: make TypeDecl `final` and the dtor non-`virtual`
  Type: use class IntrusivePtr in TypeType
  Type: use class IntrusivePtr in FuncType
  ...
2020-03-17 22:51:46 -07:00
Max Kellermann
79570fdfd6 Val: RecordVal::LookupWithDefault() returns IntrusivePtr 2020-03-06 09:06:46 +01:00
Max Kellermann
73cea5dcad Type: use class IntrusivePtr in TypeList 2020-03-06 09:06:38 +01:00
Max Kellermann
de0289125b Type: use class IntrusivePtr in IndexType 2020-03-06 09:06:38 +01:00
Max Kellermann
674e141a15 Val: use class IntrusivePtr in class TableVal 2020-03-06 09:06:38 +01:00
Max Kellermann
0a6ddfb6b5 Val: add TableVal::Assign() overload with IntrusivePtr
Prepare the transition to IntrusivePtr for various callers.
2020-03-06 09:06:38 +01:00
Max Kellermann
0db61f3094 include cleanup
The Zeek code base has very inconsistent #includes.  Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed.  Another side effect was a lot of header
bloat which slows down the build.

First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.

After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations.  In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.

This patch speeds up the build by 19%, because each compilation unit
gets smaller.  Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):

Before this patch:

 3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
 760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps

After this patch:

 2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
 72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
2020-02-04 20:51:02 +01:00
Jon Siwek
47235b57a6 Merge remote-tracking branch 'origin/topic/timw/deprecate-int-types'
* origin/topic/timw/deprecate-int-types:
  Deprecate the internal int/uint types in favor of the cstdint types they were based on

Merge adjustments:
  * A bpf type mistakenly got replaced (inside an unlikely #ifdef)
  * Did a few substitutions that got missed (likely due to
    pre-processing out of DEBUG macros)
2019-08-14 15:49:24 -07:00
Tim Wojtulewicz
54752ef9a1 Deprecate the internal int/uint types in favor of the cstdint types they were based on 2019-08-12 13:50:07 -07:00
Jon Siwek
b6862c5c59 Add methods to queue events without handler existence check
Added ConnectionEventFast() and QueueEventFast() methods to avoid
redundant event handler existence checks.

It's common practice for caller to already check for event handler
existence before doing all the work of constructing the arguments, so
it's desirable to not have to check for existence again.

E.g. going through ConnectionEvent() means 3 existence checks:
one you do yourself before calling it, one in ConnectionEvent(), and then
another in QueueEvent().

The existence check itself can be more than a few operations sometimes
as it needs to check a few flags that determine if it's enabled, has
a local body, or has any remote receivers in the old comm. system or
has been flagged as something to publish in the new comm. system.
2019-04-11 20:30:25 -07:00
Jon Siwek
8bc65f09ec Cleanup/improve PList usage and Event API
Majority of PLists are now created as automatic/stack objects,
rather than on heap and initialized either with the known-capacity
reserved upfront or directly from an initializer_list (so there's no
wasted slack in the memory that gets allocated for lists containing
a fixed/known number of elements).

Added versions of the ConnectionEvent/QueueEvent methods that take
a val_list by value.

Added a move ctor/assign-operator to Plists to allow passing them
around without having to copy the underlying array of pointers.
2019-04-11 20:30:25 -07:00
Jon Siwek
995368e68c Remove variable content from weird names
This changes many weird names to move non-static content from the
weird name into the "addl" field to help ensure the total number of
weird names is reasonably bounded.  Note the net_weird and flow_weird
events do not have an "addl" parameter, so information may no longer
be available in those cases -- to make it available again we'd need
to either (1) define new events that contain such a parameter, or
(2) change net_weird/flow_weird event signature (which is a breaking
change for user-code at the moment).

Also, the generic handling of binpac exceptions for analyzers which
to not otherwise catch and handle them has been changed from a Weird
to a ProtocolViolation.

Finally, a new "file_weird" event has been added for reporting
weirdness found during file analysis.
2019-04-01 18:30:11 -07:00
Jon Siwek
2982765128 Pre-allocate and re-use Vals for bool, int, count, enum and empty string 2019-01-09 18:29:23 -06:00
Jon Siwek
1e4964de77 Preallocate all possible PortVals.
The performance benefit is small (maybe ~1% at most), however, it's a
trivial change without downsides.
2017-12-11 15:29:28 -06:00
Johanna Amann
d5678418da SSL SCT/OCSP: small fixes by robin; mostly update comments.
SetMime now only works on the first call (as it was documented) and
unused code was used from one of the x.509 functions.
2017-08-01 16:30:08 -07:00
Johanna Amann
9fd7816501 Allow File analyzers to direcly pass mime type.
This makes it much easier for protocols where the mime type is known in
advance like, for example, TLS. We now do no longer have to perform deep
script-level magic.
2017-02-10 17:03:33 -08:00
Johanna Amann
1de6cfc2e3 Fix memory leak in file analyzer.
This undoes the changes applied in merge 9db27a6d60
and goes back to the state in the branch as of the merge 5ab3b86.

Getting rid of the additional layer of removing analyzers and just
keeping them in the set introduced subtle differences in behavior since
a few calls were still passed along. Skipping all of these with SetSkip
introduced yet other subtle behavioral differences.
2017-02-04 16:47:07 -08:00
Johanna Amann
9db27a6d60 Merge remote-tracking branch 'origin/topic/robin/file-analysis-fixes'
* origin/topic/robin/file-analysis-fixes:
  Adding test with command line that used to trigger a crash.
  Cleaning up a couple of comments.
  Fix delay in disabling file analyzers.
  Fix file analyzer memory management.

The merge changes around functionality a bit again - instead of having
a list of done analyzers, analyzers are simply set to skipping when they
are removed, and cleaned up later on destruction of the AnalyzerSet.

BIT-1782 #merged
2017-02-01 14:20:14 -08:00
Robin Sommer
fead5f5d5e Fix delay in disabling file analyzers.
When a file analyzer signaled being done with data delivery, the
analyzer would only be scheduled for removal at that poing, meaning it
could still receive more data until that action actually took effect.
Now we make sure to not send any more data to an analyzer.
2017-01-28 13:24:13 -08:00
Robin Sommer
3ce6a031d4 Fix file analyzer memory management.
File analyzers got deleted immediately once the queue with the
corresponding removal operation got drained. That however can happen
while the analyzer is still doing stuff: the queue is drained whenever
any the "special" file analysis events needing immediate attention has
been executed. This fix now only schedules the analyzer for deletion
at that time, but postpones the actual operation until file object
itself is being destroyed.
2017-01-28 13:07:51 -08:00
Johanna Amann
c464cf78dd Fix a number of format errors when using debug macros. 2016-08-12 15:42:02 -07:00
Johanna Amann
2756dfe581 Make x509 intel seen script robust against file analyzer ordering.
Now it consistently works, even if the SHA1 file analyzer gets the data
before the X509 file analyzer.
2016-08-11 16:12:08 -07:00
Robin Sommer
ed91732e09 Merge remote-tracking branch 'origin/topic/seth/more-file-type-ident-fixes'
* origin/topic/seth/more-file-type-ident-fixes:
  File API updates complete.
  Fixes for file type identification.
  API changes to file analysis mime type detection.
  Make HTTP 206 reassembly require ETags by default.
  More file type identification improvements
  Fix an issue with files having gaps before the bof_buffer is filled.
  Fix an issue with packet loss in http file reporting.
  Adding WOFF fonts to file type identification.
  Extended JSON matching and added OCSP responses.
  Another large signature update.
  More signature updates.
  Even more file type ident clean up.
  Lots of fixes for file type identification.

BIT-1368 #merged
2015-04-20 13:31:00 -07:00
Seth Hall
ed375167c8 File API updates complete.
Addresses BIT-1368.
2015-04-20 10:46:48 -04:00
Seth Hall
038e4c24f6 Merge remote-tracking branch 'origin/topic/jsiwek/bit-1368' into topic/seth/more-file-type-ident-fixes
Conflicts:
	src/file_analysis/File.cc
	testing/btest/Baseline/plugins.hooks/output
2015-04-20 09:36:40 -04:00
Jon Siwek
a55ce01ef3 API changes to file analysis mime type detection.
Removed "file_mime_type" and "file_mime_types" event, replacing them
with a new event called "file_metadata_inferred".  It has a record
argument of type "inferred_file_metadata", which contains the mime type
information that the earlier events used to supply.  The idea here is
that future extensions to the record with new metadata will be less
likely to break user code than the alternatives (adding new events or
new event parameters).

Addresses BIT-1368.
2015-04-10 16:31:29 -05:00
Seth Hall
6162d986a2 Fix an issue with files having gaps before the bof_buffer is filled.
When files had gaps prior to the bof_buffer completely filling, the
file gap handling code was never sniffing and passing along as much
data as possible so file type identification wasn't working correctly.
2015-04-08 13:41:03 -04:00
Seth Hall
ee3e885712 Lots of fixes for file type identification.
- Plain text now identified with BOMs for UTF8,16,32
   (even though 16 and 32 wouldn't get identified as plain text, oh-well)
 - X.509 certificates are now populating files.log with
   the mime type application/pkix-cert.
 - File signatures are split apart into file types
   to help group and organize signatures a bit better.
 - Normalized some FILE_ANALYSIS debug messages.
 - Improved Javascript detection.
 - Improved HTML detection.
 - Removed a bunch of bad signatures.
 - Merged a bunch of signatures that ultimately detected
   the same mime type.
 - Added detection for MS LNK files.
 - Added detection for cross-domain-policy XML files.
 - Added detection for SOAP envelopes.
2015-03-13 22:14:44 -04:00
Jon Siwek
1012539ded Merge branch 'topic/seth/small-files-bof-handling-fix'
* topic/seth/small-files-bof-handling-fix:
  Fix a bug in the core files framework with handling the BOF buffer.

BIT-1310 #merged
2015-02-05 10:10:00 -06:00
Seth Hall
a97cd1f3a2 Fix a bug in the core files framework with handling the BOF buffer.
- Any files where the total size was below the size of the
   default bof_buffer size couldn't have stream analyzers successfully
   attached because the bof_buffer never reached the full size
   and was never flushed.  This branch explicitly marks the buf_buffer
   as full and flushes it when the file is being removed.
2015-02-05 09:09:08 -05:00
Jon Siwek
6941538f81 Fix reference counting bug in refactored file reassembly code. 2014-12-16 20:58:27 -06:00
Jon Siwek
cbbe7b52dc Review/fix/change file reassembly functionality.
- Re-arrange how some fa_file fields (e.g. source, connection info, mime
  type) get updated/set for consistency.

- Add more robust mechanisms for flushing the reassembly buffer.
  The goal being to report all gaps and deliveries to file analyzers
  regardless of the state of the reassembly buffer at the time it has to
  be flushed.
2014-12-16 14:05:15 -06:00
Seth Hall
cafd35e746 Updates the files event api and brings file reassembly up to master. 2014-09-26 00:40:37 -04:00
Seth Hall
42b2d56279 Merge remote-tracking branch 'origin/master' into topic/seth/files-tracking
Conflicts:
	scripts/base/frameworks/files/main.bro
	src/file_analysis/File.cc
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.actions.data_event/out
2014-09-23 13:05:39 -04:00
Jon Siwek
7a46a70b77 BIT-1240: Fix MIME entity file data/gap ordering.
MIME entities buffered data and passed it along to protocol analyzers in
discrete amounts, but a gap is always passed along right away, so the
ordering of these "events" can cause incorrect file analysis.  The
change here is to never leave any MIME data buffered -- it should now be
passed along line by line as it is seen, but may still temporarily make
use of a buffer allocated by the analyzer as it works on decoding
content.
2014-09-08 18:04:03 -05:00
Seth Hall
8d9940c8c3 Merge remote-tracking branch 'origin/master' into topic/seth/files-tracking
Conflicts:
	src/Reassem.cc
	src/Reassem.h
	src/analyzer/protocol/tcp/TCP_Reassembler.cc
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.bifs.set_timeout_interval/bro..stdout
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.http.partial-content/b.out
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.http.partial-content/c.out
	testing/btest/Baseline/scripts.base.frameworks.file-analysis.logging/files.log
2014-05-27 10:56:11 -04:00
Robin Sommer
bbd409d274 Merge remote-tracking branch 'origin/master' into topic/robin/dynamic-plugins-2.3
(Never good to name a branch after version anticipated to include it ...)
2014-05-14 16:23:04 -07:00
Jon Siwek
8126f06ffb Enforce data size limit when checking files for MIME matches.
The value of *bof_buffer_size* in the *fa_file* record was supposed to
always limit the amount of data used by the signature matching engine,
but some corner cases would cause matching to be performed on data
beyond that.
2014-04-21 16:51:45 -05:00
Robin Sommer
9efb549236 Merge remote-tracking branch 'origin/topic/jsiwek/file-signatures'
* origin/topic/jsiwek/file-signatures:
  File type detection changes and fix https.log {orig,resp}_fuids fields.
  Various minor changes related to file mime type detection.
  Refactor common MIME magic matching code.
  Replace libmagic w/ Bro signatures for file MIME type identification.

Conflicts:
	scripts/base/init-default.bro
	testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
	testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log

BIT-1143 #merged
2014-03-30 22:51:05 +02:00
Jon Siwek
095a68b2ec Various minor changes related to file mime type detection.
- Improve or just remove some file magic signatures ported from libmagic
  that were too general and matched incorrectly too often.

- Fix MHR script's use of fa_file$mime_type before checking if it's
  initialized.  It may be uninitialized if no signatures match.

- The "fa_file" record now contains a "mime_types" field that contains
  all magic signatures that matched the file content (where the
  "mime_type" field is just a shortcut for the strongest match).
2014-03-06 11:41:10 -06:00
Jon Siwek
0865b152bb Refactor common MIME magic matching code.
Put some methods in file_analysis::Manager that can perform the
matching process and return MIME type results.  Also helps to
centralize the management/re-use of a signature matcher object.
2014-03-05 10:49:57 -06:00