Commit graph

110 commits

Author SHA1 Message Date
Max Kellermann
0a6ddfb6b5 Val: add TableVal::Assign() overload with IntrusivePtr
Prepare the transition to IntrusivePtr for various callers.
2020-03-06 09:06:38 +01:00
Max Kellermann
528cf11a5c Scope: lookup_ID() and install_ID() return IntrusivePtr<ID>
This fixes several memory leaks and double free bugs.
2020-02-27 12:02:55 +01:00
Max Kellermann
7f97f74203 RuleMatcher: delete PatternSet instances in destructor (memleak) 2020-02-24 12:14:10 +01:00
Max Kellermann
e98cf0a4a0 Val: eliminate the "BroString.h" include 2020-02-13 09:13:59 +01:00
Tim Wojtulewicz
d23b15c08f Use std::move in a few places instead of copying a pass-by-value argument (performance-unnecessary-value-param) 2020-02-11 14:11:22 -08:00
Max Kellermann
0db61f3094 include cleanup
The Zeek code base has very inconsistent #includes.  Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed.  Another side effect was a lot of header
bloat which slows down the build.

First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.

After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations.  In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.

This patch speeds up the build by 19%, because each compilation unit
gets smaller.  Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):

Before this patch:

 3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
 760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps

After this patch:

 2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
 72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
2020-02-04 20:51:02 +01:00
Jon Siwek
4959d438fa Initial structure for supervisor-mode
The full process hierarchy isn't set up yet, but these changes
help prepare by doing two things:

- Add a -j option to enable supervisor-mode.  Currently, just a single
  "stem" process gets forked early on to be used as the basis for
  further forking into real cluster nodes.

- Separates the parsing of command-line options from their consumption.
  i.e. need to parse whether we're in -j supervisor-mode before
  modifying any global state since that would taint the "stem" process.
  The new intermediate structure containing the parsed options may
  also serve as a way to pass configuration info from "stem" to its
  descendent cluster node processes.
2019-09-27 19:17:58 -07:00
Jon Siwek
316e8bb671 GH-554: don't init PIA endpoint matchers if there's only file-magic
The logic for initializing PIA endpoint matchers was previously
skipped if "there's no global rule matcher", and that's only true
when no signature files get loaded.

But when using `zeek -b`, some file-magic signatures still get loaded
by default, so the PIA endpoint matchers still get initialized even
though they don't need to be -- file-magic patterns play no part
in PIA.

For typical use-cases (not using the `-b` flag), this change won't
help any, but we do at least use `-b` often within the test suite.
2019-08-27 16:32:30 -07:00
Jon Siwek
8c9b3bd3ae GH-554: remove use of file magic in protocol-based signature logic
This can be a significant performance/memory improvement since
otherwise the protocol-based rule matching logic ends up superfluously
creating file-matching state per file-matcher per connection/endpoint.
2019-08-27 16:16:39 -07:00
Jon Siwek
6255ab6584 Fix misc. Coverity warnings 2019-08-14 16:19:56 -07:00
Jon Siwek
47235b57a6 Merge remote-tracking branch 'origin/topic/timw/deprecate-int-types'
* origin/topic/timw/deprecate-int-types:
  Deprecate the internal int/uint types in favor of the cstdint types they were based on

Merge adjustments:
  * A bpf type mistakenly got replaced (inside an unlikely #ifdef)
  * Did a few substitutions that got missed (likely due to
    pre-processing out of DEBUG macros)
2019-08-14 15:49:24 -07:00
Tim Wojtulewicz
e6558d1f19 Remove other simple uses of PDict 2019-08-13 19:57:42 -07:00
Tim Wojtulewicz
54752ef9a1 Deprecate the internal int/uint types in favor of the cstdint types they were based on 2019-08-12 13:50:07 -07:00
Tim Wojtulewicz
6144f459e1 Mark List::append/insert deprecated in favor of push_back/push_front for consistency with Queue 2019-07-22 09:47:43 -07:00
Jon Siwek
7ccf3c0a69 Fix debug build due to old int_list usage within assert 2019-07-15 19:00:24 -07:00
Tim Wojtulewicz
e51f02737b Convert uses of loop_over_list to ranged-for loops 2019-07-15 19:00:24 -07:00
Tim Wojtulewicz
a4e2cfa2be Change int_list in CCL.h to be a vector, fix uses of int_list to match 2019-07-15 18:58:48 -07:00
Robin Sommer
789cb376fd GH-239: Rename bro to zeek, bro-config to zeek-config, and bro-path-dev to zeek-path-dev.
This also installs symlinks from "zeek" and "bro-config" to a wrapper
script that prints a deprecation warning.

The btests pass, but this is still WIP. broctl renaming is still
missing.

#239
2019-05-01 21:43:45 +00:00
Daniel Thayer
7366155bad Update script search logic for new file extension
When searching for script files, look for both the new and old file
extensions.  If a file with ".zeek" can't be found, then search for
a file with ".bro" as a fallback.
2019-04-09 01:26:16 -05:00
Jon Siwek
2982765128 Pre-allocate and re-use Vals for bool, int, count, enum and empty string 2019-01-09 18:29:23 -06:00
Robin Sommer
5cf2320fbc Fix a couple of problems with signature matching.
- IPv4 CIDR specifications didn't work with dst-ip/src-ip.

    - The "payload-size" condition was unreliable with UDP traffic.
2016-10-19 14:23:43 -07:00
Seth Hall
d9d579c52c Merge remote-tracking branch 'origin/master' into topic/seth/stats-improvement 2016-05-02 14:34:29 -04:00
Jeannette Dopheide
6dddd35d21 Correcting spelling errors found under bro 2.4.1+dfsg-2 here:
https://lintian.debian.org/full/bengen@debian.org.html#bro_2.4.1_x2bdfsg-2
2016-04-25 11:49:04 -05:00
Seth Hall
cfdabb901f Continued stats cleanup and extension. 2016-01-09 01:14:13 -05:00
Robin Sommer
3957091e1b Renaming config.h to bro-config.h.
A couple times now I had this conflicting with files of the same name
in other projects.
2015-07-28 11:57:04 -07:00
Robin Sommer
51aed48d67 Adding back in a call to match pure rules when clearing signature
state.

Previous change had removed this, but I believe we still need it.
2015-04-10 08:09:47 -07:00
Robin Sommer
ea7bc11aa1 Merge remote-tracking branch 'origin/topic/jsiwek/bit-844'
BIT-844 #merged

* origin/topic/jsiwek/bit-844:
  Remove stale signature benchmarking code (-L command-line option).
  BIT-844: fix UDP payload signatures to match packet-wise
2015-04-09 14:52:44 -07:00
Jon Siwek
2aae90d4f2 Remove stale signature benchmarking code (-L command-line option).
I don't think this is seeing much use or will ever see much use, and
unless compilers optimize it out, it's just wasting cycles.
2015-04-06 15:46:08 -05:00
Jon Siwek
56a7bf7936 BIT-844: fix UDP payload signatures to match packet-wise 2015-04-06 15:22:26 -05:00
Jon Siwek
8b7d5a68b2 Fix reference counting for lookup_ID() usages.
That function refs the ID before returning it, but callers were never
assuming responsibility for that reference.
2014-05-01 15:00:03 -05:00
Jon Siwek
171c6ce86b Refactor regex/signature AcceptingSet data structure and usages.
Several parts of that code would do membership checks and that's going
to be more efficient with a set instead of a list data structure.
2014-04-21 16:55:51 -05:00
Jon Siwek
acc721c36c Fix mem leak and unchecked dynamic cast reported by Coverity. 2014-03-31 16:32:58 -05:00
Jon Siwek
b22ca5d0a3 Replace libmagic w/ Bro signatures for file MIME type identification.
Notable changes:

- libmagic is no longer used at all.  All MIME type detection is
  done through new Bro signatures, and there's no longer a means to get
  verbose file type descriptions (e.g. "PNG image data, 1435 x 170").
  The majority of the default file magic signatures are derived
  from the default magic database of libmagic ~5.17.

- File magic signatures consist of two new constructs in the
  signature rule parsing grammar: "file-magic" gives a regular
  expression to match against, and "file-mime" gives the MIME type
  string of content that matches the magic and an optional strength
  value for the match.

- Modified signature/rule syntax for identifiers: they can no longer
  start with a '-', which made for ambiguous syntax when doing negative
  strength values in "file-mime".  Also brought syntax for Bro script
  identifiers in line with reality (they can't start with numbers or
  include '-' at all).

- A new Built-In Function, "file_magic", can be used to get all
  file magic matches and their corresponding strength against a given
  chunk of data

- The second parameter of the "identify_data" Built-In Function
  can no longer be used to get verbose file type descriptions, though it
  can still be used to get the strongest matching file magic signature.

- The "file_transferred" event's "descr" parameter no longer
  contains verbose file type descriptions.

- The BROMAGIC environment variable no longer changes any behavior
  in Bro as magic databases are no longer used/installed.

- Reverted back to minimum requirement of CMake 2.6.3 from 2.8.0
  (it's back to being the same requirement as the Bro v2.2 release).
  The bump was to accomodate building libmagic as an external project,
  which is no longer needed.

Addresses BIT-1143.
2014-03-04 11:12:06 -06:00
Jon Siwek
92d2fdd4a6 Close signature files after done parsing. 2013-12-05 13:22:50 -06:00
Jon Siwek
47d7d9047b Merge branch 'master' into topic/jsiwek/broxygen 2013-10-16 13:34:14 -05:00
Jon Siwek
b828a6ddc7 Review usage of Reporter::InternalError, addresses BIT-1045.
Replaced some with InternalWarning or InternalAnalyzerError, the later
being a new method which signals the analyzer to not process further
input.  Some usages I just removed if they didn't make sense or clearly
couldn't happen.  Also did some minor refactors of related code while
reviewing/exploring ways to get rid of InternalError usages.

Also, for TCP content file write failures there's a new event:
"contents_file_write_failure".
2013-10-10 14:45:06 -05:00
Jon Siwek
90477df973 Refactor search_for_file() util function.
It was getting too bloated and allocated memory in ways that were
difficult to understand how to manage.  Separated out primarily in to
new find_file() and open_file()/open_package() functions.

Also renamed other util functions for path-related things.
2013-10-07 15:01:03 -05:00
Jon Siwek
0b97343ff7 Fix various potential memory leaks.
Though I expect most not to be exercised in practice.
2013-09-12 15:23:52 -05:00
Jon Siwek
4e8ba6eaa2 Fix signatures that use identifiers of type table. 2013-09-05 13:01:40 -05:00
Robin Sommer
5dc630f722 Working on TODOs.
- Introducing analyzer::<protocol> namespaces.
- Moving protocol-specific events out of events.bif into analyzer/protocol/<protocol>/events.bif
- Moving ARP over (even though it's not an actual analyzer).
- Moving NetFlow over (even though it's not an actual analyzer).
- Moving MIME over (even though it's not an actual analyzer).
2013-04-18 21:01:15 -07:00
Robin Sommer
af1809aaa3 First prototype of new analyzer framework.
This is a larger internal change that moves the analyzer
infrastructure to a more flexible model where the available analyzers
don't need to be hardcoded at compile time anymore. While currently
they actually still are, this will in the future enable external
analyzer plugins. For now, it does already add the capability to
dynamically enable/disable analyzers from script-land, replacing the
old Analyzer::Available() methods.

There are three major parts going into this:

    - A new plugin infrastructure in src/plugin. This is independent
      of analyzers and will eventually support plugins for other parts
      of Bro as well (think: readers and writers). The goal is that
      plugins can be alternatively compiled in statically or loadead
      dynamically at runtime from a shared library. While the latter
      isn't there yet, there'll be almost no code change for a plugin
      to make it dynamic later (hopefully :)

    - New analyzer infrastructure in src/analyzer. I've moved a number
      of analyzer-related classes here, including Analyzer and DPM;
      the latter now renamed to Analyzer::Manager. More will move here
      later. Currently, there's only one plugin here, which provides
      *all* existing analyzers. We can modularize this further in the
      future (or not).

    - A new script interface in base/framework/analyzer. I think that
      this will eventually replace the dpm framework, but for now
      that's still there as well, though some parts have moved over.

I've also remove the dpd_config table; ports are now configured via
the analyzer framework. For exmaple, for SSH:

    const ports = { 22/tcp } &redef;

    event bro_init() &priority=5
        {
        ...
        Analyzer::register_for_ports(Analyzer::ANALYZER_SSH, ports);
        }

As you can see, the old ANALYZER_SSH constants have more into an enum
in the Analyzer namespace.

This is all hardly tested right now, and not everything works yet.
There's also a lot more cleanup to do (moving more classes around;
removing no longer used functionality; documenting script and C++
interfaces; regression tests). But it seems to generally work with a
small trace at least.

The debug stream "dpm" shows more about the loaded/enabled analyzers.

A new option -N lists loaded plugins and what they provide (including
those compiled in statically; i.e., right now it outputs all the
analyzers).

This is all not cast-in-stone yet, for some things we need to see if
they make sense this way. Feedback welcome.
2013-03-26 11:05:38 -07:00
Robin Sommer
b9811e87e5 Merge remote-tracking branch 'origin/topic/jsiwek/ipv6-sigs'
* origin/topic/jsiwek/ipv6-sigs:
  Add IPv6 support to signature header conditions.

Closes #774.
Closes #880.
2012-10-19 15:06:00 -07:00
Jon Siwek
e835a55229 Add IPv6 support to signature header conditions.
- "src-ip" and "dst-ip" conditions can now use IPv6 addresses/subnets.
  They must be written in colon-hexadecimal representation and enclosed
  in square brackets (e.g. [fe80::1]).  Addresses #774.

- "icmp6" is now a valid protocol for use with "ip-proto" and "header"
  conditions.  This allows signatures to be written that can match
  against ICMPv6 payloads.  Addresses #880.

- "ip6" is now a valid protocol for use with the "header" condition.
  (also the "ip-proto" condition, but it results in a no-op in that
  case since signatures apply only to the inner-most IP packet when
  packets are tunneled).  This allows signatures to match specifically
  against IPv6 packets (whereas "ip" only matches against IPv4 packets).

- "ip-proto" conditions can now match against IPv6 packets.  Before,
  IPv6 packets were just silently ignored which meant DPD based on
  signatures did not function for IPv6 -- protocol analyzers would only
  get attached to a connection over IPv6 based on the well-known ports
  set in the "dpd_config" table.
2012-10-17 11:11:51 -05:00
Robin Sommer
42066cc1fd Teaching cmake to always link in tcmalloc if it finds it.
Also renaming --enable-perftools to --enable-perftool-debug to
indicate that the switch is only relevant for debugging the heap. It's
not needed to pick up tcmalloc for better performance.

--with-perftools can still (and always) be used to give a hint where
to find the libraries.

With the threading, using tcmalloc improves memory usage on FreeBSD
significantly when running on a trace. If it fixes the live problems,
remains to be seen ...
2012-03-28 15:42:09 -07:00
Robin Sommer
edc9bb14af Making exchange of addresses between threads thread-safe.
As we can't use the IPAddr class (because it's not thread-safe), this
involved a bit manual address manipulation and also shuffling some
things around a bit.

Not fully working yet, the tests for remote logging still fail.
2012-02-28 15:57:43 -08:00
Robin Sommer
ada5f38d04 Merge branch 'master-merge-helper'
* master-merge-helper:
  possible use after free forbidden
  Suppression of unused code
  Fix of some memory leaks
  removing dead code
  A destructor must free the memory allocated by the constructor
  Good overridance with the good qualifier
  Better use of operators priorities
  protection from bad frees on unallocated strings
2012-02-24 16:37:45 -08:00
Julien Sentier
2e069c9596 Fix of some memory leaks 2012-02-24 15:39:50 -08:00
Robin Sommer
94b9644da7 Working on merging the v6-addr branch. This is checkpoint, tests don't
pass yet.

Changes:

- Gave IPAddress/IPPrefix methods AsString() so that one doesn't need
  to cast to get a string represenation.

- Val::AsAddr()/AsSubnet() return references rather than pointers. I
  find that more intuitive.

- ODesc/Serializer/SerializationFormat get methods to support
  IPAddress/IPPrefix directly.

- Reformatted the comments in IPAddr.h from /// to /** style.

- Given IPPrefix a Contains() method.

- A bit of cleanup.
2012-02-16 20:39:16 -08:00
Robin Sommer
7458ebf385 Checkpoint after pass. 2012-02-15 13:07:08 -08:00
Jon Siwek
b3f1f45082 Remove --enable-brov6 flag, IPv6 now supported by default.
Internally, all BROv6 preprocessor switches were removed and
addr/subnet representations wrapped in the new IPAddr/IPPrefix classes.

Some script-layer changes of note:

- dns_AAAA_reply event signature changed: the string representation
  of an IPv6 addr is easily derived from the addr value, it doesn't
  need to be another parameter.  This event also now generated directly
  by the DNS analyzer instead of being "faked" into a dns_A_reply event.

- removed addr_to_count BIF.  It used to return the host-order
  count representation of IPv4 addresses only.  To make it more
  generic, we might later add a BIF to return a vector of counts
  in order to support IPv6.

- changed the result of enclosing addr variables in vertical pipes
  (e.g. |my_addr|) to return the bit-width of the address type which
  is 128 for IPv6 and 32 for IPv4.  It used to function the same
  way as addr_to_count mentioned above.

- remove bro_has_ipv6 BIF
2012-02-03 16:46:58 -06:00