Commit graph

2398 commits

Author SHA1 Message Date
Johanna Amann
930a5c8ebd TableSync: rename auto_store -> table_store 2020-07-17 11:40:59 -07:00
Johanna Amann
6d2aa84952 SyncTables: address feedback part 1 - naming (broker and zeek)
This commit fixes capitalization issues.
2020-07-17 10:56:28 -07:00
Jon Siwek
c84a51ac09 GH-837: emit Reporter errors for Broker errors
Instead of only writing them in broker.log, which may be easy to
overlook.
2020-07-16 18:07:00 -07:00
Johanna Amann
7c37226eaa Merge remote-tracking branch 'origin/master' into topic/johanna/table-changes 2020-07-13 17:11:55 -07:00
Johanna Amann
2b2a40f49c Zeek Table<->Brokerstore: cleanup, documentation, small fixes
This commit adds script/c++ documentation and fixes a few loose ends.
It also adds tests for corner cases and massively improves error
messages.

This also actually introduces type-compatibility checking and introduces
a new attribute that lets a user override this if they really know what
they are doing. I am not quite sure if we should really let that stay in
- but it can be very convenient to have this functionality.

One test is continuing to fail - the expiry test is very flaky. This is,
I think, caused by delays of the broker store forwarding. I am unsure if
we can actually do anything about that.
2020-07-10 16:58:34 -07:00
Jon Siwek
6908d1b919 GH-1019: deprecate icmp_conn params for ICMP events
Previously, a single `icmp_conn` record was built per ICMP "connection"
and re-used for all events generated from it.  This may have been a
historical attempt at performance optimization, but:

  * By default, Zeek does not load any scripts that handle ICMP events.

  * The one script Zeek ships with that does handle ICMP events,
    "detect-traceroute", is already noted as being disabled due to
    potential performance problems of doing that kind of analysis.

  * Re-use of the original `icmp_conn` record tends to misreport
    TTL and length values since they come from original packet instead
    of the current one.

  * Even if we chose to still re-use `icmp_conn` records and just fill
    in a new TTL and length value each packet, a user script could have
    stored a reference to the record and not be expecting those values
    to be changed out from underneath them.

Now, a new `icmp_info` record is created/populated in all ICMP events
and should be used instead of `icmp_conn`.  It also removes the
orig_h/resp_h fields as those are redundant with what's already
available in the connection record.
2020-07-10 11:06:28 -07:00
Johanna Amann
67917b83aa Merge remote-tracking branch 'origin/master' into topic/johanna/table-changes 2020-07-09 17:02:57 -07:00
Johanna Amann
e1a45d33e0 Merge remote-tracking branch 'origin/master' into topic/johanna/table-changes
* origin/master: (47 commits)
  scan.l: Remove "constant" did_module_restore logic
  Fix FreeBSD CI script to install right SWIG package
  Update submodule(s)
  GH-928: use realpath() instead of inode to de-duplicate scripts
  Update submodule(s)
  GH-1040: Add zero-indexed version of str_split
  Fix WhileStmt to call Stmt(Tag) ctor
  GH-1041: Move compress_path to a bif that uses normalize_path
  Update submodule(s)
  Update submodule(s)
  Update submodule(s)
  Fix --enable-mobile-ipv6 build
  Fix namespace of GetCurrentLocation() to zeek::detail
  Add backtrace() and print_backtrace()
  Rename BroString files to ZeekString
  Update NEWS entry with note about class renames
  Rename BroObj to Obj
  Rename BroString to zeek::String
  Move Func up to zeek namespace, rename BroFunc to ScriptFunc
  Mark global val_mgr as deprecated and fix uses of it to use namespaced version
  ...
2020-07-09 14:07:03 -07:00
Jon Siwek
7669f560d1 Integrate Supervisor code review suggestions 2020-07-09 13:56:11 -07:00
Johanna Amann
3eac12b40d BrokerStore<->Zeek Tables Fix a few small test failures. 2020-07-09 19:43:45 +00:00
Jon Siwek
10709c627b Add Supervisor::{stdout,stderr}_hook
These allow capturing/handling the stdout/stderr of child processes
via Zeek scripts.
2020-07-07 20:21:32 -07:00
Jon Siwek
a06ef66edc Add Log::rotation_format_func and Log::default_rotation_dir options
These may be redefined to customize log rotation path prefixes,
including use of a directory.  File extensions are still up to
individual log writers to add themselves during the actual rotation.

These new also allow for some simplication to the default
ASCII postprocessor function: it eliminates the need for it doing an
extra/awkward rename() operation that only changes the timestamp format.

This also teaches the supervisor framework to use these new options
to rotate ascii logs into a log-queue/ directory with a specific
file name format (intended for an external archiver process to
monitor separately).
2020-07-07 18:42:37 -07:00
Jon Siwek
7b15b82009 Merge remote-tracking branch 'origin/topic/timw/1040-str-split'
* origin/topic/timw/1040-str-split:
  GH-1040: Add zero-indexed version of str_split
2020-07-06 21:06:51 -07:00
Tim Wojtulewicz
e6871ed3e9 GH-1040: Add zero-indexed version of str_split 2020-07-06 17:05:40 -07:00
Ron Wellman
e7146c2a6b Implement EDNS Client Subnet Option 2020-07-06 15:09:03 -04:00
Tim Wojtulewicz
560ee0c05e GH-1041: Move compress_path to a bif that uses normalize_path 2020-07-06 11:43:44 -07:00
Jon Siwek
a1c19840ce Add backtrace() and print_backtrace() 2020-07-03 14:09:31 -07:00
Johanna Amann
a220b02722 BrokerStore<->Zeek tables: &backend works for in-memory stores.
Currently this requires using this with a normal cluster - or sending
messages by yourself.

It, in principle, should also work with SQLITE - but that is a bit
nonsensical without being able to change the storage location.
2020-07-01 16:38:10 -07:00
Johanna Amann
318a72c303 BrokerStore<->Zeek table - introdude &backend attribute
The &backend attribute allows for a much more convenient way of
interacting with brokerstores. One does not need to create a broker
store anymore - instead all of this is done internally.

The current state of this partially works. This should work fine for
persistence - but clones are currently not yet correctly attached.
2020-06-30 16:33:52 -07:00
Johanna Amann
a5a51de3c4 Merge remote-tracking branch 'origin/topic/jsiwek/gh-1036-print-log-network-time'
* origin/topic/jsiwek/gh-1036-print-log-network-time:
  GH-1036: change print.log to log network time instead of current

Fixes GH-1036
2020-06-29 19:25:16 +00:00
Jon Siwek
54d8954c80 GH-1036: change print.log to log network time instead of current 2020-06-26 19:55:09 -07:00
Justin Azoff
f086928c5c reduce memory usage of ConnPolling
Instead of scheduling the event with the full 'connection' record,
schedule it with the smaller 'conn_id' record.
2020-06-26 18:51:29 -04:00
Johanna Amann
af2110cfc9 Merge remote-tracking branch 'origin/topic/jsiwek/reduce-ftp-cluster-msg-sizes'
* origin/topic/jsiwek/reduce-ftp-cluster-msg-sizes:
  Minimize data published for expected FTP data channel analysis
2020-06-18 20:07:26 +00:00
Jon Siwek
7e9a3e1e00 Minimize data published for expected FTP data channel analysis
Previously, more data than could effectively be utilized by any remote
Zeek was published (e.g. full list of pending commands or other
transient state that may add up to non-trivial amount of bytes).
2020-06-17 12:45:21 -07:00
Jon Siwek
51e738a1c0 GH-998: Fix Reporter::conn_weird() to handle expired connections
This introduces a new sampling state-map for expired connections to fix
segfaults that previously occured when passing in a `connection` record
to `Reporter::conn_weird()` for which the internal `Connection` object
had already been expired and deleted.  This also introduces a new event
called `expired_conn_weird`, which is similar to `conn_weird`, except
the full `connection` record is no longer available, just the `conn_id`
and UID string.
2020-06-15 12:57:47 -07:00
Tim Wojtulewicz
503ef26a17 Merge remote-tracking branch 'origin/topic/jsiwek/gh-893-intrusive-ptr-migration'
* origin/topic/jsiwek/gh-893-intrusive-ptr-migration: (151 commits)
  Integrate review feedback
  Switch Broker Val converter visitor to return IntrusivePtr
  Change BroFunc ctor to take const-ref IntrusivePtr<ID>
  Add version of Frame::SetElement() taking IntrusivePtr<ID>
  Change Scope/Func inits from id_list* to vector<IntrusivePtr<ID>>
  Change Scope::GenerateTemporary() to return IntrusivePtr
  Deprecate Scope::ReturnType(), replace with GetReturnType()
  Deprecate Scope::ScopeID(), replace with GetID()
  Switch parsing to use vector<IntrusivePtr<Attr>> from attr_list
  Deprecate TableVal::FindAttr(), replace with GetAttr()
  Deprecate TypeDecl::FindAttr(), replace with GetAttr()
  Deprecate ID::FindAttr(), replace with GetAttr()
  Deprecate Attributes::FindAttr(), replace with Find()
  Deprecate Attributes::AddAttrs(Attributes*)
  Add Attributes ctor that takes IntrusivePtrs
  Change Attributes to store std:vector<IntrusivePtr<Attr>>
  Change Attr::SetAttrExpr() to non-template
  Deprecate Attr::AttrExpr(), replace with GetExpr()
  Deprecate ID::Attrs(), replace with GetAttrs()
  Remove weak_ref param from ID::SetVal()
  ...
2020-06-01 10:58:02 -07:00
Johanna Amann
433e1154da Merge branch 'add_bzar_dce_rpc_consts' of https://github.com/ct-square/zeek
* 'add_bzar_dce_rpc_consts' of https://github.com/ct-square/zeek:
  Remove dupplicate DCE-RPC endpoint
  Add DCE-RPC constants from BZAR project

Closes GH-953
2020-05-26 22:04:33 +00:00
Jon Siwek
78e3267c44 Deprecate internal_handler(), replace with EventRegistry::Register()
Added a couple explicit event declarations that were missing: "net_done"
and "dns_mapping_name_changed".
2020-05-14 17:25:02 -07:00
Johanna Amann
2aeb3d8e39 Merge remote-tracking branch 'origin/topic/timw/906-find-all-urls-regex'
* origin/topic/timw/906-find-all-urls-regex:
  Restore previous url scheme capture group
  GH-906: Fix the regex in url.zeek to better match for find_all_urls
2020-05-13 15:05:54 -07:00
Johanna Amann
a259e8bbda Merge remote-tracking branch 'origin/master' into topic/johanna/hash-unification 2020-05-12 00:29:02 +00:00
Jon Siwek
b5531ecbd3 Merge branch 'set_to_regex-docs' of https://github.com/jlagermann/zeek
- Adjusted the formatting during merge

* 'set_to_regex-docs' of https://github.com/jlagermann/zeek:
  added examples to set_to_regex comments Signed-ff-by: James Lagermann <james.lagermann@corelight.com>
2020-05-08 11:48:44 -07:00
James Lagermann
2c04a56236
added examples to set_to_regex comments
Signed-ff-by: James Lagermann <james.lagermann@corelight.com>
2020-05-08 12:31:56 -05:00
Johanna Amann
04ed125941 Merge remote-tracking branch 'origin/master' into topic/johanna/hash-unification 2020-05-06 23:18:33 +00:00
Jon Siwek
b749dda520 Fix SSL scripting error leading to access of unitialized field
Reported by Justin Azoff
2020-05-06 09:52:31 -07:00
Johanna Amann
7d28a6ee9a Remove outdated comment on set_to_regex.
We can add patterns at runtime since 2.6.
2020-05-05 14:23:33 -07:00
Jon Siwek
156686b237 Correct spelling of DCE/RPC operation string NetrLogonSameLogonWithFlags
Fixes GH-952
2020-05-04 18:03:14 -07:00
V
45a5b1b0cf Remove dupplicate DCE-RPC endpoint 2020-05-04 18:02:04 +02:00
V
7cf8c7a6d2 Add DCE-RPC constants from BZAR project 2020-05-04 17:15:27 +02:00
Johanna Amann
3bce313b12 Switch file UID hashing from md5 to highwayhash.
This commit switches UID hashing from md5 to a highway hash. It also
moves the salt value out of the file plugin - and makes it
installation-specific instead - it is moved to the global namespace.

There now are digest hash functions to make "static"
installation-specific hashes that are stable over workers available to
everyone; hashes can be 64, 128 or 256 bits in size.

Due to the fact that we switch the file hashing algorithm, all file
hashes change.

The underlyigng algorithm that is used for hashing is highwayhash-128,
which is significantly faster than md5.
2020-04-30 10:20:09 -07:00
Seth Hall
dac96a6be3 Fixes a small bug in one signature with a duplicate name.
Also update a single failing test.
2020-04-29 11:22:42 -04:00
Seth Hall
15d43dfbcd Organized and added to the shipped file identification signatures.
- Added ISO 9660 disk image
 - Created new files for categorizing signatures better.
   - executable.sig - Executable (and bytecode) files.
   - java.sig - Java related files (class/jar, etc).
   - programming.sig - Mostly scripting language identification
2020-04-29 11:08:32 -04:00
Johanna Amann
faa8a38578 Merge remote-tracking branch 'origin/topic/jsiwek/gh-854-preserve-header-name'
* origin/topic/jsiwek/gh-854-preserve-header-name:
  GH-854: provide access to original HTTP/MIME header names
2020-04-27 19:31:49 +00:00
Vern Paxson
fe46ef06a0 unused variables found via use-def analysis (plus an indentation micro-nit) 2020-04-25 18:06:47 -07:00
Jon Siwek
5032993b94 GH-854: provide access to original HTTP/MIME header names
The "http_header" event now has an "original_name" parameter that allows
access to the original header name (the "name" parameter reamins the
same as before: it's the uppercased header name).

The "mime_header_rec" record type now also includes an "original_name"
field to similarly provide access to original header name in the
following events: "http_all_headers", "mime_one_header", and
"mime_all_headers".
2020-04-20 16:56:41 -07:00
Jon Siwek
8843f69002 Remove ineffective &default in netcontrol cluster event handler args 2020-04-16 15:40:27 -07:00
Jon Siwek
c8e070b8ee Add default function for Kerberos constant-lookup-tables 2020-04-16 12:34:41 -07:00
Tim Wojtulewicz
612c59e099 Restore previous url scheme capture group 2020-04-14 16:33:19 -07:00
Tim Wojtulewicz
ba1c03188f Merge remote-tracking branch 'origin/topic/jsiwek/alternate-hook-event-prototypes'
* origin/topic/jsiwek/alternate-hook-event-prototypes:
  Add warning for ineffective &default arguments in handlers
  Fix frame size allocation of alternate event/hook handlers
  Emit error for alternate event/hook prototype args with attributes
  Improve alternate event/hook prototype matching
  Allow alternate event/hook prototype declarations
2020-04-13 15:00:25 -07:00
Tim Wojtulewicz
0d31d39de9 GH-906: Fix the regex in url.zeek to better match for find_all_urls 2020-04-13 13:17:57 -07:00
Jon Siwek
ce9183a2ed Fix Broker topics used to uniquely identify cluster nodes
Node-specific topic prefix subscriptions/publications now add a trailing
slash like "zeek/cluster/node/<name>/".  Without the trailing slash,
messages attempting to target "proxy-10" may also be sent to "proxy-1"
since subscription matching is prefix-based.
2020-04-10 14:36:00 -07:00