* origin/topic/timw/906-find-all-urls-regex:
Restore previous url scheme capture group
GH-906: Fix the regex in url.zeek to better match for find_all_urls
- Adjusted the formatting during merge
* 'set_to_regex-docs' of https://github.com/jlagermann/zeek:
added examples to set_to_regex comments Signed-ff-by: James Lagermann <james.lagermann@corelight.com>
A call such as
decompose_uri("git://git.kernel.org:/pub/scm/linux/");
would raise an error along the lines of
error in /usr/local/zeek-3.0.0/share/zeek/base/utils/urls.zeek, line 122: bad conversion to count (to_count(parts[1]) and )
This was because an empty string got passsed to the to_count()
function.
Let's improve the behaviour and rather consider the portnum component
of the URI to be uninitialized.
This commit removed functions/events that have been deprecated in Bro
2.6. It also removes the detection code that checks if the old
communication framework is used (since all the functions that are
checked were removed).
Addresses parts of GH-243
* All "Broxygen" usages have been replaced in
code, documentation, filenames, etc.
* Sphinx roles/directives like ":bro:see" are now ":zeek:see"
* The "--broxygen" command-line option is now "--zeexygen"
* is_valid_ip() is now implemented as a BIF instead of in
base/utils/addrs
* The IPv4 and IPv6 regular expressions provided by base/utils/addrs
have been improved/corrected (previously they could possibly match
some invalid IPv4 decimals, or various "zero compressed" IPv6 strings
with too many hextets)
* extract_ip_addresses() should give better results as a result of
the above two points
Scripting errors/mistakes now consistently generate a runtime error
which have the behavior of unwinding the call stack all the way out of
the current event handler.
Before, such errors were not treated consistently and either aborted
the process entirely or emitted a message while continuing to execute
subsequent statements without well-defined behavior (possibly causing
a cascade of errors).
The previous behavior also would only unwind out of the current
function (if within a function body), not out the current event
handler, which is especially problematic for functions that return
a value: the caller is essentially left a mess with no way to deal
with it.
This also changes the behavior of the startup/initialization process
to abort if there's errors during bro_init() rather than continue one
to the main run loop. The `allow_init_errors` option may change this
new, default behavior.
These are probably some of the most desired options to be dynamically
changeable; since they only are accessed in script-land there should not
be any problems with them changing on the fly.
* origin/topic/seth/fix-raw-reader-subprocess-exit:
Fix an issue with raw reader culling streams for dead processes.
Updated the 'exec' utility to no longer remove input streams for
processes that are finished as the core C++ code will take care of that
(and trying to remove a stream multiple times emits a warning message).
* origin/topic/vern/vec-append:
d'oh, still have a (deprecated) string_array rather than string_vector
forgot to update test suite results for v += e
reap the fruits of v += e
test case for v += e
documentation of v += e
v += e implemented
Fixed a mistake in find_ip_addresses()
BIT-1594 #merged
* origin/topic/johanna/rawleak:
Exec: fix reader cleanup when using read_files
Raw Writer: First step - make code more c++11-y, remove raw pointers.
- SMTP protocol headers now do some minimal parsing to clean up
email addresses.
- New function named split_mime_email_addresses to take MIME headers
and get addresses split apart but including the display name.
- Update tests.
Wen using read_files, the Exec framework called Input::remove on the
wrong input stream: it always got called on the input stream of the
execution, not on the input stream of the current file that was being
read.
This lead to threads never being closed and file handles being kept open
until Bro is closed. This means that before this patch, every time
ActiveHTTP is used, a thread stays around and several file handles are
used.
We now extract email addresses in the fields that one would expect
to contain addresses. This makes further downstream processing of
these fields easier like log analysis or using these fields in the
Intel framework. The primary downside is that any other content
in these fields is no longer available such as full name and any
group information. I believe the simplification of the content in
these fields is worth the change.
Added "cc" to the script that feeds information from SMTP into the
Intel framework.
A new script for email handling utility functions has been created
as a side effect of these changes.
Added a new BIF haversine_distance that computes distance between two
geographic locations.
Added a new Bro script function haversine_distance_ip that does the same
but takes two IP addresses instead of latitude/longitude. This function
requires that Bro be built with libgeoip.
BIT-1550 #merged
* origin/topic/johanna/netcontrol: (72 commits)
Update baselines and news
Move prefixtable back to all IPv6 internal handling.
NetControl: Add functions to search for rules affecting IPs/subnets
Add check_subnet bif that allows exact membership test for subnet tables.
Rewrite internal handling of rules.
Add bif that allows searching for all matching subnets in table.
Add signaling of succesful initialization of plugins to NetControl.
Add rule hooks to the acld plugin.
Add new logfiles for shunting and drops to netcontrol
Extend NetControl logging and fix bugs.
Update OpenFlow API and events.
small acld plugin fix
Revert "introduce &weaken attribute"
Fix crash when printing type of recursive structures.
Testcase for crash when a record contains a function referencing a record.
Rename Pacf to NetControl
fix acld plugin to use address instead of subnet (and add functions for conversion)
implement quarantine
miscelaneous missing bits and pieces
Acld implementation for Pacf - Bro side.
...
The server-reported file size was being collected poorly and if
a file name had a number in it, that was reported as the file
size instead of the actual size.
A new test is included to avoid reintroducing the problem.
The API now does not follow the openflow specification quite as closely,
however I think it is much more usable. Furthermore, the Ryu plugin was
basically completely rewritten and is now more usable for general flow
manipulation.
This also adds a debug mode that just outputs the json fragments that
would be sent to ryu. At the moment, Ryu still assumes that every
request that it receives succeeds - it is not possible to get an error
message from the controller. Instead, one has to check if a flow was
added by doing a second REST request. Which seems unnecessary, and also
requires complete json parsing functionality. Hence we are not doing
that at the moment.
The alternative would be to use an external script for the actual
add-and-check-operation.