One tweak: I made ts optional and set it to network_time() if not given.
BIT-1578 #merged
* origin/topic/johanna/bit-1578:
Weird: fix potential small issue when ignoring duplicates
Rewrite weird logging.
BIT-1594 #merged
* origin/topic/johanna/rawleak:
Exec: fix reader cleanup when using read_files
Raw Writer: First step - make code more c++11-y, remove raw pointers.
- SMTP protocol headers now do some minimal parsing to clean up
email addresses.
- New function named split_mime_email_addresses to take MIME headers
and get addresses split apart but including the display name.
- Update tests.
Wen using read_files, the Exec framework called Input::remove on the
wrong input stream: it always got called on the input stream of the
execution, not on the input stream of the current file that was being
read.
This lead to threads never being closed and file handles being kept open
until Bro is closed. This means that before this patch, every time
ActiveHTTP is used, a thread stays around and several file handles are
used.
In all versions so far, the identifier string that was used for
comparisons might have been different from the identifier string that
was added (when certain notices are used).
This commit rewrites the way that weirds are logged and fixes a number
of issues on the way. Most prominently, flow weirds now actually log
information about the flow that they occur in (before this change, they
only logged the name of the weird, which is only marginally helpful).
Besides restructuring how weird logging works internally, weirds can now
also be generated by calling Weird::weird with the info record directly,
allowing more fine-granular passing of information. This is e.g. used
for DNS weirds, which do not have the connection record available any
more when they are generated (before data like the connection ID was
just not logged in these instances).
Addresses BIT-1578
We now extract email addresses in the fields that one would expect
to contain addresses. This makes further downstream processing of
these fields easier like log analysis or using these fields in the
Intel framework. The primary downside is that any other content
in these fields is no longer available such as full name and any
group information. I believe the simplification of the content in
these fields is worth the change.
Added "cc" to the script that feeds information from SMTP into the
Intel framework.
A new script for email handling utility functions has been created
as a side effect of these changes.
This changes the HTTP log format slightly but shouldn't mess
up anything that anyone was doing because the old "filename"
field was never actually filled out. Tests are updated as well.
Frame types except data and frames subtypes without payload are skipped.
Header length is determined based on presence of QoS and flags
indicating the use of the 4th address field. Handling of aggregated
MSDUs is explicitly prevented.
The expiration attribute expression is now evaluated for every use. Thus
later adjustments of the value (e.g. by redefining a const) will now
take effect. Values less than 0 will disable expiration.
Added a new BIF haversine_distance that computes distance between two
geographic locations.
Added a new Bro script function haversine_distance_ip that does the same
but takes two IP addresses instead of latitude/longitude. This function
requires that Bro be built with libgeoip.
I changed the patch slightly - now debug.log is only created, if a debug
stream is enabled.
BIT-1616 #merged
* origin/topic/dnthayer/ticket1616:
Don't create debug.log immediately upon startup
The RFB analyzer's state machine did not foresee that a server could
send two subsequent messages in one packet. This would result in the
error. Patch by Martin van Hensbergen.
BIT-1611 #merged
* origin/topic/seth/remove-unescaped_special_char-weird:
Add urldecoding for the unofficial %u00AE style of encoding.
Remove the unescaped_special_char HTTP weird.