Rewrite the MHR detection description.

Now that the MHR script uses the file analysis framework, the
description needed to be rewritten to reflect the changes.  Robin
commented that he didn't feel the MHR script was a good introductory
script and he might be right, however, I couldn't find one that was
easier to explain.
This commit is contained in:
Scott Runnels 2013-09-20 13:25:49 -04:00
parent 5fede2f73e
commit 8e3c6ada0f

View file

@ -10,13 +10,6 @@ Writing Bro Scripts
Understanding Bro Scripts
=========================
.. todo::
The MHR integration has changed significantly since the text was
written. We need to update it, however I'm actually not sure this
script is a good introductory example anymore unfortunately.
-Robin
Bro includes an event-driven scripting language that provides
the primary means for an organization to extend and customize Bro's
functionality. Virtually all of the output generated by Bro
@ -51,82 +44,96 @@ appropriate DNS lookup and parsing the response.
.. btest-include:: ${BRO_SRC_ROOT}/scripts/policy/frameworks/files/detect-MHR.bro
Visually, there are three distinct sections of the script. A base
level with no indentation followed by an indented and formatted
section explaining the custom variables being provided (``export``) and another
indented and formatted section describing the instructions for a
specific event (``event log_http``). Don't get discouraged if you don't
level with no indentation where libraries are included in the script through ``@load``
and a namespace is defined with ``module``. This is followed by an indented and formatted
section explaining the custom variables being provided (``export``) as part of the script's namespace.
Finally there is a second indented and formatted section describing the instructions to take for a
specific event (``event file_hash``). Don't get discouraged if you don't
understand every section of the script; we'll cover the basics of the
script and much more in following sections.
.. btest-include:: ${BRO_SRC_ROOT}/scripts/policy/frameworks/files/detect-MHR.bro
:lines: 7-11
:lines: 4-6
Lines 7 and 8 of the script process the ``__load__.bro`` script in the
respective directories being loaded. The ``@load`` directives are
often considered good practice or even just good manners when writing
Bro scripts to make sure they can be
used on their own. While it's unlikely that in a
Bro scripts to make sure they can be used on their own. While it's unlikely that in a
full production deployment of Bro these additional resources wouldn't
already be loaded, it's not a bad habit to try to get into as you get
more experienced with Bro scripting. If you're just starting out,
this level of granularity might not be entirely necessary though.
this level of granularity might not be entirely necessary. The ``@load`` directives
are ensuring the Files framework, the Notice framework and the script to hash all files has
been loaded by Bro.
.. btest-include:: ${BRO_SRC_ROOT}/scripts/policy/frameworks/files/detect-MHR.bro
:lines: 12-24
:lines: 10-31
The export section redefines an enumerable constant that describes the
type of notice we will generate with the logging framework. Bro
type of notice we will generate with the Notice framework. Bro
allows for re-definable constants, which at first, might seem
counter-intuitive. We'll get more in-depth with constants in a later
chapter, for now, think of them as variables that can only be altered
before Bro starts running. The notice type listed allows for the use
of the :bro:id:`NOTICE` function to generate notices of type
``Malware_Hash_Registry_Match`` as done in the next section. Notices
``TeamCymruMalwareHashRegistry::Match`` as done in the next section. Notices
allow Bro to generate some kind of extra notification beyond its
default log types. Often times, this extra notification comes in the
form of an email generated and sent to a preconfigured address.
form of an email generated and sent to a preconfigured address, but can be altered
depending on the needs of the deployment. The export section is finished off with
the definition of two constants that list the kind of files we want to match against and
the minimum percentage of detection threshold in which we are interested.
Up until this point, the script has merely done some basic setup. With the next section,
the script starts to define instructions to take in a given event.
.. btest-include:: ${BRO_SRC_ROOT}/scripts/policy/frameworks/files/detect-MHR.bro
:lines: 26-44
:lines: 33-57
The workhorse of the script is contained in the event handler for
``log_http``. The ``log_http`` event is defined as an event-hook in
the :doc:`/scripts/base/protocols/http/main` script and allows scripts
to handle a connection as it is being passed to the logging framework.
The event handler is passed an :bro:type:`HTTP::Info` data structure
which will be referred to as ``rec`` in body of the event handler.
``file_hash``. The ``file_hash`` event is defined in the
:doc:`/scripts/base/bif/plugins/Bro_FileHash.events.bif.bro` script and allows scripts to access
the information associated with a file for which Bro's file analysis framework has
generated a hash. The event handler is passed the file itself as ``f``, the type of digest
algorithm used as ``kind`` and the hash generated as ``hash``.
An ``if`` statement is used to check for the existence of a data structure
named ``md5`` nested within the ``rec`` data structure. Bro uses the ``$`` as
a deference operator and as such, and it is employed in this script to
check if ``rec$md5`` is present by including the ``?`` operator within the
path. If the ``rec`` data structure includes a nested data structure
named ``md5``, the statement is processed as true and a local variable
named ``hash_domain`` is provisioned and given a format string based on
the contents of ``rec$md5`` to produce a valid DNS lookup.
On line 35, an ``if`` statement is used to check for the correct type of hash, in this case
a SHA1 hash. It also checks for a mime type we've defined as being of interest as defined in the
constant ``match_file_types``. The comparison is made against the variable ``f$mime_type`` which uses
the ``$`` dereference operator to check the value ``mime_type`` inside the variable ``f``. Once both
values resolve to true, a local variable is defined to hold a string comprised of the SHA1 hash concatenated
with ".malware.hash.cymru.com"; this value will be the domain queried in the malware hash registry.
The rest of the script is contained within a ``when`` block. In
short, a ``when`` block is used when Bro needs to perform asynchronous
actions, such a DNS lookup, to ensure that performance isn't effected.
actions, such as a DNS lookup, to ensure that performance isn't effected.
The ``when`` block performs a DNS TXT lookup and stores the result
in the local variable ``MHR_result``. Effectively, processing for
this event continues and upon receipt of the values returned by
:bro:id:`lookup_hostname_txt`, the ``when`` block is executed. The
``when`` block splits the string returned into two separate values and
checks to ensure an expected format. If the format is invalid, the
script assumes that the hash wasn't found in the repository and
processing is concluded. If the format is as expected and the
detection rate is above the threshold set by ``MHR_threshold``, two
new local variables are created and used in the notice issued by
:bro:id:`NOTICE`.
``when`` block splits the string returned into a portion for the date on which
the malware was first detected and the detection rate by splitting on an text space
and storing the values returned in a local table variable. In line 42, if the table
returned by ``split1`` has two entries, indicating a sucessful split, we store the detection
date in ``mhr_first_detect`` and the rate in ``mhr_detect_rate`` on lines 45 and 45 respectively
using the appropriate conversion functions. From this point on, Bro knows it has seen a file
transmitted which has a hash that has been seen by the Team Cymru Malware Hash Registry, the rest
of the script is dedicated to producing a notice.
In approximately 15 lines of actual code, Bro provides an amazing
On line 47, the detection time is processed into a string representation and stored in
``readable_first_detected``. The script then compares the detection rate against the
``notice_threshold`` that was defined on line 30. If the detection rate is high enough, the script
creates a concise description of the notice on line 50, a possible URL to check the sample against
virustotal.com's database, and makes the call to :bro:id:`NOTICE` to hand the relevant information
off to the Notice framework.
In approximately 25 lines of code, Bro provides an amazing
utility that would be incredibly difficult to implement and deploy
with other products. In truth, claiming that Bro does this in 15
with other products. In truth, claiming that Bro does this in 25
lines is a misdirection; there is a truly massive number of things
going on behind-the-scenes in Bro, but it is the inclusion of the
scripting language that gives analysts access to those underlying
layers in a succinct and well defined manner.
layers in a succinct and well defined manner.
The Event Queue and Event Handlers
==================================