From 17bc3955f90b62659f55a85ad11dfc710be2a484 Mon Sep 17 00:00:00 2001 From: Scott Runnels Date: Fri, 20 Sep 2013 11:43:45 -0400 Subject: [PATCH 1/5] Update the lines included from events.bif.bro. Previously listed connection_established and connection_finished which are no longer in place in events.bif.bro. --- doc/scripting/index.rst | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/doc/scripting/index.rst b/doc/scripting/index.rst index 197180241e..aca1a9472e 100644 --- a/doc/scripting/index.rst +++ b/doc/scripting/index.rst @@ -197,13 +197,8 @@ such, there are events defined for the primary parts of the connection life-cycle as you'll see from the small selection of connection-related events below. -.. todo:: - - Update the line numbers, this isn't pulling in the right events - anymore but I don't know which ones it were. - .. btest-include:: ${BRO_SRC_ROOT}/build/scripts/base/bif/event.bif.bro - :lines: 135-138,154,204-208,218,255-256,266,335-340,351 + :lines: 69-72,88,106-109,129,132-137,148 Of the events listed, the event that will give us the best insight into the connection record data type will be From 5fede2f73e6d7758bbb27a73440a01efba04731c Mon Sep 17 00:00:00 2001 From: Scott Runnels Date: Fri, 20 Sep 2013 12:22:12 -0400 Subject: [PATCH 2/5] Spelling corrections. Apparently I am unable to spell "separate". --- doc/scripting/index.rst | 44 ++++++++++++++++++++--------------------- 1 file changed, 21 insertions(+), 23 deletions(-) diff --git a/doc/scripting/index.rst b/doc/scripting/index.rst index aca1a9472e..b01165a5bc 100644 --- a/doc/scripting/index.rst +++ b/doc/scripting/index.rst @@ -33,7 +33,7 @@ are invalid. This entire process is setup by telling Bro that should it see a server or client issue an SSL ``HELLO`` message, we want to know about the information about that connection. -It's often the easiest to understand Bro's scripting language by +It's often easiest to understand Bro's scripting language by looking at a complete script and breaking it down into its identifiable components. In this example, we'll take a look at how Bro queries the `Team Cymru Malware hash registry @@ -76,7 +76,7 @@ this level of granularity might not be entirely necessary though. The export section redefines an enumerable constant that describes the type of notice we will generate with the logging framework. Bro -allows for redefinable constants, which at first, might seem +allows for re-definable constants, which at first, might seem counter-intuitive. We'll get more in-depth with constants in a later chapter, for now, think of them as variables that can only be altered before Bro starts running. The notice type listed allows for the use @@ -84,7 +84,7 @@ of the :bro:id:`NOTICE` function to generate notices of type ``Malware_Hash_Registry_Match`` as done in the next section. Notices allow Bro to generate some kind of extra notification beyond its default log types. Often times, this extra notification comes in the -form of an email generated and sent to a pre-configured address. +form of an email generated and sent to a preconfigured address. .. btest-include:: ${BRO_SRC_ROOT}/scripts/policy/frameworks/files/detect-MHR.bro :lines: 26-44 @@ -112,9 +112,9 @@ The ``when`` block performs a DNS TXT lookup and stores the result in the local variable ``MHR_result``. Effectively, processing for this event continues and upon receipt of the values returned by :bro:id:`lookup_hostname_txt`, the ``when`` block is executed. The -``when`` block splits the string returned into two seperate values and +``when`` block splits the string returned into two separate values and checks to ensure an expected format. If the format is invalid, the -script assumes that the hash wasn't found in the respository and +script assumes that the hash wasn't found in the repository and processing is concluded. If the format is as expected and the detection rate is above the threshold set by ``MHR_threshold``, two new local variables are created and used in the notice issued by @@ -168,7 +168,7 @@ the event, and a concise explanation of the functions use. :lines: 29-54 Above is a segment of the documentation for the event -:bro:id:`dns_request` (and the preceeding link points to the +:bro:id:`dns_request` (and the preceding link points to the documentation generated out of that). It's organized such that the documentation, commentary, and list of arguments precede the actual event definition used by Bro. As Bro detects DNS requests being @@ -240,7 +240,7 @@ information gleaned from the analysis of a connection as a complete unit. To break down this collection of information, you will have to make use of use Bro's field delimiter ``$``. For example, the originating host is referenced by ``c$id$orig_h`` which if given a -narritive relates to ``orig_h`` which is a member of ``id`` which is +narrative relates to ``orig_h`` which is a member of ``id`` which is a member of the data structure referred to as ``c`` that was passed into the event handler." Given that the responder port (``c$id$resp_p``) is ``53/tcp``, it's likely that Bro's base DNS scripts @@ -338,7 +338,7 @@ Constants Bro also makes use of constants, which are denoted by the ``const`` keyword. Unlike globals, constants can only be set or altered at parse time if the ``&redef`` attribute has been used. Afterwards (in -runtime) the constants are unalterable. In most cases, redefinable +runtime) the constants are unalterable. In most cases, re-definable constants are used in Bro scripts as containers for configuration options. For example, the configuration option to log password decrypted from HTTP streams is stored in @@ -354,7 +354,7 @@ following line to our ``site/local.bro`` file before firing up Bro. .. btest-include:: ${DOC_ROOT}/scripting/data_type_const_simple.bro -While the idea of a redefinable constant might be odd, the constraint +While the idea of a re-definable constant might be odd, the constraint that constants can only be altered at parse-time remains even with the ``&redef`` attribute. In the code snippet below, a table of strings indexed by ports is declared as a constant before two values are added @@ -412,7 +412,7 @@ The table below shows the atomic types used in Bro, of which the first four should seem familiar if you have some scripting experience, while the remaining six are less common in other languages. It should come as no surprise that a scripting language for a Network Security -Monitoring platform has a fairly robust set of network centric data +Monitoring platform has a fairly robust set of network-centric data types and taking note of them here may well save you a late night of reinventing the wheel. @@ -474,7 +474,7 @@ the ``for`` loop, the next element is chosen. Since sets are not an ordered data type, you cannot guarantee the order of the elements as the ``for`` loop processes. -To test for membership in a set the ``in`` statment can be combined +To test for membership in a set the ``in`` statement can be combined with an ``if`` statement to return a true or false value. If the exact element in the condition is already in the set, the condition returns true and the body executes. The ``in`` statement can also be @@ -541,7 +541,7 @@ iterate over, say, the directors; we have to iterate with the exact format as the keys themselves. In this case, we need squared brackets surrounding four temporary variables to act as a collection for our iteration. While this is a contrived example, we could easily have -had keys containin IP addresses (``addr``), ports (``port``) and even a ``string`` +had keys containing IP addresses (``addr``), ports (``port``) and even a ``string`` calculated as the result of a reverse hostname lookup. .. btest-include:: ${DOC_ROOT}/scripting/data_struct_table_complex.bro @@ -642,7 +642,7 @@ subnet ~~~~~~ Bro has full support for CIDR notation subnets as a base data type. -There is no need to manage the IP and the subnet mask as two seperate +There is no need to manage the IP and the subnet mask as two separate entities when you can provide the same information in CIDR notation in your scripts. The following example below uses a Bro script to determine if a series of IP addresses are within a set of subnets @@ -802,7 +802,7 @@ composite type. We have, in fact, already encountered a a complex example of the ``record`` data type in the earlier sections, the :bro:type:`connection` record passed to many events. Another one, :bro:type:`Conn::Info`, which corresponds to the fields logged into -``conn.log``, is shown by the exerpt below. +``conn.log``, is shown by the excerpt below. .. btest-include:: ${BRO_SRC_ROOT}/scripts/base/protocols/conn/main.bro :lines: 10-12,16,17,19,21,23,25,28,31,35,37,56,62,68,90,93,97,100,104,108,109,114 @@ -813,7 +813,7 @@ definition is within the confines of an export block, what is defined is, in fact, ``Conn::Info``. The formatting for a declaration of a record type in Bro includes the -descriptive name of the type being defined and the seperate fields +descriptive name of the type being defined and the separate fields that make up the record. The individual fields that make up the new record are not limited in type or number as long as the name for each field is unique. @@ -829,7 +829,7 @@ string, a set of ports, and a count to define a service type. Also included is a function to print each field of a record in a formatted fashion and a :bro:id:`bro_init` event handler to show some functionality of working with records. The definitions of the DNS and -HTTP services are both done inline using squared brackets before being +HTTP services are both done in-line using squared brackets before being passed to the ``print_service`` function. The ``print_service`` function makes use of the ``$`` dereference operator to access the fields within the newly defined Service record type. @@ -846,7 +846,7 @@ record. @TEST-EXEC: btest-rst-cmd bro ${DOC_ROOT}/scripting/data_struct_record_02.bro The example above includes a second record type in which a field is -used as the data type for a set. Records can be reapeatedly nested +used as the data type for a set. Records can be repeatedly nested within other records, their fields reachable through repeated chains of the ``$`` dereference operator. @@ -1123,7 +1123,7 @@ which we will cover shortly. +---------------------+------------------------------------------------------------------+----------------+----------------------------------------+ | policy_items | set[count] | &log &optional | Policy items that have been applied | +---------------------+------------------------------------------------------------------+----------------+----------------------------------------+ -| email_body_sections | vector | &optinal | Body of the email for email notices. | +| email_body_sections | vector | &optional | Body of the email for email notices. | +---------------------+------------------------------------------------------------------+----------------+----------------------------------------+ | email_delay_tokens | set[string] | &optional | Delay functionality for email notices. | +---------------------+------------------------------------------------------------------+----------------+----------------------------------------+ @@ -1137,7 +1137,7 @@ has been heuristically detected and the originating hostname is one that would raise suspicion. Effectively, the script attempts to define a list of hosts from which you would never want to see SSH traffic originating, like DNS servers, mail servers, etc. To -accomplish this, the script adhere's to the seperation of detection +accomplish this, the script adheres to the separation of detection and reporting by detecting a behavior and raising a notice. Whether or not that notice is acted upon is decided by the local Notice Policy, but the script attempts to supply as much information as @@ -1221,7 +1221,7 @@ Bro. In the :doc:`/scripts/policy/protocols/ssl/expiring-certs` script which identifies when SSL certificates are set to expire and raises -notices when it crosses a pre-defined threshold, the call to +notices when it crosses a predefined threshold, the call to ``NOTICE`` above also sets the ``$identifier`` entry by concatenating the responder IP, port, and the hash of the certificate. The selection of responder IP, port and certificate hash fits perfectly @@ -1257,7 +1257,7 @@ In short, there will be notice policy considerations where a broad decision can be made based on the ``Notice::Type`` alone. To facilitate these types of decisions, the Notice Framework supports Notice Policy shortcuts. These shortcuts are implemented through the -means of a group of data structures that map specific, pre-defined +means of a group of data structures that map specific, predefined details and actions to the effective name of a notice. Primarily implemented as a set or table of enumerables of :bro:type:`Notice::Type`, Notice Policy shortcuts can be placed as a single directive in your @@ -1303,5 +1303,3 @@ Notice::emailed_types set while the shortcut below alters the length of time for which those notices will be suppressed. .. btest-include:: ${DOC_ROOT}/scripting/framework_notice_shortcuts_02.bro - - From 8e3c6ada0fc0249fece2864faaa04dfd1b330c2c Mon Sep 17 00:00:00 2001 From: Scott Runnels Date: Fri, 20 Sep 2013 13:25:49 -0400 Subject: [PATCH 3/5] Rewrite the MHR detection description. Now that the MHR script uses the file analysis framework, the description needed to be rewritten to reflect the changes. Robin commented that he didn't feel the MHR script was a good introductory script and he might be right, however, I couldn't find one that was easier to explain. --- doc/scripting/index.rst | 95 ++++++++++++++++++++++------------------- 1 file changed, 51 insertions(+), 44 deletions(-) diff --git a/doc/scripting/index.rst b/doc/scripting/index.rst index b01165a5bc..077d5a9c45 100644 --- a/doc/scripting/index.rst +++ b/doc/scripting/index.rst @@ -10,13 +10,6 @@ Writing Bro Scripts Understanding Bro Scripts ========================= -.. todo:: - - The MHR integration has changed significantly since the text was - written. We need to update it, however I'm actually not sure this - script is a good introductory example anymore unfortunately. - -Robin - Bro includes an event-driven scripting language that provides the primary means for an organization to extend and customize Bro's functionality. Virtually all of the output generated by Bro @@ -51,82 +44,96 @@ appropriate DNS lookup and parsing the response. .. btest-include:: ${BRO_SRC_ROOT}/scripts/policy/frameworks/files/detect-MHR.bro Visually, there are three distinct sections of the script. A base -level with no indentation followed by an indented and formatted -section explaining the custom variables being provided (``export``) and another -indented and formatted section describing the instructions for a -specific event (``event log_http``). Don't get discouraged if you don't +level with no indentation where libraries are included in the script through ``@load`` +and a namespace is defined with ``module``. This is followed by an indented and formatted +section explaining the custom variables being provided (``export``) as part of the script's namespace. +Finally there is a second indented and formatted section describing the instructions to take for a +specific event (``event file_hash``). Don't get discouraged if you don't understand every section of the script; we'll cover the basics of the script and much more in following sections. .. btest-include:: ${BRO_SRC_ROOT}/scripts/policy/frameworks/files/detect-MHR.bro - :lines: 7-11 + :lines: 4-6 Lines 7 and 8 of the script process the ``__load__.bro`` script in the respective directories being loaded. The ``@load`` directives are often considered good practice or even just good manners when writing -Bro scripts to make sure they can be -used on their own. While it's unlikely that in a +Bro scripts to make sure they can be used on their own. While it's unlikely that in a full production deployment of Bro these additional resources wouldn't already be loaded, it's not a bad habit to try to get into as you get more experienced with Bro scripting. If you're just starting out, -this level of granularity might not be entirely necessary though. +this level of granularity might not be entirely necessary. The ``@load`` directives +are ensuring the Files framework, the Notice framework and the script to hash all files has +been loaded by Bro. .. btest-include:: ${BRO_SRC_ROOT}/scripts/policy/frameworks/files/detect-MHR.bro - :lines: 12-24 + :lines: 10-31 The export section redefines an enumerable constant that describes the -type of notice we will generate with the logging framework. Bro +type of notice we will generate with the Notice framework. Bro allows for re-definable constants, which at first, might seem counter-intuitive. We'll get more in-depth with constants in a later chapter, for now, think of them as variables that can only be altered before Bro starts running. The notice type listed allows for the use of the :bro:id:`NOTICE` function to generate notices of type -``Malware_Hash_Registry_Match`` as done in the next section. Notices +``TeamCymruMalwareHashRegistry::Match`` as done in the next section. Notices allow Bro to generate some kind of extra notification beyond its default log types. Often times, this extra notification comes in the -form of an email generated and sent to a preconfigured address. +form of an email generated and sent to a preconfigured address, but can be altered +depending on the needs of the deployment. The export section is finished off with +the definition of two constants that list the kind of files we want to match against and +the minimum percentage of detection threshold in which we are interested. + +Up until this point, the script has merely done some basic setup. With the next section, +the script starts to define instructions to take in a given event. .. btest-include:: ${BRO_SRC_ROOT}/scripts/policy/frameworks/files/detect-MHR.bro - :lines: 26-44 + :lines: 33-57 The workhorse of the script is contained in the event handler for -``log_http``. The ``log_http`` event is defined as an event-hook in -the :doc:`/scripts/base/protocols/http/main` script and allows scripts -to handle a connection as it is being passed to the logging framework. -The event handler is passed an :bro:type:`HTTP::Info` data structure -which will be referred to as ``rec`` in body of the event handler. +``file_hash``. The ``file_hash`` event is defined in the +:doc:`/scripts/base/bif/plugins/Bro_FileHash.events.bif.bro` script and allows scripts to access +the information associated with a file for which Bro's file analysis framework has +generated a hash. The event handler is passed the file itself as ``f``, the type of digest +algorithm used as ``kind`` and the hash generated as ``hash``. -An ``if`` statement is used to check for the existence of a data structure -named ``md5`` nested within the ``rec`` data structure. Bro uses the ``$`` as -a deference operator and as such, and it is employed in this script to -check if ``rec$md5`` is present by including the ``?`` operator within the -path. If the ``rec`` data structure includes a nested data structure -named ``md5``, the statement is processed as true and a local variable -named ``hash_domain`` is provisioned and given a format string based on -the contents of ``rec$md5`` to produce a valid DNS lookup. +On line 35, an ``if`` statement is used to check for the correct type of hash, in this case +a SHA1 hash. It also checks for a mime type we've defined as being of interest as defined in the +constant ``match_file_types``. The comparison is made against the variable ``f$mime_type`` which uses +the ``$`` dereference operator to check the value ``mime_type`` inside the variable ``f``. Once both +values resolve to true, a local variable is defined to hold a string comprised of the SHA1 hash concatenated +with ".malware.hash.cymru.com"; this value will be the domain queried in the malware hash registry. The rest of the script is contained within a ``when`` block. In short, a ``when`` block is used when Bro needs to perform asynchronous -actions, such a DNS lookup, to ensure that performance isn't effected. +actions, such as a DNS lookup, to ensure that performance isn't effected. The ``when`` block performs a DNS TXT lookup and stores the result in the local variable ``MHR_result``. Effectively, processing for this event continues and upon receipt of the values returned by :bro:id:`lookup_hostname_txt`, the ``when`` block is executed. The -``when`` block splits the string returned into two separate values and -checks to ensure an expected format. If the format is invalid, the -script assumes that the hash wasn't found in the repository and -processing is concluded. If the format is as expected and the -detection rate is above the threshold set by ``MHR_threshold``, two -new local variables are created and used in the notice issued by -:bro:id:`NOTICE`. +``when`` block splits the string returned into a portion for the date on which +the malware was first detected and the detection rate by splitting on an text space +and storing the values returned in a local table variable. In line 42, if the table +returned by ``split1`` has two entries, indicating a sucessful split, we store the detection +date in ``mhr_first_detect`` and the rate in ``mhr_detect_rate`` on lines 45 and 45 respectively +using the appropriate conversion functions. From this point on, Bro knows it has seen a file +transmitted which has a hash that has been seen by the Team Cymru Malware Hash Registry, the rest +of the script is dedicated to producing a notice. -In approximately 15 lines of actual code, Bro provides an amazing +On line 47, the detection time is processed into a string representation and stored in +``readable_first_detected``. The script then compares the detection rate against the +``notice_threshold`` that was defined on line 30. If the detection rate is high enough, the script +creates a concise description of the notice on line 50, a possible URL to check the sample against +virustotal.com's database, and makes the call to :bro:id:`NOTICE` to hand the relevant information +off to the Notice framework. + +In approximately 25 lines of code, Bro provides an amazing utility that would be incredibly difficult to implement and deploy -with other products. In truth, claiming that Bro does this in 15 +with other products. In truth, claiming that Bro does this in 25 lines is a misdirection; there is a truly massive number of things going on behind-the-scenes in Bro, but it is the inclusion of the scripting language that gives analysts access to those underlying -layers in a succinct and well defined manner. +layers in a succinct and well defined manner. The Event Queue and Event Handlers ================================== From 89090ec34af8e2bba63d069fd077825080962103 Mon Sep 17 00:00:00 2001 From: Scott Runnels Date: Fri, 20 Sep 2013 13:33:44 -0400 Subject: [PATCH 4/5] Include a better description for detect-MHR.bro I added a better more concise and accurate description of what is going on behind the scenes of detect-MHR.bro to not only bring it into line with the Files framework but to help make it a bit more clear as to where the various responsibilities lie. --- doc/scripting/index.rst | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/doc/scripting/index.rst b/doc/scripting/index.rst index 077d5a9c45..aeaeac1726 100644 --- a/doc/scripting/index.rst +++ b/doc/scripting/index.rst @@ -29,17 +29,16 @@ about the information about that connection. It's often easiest to understand Bro's scripting language by looking at a complete script and breaking it down into its identifiable components. In this example, we'll take a look at how -Bro queries the `Team Cymru Malware hash registry -`_ for downloads via -HTTP. Part of the Team Cymru Malware Hash registry includes the -ability to do a host lookup on a domain with the format -``MALWARE_HASH.malware.hash.cymru.com`` where ``MALWARE_HASH`` is the MD5 or -SHA1 hash of a file. Team Cymru also populates the TXT record of -their DNS responses with both a "last seen" timestamp and a numerical -"detection rate". The important aspect to understand is Bro already -generates hashes for files it can parse from HTTP streams, but the -script ``detect-MHR.bro`` is responsible for generating the -appropriate DNS lookup and parsing the response. +Bro checks the SHA1 hash of various files extracted from network traffic +against the `Team Cymru Malware hash registry +`_. Part of the Team Cymru Malware +Hash registry includes the ability to do a host lookup on a domain with the format +``MALWARE_HASH.malware.hash.cymru.com`` where ``MALWARE_HASH`` is the SHA1 hash of a file. +Team Cymru also populates the TXT record of their DNS responses with both a "first seen" +timestamp and a numerical "detection rate". The important aspect to understand is Bro already +generating hashes for files via the Files framework, but it is the +script ``detect-MHR.bro`` that is responsible for generating the +appropriate DNS lookup, parsing the response, and generating a notice if appropriate. .. btest-include:: ${BRO_SRC_ROOT}/scripts/policy/frameworks/files/detect-MHR.bro From 261b9e1e9747631bd5b0ae8f4bbfcc3b4cda9f0a Mon Sep 17 00:00:00 2001 From: Scott Runnels Date: Fri, 20 Sep 2013 13:36:56 -0400 Subject: [PATCH 5/5] Spelling corrections. --- doc/scripting/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/scripting/index.rst b/doc/scripting/index.rst index aeaeac1726..7c484af7e9 100644 --- a/doc/scripting/index.rst +++ b/doc/scripting/index.rst @@ -113,7 +113,7 @@ this event continues and upon receipt of the values returned by ``when`` block splits the string returned into a portion for the date on which the malware was first detected and the detection rate by splitting on an text space and storing the values returned in a local table variable. In line 42, if the table -returned by ``split1`` has two entries, indicating a sucessful split, we store the detection +returned by ``split1`` has two entries, indicating a successful split, we store the detection date in ``mhr_first_detect`` and the rate in ``mhr_detect_rate`` on lines 45 and 45 respectively using the appropriate conversion functions. From this point on, Bro knows it has seen a file transmitted which has a hash that has been seen by the Team Cymru Malware Hash Registry, the rest