zeek/doc/user-manual/scripting.rst

88 lines
9.1 KiB
ReStructuredText

=========
Scripting
=========
.. toctree::
:maxdepth: 2
:numbered:
Understanding Bro Scripts
Event Handlers
Data types
A complete script
Guidelines for writing scripts
Understanding Bro Scripts
=========================
Bro includes an event queue driven scripting language that provides the primary means for an organization to extend and customize Bro's functionality. An overwhelming amount of the output generated by Bro is, in fact, generated by Bro scripts. It's almost easier to consider Bro to be an entity behind-the-scenes processing connections and generating events while Bro's scripting language is the medium through which we mere mortals can achieve communication. Bro scripts effectively notify Bro that should there be an event of a type we define, then let us have the information about the connection so we can perform some function on it. For example, the ssl.log file is generated by a Bro script that walks the entire certificate chain and issues notifications if any of the steps along the certificate chain are invalid. This entire process is setup by telling Bro that should it see a server or client issue an SSL HELLO message, we want to know about the information about that connection.
It's often easier to understand Bro's scripting language by looking at a complete script and breaking it down into its identifiable components. In this example, we'll take a look at how Bro queries the Team Cymru Malware hash registry for detected downloads via HTTP. Part of the Team Cymru Malware Hash registry includes the ability to do a host lookup on a domain with the format MALWARE_HASH.malware.hash.cymru.com where MALWARE_HASH is the md5 or sha1 hash of a file. The important aspect to understand is Bro already generates hashes for files it can parse from HTTP streams, but the script detect-MHR.bro is responsible for generating the appropriate DNS lookup and parsing the response.
.. literalinclude:: ../../scripts/policy/protocols/http/detect-MHR.bro
:language: bro
:linenos:
Visually, there are three distinct sections of the script. A base level with no indentation followed by an indented and formatted section explaining custom the variables being set (export) and another indented and formatted section describing the instructions for a specific event (event log_http). Don't get discouraged if you don't understand every section of the script; we'll cover the basics of the script and much more in following sections.
.. literalinclude:: ../../scripts/policy/protocols/http/detect-MHR.bro
:language: bro
:linenos:
:lines: 7-11
Lines 7 and 8 of the script process the __load__.bro script in the respective directories being loaded. In a full production deployment of Bro it's likely that these files would already be loaded and their contents made available to the script, but including them explicitly ensures that Bro can be run in modes such as "bare" and still have scripts that reliably work.
.. literalinclude:: ../../scripts/policy/protocols/http/detect-MHR.bro
:language: bro
:linenos:
:lines: 12-18
The export section redefines an enumerable constant that describes the type of notice we will generate with the logging framework. The notice type listed allows for the use of the NOTICE() function to generate notices of type Malware_Hash_Registry_Match as done in the next section.
.. literalinclude:: ../../scripts/policy/protocols/http/detect-MHR.bro
:language: bro
:linenos:
:lines: 20-37
The workhorse of the script is contained in the event handler for log_http. The log_http event is defined as an event-hook in the base/protocols/http/main.bro script and allows scripts to handle connection as it is being passed to the logging framework. The event handler is passed an HTTP::Info data structure which will be referred to as "rec" in body of the event handler.
An if statement is used to check for the existence of a data structure named "md5" nested within the rec data structure. Bro uses the "$" as a deference operator and as such, and it is employed in this script to check if rec$md5 is present by including the "?" operator within the path. If the rec data structure includes a nested data structure named "md5", the statement is processed as true and a local variable named "hash_domain" is provisioned and given a format string based on the contents of rec$md5. The script then provisions another local variable named "addrs" and fills it with the output of the lookup_hostname() function which will perform a DNS lookup on the arguments passed to it.
DNS lookups against the Team Cymru Malware Hash Registry will return a '127.0.0.2' if the hash being queried is found in the hash registry. If the response includes 127.0.0.2, the script proceeds to build two new local variables to be used in the notice and then issues the NOTICE().
In approximately 15 lines of actual code, Bro provides an amazing utility that would be incredibly difficult to implement and deploy with other products. In truth, claiming that Bro does this in 15 lines is a misdirection; There is a truly massive number of things going on behind-the-scenes in Bro, but it is the inclusion of the scripting language that gives analysts access to those underlying layers in a succinct and well defined manner.
The Event Queue and Event Handlers
==================================
Bro's scripting language is event driven which is a gear change from the majority of scripting languages with which most users will have previous experience. Scripting in Bro depends on handling the events generated by Bro as it processes network traffic, altering the state of data structures through those events, and making decisions on the information provided. This approach to scripting can often cause confusion to users who come to Bro from a procedural or functional language, but once the initial shock wears off it becomes more clear with each exposure.
Bro's core acts to place events into an ordered "Event Queue", allowing event handlers to process them on a first-come-first-serve basis. In effect, this is Bro's core functionality as without the scripts written to perform discrete actions on events, there would be little to no usable output. As such, a basic understanding of the Event Queue, the Events being generated, and the way in which Event Handlers process those events is a basis for not only learning to write scripts for Bro but for understanding Bro itself.
Gaining familiarity with the specific events generated by Bro is a big step towards building a mind set for working with Bro scripts. The majority of events generated by Bro are defined in the built-in-function files or .bif files which also act as the basis for online event documentation. Whether starting a script from scratch or reading and maintaining someone else's script, having the built-in event definitions available is an excellent resource to have on hand. Before release version 2.0 the Bro developers put significant effort into organization and documentation of every event. This effort resulted in built-in-function files organized such that each entry contains a descriptive event name, the arguments passed to the event, and a concise explanation of the functions use.
.. literalinclude:: ../../build/src/base/event.bif.bro
:language: bro
:linenos:
:lines: 4124-4149
Above is a segment of event.bif.bro showing the documentation for the event dns_request(). It's organized such that the documentation, commentary, and list of arguments precede the actual event definition used by Bro. As Bro detects DNS requests being issued by an originator, it issues this event and any number of scripts then have access to the data Bro passes along with the event. In this example, Bro passes not only the message, the query, query type and query class for the DNS request, but also a then record used for the connection itself.
The Connection Record Data Type
===============================
Of all the events defined in Bro's event.bif.bro file, an overwhelmingly large number of them are passed the connection record data type, in effect, making it the backbone of many scripting solutions. The connection record itself, as we will see in a moment, is a mass of nested data types used to track state on a connection through its lifetime. Let's walk through the process of selecting an appropriate event, generating some output to standard out and dissecting the connection record so as to get an overview of it. We will cover data types in more detail later.
While Bro is capable of packet level processing, its strengths lay in the context of a connection between an originator and a responder. As such, there are events defined for the primary parts of the connection life-cycle as you'll see from the small selection of connection-related events below.
.. literalinclude:: ../../build/src/base/event.bif.bro
:language: bro
:linenos:
:lines: 135-138,154,204-208,218,255-256,266,335-340,351
Of the events listed, the event that will give us the best insight into the connection record data type will be connection_state_remove(). As detailed in the in-line documentation, Bro generates this event just before it decides to remove this event from memory, effectively forgetting about it.