mirror of
https://github.com/zeek/zeek.git
synced 2025-10-02 06:38:20 +00:00

This is based on commit 2731def9159247e6da8a3191783c89683363689c from the zeek-docs repo.
270 lines
14 KiB
ReStructuredText
270 lines
14 KiB
ReStructuredText
====================
|
||
Monitoring With Zeek
|
||
====================
|
||
|
||
Detection and Response Workflow
|
||
===============================
|
||
|
||
As noted in the previous sections, Zeek is optimized, more or less “out of the
|
||
box,” to provide two of the four types of network security monitoring data.
|
||
Without any major configuration, Zeek offers transaction data and extracted
|
||
content data, in the form of logs summarizing protocols and files seen
|
||
traversing the wire. Zeek can also provide some degree of alert data in the
|
||
form of notices, and analysts can modify Zeek to create custom alerts if
|
||
desired. A dedicated intrusion detection engine like Suricata or Snort might be
|
||
more appropriate, however. Finally, Zeek does not collect full content data in
|
||
pcap format, although other open source projects do provide that functionality.
|
||
|
||
Broadly speaking, incident detection and response begins with the collection of
|
||
security data, followed by its analysis. In the analysis phase, in the absence
|
||
of an explicit alert of malicious activity, investigators can work two broad
|
||
investigative categories: “matching” and “hunting.” Matching means querying and
|
||
reviewing security data for signs of known indicators of compromise. Hunting
|
||
means working without indicators of compromise, relying instead on creating a
|
||
hypothesis of how adversary activity might manifest in security data. Matching
|
||
is the sort of activity that can be easily automated. Hunting is an activity
|
||
that is difficult to automate because it relies upon the creation of a cyber
|
||
security “experiment” to yield results and often a little bit of human
|
||
intuition.
|
||
|
||
In the common vernacular, some security teams believe hunting involves querying
|
||
data for indicators of compromise. That is really just a search function, i.e.,
|
||
looking for matches of “expected bad” in collected data. True hunting involves
|
||
more of a scientific method that requires formulating a hypothesis, testing the
|
||
hypothesis in sample and production data, and then refining the process until
|
||
it yields results or is disproved. Investigative methods which yield results
|
||
Zeek data plays a role in matching or hunting operations. Analysts may query a
|
||
store of Zeek transaction logs for indicators of compromise, and begin a
|
||
security investigation when they see a match on an IP address, or username, or
|
||
HTTP user-agent string, or any single or combination of the hundreds of
|
||
elements Zeek derives from network traffic. Analysts can also pose a hypothesis
|
||
of how certain adversary behavior may appear in Zeek data, and then query that
|
||
data for signs that prove or disprove their hypothesis.
|
||
|
||
Beyond the matching and hunting paradigms, analysts can use Zeek within an
|
||
“incident detection alert” workflow. In this scenario, an IDS creates an alert
|
||
that catches the attention of a security team member. Because IDS alerts are
|
||
often light on details, analysts require corroborating data to decide if the
|
||
alert represents normal, suspicious, or malicious activity. Analysts can
|
||
“pivot” from the IDS alert to a variety of logs generated by Zeek. If the IDS
|
||
alert provides the community identification (community ID) supported by Zeek,
|
||
the analyst can easily tie the IDS alert to specific Zeek logs. Based on the
|
||
data provided by Zeek, analysts may be able to resolve the incident. At the
|
||
very least, the analyst can accelerate the alert validation and verification
|
||
process by having access to data beyond the initial IDS notification.
|
||
|
||
Finally, analysts can use Zeek data to improve the validation process when
|
||
prompted by any other external stimulus. For example, an analyst might notice
|
||
an odd process running on a system, as reported by their endpoint detection and
|
||
response (EDR) or anti-virus agent. Alternatively, an analyst might receive a
|
||
report from a user or a peer involving suspicious activity on an
|
||
Internet-facing Web server. In either case, the analyst with access to Zeek
|
||
data can seek to learn all they can about the systems in question, simply by
|
||
querying the repository storing their Zeek logs. This security design pattern
|
||
has immense benefits, as it does not affect the end state of the suspicious
|
||
asset. Not touching a system that may be compromised has two benefits. First,
|
||
an intruder who has compromised the asset remains unawares that the security
|
||
team is investigating it. Second, the forensic integrity of the asset remains
|
||
intact, as the analyst is working with logs stored off-device.
|
||
|
||
Instrumentation and Collection
|
||
==============================
|
||
|
||
Zeek is designed to watch live network traffic. Although Zeek can process
|
||
packet captures saved in PCAP format, most users deploy Zeek to gain
|
||
near-real-time insights into network usage patterns. Administrators run Zeek
|
||
by telling it to “sniff” one or more network interfaces, generating
|
||
transaction logs, insights, and extracted file contents, based on the network
|
||
traffic seen on those network interfaces.
|
||
|
||
Some users may choose to run Zeek on a single computer used for general
|
||
computing purposes, watching network traffic to and from that single
|
||
computer. That system might be an office laptop used for business purposes,
|
||
chosen for experimentation with Zeek. This is a simple way to become familiar
|
||
with the logs that Zeek creates. This approach is similar to running Tcpdump
|
||
or Wireshark on one’s computer for the same educational purposes.
|
||
|
||
Most users, however, run Zeek on a computer selected solely for the purpose
|
||
of network security monitoring. Security personnel call that computer a
|
||
“sensor” and they select, configure, and deploy it specifically to watch
|
||
network traffic. They select a location in an environment that offers
|
||
visibility to multiple computers, and deploy the sensor with Zeek to
|
||
instrument that network segment.
|
||
|
||
When choosing a place to deploy a sensor, users will likely prioritize a
|
||
requirement like the following:
|
||
|
||
Identify a single location in the network to instrument with a network tap or
|
||
switch span port that provides the maximum visibility. This means seeing
|
||
traffic from all devices on the network, with a strong preference for
|
||
identifying devices by observing them with their original source IP address.
|
||
|
||
Users new to Zeek may choose to try Zeek in their home or in a small office
|
||
environment. Figure 1 depicts the standard SOHO network architecture. Letters
|
||
A-D are possible monitoring locations, to be discussed below.
|
||
|
||
.. figure:: /images/collection-figure1.png
|
||
|
||
Figure 1: Standard SOHO Architecture
|
||
|
||
Most home users and many small office environments are connected to the
|
||
Internet via customer premise equipment (CPE) provided by their Internet
|
||
service provider (ISP). This box may or may not be available or visible to
|
||
the customer. In the context of a system like Verizon FIOS, for example, the
|
||
ISP CPE is the box attached to the outside of a residence, with a warning
|
||
that only Verizon technicians should open it. For fiber connectivity, the ISP
|
||
might call this device an Optical Network Terminal or ONT.
|
||
|
||
The ISP also provides a gateway device that provides routing and wireless
|
||
access point (WAP) functionality. This is the piece of equipment familiar to
|
||
most home and small office users. It typically has a gigabit copper Ethernet
|
||
connection that connects to the ISP CPE, on its wide area network (WAN) side,
|
||
and four gigabit copper Ethernet ports for devices on its local area network
|
||
(LAN) side. Customer devices gain network access via WiFi to the ISP WAP or
|
||
via copper Ethernet cables to the embedded switch on the same device.
|
||
|
||
On the WAN side of the router, the device usually has a public IP address
|
||
provided by the ISP. This may not necessarily be the case, however. On the
|
||
LAN side of the router, the device provides RFC 1918 private addresses, often
|
||
in the 192.168.0.0/16 subnet. The router acts as a gateway, using network
|
||
address translation (NAT), or for the more strictly minded, network port
|
||
address translation (NPAT), so that client devices share a single IP address
|
||
provided by the ISP. (Note that in some situations, multiple residences even
|
||
share the same public IP address, and differentiate between each other via
|
||
the port range. We'll not consider this further for now, as it is extraneous
|
||
to the discussion.)
|
||
|
||
Where does one monitor, given this architecture?
|
||
|
||
Location A is off limits to the customer. It is likely a cable exiting the
|
||
ISP CPE and entering the ground.
|
||
|
||
Location B is a possibility, assuming the cable between the ISP CPE and
|
||
router is a copper Ethernet cable. One could insert a reliable network tap
|
||
(typically outside the home user’s budget) or a decent small managed switch
|
||
with a span port (like a Netgear GS30Xe model).
|
||
|
||
However, and this is crucial: because of the NAT done by the router, all
|
||
traffic will appear to originate from a single IP address. Whether the
|
||
customer has 100 devices or 1 device, they will all share the single IP
|
||
address. This reality makes it much more difficult for a security analyst to
|
||
track down the originator of suspicious or malicious network traffic.
|
||
|
||
Location C is essentially not possible. Yes, there are various penetration
|
||
testing tools and wireless network troubleshooting tools that can try to
|
||
access WiFi traffic. However, they do not expose the traffic in a form usable
|
||
to security analysts, assuming that the WiFi protocols in use are at all
|
||
modern.
|
||
|
||
Location D is a possibility, assuming that the user installs a network tap or
|
||
switch span port as in location B. However, monitoring only at location D
|
||
would ignore WiFi traffic.
|
||
|
||
In other words, the standard SOHO network architecture is not well-suited for
|
||
network security monitoring, because there isn’t a good place, by default, to
|
||
see the originating IP addresses, which are generally needed to investigate
|
||
suspicious and malicious activity.
|
||
|
||
In contrast, the Visible Network Architecture shown in Figure 2 depicts the
|
||
sort of setup one needs if visibility is designed into the architecture,
|
||
rather than added as an afterthought.
|
||
|
||
.. figure:: /images/collection-figure2.png
|
||
|
||
Figure 2: Visible Network Architecture
|
||
|
||
The major changes include the following:
|
||
|
||
The ISP router is no longer also acting as a WAP. The WiFi capability is
|
||
disabled. No other changes are required on the router. Strictly speaking,
|
||
WiFi need not be disabled, so long as no one uses it.
|
||
|
||
The customer has purchased her own router. That device may or may not also
|
||
provide NAT.
|
||
|
||
The customer explicitly owns a switch, to which wired devices may connect.
|
||
That switch has a span port.
|
||
|
||
The customer explicitly owns her own wireless access point, acting as a
|
||
bridge, and not offering NAT.
|
||
|
||
Don’t be fooled into thinking that one need only buy a new combination
|
||
router/WAP. It’s essential to split these functions. Consumer-grade customer
|
||
routers do not offer span ports, which cheap consumer-grade network switches
|
||
do. This architecture takes advantage of that fact in order to provide
|
||
suitable monitoring locations.
|
||
|
||
Let’s review the options.
|
||
|
||
Location A is still off-limits.
|
||
|
||
Location B is still a bad idea.
|
||
|
||
Location C is a good option, if one places a network tap here, or another
|
||
small switch with a span port, and neither the customer router nor customer
|
||
WAP is doing NAT.
|
||
|
||
Location D is a better option. Now one need only ensure that the customer WAP
|
||
is not doing NAT. In fact, one need not introduce another switch or tap here,
|
||
assuming one can span the uplink port on the customer switch.
|
||
|
||
Location E would only see wired devices, and is not a good option because it
|
||
ignores WiFi devices.
|
||
|
||
Location F would only see WiFi devices, and is not a good option because it
|
||
ignores wired devices.
|
||
|
||
Location G is essentially impossible, as with Figure 1.
|
||
|
||
The bottom line is that the location D is the best monitoring location,
|
||
assuming that the customer WAP is not doing NAT. If the customer WAP is
|
||
acting as a router with NAT, then all of the wireless devices will have the
|
||
same source IP address as seen in location D.
|
||
|
||
In an architecture designed for visibility, introducing a network tap, or
|
||
simply spanning the uplink from the network switch, at point D, satisfies the
|
||
visibility requirement.
|
||
|
||
It is possible to simplify the architecture shown in figure 2 to that which
|
||
follows:
|
||
|
||
.. figure:: /images/collection-figure3.png
|
||
|
||
Figure 3: Simplified Visible Network Architecture
|
||
|
||
The customer router between monitoring points C and D is gone, as one can
|
||
rely upon the ISP router if so desired.
|
||
|
||
In summary, one could deploy a Zeek sensor at location D, or C, if the
|
||
simplified architecture is in place, as C and D are logically similar. Going
|
||
forward, we'll discuss monitoring at location D.
|
||
|
||
Gaining access to traffic at point D requires either a span port to be
|
||
enabled on the customer switch, or a network tap to be deployed at location
|
||
D. Professional Zeek users prefer high-quality, powered network taps wherever
|
||
possible, for a variety of reasons. When they are not available, as in the
|
||
case of a SOHO or test environment, then a span port on a managed switch is
|
||
an acceptable alternative.
|
||
|
||
Once the network tap or span port is providing network traffic to the Zeek
|
||
sensor, one can turn to matters beyond instrumentation and collection.
|
||
|
||
Storage and Review
|
||
==================
|
||
|
||
As Zeek ingests network traffic, either by monitoring one or more live
|
||
network interfaces or by processing stored traffic in a capture file, it
|
||
creates a variety of logs and other artifacts. By default Zeek writes that
|
||
data to a storage location designated via its configuration files. Zeek
|
||
possesses the capability to write the logs in several formats and perform
|
||
certain log management processes like compression and archiving.
|
||
|
||
Analysts make use of Zeek data by reviewing the logs it generates. Review
|
||
methods can be as simple as using text processing tools packaged with the
|
||
underlying operating system. Depending on the format of the logs, users may
|
||
apply more specialized processing tools, some of which are available with
|
||
Zeek. In many cases, Zeek administrators ship logs to specialized storage and
|
||
review applications. These are usually referred to collectively as Security
|
||
and Information Event Management (SIEM) platforms. Some of these log
|
||
management and SIEM platforms are available as open source offerings, while
|
||
others are commercially available.
|