Practicing Backward And Forward Tracking Hunts on A Windows Host

Xiaokui Shu and Ian Molloy · August 16, 2021 · 15 min read

In our previous blog post, we showed how to get started with the Kestrel Threat Hunting Language, such as connecting to data sources and performing your first hunts using the

GET

and

FIND

commands. In this post, we’ll introduce the

APPLY

keyword, which adds powerful analytics and enrichment capabilities to hunts.

We will show a Kestrel hunt performing backward and forward tracking on a Windows host to unearth the root cause and impact of activities related to an IP address. We will walk the process tree and extend analysis to network traffic and IP addresses, and we will use both pattern matching and analytic hunting steps to build the huntflow.

Table of Content

Hunting Environment
Start From An IP
From IP to Process
Backward Tracking
Forward Tracking
Applying An Analytics

Kestrel Installation And Monitoring System Setup

If you’re unfamiliar with setting up Kestrel for your environment, start by reviewing our previous blog Building A Huntbook to Discovery Persistent Threats from Scheduled Windows Tasks. There you will find the steps to monitor a Windows host with Sysmon, stream the log data to Elasticsearch via winlogbeat, install Kestrel in a Python virtual environment, and install and configure the STIX-shifter elastic_ecs connector to access the log data.

Entity-based Reasoning From An IP Address

Sysmon traces Windows activities and stores individual events as records in Elasticsearch. Reasoning on records directly requires excessive knowledge of record semantics, which gets worse when querying multiple data sources with different record formats. The Kestrel runtime identifies entities in records and automatically collects all available information about entities in related records before presenting the complete view to the user. Kestrel enables us to hunt through entities, e.g., establishing an IP address entity in a Kestrel variable from hundreds of records, each of which describes one aspect or appearance of the entity, finding all process entities connected to that IP address, and tracing other related entities such as network-traffic, file, or registry-key. This entity-based hunting approach makes it easy for a human to organize hunts, pivot from one hunt step to another, and enables composable huntflow development.

In this tutorial, we will start from the IP address

72.21.81.200

and walk the associated process tree to reason about whether the IP address is suspicious.

First, let’s create a Kestrel variable starting from the IP address as an entity:

ip200 = GET ipv4-addr FROM stixshifter://host101
        WHERE [ipv4-addr:value = '72.21.81.200']
        START t'2021-04-01T00:00:00Z' STOP t'2021-04-06T00:00:00Z'

We specify the type of entity

ipv4-addr

we would like to get from the monitored Windows host

host101

and describe the matching criteria using the STIX Pattern

[ipv4-addr:value = '72.21.81.200']

. We limit the time of the search for the first 5 days in April to limit our search. The time range is required in our first

GET

command in a huntflow to set the scope. Otherwise, STIX-shifter defaults to the past five minutes.

If you have setup multiple data sources besides

host101

, you can use auto-complete in Kestrel to list all available data sources by pressing

TAB

after

stixshifter://

After putting the command in a Jupyter cell and executing it, we get a summary for the code block—one IP address entity and 350 records/logs, each of which has some information about the IP. We don’t bother with the number of records since we are doing entity-based reasoning and Kestrel will assemble the entity from the 350 records for us.

Obtaining Associated Network-Traffic And Processes

Finding hosts connecting to the IP address is good, and pinning down the processes on the hosts that communicate with the IP is even better. To achieve this, we need to connect to an endpoint monitoring system such as Sysmon mentioned in our previous blog post. From an endpoint’s view, a process creates a network-traffic that reaches a remote host or an IP address entity. We can use the FIND command to navigate through connected entities; the Kestrel runtime will generate/execute corresponding data source queries and assemble entities from returned records.

According to the relation chart in the FIND command syntax, we use

ACCEPTED BY

relation to describe the IP addresses in

ip200

are the destination IPs of the network-traffic returned. We then use

CREATED

relation to describe the network-traffic is associated with the process to be returned. The

DISP

command will print select attributes of the entities in a Kestrel variable without side effects. Now let’s put all four commands together into a code block and execute it in a Jupyter Notebook cell:

# obtain network traffic that has ip200 as the destination IP
ip200nt = FIND network-traffic ACCEPTED BY ip200
DISP ip200nt ATTR src_ref.value, src_port, dst_ref.value, dst_port

# obtain processes creating the network traffic
p = FIND process CREATED ip200nt
DISP p ATTR pid, name, command_line

We get back 350 network-traffic entities in

ip200nt

and 34 process entities in

Here is a partial list of the network-traffic entities in

ip200nt

out of the 350 entities:

Are all network-traffic HTTPS (port 443)? If we only display the destination port attribute of entities in

ip200nt

DISP

will deduplicate the results before output, so it is easy to find out the answer using another

DISP

command in a new cell:

DISP ip200nt ATTR dst_ref.value, dst_port

After executing the cell, it is clear we guess it right: all network-traffic to the IP

72.21.81.200

are pointing to its port 443:

Nothing is explicitly suspicious about

ip200nt

, and let’s move to the results we show about

in our previous executed block: the

DISP p

command shows an abridged list of the 34 associated processes:

We know all 34 processes in the Kestrel variable

connected to

72.21.81.200:443

. Next, we pick up the first process

BackgroundDownload.exe

to start walking the process tree to further our understanding of the activities on the Windows host.

Walking Up The Process Tree of BackgroundDownload.exe

Backward tracking is a hunting strategy to walk back the control-flow or data-flow of entities and understand their origin or provenance. The most common task for process entities is to backtrack their control-flow and walk up the process tree to check whether given processes are created by a malicious or potentially compromised process.

Let’s take a close look at a subset of processes with name

BackgroundDownload.exe

. From their command line we can guess they are benign and belong to Microsoft Visual Studio. Let’s check their parent process to verify this.

# get a subset of entities from variable `p`
bgdownloads = GET process FROM p WHERE [process:name = 'BackgroundDownload.exe']

# obtain parent processes of bgdownloads
bgdp = FIND process CREATED bgdownloads
DISP bgdp ATTR pid, name, command_line

We get four

BackgroundDownload.exe

processes from

and three processes as their parent processes. From the parent process names and executable paths, we are more confident

BackgroundDownload.exe

processes are part of Microsoft Visual Studio and spawned by processes from the suite.

Next, let’s see if we can trace back one level to find the grandparent processes:

# grandparent processes of `BackgroundDownload.exe`
bgdpp = FIND process CREATED bgdp

Good, we see two processes, and Kestrel also gets back some records with network activities when trying to get the most complete information of the entities—there are 146 network-traffic records related to the two processes. However, this is only a summary of the command execution, and we are not sure whether the network-traffic entities are directly or indirectly linked to the grandparent processes. Let’s print out details of the grandparent processes and the network traffic:

DISP bgdpp ATTR pid, name, command_line

bgdppnt = FIND network-traffic CREATED BY bgdpp
DISP bgdppnt ATTR src_ref.value, src_port, dst_ref.value, dst_port

Zero network activities: the 146 related

network-traffic

records cached could be indirectly associated with the process in

bgdpp

such as their parent or child processes.

It is easy to guess the

devenv.exe

bgdp

is spawned from

explorer.exe

bgdpp

(likely a double click by a human user);

svchost.exe

services.exe

. We can choose the former to verify:

bgdp_devenv = GET process FROM bgdp WHERE [process:name = 'devenv.exe']
bgdp_devenv_parent = FIND process CREATED bgdp_devenv
DISP bgdp_devenv_parent ATTR pid, name, command_line

Bingo. And we can walk the process chain up of

bgdp_devenv_parent

# Let's go further up to pull out parent process of `bgdp_devenv_parent`.
ppp = FIND process CREATED bgdp_devenv_parent
DISP ppp ATTR pid, name, command_line

The

explorer.exe

is spawned by

svchost.exe -k DcomLaunch -p

, which is the Windows DCOM Server Process Launcher and this behavior is expected. The DCOM launcher is the great grandparent of

BackgroundDownload.exe

, and it is one of the core Windows system services. We could stop here, but we can also try to trace back further to see what is the limit of the Sysmon monitor regarding its visibility into the very early phases of system bootup—of course, Sysmon can only see things after it is started/spawned itself by a Windows service process.

# Let's see how far we can go to the origin of the process tree in sysmon
pppp = FIND process CREATED ppp
DISP pppp ATTR pid, name, command_line

OK. We just hit the limit of Sysmon, which does not log the birth of

ppp

Forward Tracking From iexplore.exe

After backward tracking and finding some interesting branches in the process tree, we can perform forward tracking, or walk down the tree, to check other activities from the entities. This is a common hunting strategy to understand the impacts of an entity.

We start from processes that talk to IP address

72.21.81.200

, and we already have all such processes in variable

. Let’s list all process names in

DISP p ATTR name

In the last section, we find the parent process of

BackgroundDownload.exe

. We could find the siblings of

BackgroundDownload.exe

by walking down the process tree from

bgdp

(the parent processes of

BackgroundDownload.exe

). We can also go beyond the process tree and forward track files, network-traffic, registry-keys and even further to other processes via files (data-flow analysis).

Let’s try a simple task: start from the

iexplore.exe

processes in

to (i) walk down their process tree if they fork processes, and (ii) go beyond the process tree to network activities at leaf processes.

# first walk down the tree from the iexplore processes in `p`
ie = GET process FROM p WHERE [process:name = 'iexplore.exe']
DISP ie ATTR pid, name, command_line

ie_children = FIND process CREATED BY ie
DISP ie_children ATTR pid, name, command_line

As shown in the execution summary, the Kestrel variable

ie

contains two process entities with pids 7356 and 11368. There is only one process in

ie_children

with pid 8508. Let’s check its network-traffic as we discussed:

# second let's go beyond the process tree for network activities of the child process
ient = FIND network-traffic CREATED BY ie_children
DISP ient ATTR dst_ref.value, dst_port

Entity Enrichment With Kestrel Analytics

We find 76 network traffic entities from the IE process with pid 8508 shown above, all of which look like web connections. However, could a malicious C&C sever hide in the list? Usually attackers do not use IP addresses directly for C&C, but domain names created by domain generation algorithms (DGA). If we can enrich the IP addresses with their domain names, we may discover something malicious.

Enriching entities is another type of hunting steps besides pattern matching, and Kestrel supports such hunting steps as analytics. A Kestrel analytics is given all records of a list of entities, runs pre-programmed logic on the entities, checks with external threat intelligence, or matches entities with a pre-trained machine learning model. Finally, the analytics generates new attributes for the entities and gives them back to the Kestrel runtime to complete enrichment.

The analytic we use here is domainnamelookup in the Kestrel analytics repository. It is a Kestrel analytic executed via the docker interface. To use the analytics, first clone the repo and go to the domainnamelookup analytics directory. Then do docker build:

$ docker build -t kestrel-analytics-enrichdomain .

The analytic is now available as enrichdomain in Kestrel. Kestrel calls analytics using the

APPLY

command (more details in documentation). After typing

APPLY docker://

, we can press

TAB

to list all available analytics. If enrichdomain does not show up, restart the Jupyter kernel to re-initialize the Kestrel kernel and its analytics interface manager.

# next let's apply the analytics to all entities in `ient`.
APPLY docker://enrichdomain ON ient

# print out all attributes including the ones added/enriched by `enrichdomain`
INFO ient

Looking for attributes about domain names, we find two attributes newly added by the analytic: x_domain_name and x_domain_organization. We can now

DISP

DISP ient ATTR dst_ref.value, dst_port, x_domain_name, x_domain_organization

No domain names here appear to be generated by naive DGA (examples from the article Domain Generation Algorithms – Why so effective?). Not bad to rule out a threat.

Stretch Hunts

We hope you enjoyed this tutorial on how to use the Kestrel Threat Hunting Language to extend your searches and pivot between entity types to perform provenance tracking and impact analysis. More can be done by backward and forward tracking control-flow (through the process tree) and data-flow (through files) and applying other analytics in the hunts, and we hope to bring more powerful capabilities in future releases.

Kestrel 1.0 has just process–executable relation for files, which is not very powerful for data-flow tracking. In the future, Kestrel will support STIX 2.1 with SROs including universal file type support for more powerful data-flow tracking.
In the Kestrel analytics repository, there is an example analytic SANS IP Enrichment that enriches network traffic entities with IOC information provided by SANS.
It is easy to build your own analytics, especially the ones run as docker containers. Check out the analytics template to start, and watch out for a future blog post to guide you in detail!

Until next time, happy threat hunting!

kestrel 威脅狩獵——通過流量和程序發現異常

Practicing Backward And Forward Tracking Hunts on A Windows Host

Kestrel Installation And Monitoring System Setup

Entity-based Reasoning From An IP Address

Obtaining Associated Network-Traffic And Processes

Walking Up The Process Tree of BackgroundDownload.exe

Forward Tracking From iexplore.exe

Entity Enrichment With Kestrel Analytics

Stretch Hunts

繼續閱讀

基于DoH的C2——本質上就是dns隧道做的c2封裝在了https裡

Kestrel威脅狩獵實踐篇之一——追捕諸如FIN7 之類的持續威脅

Domain Borrowing: 一種基于CDN的新型隐蔽通信方法——含域前置的檢測方法

軟考-資訊安全工程師學習筆記-第23章雲計算安全分析與安全保護

工作安全分析(JSA)57頁

工作講安全安全生産是指事先或定期對某項工作進行安全分析，識别危害因素，評估風險，根據評估結果制定和實施相應的控制措施，最

随着人們生活水準的提高，越來越多的人開始關心起食品安全的問題，食品安全檢測儀作為內建化食品安全分析裝置，可以檢測200多

Mustache使用及其安全分析0x00 Mustache簡介0x01 Mustache模闆使用0x02 安全漏洞及防護

四分之一的CISO想跟網絡安全說再見！！如果企業不主動應對安全倦怠，預計在未來兩年内，他們的網絡安全部門将出現相當大的變

工作安全分析管理規定

CVE-2010-3333 分析學習

PPT |【課件】工作安全分析(JSA)管理規範47頁）

2023汽車碰撞測試安全分析合集（一）本期的14款車型其實也是2022年的測試範圍，隻不過近期才釋出資料。本期測試的有大

WEB應用系統安全分析與設計，對科技發展有什麼影響？

奇安信釋出《2023中國軟體供應鍊安全分析報告》開源軟體供應鍊的系統化安全治理需加速落地