A Multifaceted Approach to Understand the Botnet Phenomenon

Download Report

Transcript A Multifaceted Approach to Understand the Botnet Phenomenon

Published: Internet Measurement Conference (IMC) 2006
Presented by Wei-Cheng Xiao
2016/3/29
1
Outline
 Introduction
 Overview of IRC-based botnets
 Data collection methodology
 Analysis results
 Related work
 Conclusion
2016/3/29
2
Introduction
 Botnet:
a network of infected hosts, called bots, that are
controlled by botmasters
 The characteristic of botnets
 The command and control (C&C) channel
 Communication mechanisms
 IRC (the majority, easy to distribute)
 P2P
 HTTP
2016/3/29
3
Why choosing IRC
 Supports several forms of communication
 Point-to-point, point-to-multipoint
 Supports several forms of data dissemination
 Provide open-source implemenations
2016/3/29
4
Motivation and Goals
 Motivation
 There are increases in botnet activity, but little
behavior is known.
 Goals
 Getting better understanding of botnets,
including



2016/3/29
the prevalence of botnet activity
the botnet subspecies diversity
the evolution of a botnet
5
Contributions
 The development of a multifaceted infrastructure to
capture and concurrently track multiple botnets in the
wild
 A comprehensive analysis of measurements reflecting
several important structural and behavioral aspects of
botnets
2016/3/29
6
Outline
 Introduction
 Overview of IRC-based botnets
 Data collection methodology
 Analysis results
 Related work
 Conclusion
2016/3/29
7
The Life Cycle of A Botnet Infection
2016/3/29
8
Step 1: Exploit
 Exploit software
vulnerability of victim
hosts
 by worms or malicious
email attachments
2016/3/29
9
Step 2: Download Bot Binary
 Execute a shellcode to
download bot binary
from a specific location
and install it
2016/3/29
10
Step 3: DNS Lookup (optional)
 Resolve the domain name of
the IRC server coded in the
binary
 Avoid server unavailability
due to IP blocking
2016/3/29
11
Step 4: Join
 Join the IRC server and C&C
channel listed in the binary
 3 types of authentications
1.
2.
3.
2016/3/29
Bots authenticate to join
the server using passwords
in the binary
Bots authenticate to join
the C&C channel using
passwords in the binary
Botmasters authenticate to
the bot population to send
commands
12
Step 5: Parse and Execute
Commands
 Parse commands from the
channel topic and execute
them
 The topic contains default
commands for all bots
2016/3/29
13
Outline
 Introduction
 Overview of IRC-based botnets
 Data collection methodology
 Analysis results
 Related work
 Conclusion
2016/3/29
14
The Overall Data Collection Architecture
2016/3/29
15
The Three Main Phases
1.
Malware collection

Goal: collect bot binaries
2. Binary analysis via gray-box testing
 Goal: analyze the binaries
3. Longitudinal tracking of botnets
 Goal: track real botnets using the analysis results
2016/3/29
16
Phase 1: Malware Collection
Darknet: an allocated but unused portion of the IP address space
2016/3/29
17
Malware Collection
 Environment setup
 There are 14 nodes distributed in the PlantLab testbed.
 These nodes have access to the darknet, whose IP space is located in
10 different class A networks.
 Nepenthes
 Mimics replies generated by vulnerable services to get shellcodes
 Pass URLs in the shellcodes to the download station to fetch bot
binaries (why?)
 Honeynet
 Used to handle cases where Nepenthes failed
 Running unpatched Windows XP on VM
 VLAN
2016/3/29
18
Gateway
 Route darknet traffic to Nepenthes and honeypots
 half to Nepenthes, half to honeypots
 Rotate routing among 8 class-C networks in the darknet
 Use NAT to keep # of honeypots small
 Act as a firewall to prevent honeypots from outgoing
attack and cross infections (VLAN)
 Detect and manage IRC connections
2016/3/29
19
Phase 2: Binary analysis
(graybox)
2016/3/29
20
Binary Analysis
 Environment setup
 A sink (IRC server) monitors all network traffic.
 A client, which is a VM with clean Windows XP
installed and binary executed, is connected to
the sink.
 Two steps
 Creating network fingerprints
 Extracting IRC-related features
2016/3/29
21
The Two Steps
 Creating network fingerprints (network level)
 fnet = {DNS, IPs, Ports, Scan}




DNS: targets of any DNS requests
IPs: destination IP addresses
Ports: contacted ports on the server side
Scan: whether or not the IP scanning behavior is detected
 Extracting IRC-related features (application level)
 When an IRC session is detected, an IRC-fingerprint is
created:

firc = {PASS, NICK, USER, MODE, JOIN}.
 fnet and firc provide enough information to join a botnet
in the wild.
2016/3/29
22
Dialect
 Dialect: the syntax of botmasters’ commands and their
responses
 Learning a botnet’s dialect is required for mimicking
actual bot behavior.
 An IRC query engine plays the role of botmaster.
 Commands come from
 those observed in honeypots
 source codes of public known bots
 The output of the querying process becomes the
template.
2016/3/29
23
Phase 3: Longitudinal Tracking of
Botnets
2016/3/29
24
IRC Tracker (Drone):
 An IRC clients who can join a real-world IRC channel.
 A drone is given firc and the template.
 Automatically answer queries based on the template
 Pretend to be a dutiful bot
 Must be intelligent enough
 Mimicry improvement
 Randomly join and leave
 Change external IP
2016/3/29
25
DNS Tracking
 Most bots find out IRC servers via DNS queries.
 Probe about 800,000 real-world DNS servers
 Query domain names of the IRC servers
 A cache hits implies one or more bots
 Shortcomings
 Not all DNS server are probed.
 # of hits provides only the lower bound of # of bots
 Still useful when the broadcast feature in a botnet is
turned off.
2016/3/29
26
Outline
 Introduction
 Overview of IRC-based botnets
 Data collection methodology
 Analysis results
 Related work
 Conclusion
2016/3/29
27
Data collected
 Started from Feb. 1st, 2006, including
 Traffic traces over the span of 3 months
 IRC logs over the span of 3 months, covering
data from more than 100 botnet channels
 Results of DNS cache hits from tracking 65 IRC
servers on 800,000 DNS servers for more than 45
days
2016/3/29
28
Botnet Traffic Share
 27% of SYNs are from
known botnet spreaders.
 76% of SYNs direct to
target ports.
 The two curves reveal
similar traffic pattern.
 This is a low-bound
estimate.
2016/3/29
29
Botnet Prevalence: A Global Look
 About 85,000 DNS servers are involved in at least
one botnet activity.
2016/3/29
30
Botnet Prevalence: A Global Look
2016/3/29
31
Botnet Spreading Patterns
 Two types of botnet:
 Type-I: fixed scanning
algorithm
 Type-II: variable
scanning algorithm
 Out of 192 IRC bots, 34
are Type-I.
Summery of Type-II scanning practice
2016/3/29
32
Botnet Growth Patterns
2016/3/29
33
Predominant Botnet Structures
1. Single IRC server (70%)
 Prevalent among small botnets
2. Multiple IRC servers, bridged botnet (30%)
 25% of which are public known servers
3. A botmaster controls multiple botnets
4. Some botnets migrate
2016/3/29
34
Effective Botnet Sizes and
Botnet Lifetime
 Effective size: the # of online bots
 The observed effective size was much smaller than
the footprint.
 Bots usually stay connected for only 25 minutes.
 May be due to client inavailability
 More likely, botmasters ask them to leave.
 Botnets, however, have long life time
 84% IRC servers were still up at the end of study.
2016/3/29
35
Botnet Software Taxonomy
2016/3/29
36
Outline
 Introduction
 Overview of IRC-based botnets
 Data collection methodology
 Analysis results
 Related work
 Conclusion
2016/3/29
37
Related Work
 Botnet Tracking: Exploring a Root-Cause
Methodology to Prevent DoS Attacks. ESORICS,
2005
 Introduces the idea of using honeypots and
active responders to analyze the botnet behavior
 Scalability, Fidelity, and Containment in the
Potemkin Virtual Honeyfarm. ACM SIGOPS, 2005
 A very useful tool for botnet detection, but not
appropriate for long term botnet tracking
2016/3/29
38
Outline
 Introduction
 Overview of IRC-based botnets
 Data collection methodology
 Analysis results
 Related work
 Conclusion
2016/3/29
39
Conclusion
 A multifaceted approach is proposed to
understand botnet phenomenon.
 The results show that botnet is a major contributor
to the unwanted network traffic.
 The scanning and pattern of botnets is quite
different from that of autonomous malware.
 The effective size of botnets are much smaller than
that of fingerprints.
2016/3/29
40