Worms Taxonomy and Detection Survey

Transcript Worms Taxonomy and Detection Survey

Worms: Taxonomy and Detection
Mark Shaneck
3/15/2004
Outline
 Introduction
 Worm Classification



Spreading Media
Target Acquisition
Polymorphic Worms
 Detection / Prevention
 Conclusion
2
Introduction
 Common and costly
 So far, mostly benign…
 Need to react within seconds - too quickly
for a human
3
Spreading Media
 Traditional
 Email
 Windows File Sharing
 Hybrid
4
Traditional
 Self-propagate through network
 Exploit some vulnerability to automatically execute
worm payload

Most common - buffer overflow
 Least common in existence
 Largest potential danger

Spreads fastest
 Main subject of detection and containment research
5
Email
 Spreads through email
 Relies on humans or poor application design


Most are executable attachments
Nimda executed automatically when previewed
 Most common form of worm
 Very hard to detect, but they spread slowly
6
Windows File Sharing
 Spreads through windows file shares
 Worms don’t generally spread this way
solely


Very hard to penetrate a network perimeter this
way
Usually use other methods to penetrate network
and then this method to spread within the
network
7
Hybrid Worms
 Combination of methods
 Example: Nimda



Spread through email
Copied itself to open network shares (was executed if
someone viewed it in Windows Explorer)
Traditional methods



Used subnet scanning to look for open Code Red II and Sadmind
backdoors
Exploited multiple IIS Directory Traversal vulnerabilities
Modified web pages to cause clients to download and
execute the worm payload
8
Hybrid Worms
 Detection difficulties


Propagation pattern is difficult to predict since
humans are involved
If one method is blocked it might find another
way in…
9
Target Acquisition
 Random Scanning
 Subnet Scanning
 Routing Worm
 Pre-generated Hit List
 Topological
 Stealth / Passive
10
Random Scanning
 32 bit number is randomly generated and
used as the IP address
 Slammer and Code Red I
 Hits black IP space frequently

Only 28.6% of IP space is allocated
11
Subnet Scanning
 Generate last 1, 2, or 3 bytes of IP address
randomly
 Code Red II and Blaster
 Some scans must be completely random to
infect whole internet
12
Routing Worm
 BGP information can tell which IP address
blocks are allocated
 This information is publicly available


http://www.routeviews.org/
http://www.ripe.net/ris/
13
BGP Routing Worm
 By including routable prefixes in the worm
payload, it can limit its scanning to allocated
addresses
 Could reduce scanning space by 71.4%
 Aggregation and compression could reduce the
space needed to 175 KB
 Compare



Slammer: 376 bytes
Blaster: 6 KB
Nimda: 57 KB
14
Class A Routing Worm
 By examining BGP data you can see which
Class A addresses are allocated
 Only 116 of 256 Class A addresses are
publicly routable (45.3% of total IP space)
 Only 116 extra bytes are needed to reduce
the scanning space in half
15
Pre-generated Hit List
 Hit list of vulnerable machines is sent with payload

Determined before worm launch by scanning
 Gives the worm a boost in the slow start phase
 Skips the phase that follows the exponential model

Infection rate looks linear in the rapid propagation phase
 Can avoid detection by the early detection systems
16
Topological
 Uses info on the infected host to find the next
target



Morris Worm used Network Yellow Pages and
/etc/hosts file to find more hosts
Email worms use address books
P2P systems usually store info about hosts it
connects to
17
Stealth / Passive
 Waits for a vulnerable system to contact it
 Hides the infection among normal traffic

No active scanning
 Nimda - modification of server web pages
 P2P systems - infected host could respond to
requests with the worm
18
Polymorphic Worms
 Worms can easily be enhanced for self-modification
 Simple encryption with random key would
randomize the payload


Small decryption routine would remain
This could be obfuscated and randomized as well


Random do-nothing instructions
Random padding
 Exploit might remain common


Nimda email - no exploit data
Buffer Overflow - return address might be same
19
Detection / Prevention
 Ideal: Dynamic Quarantine and Automatic
Signature Generation
 IPv6 vs. Worms
 EarlyBird
 Honeycomb
 BGP Information
 Kalman Filter
 Hidden Markov Models
 Email Worm Detection
20
Ideal
 Detect worm outbreak quickly
 Automatically generate signatures and filter
packets immediately
 Distribute alerts and signatures faster than
worms can spread
 Is this possible?
21
IPv6 vs. Worms
 IPv6 has 2128 IP addresses
 Smallest subnet has 264 addresses

4 billion IPv4 internets
 Consider a sub-network




1,000,000 vulnerable hosts
100,000 scans per second (Slammer - 4,000)
1,000 initially infected hosts
It would take 40 years to infect 50% of vulnerable
population with random scanning
 Scan-based worms will be ineffective
22
EarlyBird
 “Flows” are identified by packet content (or hash of
content)
 Counters of distinct sources and destinations are
kept for popular flows
 When counts cross the threshold, flow is considered
a worm, and content used for signature
 Additional “guilt” can be assigned to flows sent to
black address space
23
EarlyBird
 Benefits


Counts distinct sources and destinations
Most systems simply examine total traffic on a
particular port and look for changes in the traffic
pattern
24
EarlyBird
 Packet content examination can be evaded
with simple polymorphism


They suggest using sampled Rabin fingerprinting
to find commonly occurring fixed length strings
If only 4 bytes are in common for a polymorphic
worm, then the packets will be identified by only
4 bytes…. How to differentiate packets?
25
Honeycomb
 Plugin to honeyd
 Assumption: All traffic to a honeypot is suspicious
 For every inbound packet - use longest common
substring (LCS) algorithm to find a signature (after
performing header analysis)
 Adds signature to the signature pool
 Periodically outputs signature pool to Snort/Bro
 Problems: Traffic to regular hosts? Polymorphism?
26
BGP Information
 Use black address space to watch for scans

Only will be useful in detecting random scanning
worms
 Use AS profiling to build a model of how
much traffic comes from each AS and watch
for drastic changes

Will it detect in time?
27
Kalman Filter
 Worm propagation follows the epidemic
model 10 x 10
4
# of infected hosts
# of infected hosts It
8
6
4
2
0
0
50
100
Time t (second)
150
200
28
Kalman Filter
 Best system currently by Don Towsley, et al.
 Distribute sensors (ingress and egress filters) around
network to measure




Scan rate
Scan distribution
Total number of scans
Total number of infected hosts
 Info sent to centralized Malware Warning Center
(MWC)
29
Kalman Filter
Monitored illegitimate traffic rate
60
60
60
50
50
40
40
30
30
20
20
10
10
0
0
50
40
30
20
10
0
10
20
30
40
50
10
20
30
40
Exponential rate a on-line estimation
0.2
0.2
0.2
0.15
0.15
0.1
0.1
0.1
0.05
0.05
0.05
0
0
0
-0.05
-0.05
-0.05
0.15
-0.1
20
30
40
50
20
10
20
30
40
50
-0.1
-0.1
10
10
50
10
Non-worm traffic burst
20
30
40
50
30
Worm traffic
40
50
30
Kalman Filter
 MWC uses Kalman filter to calculate trend in the
growth

If it matches the exponential model, it is considered a
worm
 Sensors measure the info by packets sent to black
IP space
 Sensors must monitor 220 IP addresses to get
accurate information
 Can be circumvented by a hit-list or topological
worm
31
Hidden Markov Model
 Not very useful in worm detection
 HMMs are based on changes in states
 Worm outbreaks effectively consist of two
states - vulnerable and infected
 To be of use the transition to infected would
need to be detected, which is basically worm
detection…
32
Email Worm Detection
 Email Mining Toolkit (EMT) - Columbia
 Cliques - users usually send email to particular sets
of users
 Assumption: If user sends to a set that is not a
subset of a clique, something is wrong
 Anomaly detection to find suspicious email to be
examined in more detail
 Problems: If user sends one broadcast email, clique
is useless. False positives.
33
Conclusion
 Ideal in fighting worms - detection and quarantine /
signature generation
 Most research focuses on early detection
 It is not clear how to protect after detection


Is it enough to close the port?
Ban offending IP addresses temporarily?
 Is it possible to automatically generate signatures
for any worm?
34