Autograph: Toward Automated, Distributed Worm Signature Detection

Download Report

Transcript Autograph: Toward Automated, Distributed Worm Signature Detection

Autograph
Toward Automated, Distributed Worm
Signature Detection
Hyang-Ah Kim
Brad Karp
Carnegie Mellon University
Intel Research &
Carnegie Mellon University
Usenix Security 2004
Internet Worm Quarantine

Internet Worm Quarantine Techniques




Destination port blocking
Infected source host IP blocking
Content-based blocking [Moore et al., 2003]
Worm Signature
05:45:31.912454 90.196.22.196.1716 > 209.78.235.128.80: . 0:1460(1460) ack 1
Signature for CodeRed II
win 8760 (DF)
0x0000
4500 05dc 84af 4000 6f06 5315 5ac4 16c4
[email protected]...
0x0010
d14e eb80 06b4 0050 5e86 fe57 440b 7c3b
.N.....P^..WD.|;
0x0020
5010 2238 6c8f 0000 4745 5420 2f64 6566
P."8l...GET./def
0x0030
6175 6c74 2e69 6461 3f58 5858 5858 5858
ault.ida?XXXXXXX
0x0040
5858 5858 5858 5858 5858 5858 5858 5858
XXXXXXXXXXXXXXXX
. . . . .
0x00e0
5858 5858 5858 5858 5858 5858 5858 5858
XXXXXXXXXXXXXXXX
0x00f0
5858 5858 5858 5858 5858 5858 5858 5858
XXXXXXXXXXXXXXXX
: A Payload
Content
String
Specific
To A Worm
0x0100
5858 5858 5858
5858 5858
5858 5858
5858
XXXXXXXXXXXXXXXX
0x0110
5858 5858 5858 5858 5825 7539 3039 3025
XXXXXXXXX%u9090%
0x01a0
303d 6120 4854 5450 2f31 2e30 0d0a 436f
0=a.HTTP/1.0..Co
.
Signature
Usenix Security 2004
2
Content-based Blocking
Signature for CodeRed II
Internet
Traffic
Filtering
X
Our network
 Can be used by Bro, Snort, Cisco’s NBAR, ...
Usenix Security 2004
3
Signature derivation is too slow

Current Signature Derivation Process

New worm outbreak
Report of anomalies from people via
phone/email/newsgroup
Worm trace is captured
Manual analysis by security experts
Signature generation

Labor-intensive, Human-mediated




Usenix Security 2004
4
Goal
Automatically generate signatures of
previously unknown Internet worms

as accurately as possible
 Content-Based Analysis

as quickly as possible
 Automation, Distributed Monitoring
Usenix Security 2004
5
Assumptions

We focus on TCP worms that propagate via
scanning
Actually, any transport
 in which spoofed sources cannot communicate
successfully
 in which transport framing is known to monitor

Worm’s payloads share a common substring

Vulnerability exploit part is not easily mutable

Not polymorphic
Usenix Security 2004
6
Outline


Problem and Motivation
Automated Signature Detection




Distributed Signature Detection




Desiderata
Technique
Evaluation
Tattler
Evaluation
Related Work
Conclusion
Usenix Security 2004
7
Desiderata

Automation: Minimal manual intervention

Signature quality: Sensitive & specific


Sensitive: match all worms  low false negative rate
Specific: match only worms  low false positive rate

Timeliness: Early detection

Application neutrality

Broad applicability
Usenix Security 2004
8
Automated Signature Generation
Internet
Traffic
Filtering
Autograph
Monitor
Our network
X
Signature


Step 1: Select suspicious flows using heuristics
Step 2: Generate signature using contentprevalence analysis
Usenix Security 2004
9
S1: Suspicious Flow Selection
Reduce the work by filtering out
vast amount of innocuous flows

Heuristic: Flows from scanners are suspicious
Focus on the successful flows from IPs who made unsuccessful
connections to more than s destinations for last 24hours
 Suitable heuristic for TCP worm that scans network


Suspicious Flow Pool


Autograph (s = 2)
Holds reassembled, suspicious flows captured during the last
Non-existent
time period t
Triggers signature generation if there areNon-existent
more than  flows

This flow will be
selected
Usenix Security 2004
10
S1: Suspicious Flow Selection
Reduce the work by filtering out
vast amount of innocuous flows

Heuristic: Flows from scanners are suspicious
Focus on the successful flows from IPs who made unsuccessful
connections to more than s destinations for last 24hours
 Suitable heuristic for TCP worm that scans network


Suspicious Flow Pool


Holds reassembled, suspicious flows captured during the last
time period t
Triggers signature generation if there are more than  flows
Usenix Security 2004
11
S2: Signature Generation
Use the most frequent byte sequences across
suspicious flows as signatures
All instances of a worm have a common byte
pattern specific to the worm
Rationale
 Worms propagate by duplicating themselves
 Worms propagate using vulnerability of a service
How to find the most frequent byte sequences?
Usenix Security 2004
12
Worm-specific Pattern Detection

Use the entire payload

Brittle to byte insertion, deletion, reordering
Flow 1
GARBAGEEABCDEFGHIJKABCDXXXX
Flow 2
GARBAGEABCDEFGHIJKABCDXXXXX
Usenix Security 2004
13
Worm-specific Pattern Detection
Partition flows into non-overlapping small blocks
and count the number of occurrences

Fixed-length Partition

Still brittle to byte insertion, deletion, reordering
Flow 1
GARBAGEEABCDEFGHIJKABCDXXXX
Flow 2
GARBAGEABCDEFGHIJKABCDXXXXX
Usenix Security 2004
14
Worm-specific Pattern Detection

Content-based Payload Partitioning (COPP)


Partition if Rabin fingerprint of a sliding window matches
Breakmark  Content Blocks
Configurable parameters: content block size (minimum, average,
maximum), breakmark, sliding window
Flow 1
GARBAGEEABCDEFGHIJKABCDXXXX
Flow 2
GARBAGEABCDEFGHIJKABCDXXXXX
Breakmark = last 8 bits of fingerprint (ABCD)
Usenix Security 2004
15
Why Prevalence?
Prevalence Distribution in Suspicious Flow Pool
- From 24-hr http traffic trace
Nimda
CodeRed2
Nimda (16 different payloads)
WebDAV exploit
Innocuous,
misclassified


Worm flows dominate in the suspicious flow pool
Content-blocks from worms are highly ranked
Usenix Security 2004
16
Select Most Frequent Content Block
f0
C
f1
C D G
f2
A B D
f3
A C
E
f4
A B
E
f5
A B D
f6
H I
f7
I H J
f8
G
F
I
J
J
Usenix Security 2004
17
Select Most Frequent Content Block
f0
f0
C F
f1
C D G
f2
A B D
f3
A C E
E
f4
A B E
B
E
B
D
f5
f6
A B D
H I J
f7
I H J
f8
G I J
C
F
f1
C
D
G
f2
A
B
D
f3
A
C
f4
A
f5
A
A
f6 B HC DI I J J
A
A
H IJ J
f7 B IC D
A
f8 B GC DI I J J
E G H
E
G H F
Usenix Security 2004
18
Select Most Frequent Content Block
Signature:
W≥90%
W: target coverage in suspicious flow pool
P: minimum occurrence to be selected
A
P≥3
A B
C D
I
J
A B
C D
I
J
E G H
A B
C D
I
J
E
f0
C F
f1
C D G
f2
A B D
f3
A C E
f4
A B E
f5
f6
A B D
H I J
f7
I H J
f8
G I J
G H F
Usenix Security 2004
19
Select Most Frequent Content Block
Signature: A
W≥90%
W: target coverage in suspicious flow pool
P: minimum occurrence to be selected
A
P≥3
A B
C D
I
J
A B
C D
I
J
E G H
A B
C D
I
J
E
f0
C F
f1
C D G
f2
A B D
f3
A C E
f4
A B E
f5
f6
A B D
H I J
f7
I H J
f8
G I J
G H F
Usenix Security 2004
20
Select Most Frequent Content Block
Signature: A
W≥90%
W: target coverage in suspicious flow pool
P: minimum occurrence to be selected
A
P≥3
A B
C D
I
J
A B
C D
I
J
E G H
A B
C D
I
J
E
f0
C F
f1
C D G
f2
A B D
f3
A C E
f4
A B E
f5
f6
A B D
H I J
f7
I H J
f8
G I J
G H F
Usenix Security 2004
21
Select Most Frequent Content Block
Signature: A
I
W≥90%
W: target coverage in suspicious flow pool
P: minimum occurrence to be selected
P≥3
I
J
I
J
C
G H
I
J
C
G H D F
Usenix Security 2004
f0
C F
f1
C D G
f2
A B D
f3
A C E
f4
A B E
f5
f6
A B D
H I J
f7
I H J
f8
G I J
22
Select Most Frequent Content Block
Signature: A
I
W≥90%
W: target coverage in suspicious flow pool
P: minimum occurrence to be selected
P≥3
f0
C F
f1
C D G
f2
A B D
f3
A C E
f4
A B E
f5
f6
A B D
H I J
f7
I H J
f8
G I J
C
C
G D F
Usenix Security 2004
23
Outline


Problem and Motivation
Automated Signature Detection




Distributed Signature Detection




Desiderata
Technique
Evaluation
Tattler
Evaluation
Related Work
Conclusion
Usenix Security 2004
24
Behavior of Signature Generation

Objectives


Metrics



Effect of COPP parameters on signature quality
Sensitivity = # of true alarms / total # of worm
flows  false negatives
Efficiency = # of true alarms / # of alarms  false
positives
Trace


Contains 24-hour http traffic
Includes 17 different types of worm payloads
Usenix Security 2004
25
Signature Quality


Larger block sizes generate more specific signatures
A range of w (90-95%, workload dependent) produces a
good signature
Usenix Security 2004
26
Outline


Problem and Motivation
Automated Signature Detection




Distributed Signature Detection




Desiderata
Technique
Evaluation
Tattler
Evaluation
Related Work
Conclusion
Usenix Security 2004
27
Signature Generation Speed

Bounded by worm payload accumulation speed


Aggressiveness of scanner detection heuristic
s: # of failed connection peers to detect a scanner
# of payloads enough for content analysis
: suspicious flow pool size to trigger signature generation

Single Autograph


Worm payload accumulation is slow
Distributed Autograph


Share scanner IP list
Tattler: limit bandwidth
consumption within a
predefined cap
A
A
A
A
tattler
A
Internet
A
A
Usenix Security 2004
28
Benefit from tattler


Worm payload accumulation (time to catch
5 worms)
Many innocuous
misclassified
Fraction of Infected
Hosts
Info
Autograph
Aggressive flows
Conservative
Sharing
Monitor
None
Luckiest
Median
(s = 1)
2%
25%
(s = 4)
60%
--
Tattler
All
<1%
15%
Signature generation


More aggressive scanner detection (s) and signature generation
trigger ()  faster signature generation, more false positives
With s=2 and =15, Autograph generates the good worm
signature before < 2% hosts get infected
Usenix Security 2004
29
Related Work

Automated Worm Signature Detection
EarlyBird
HoneyComb
[Singh et al. 2003]
[Kreibich et al. 2003]
Signature
Generation
Content prevalence
 Address
Dispersion
Honeypot +
Pairwise LCS
Suspicious flow
selection 
Content prevalence
Deployment
Network
Host
Network
Flow
Reassembly
No
Yes
Yes
Distributed
Monitoring
No
No
Yes

Autograph
Distributed Monitoring


Honeyd[Provos2003], DOMINO[Yegneswaran et al. 2004]
Corroborate faster accumulation of worm payloads/scanner IPs
Usenix Security 2004
30
Future Work

Attacks




Online evaluation with diverse traces & deployment on
distributed sites
Broader set of suspicious flow selection heuristics




Overload Autograph
Abuse Autograph for DoS attacks
Non-scanning worms (ex. hit-list worms, topological worms, email
worms)
UDP worms
Egress detection
Distributed agreement for signature quality testing

Trusted aggregation
Usenix Security 2004
31
Conclusion


Stopping spread of novel worms requires early
generation of signatures
Autograph: automated signature detection
system




Automated suspicious flow selection→ Automated
content prevalence analysis
COPP: robustness against payload variability
Distributed monitoring: faster signature generation
Autograph finds sensitive & specific signatures
early in real network traces
Usenix Security 2004
32
For more information, visit
http://www.cs.cmu.edu/~hakim/autograph
Usenix Security 2004
Attacks

Overload due to flow reassembly
Solutions
 Multiple instances of Autograph on separate HW (port-disjoint)
 Suspicious flow sampling under heavy load

Abuse Autograph for DoS: pollute suspicious flow pool


Port scan and then send innocuous traffic
Solution
 Distributed verification of signatures at many monitors
Source-address-spoofed port scan
Solution
 Reply with SYN/ACK on behalf of non-existent hosts/services
Usenix Security 2004
34
Number of Signatures

Smaller block sizes generate small # of
signatures
Usenix Security 2004
35
tattler


A modified RTCP (RTP Control Protocol)
Limit the total bandwidth of announcements sent to the
group within a predetermined cap
Usenix Security 2004
36
Simulation Setup



About 340,000 vulnerable hosts from about 6400 ASes
Took small size edge networks (/16s) based on BGP table
of 19th of July, 2001.
Service deployment




Scanning



50% of address space within the vulnerable ASes is reachable
25% of reachable hosts run web server
340,000 vulnerable hosts are randomly placed.
10probes per second
Scanning the entire non-class-D IP address space
Network/processing delays

Randomly chosen in [0.5, 1.5] seconds
Usenix Security 2004
37