ARO_botnet - Computer Science Division

Download Report

Transcript ARO_botnet - Computer Science Division

High-Performance Network
Anomaly/Intrusion Detection &
Mitigation System (HPNAIDM)
Yan Chen
Department of Electrical Engineering and
Computer Science
Northwestern University
Lab for Internet & Security Technology (LIST)
http://list.cs.northwestern.edu
Current Intrusion Detection Systems
• Mostly not scalable to high-speed networks
– Slammer worm infected 75K machines in
<10mins
– Host-based schemes inefficient & user
dependent
• Statistical detection unscalable for flowlevel detection
• Mostly simple signature-based
– Cannot detect unknown and polymorphic attacks
• Cannot differentiate malicious events with
unintentional anomalies
High-Performance Network
Anomaly/Intrusion Detection and
Mitigation System (HPNAIDM)
• Online traffic recording
[SIGCOMM IMC 2004, IEEE INFOCOM 2006, ToN to
appear]
– Reversible sketch for data streaming computation
– Record millions of flows (GB traffic) in a few hundred KB
– Infer the key (eg, src IP) even when not directly
recorded
• Online sketch-based flow-level anomaly detection
[IEEE ICDCS 2006] [IEEE CG&A, Security Visualization 06]
– As a first step, detect TCP SYN flooding, horizontal and
vertical scans even when mixed
HPNAIDM (II)
Integrated approach for false positive
reduction
• Polymorphic worm detection (Hamsa)
[IEEE Symposium on Security and Privacy 2006]
• Accurate network diagnostics
SIGCOMM 2006]
[ACM
• Scalable and robust distributed intrusion
alert fusion with DHT
[ACM SIGCOMM Workshop on Large Scale Attack
Defense 2006]
HPNAIDM
Architecture
Remote
aggregated
sketch
records
Sent out for
aggregation
Normal flows
Reversible
sketch
monitoring
Streaming
packet
data
Filtering
Local
sketch
records
Sketch based
statistical anomaly
detection (SSAD)
Keys of suspicious flows
Part I
Sketchbased
monitoring
& detection
Keys of normal flows
Suspicious flows
Per-flow
monitoring
Signature
-based
detection
Polymorphic
worm detection
(Hamsa)
Network fault
diagnosis
Intrusion or
anomaly alarms
Data path
Control path
Modules on
the critical
path
Modules on
the non-critical
path
Part II
Per-flow
monitoring
& detection
IRC-based Botnet Detection on
Routers
Trend on Botnets
• Total infected bot hosts 800,000 - 900,000
[CERT CA-2003-08]
• Symantec identified an average of about
10,000 bot infected computers per day [Mar.
2006 Internet Security Threat Report]
• # of Botnets - increasing
• Bots per Botnet - decreasing
– Used to be 80k-140k, now 1000s
• More firepower:
– Broadband (1Mbps Up) x 100s = OC3
Geographical Distribution of Bots
Note that this doesn’t reflect where the attackers are.
Trend on Botnets II
• Distribution of Command and Control servers
– Top 3: USA (48%), South Korea (9%) and
Canada(6%)
• US also experienced the highest percentage
of growth in bot-infected computers
– The number of bot-infected computers increased
by 39% in the second half of 2005
– Wide adoption of broadband ?
• Bot-related malicious code reported to
Symantec accounted for 20% of the top 50
malicious code reports, up from 14%.
Problem Definition
For an ISP/enterprise network operator
monitoring at the edge router/gateway, how
to detect botnet server/channel even when
such traffic is encrypted ?
• Identify attacker
Internet
• Disable botnets
botnet server/channel ?
Edge network
Existing Work on Botnet Detection
• Mostly honeypot based approaches
– Trap bots and analyze their behavior
– Eg, Honeynet project, U Michigan [SRUTI 05]
– Hard to generate traffic signatures for
network detection
• Identify botnet channel
– Assuming to know the IRC traffic first, look
for channel w/ majority of hosts performing
TCP SYN scans [SRUTI 06]
– Hard to differentiate from P2P & game traffic
– Bots w/ emerging infection scheme (SMTP) ?
Existing Work on Botnet Detection II
• IDS-based approach like Snort
– Use port numbers and key words (e.g.,
PRIVMSG, lsass, NICK, etc.)
– High false positive and/or false negative
– E.g., what about encrypted bot channel ?
– Complementary to our approach
Our Approach
Two steps:
• Separate IRC traffic from normal
traffic
• Identify botnet traffic in the IRC
traffic
Separating IRC Traffic from Other
Traffic
• Key characteristic: relay (broadcast)
– Upon an incoming packet of size x, broadcast a
packet to one or many different IPs (with packet
size similar to x)
• Packet size: median packet size < 100B
• Duration: average life time 3.5 hours
• Port numbers: 6667, 6668, 6669, 7000, 7514
– But IRC/botnets can also run on non-standard
ports
• Combine all these
Preliminary Analysis
• IRC traffic observed at an university edge router
– Mostly packet headers with limited payload
– Collected in April, 2006
Data size:
Duration:
# of packets
Mean packet length
(bytes)
# of sessions
Mean session
duration (s)
# of IRC servers
# of IRC clients
378M
5 days
925390
164
664
12591
54
39
PING
PONG
PRIVMSG
JOIN
QUIT
ISON
WHO
MODE
39638
39672
241591
41905
34439
15129
9144
6804
CDF of Session Durations
Packet Length Distribution
• Large packets caused by membership listing
These Metrics Are Not Enough !
Online games, and P2P systems
• Relay broadcast:
– Game update, query broadcast from supernodes,
e.g., Gnutella (not for all P2P systems)
• Small average packet size
– FPS (first person shooting), e.g., CounterStrike
» All UDP packets and w/ packet size dist 40 ~ 120B
– RTS (real time strategy), e.g., Warcraft III
» TCP packets, and the packet size is extremely small,
5~10B payload
– Supernodes of P2P only broadcast small query
packets w/o real file transfer
• Long session durations
Additional Characteristics for IRC
Traffic
• IRC traffic usually generated through human
typing or bot command execution report
– Small packet frequency and throughput per IP
– Key differentiator from the RTS games
» Each client sends out at least 5~10 packets per second
• Still, what about P2P?
– Existing traffic study do not have the answer
» Transport layer identification of P2P traffic [IMC 04]
use port # to separate IRC traffic
» Study of Internet chat systems [IMC03] use port #
and keywords to identify IRC traffic
– Our approach: complement w/ active probing
Identify Botnet Traffic
(with Packet Header only)
• When attacker sends command to bots, they
will mostly finish within certain period and
send back similar replies
– Identify groups of IPs that belong to different
channels
– Identify bot channel which has a large number of
non-control messages of similar sizes at the same
time
• Bot repeatedly connect to IRC server when
they fail the connection
– Even ignore error messages from IRC servers,
e.g., connecting too fast or nickname used
Identify Botnet Traffic
(with payload)
• Most normal IRC server un-encrypted
• Look for commands of keywords
– Eg, bot*, ddos*, scan* in Agobot
• Check content similarity of client replies
– Most bots’ replies are similar, e.g., using
Hamming distance
Preliminary Analysis
• Packet traffic from a botnet IRC server at a
compromised machine
Data size:
Duration:
Number of
packets
Mean packet
length (bytes)
# of bots
6MB
7 minutes
on Jun 05
65,135
104
1221
Total
PING
PONG
PRIVMSG
JOIN
QUIT
NOTICE
21,004
6596
5846
971
1198
52
3373
Content Analysis
• [:IRC] PRIVMSG #r00t# :[nickname]: lsass:
exploited (192.168.1.103) 210, 201
• [:IRC] PRIVMSG #r00t# :[nickname]: ftp:
64.198.252.197 on 13836 157, 149
• [:IRC] PRIVMSG #scan :[nickname]
:CSendFile(0x0546EFA0h): Transfer to
149.76.159.9 finished. 105, 97
• [nickname] PRIVMSG #bz-sniff :FTP sniff
"82.227.37.93:4868" to
"24.205.128.157:8500": - "USER
administrator " 60, 45
The message length of each type are very similar,
because they only change IP, port number or
number of bytes
Content Analysis II
Note that Sniff report can vary a lot in length
and content
• PRIVMSG #sniff :HTTP sniff "64.62.222.77:80" to "10.0.0.8:1583": "HTTP/1.1 200 OK Server: Microsoft-IIS/5.0 Date: Wed, 13 Oct 2004
03:33:19 GMT Content-Length: 46 Content-Type: text/html SetCookie: BID=12523219; expires=Mon, 12-Oct-2009 07:00:00 GMT;
path=/ Set-Cookie: PID=1156; expires=Mon, 12-Oct-2009 07:00:00
GMT; path=/ Cache-control:
private
<html><body>5.00_5.00_UG</body></html> " 17
•
PRIVMSG #vuln :VULN sniff "206.230.3.199:80" to "10.0.0.209:4690": "HTTP/1.0 200 OK Server: Apache/1.3.27 (Unix) (Red-Hat/Linux)
mod_ssl/2.8.12 OpenSSL/0.9.6b DAV/1.0.3 mod_perl/1.26
mod_oas/5.6.1 mod_cap/2.0 P3P: CP="NON NID PSAa PSDa OUR IND
UNI COM NAV STA",policyref="/w3c/p3p.xml" P3P: CP="NON NID
PSAa PSDa OUR IND UNI COM NAV
STA",policyref="/w3c/p3p.xml" Last-Modified: Mon, 04 Oct 2004
22:22:54 GMT ETag: "10003e-2a65-4161cd3e" Accept-Ranges:
bytes Content-Length: 10853 Content-Type: image/gif Date: Wed, 13
Oct 2004 03:36:38 GMT Connection: keep-alive GIF89a 24
Bots Making Repeated Connection Attempts
• Even after receiving error messages, e.g.,
connecting too fast or nickname used
Summary
Goal: Detect botnet server/channel at edge
network routers/gateways even when such
traffic is encrypted
• Separate IRC traffic from normal traffic
• Identify botnet traffic in the IRC traffic
Contact: Yan Chen
[email protected]