ppt - Northwestern University

Download Report

Transcript ppt - Northwestern University

Global Intrusion Detection
Using Distribute Hash Table
Jason Skicewicz, Laurence
Berland, Yan Chen
Northwestern University 6/2004
Current Architecture

Intrusion Detection Systems
• Vulnerable to attack
• Many false responses
• Limited network view
• Varying degrees of intelligence

Centralized Data Aggregation
• Generally done manually
• Post-mortem global view
• Not real time!
Sensor Fusion Centers

Sensor fusion centers (SFC) aggregates
information from sensors throughout the
network
•
•
•
•

More global view
Larger information pool
Still vulnerable to attack
Overload potential if multiple simultaneous
attacks
Can’t we leverage all the participants?
Distributed Fusion Centers
Different fusion centers for different
anomalies
 Must attack all fusion centers, or
know more about fusion center
assignments
 Still needs to be manually set up,
routed to
 What if things were redundant and
self-organizing?

What is DHT




DHT, or Distributed Hash Tables, is a peerto-peer system where the location of a
resource or file is found by hashing on the
key
DHTs include CHORD, CAN, PASTRY, and
TAPESTRY
DHT attempts to spread the keyspace
across as many nodes as possible
Different DHT use different topologies
CAN

CAN is based on a multi-reality ndimensional toroid for routing
(Ratnasamy et al)
CAN
Each reality is a complete toroid,
provides full redundancy
 Network covers entire address space,
dynamically splits space
 Routes across the CAN, so you don’t
need to connect directly to the
Fusion Center

GIDS over DHT

Fusion centers are organized on a
distributed hash table
•
•
•
•

Peer-to-peer
Self-organized
Decentralized
Resilient
We use Content Addressable Network
(CAN)
• Highly redundant
• N-dimensional toroid enhances reachability
DIDS diagram
NIDS Reports to Fusion Center
NIDS
NIDS
Peer-to-peer
CAN
Worm Probe Sent
INTERNET
Infected Machine
Host IDS
IDS on probed
Host reports to
Fusion Center
CAN directs to
Fusion Center
Reporting Information




Fusion Centers need enough information
to make reasonable decisions
ID systems all have different proprietary
reporting formats
Fusion Centers would be overloaded with
data if full packet dumps were sent
We need a concise, standardized format
for reporting anomalies
Symptom Vector
Standardized set of information
reported to fusion centers.
 Plugins for IDS could be written to
handle producing these vectors and
actually connect to the CAN
 Flexibility for reporting more details

Symptom Vector
<src_addr,dst_addr,proto,src_port,dst_port,payload
,event_type,lower_limit,upper_limit>
• Payload: Payload specifies some descriptor of the actual
packet payload. This is most useful for worms. Two
choices we’ve considered so far are a hash of the
contents, or the size in bytes
• Event_type: A code specifying an event type such as a
worm probe or a SYN flood
• Based on the event_type, upper_limit and lower_limit
are two numerical fields available for the reporting IDS
to provide more information
Payload Reporting

Hash: a semi-unique string produced by
performing mathematical transformations
on the content
• Uniquely identifies the content
• Cannot easily be matched based on “similarity”
so it’s hard to spot polymorphic worms

Size: the number of bytes the worm takes
up
• Non-unique: two worms could be of the same
size, though we’re doing research to see how
often that actually occurs
• Much easier to spot polymorphism: simple
changes cause no or only small changes in size
Routing Information

DHT is traditionally a peer to peer file
sharing network
• Locates content based on name, hash,
etc
• Not traditionally used to locate
resources

We develop a routing vector in place
of traditional DHT addressing
methods, and use it to locate the
appropriate fusion center(s)
Routing Vector
Based on the anomaly type
 Generalized to ensure similar
anomalies go to the same fusion
center, while disparate anomalies are
distributed across the network for
better resource allocation
 Worm routing vector:
<dst_port,payload,event_type,lower
_limit,upper_limit>

Routing Vector
Worm routing vector avoids using
less relevant fields such as source
port or IP addresses
 Designed to utilize only information
that will be fairly consistent across
any given worm
 Used to locate fusion center, which
receives full symptom vector for
detailed analysis

Size and the boundary problem




Assume a CAN with several nodes. Each is
allocated a range of sizes, say in blocks of 1000
bytes.
Assume node A has range 4000-5000 and node B
has range 5000-6000
If a polymorphic worm has size ranging between
4980 and 5080, the information is split
Solution? Have information sent across the
boundary. Node A sends copies of anything with
size >4900 to node B and node B sends anything
with size <5100 to A
To DHT or not to DHT




DHT automatically organizes everything
for us
DHT ensures anomalies are somewhat
spread out across the network
DHT routes in real time, without
substantial prior knowledge of the
anomaly
DHT is redundant, making an attack
against the sensor fusion center tricky at
worst and impossible to coordinate at best
Simulating the system
We build a simple array of nodes,
and have them generate the
symptom and routing vectors as they
encounter anomalies
 Not yet complete, work in progress
 Demonstrates fusibility of
information appropriately; noninterference of multiple simultaneous
anomalies

Further Work
Complete paper (duh)
 Add CAN to simulation to actually
route
 Include real-world packet dumps in
the simulation
 Test on more complex topologies?
