ppt - Northwestern University
Download
Report
Transcript ppt - Northwestern University
Global Intrusion Detection
Using Distribute Hash Table
Jason Skicewicz, Laurence
Berland, Yan Chen
Northwestern University 6/2004
Current Architecture
Intrusion Detection Systems
• Vulnerable to attack
• Many false responses
• Limited network view
• Varying degrees of intelligence
Centralized Data Aggregation
• Generally done manually
• Post-mortem global view
• Not real time!
Sensor Fusion Centers
Sensor fusion centers (SFC) aggregates
information from sensors throughout the
network
•
•
•
•
More global view
Larger information pool
Still vulnerable to attack
Overload potential if multiple simultaneous
attacks
Can’t we leverage all the participants?
Distributed Fusion Centers
Different fusion centers for different
anomalies
Must attack all fusion centers, or
know more about fusion center
assignments
Still needs to be manually set up,
routed to
What if things were redundant and
self-organizing?
What is DHT
DHT, or Distributed Hash Tables, is a peerto-peer system where the location of a
resource or file is found by hashing on the
key
DHTs include CHORD, CAN, PASTRY, and
TAPESTRY
DHT attempts to spread the keyspace
across as many nodes as possible
Different DHT use different topologies
CAN
CAN is based on a multi-reality ndimensional toroid for routing
(Ratnasamy et al)
CAN
Each reality is a complete toroid,
provides full redundancy
Network covers entire address space,
dynamically splits space
Routes across the CAN, so you don’t
need to connect directly to the
Fusion Center
GIDS over DHT
Fusion centers are organized on a
distributed hash table
•
•
•
•
Peer-to-peer
Self-organized
Decentralized
Resilient
We use Content Addressable Network
(CAN)
• Highly redundant
• N-dimensional toroid enhances reachability
DIDS diagram
NIDS Reports to Fusion Center
NIDS
NIDS
Peer-to-peer
CAN
Worm Probe Sent
INTERNET
Infected Machine
Host IDS
IDS on probed
Host reports to
Fusion Center
CAN directs to
Fusion Center
Reporting Information
Fusion Centers need enough information
to make reasonable decisions
ID systems all have different proprietary
reporting formats
Fusion Centers would be overloaded with
data if full packet dumps were sent
We need a concise, standardized format
for reporting anomalies
Symptom Vector
Standardized set of information
reported to fusion centers.
Plugins for IDS could be written to
handle producing these vectors and
actually connect to the CAN
Flexibility for reporting more details
Symptom Vector
<src_addr,dst_addr,proto,src_port,dst_port,payload
,event_type,lower_limit,upper_limit>
• Payload: Payload specifies some descriptor of the actual
packet payload. This is most useful for worms. Two
choices we’ve considered so far are a hash of the
contents, or the size in bytes
• Event_type: A code specifying an event type such as a
worm probe or a SYN flood
• Based on the event_type, upper_limit and lower_limit
are two numerical fields available for the reporting IDS
to provide more information
Payload Reporting
Hash: a semi-unique string produced by
performing mathematical transformations
on the content
• Uniquely identifies the content
• Cannot easily be matched based on “similarity”
so it’s hard to spot polymorphic worms
Size: the number of bytes the worm takes
up
• Non-unique: two worms could be of the same
size, though we’re doing research to see how
often that actually occurs
• Much easier to spot polymorphism: simple
changes cause no or only small changes in size
Routing Information
DHT is traditionally a peer to peer file
sharing network
• Locates content based on name, hash,
etc
• Not traditionally used to locate
resources
We develop a routing vector in place
of traditional DHT addressing
methods, and use it to locate the
appropriate fusion center(s)
Routing Vector
Based on the anomaly type
Generalized to ensure similar
anomalies go to the same fusion
center, while disparate anomalies are
distributed across the network for
better resource allocation
Worm routing vector:
<dst_port,payload,event_type,lower
_limit,upper_limit>
Routing Vector
Worm routing vector avoids using
less relevant fields such as source
port or IP addresses
Designed to utilize only information
that will be fairly consistent across
any given worm
Used to locate fusion center, which
receives full symptom vector for
detailed analysis
Size and the boundary problem
Assume a CAN with several nodes. Each is
allocated a range of sizes, say in blocks of 1000
bytes.
Assume node A has range 4000-5000 and node B
has range 5000-6000
If a polymorphic worm has size ranging between
4980 and 5080, the information is split
Solution? Have information sent across the
boundary. Node A sends copies of anything with
size >4900 to node B and node B sends anything
with size <5100 to A
To DHT or not to DHT
DHT automatically organizes everything
for us
DHT ensures anomalies are somewhat
spread out across the network
DHT routes in real time, without
substantial prior knowledge of the
anomaly
DHT is redundant, making an attack
against the sensor fusion center tricky at
worst and impossible to coordinate at best
Simulating the system
We build a simple array of nodes,
and have them generate the
symptom and routing vectors as they
encounter anomalies
Not yet complete, work in progress
Demonstrates fusibility of
information appropriately; noninterference of multiple simultaneous
anomalies
Further Work
Complete paper (duh)
Add CAN to simulation to actually
route
Include real-world packet dumps in
the simulation
Test on more complex topologies?