Transcript ppt
Hash-Based IP Traceback
Alex C. Snoeren, Craig Partidge, Luis A. Sanchez,
Christine E. Jones, Fabrice Tchakountio, Stephen T.
Kent, and W. Timothy Strayer
SigComm Aug. 2001 San Diego, Ca
Presented by
Chris Dion
Tonight’s Outline
•
•
•
•
•
Introduction to the problem
What is IP Traceback?
Some Previous Work
Overview of the Proposed Solution
Implementation/Simulation
Internet Anonymity
• Not all attacks are large flooding DOS
attacks
• Well placed single packet attacks can be
just as effective
• These packets can be spoofed to appear
from almost anywhere
• How can we track these attacks and find
their origin?
Current Methods
• Use of ingress filtering to limit source
address
– Not all routers can look at every packets
source address
• Spoofed addresses are all to often found
– NAT
– Mobile IP
– Hybrid satellite architectures
IP Traceback
• Some Assumptions about the network
– Packets may be Multi- or broadcast
• Tracing system must be prepared for multiple packets
– Attackers can get into routers
• Tracing must not be confounded by a motivated attacker
– Routing behavior of network can be unstable
• Tracing must be prepared to handle divergent information
– Packet Size Should not grow due to Tracing
– End hosts may be resource constrained
– Tracing is an infrequent operation
• Can use routers control path vs. data path
Attack Path
Attack packet #2
Attack packet #1
Possible Compromised Routers
Victim
Packet Transformations
• Packets may be modified for number of valid
reasons
–
–
–
–
–
–
Packet fragmentation
IP option processing
ICMP processing
Packet duplication
NAT
IPsec Tunneling
• Less then 3% of Internet traffic in 2000
• Attackers can use these!
Some Previous work
• 2 approaches to determining route:
– Audit of flow as it traverses network
• Can grow packet with route information, use fields
in header, or use out-of-band signaling
– Inference of flow based on its impact on state
of network
• Systematically floods network and watch for
variations in received packet flow
• Becomes infeasible when flow sizes approach a
single packet
Packet Digests
• We do not need the entire packet
–
–
–
–
Reduces storage requirements
Need only packet header to determine attacker
Still need to uniquely determine packet
Security concerns
• Mask out fields that modify along a packets
route:
–
–
–
–
Type of Service
TTL
Checksum
IP Options
IP Packet fields for Hash Input
Why 28 bytes?
• WAN trace from
OC-3 gateway
router
• LAN trace from
active 100Mb
segment
• For 28 bytes
– .00092% WAN
– .139 % LAN
Bloom filters
• Used to store digests in router
• From Communications of ACM July 1970
• Computes k distinct packet digests for
each packet using hash functions
• Uses results to index into a bit array
• Could potentially create false positives
Bloom filter
K bit hash functions
n bit digests for each packet received
Bloom Filters (cont)
• Restrictions on Hash Family
– Must distribute a high correlated set of inputs
(packet digests)
– Independent collision events (false positives
at one router is independent of neighboring
routers)
• Called universal hash families
– Must be easy to compute at high link speeds
Source Path Isolation Engine
SPIE System
• DGA – Data Generation Agent
– Produces packet digests of each departing packet
and stores them in a digest table
– Represents the traffic forwarded in a given time
interval
• SCAR – SPIE Collection and Reduction Agent
– When attack is detected, SCAR product attack graph
for it’s region
• STM- SPIE Traceback Manager
– Interface to the intrusion detection system
– Gathers complete attack graph
Traceback processing
• IDS will signal potential attack and give STM:
– Packet P
– Victim V, must be expressed in terms of the last-hop
routers
– Time of attack T, must be in a timely fashion
• STM immediately asks all SCARs in domain to
poll DGAs for digests
• SCAR will give Attack graph, then STM will work
backwards to identify source
What if Packet is Transformed?
• Need a TLT – Transform Lookup Table with each
packet digest:
IP Packet Digest
Indirect flag
Type of Transform (ICMP, NAT, etc.)
Variable for Packet
Data needed to
transform
Graph Construction
• Each SCAR is responsible for it’s region
• After gathering all digest tables, simulates
reverse-path flooding (RPF)
• If packet is found in router, node is marked
and arrival time is the latest possible time
to search
Graph Construction Example
Attack Paths
SPIE Queries
Implementation
• Universal hash family is simulated using
MD5 Hashing (128-bit output)
• Random number is pre-pended to each
packet for independency
• Output is taken as 4 32-bit digests
• Size of Digest Table varies with the total
traffic capacity of the router
Possible DGA in hardware
False Positive Analysis
• Use probability of false positives at p=1/8d for a
theoretical limit (d=degree of router’s neighbors)
– Assuming 32 node path length, approaching diameter
of the Internet
• For simulation used topology for a major ISP
– 70 backbone routers with T-1 (1.54 Mbps) to OC-3
(155 Mbps)
• Sent 1000 attack packets at a constant rate to
one victim, with background traffic set to a fixed
false-positive rate P
Simulation Result
• Low value was due to link utilizations
• Considerable Gap between theoretical and simulation
Time and Memory Analysis
• Give one minute to identify attack packet
• Memory will be linear with link capacity
– We will consider Bloom filter with 3 digesting
functions and a capacity factor of 5 for a false
positive rate of P = .092 when full
– Average sized packets (1000 bits)
• Using this we get a rule of thumb
– SPIE requires 0.5% of total link capacity
Time and Memory Analysis (cont)
• 4 OC-3 links = 47 MB of storage
• 32 OC-192 links = 23.4GB for one minute
• Access Time is also important
– Given DRAM cycle time of 50ns, routers
processing more then 1 OC-192 will need
SRAM (only 16Mb which must be paged)
Some Issues
• Traceback may be requested when the
network is unstable
– Possibly from the attack itself
– Best solution would be out-of-band
management
– Priority handling may work for in-band
• ISP-ISP deployment
– Possible sharing of SPIE infrastructure?
– Grant STM requests to other domains
Conclusions
• Traceback of a single packet is very difficult
• SIPE’s key contribution is that it is feasible
– Low Storage
– Does not aid in eavesdropping
– Complete System
• The future could discard packet digests
probabilistically as they age to allow for longer
traceback times