Techniques and Issues for Anonymizing Network Traces

Download Report

Transcript Techniques and Issues for Anonymizing Network Traces

Analyzing Network Traffic in
the Presence of Adversaries
Vern Paxson
International Computer Science Institute /
Lawrence Berkeley National Laboratory
[email protected] / [email protected]
October 18, 2004
Roadmap
• In today’s Internet, attacks are the norm
– Adversaries can create fundamental problems for
network traffic analysis
• #1 problem: evasion by ambiguity
• “Active mapping” to resolve ambiguities
• “Normalization” to eliminate ambiguities
• #2: flooding directed at network devices
• How to design robust analysis hardware
= 80% growth/year
Data courtesy of
Rick Adams
= 60% growth/year
= 596% growth/year
= 596% growth/year
Courtesy Mark Dedlow
In Today’s Internet
Attacks are the Norm
 Great interest in watching network traffic and
analyzing what it’s doing
• Watching: monitor traffic at chokepoints, capture
copy or perhaps intercept
• Analyzing: reconstruct protocol layers as seen by
endpoints, interpret semantics
• How hard can it be?
• Attackers are adversaries: they don’t want to be
caught and they want to make it painful for us to
operate
The Problem of Evasion
• Evasion raises fundamental problems
• Network traffic seen from within a network is
inherently ambiguous.
• Analyzing network traffic at a high semantic level
requires extensive state …
… which an adversary can target.
• Consider a network intrusion detection
system (IDS; “Bro”) detecting occurrences of
the string “root” inside a network connection
(Let’s disregard the wholly separate issue of false
positives: whether this is a good “signature”)
Detecting “root”: Attempt #1
• Method: scan each packet for ‘r’, ‘o’, ‘o’, ‘t’
– Perhaps using Boyer-Moore, Aho-Corasick,
Bloom filters …
1
…….….root………..…………
But: TCP protocol doesn’t preserve text boundaries
1
…….….ro
2
ot………..…………
Detecting “root”: Attempt #2
• Method: remember match from end of
previous packet
1
2
…….….ro + ot………..…………
- Now we’re managing state
But: TCP protocol doesn’t guarantee in-order arrival
2
ot………..………… ? …….….ro
1
Detecting “root”: Attempt #3
• Method: reassemble entire byte stream
– Keep track of full TCP connection state
– So much for “simple”
– What happens if we run out of memory?
• And:
– Still evadable …
Evading Detection Via
Ambiguous TCP Retransmission
Evading Detection Via
Ambiguous TCP Retransmission
It’s Not Just TTL Expiration
• Systematic study (w/ M. Handley & C.
Kreibich) to analyze ambiguous protocol
fields:
– 73 exploitable ambiguities IP/TCP/UDP/ICMP
– E.g: control flags, flow control window, “don’t
fragment”, old timestamps, service class,
redundant length field, filtering on unused bits
– Internet protocols not designed for analysis
– Attacker toolkits already exist for exploiting these
• Answer: alert upon seeing ambiguous traffic?
The Problem of “Crud”
• Unfortunately, ambiguities occur in benign
traffic, too:
– Legitimate tiny fragments, overlapping fragments
– Receivers that acknowledge data they did not
receive
– Senders that retransmit different data than
originally
• In a diverse traffic stream, you will see these:
– What is the intent?
• Loss of alert precision  “Maybe there’s an
attack”
Countering Evasion-by-Ambiguity:
Active Mapping
• Idea (w/ Umesh Shankar, UCB): Probe
end-host in advance to resolve vantagepoint ambiguities
– E.g., how many hops to it?
– E.g., how does it resolve ambiguous
retransmissions?
– Gray-box testing
Mapping Setup
Grey-box Inference of Reassembly Policy
A Plethora of Inferred Policies
Issues for Active Mapping
• Probing for most ambiguities requires eliciting
a response
– Some hosts won’t respond when not actively
engaged
– For some responses, need to trick host into
echoing back what it saw
• Have to take churn into account
– At a large site, something’s always changing
– Lack of identity due to NAT, DHCP
– Our implementation takes ≈ 5 sec/host
Countering Evasion-by-Ambiguity:
Normalization
• Idea (w/ Mark Handley, Christian Kreibich):
Introduce network element to rewrite traffic
passing through it to eliminate ambiguities
– E.g., regenerate low TTLs (dicey!)
– E.g., regularize flags, unused fields
– E.g., trim out-of-window data
– E.g., reassemble streams & remove inconsistent
retransmissions
Issues for Normalization
• Effect on end-to-end semantics?
– Some normalizations harmless (e.g., inconsistent
streams)
– Some actually improve protocol (e.g., reliable
RSTs)
– Some degrade performance in the presence of
cold start (e.g., stripping TCP window scaling)
• Performance: element is in-line
– Prototype (1.1 GHz): 400 Mbps
– Would like to use custom hardware …
Robust Hardware for Analyzing Traffic
in the Presence of Adversaries
• Ongoing work w/ Sarang Dharmapurikar
(WUSTL)
• Basic building-block for boosting network
analysis: in-line TCP stream reassembly
– If data arrives in-sequence, hand it to
analyzer module
– If data arrives out-of-sequence, it creates a
“hole”
• Buffer for later delivery
• How hard can it be?
How Much Buffer for Holes
Do We Need?
• Most previous work says: “Zero”
– Skip out-of-sequence packets
• Commercial work says: “Yes”
– Claim out-of-sequence packets buffered, but with
no details
• Answer for sound operation depends critically
on whether we consider adversaries …
Measured Buffer Required Per-Hole
Measured Duration of Holes
Instantaneous Aggregate Hole Buffer
How Much Buffer for Holes
Do We Need?, con’t
• Trace analysis says: a few hundred KB
suffices even for a large site’s access link ...
• … But: an adversary can maliciously create
holes, overflowing the buffer. On overflow,
we can either:
– Stop analyzing evicted connection, allowing
adversary to evade
– Kill unanalyzable connection, allowing adversary
to inflict collateral damage
Adversary-Resistant
Stream Reassembly
• Trace analysis also says:
– Very few connections have concurrent holes
• Can limit adversary to one hole per connection
– No hosts have concurrent connections w/ holes
• Can limit adversary to one hole per Zombie
• Consider randomized eviction:
– If buffer size >> requirements of legit connections,
then most evictions evict the attacker’s own holes
Zombie Equations
• Let:
–
–
–
–
–
–
M, P = total memory (pages) available for holes
Ml, Pl = memory (pages) for legitimate holes
e = tolerable eviction rate for legit. connections
r = rate at which a zombie can transmit (bytes/sec)
g = page size (granularity) for hole buffer
Z = # of zombies required to achieve eviction rate
• Then for attacker creating small/large holes:
Zombie Implications
• If we only terminate connections with > 2
packets buffered; allow each connection
10KB of buffer; and use 512MB DRAM …
• … then collateral damage rate X of legitimate
connections terminated per second is:
 By throwing memory at the problem, we can
weather a large attack
Summary
• The lay of the land has changed
– Ecosystem of endemic hostility
• Adversaries can exploit ambiguity and
pressures of holding state to evade
detection or inflict collateral damage
• Internet protocols not designed with
“wire analysis” in mind …
• … But it is possible to design to address
these issues if they are properly
considered
Summary, con’t
• Network analysis amidst adversaries is
a new area:
– Did not talk about: application-level
evasion, polymorphism, tunneling,
compromising passive monitors
– In many ways, reminiscent of Internet
measurement a decade ago:
• Low-hanging fruit
• Daunting problems
• Fun!