Tamper Resistant Network Tracing

Download Report

Transcript Tamper Resistant Network Tracing

Bunker: A Tamper Resistant
Platform for Network Tracing
Stefan Saroiu
University of Toronto
Motivation

Today’s tracing help build tomorrow’s systems

ISPs view raw network traces as a liability



Traces can compromise user privacy
Protecting users’ privacy increasingly important
Trace anonymization mitigates these issues
Offline Anonymization

Trace anonymized after raw data is collected


Today’s traces require deep packet inspection



Privacy risk until raw data is deleted
Headers insufficient to understand phishing or P2P
Payload traces pose a serious privacy risk
Risk to user privacy is too high

Two universities rejected offline anonymization
Offline’s Privacy Vulnerabilities

1.
2.

Two types of attacks:
Traditional: Network intrusion attacks
New: Raw data can be subpoenaed
Both universities required that subpoenas
would not affect privacy
Online Anonymization

Trace anonymized while tracing


Difficult to meet performance demands


Extraction and anonymization must be done at line speeds
Code is frequently buggy and difficult to maintain


Raw data resides in RAM only
Low-level languages (e.g. C) + “Home-made” parsers
Small bugs cause large amounts of data loss

Introduces consistent bias against long-lived flows
Simple Tasks can be Very Slow

Regular expression for phishing:
" ((password)|(<form)|(<input)|(PIN)|(username)|(<script)|
(user id)|(sign in)|(log in)|(login)|(signin)|(log on)|
(sign on)|(signon)|(passcode)|(logon)|(account)|(activate)|(verify)|
(payment)|(personal)|(address)|(card)|(credit)|(error)|(terminated)|
(suspend))[^A-Za-z]”

libpcre: 5.5 s for 30 M = 44 Mbps max
Online Anonymization

Trace anonymized while tracing


Difficult to meet performance demands


Extraction and anonymization must be done at line speeds
Code is frequently buggy and difficult to maintain


Raw data resides in RAM only
Low-level languages (e.g. C) + “Home-made” parsers
Small bugs cause large amounts of data loss

Introduces consistent bias against long-lived flows
Our solution: Bunker

Combines best of both worlds
 Same privacy benefits as online anonymization
 Same engineering benefits as offline anonymization

Pre-load analysis and anonymization code
 Lock-it and throw away the key (tamper-resistance)
Threat Model

Accidental disclosure:


Subpoenas:



Risk is substantial whenever humans are handling data
Attacker has physical access to tracing system
Subpoenas force researcher and ISPs to cooperate
 As long as cooperation is not “unduly burdensome”
Implication: Nobody can have access to raw data
Is Developing Bunker Legal?
It Depends on Intent of Use

Developing Bunker is like
developing encryption

Must consider purpose and uses of Bunker


Developing Bunker for user privacy is legal
Misuse of Bunker to bypass law is illegal
Outline





Motivation
Design of our platform
System evaluation
Case study: Phishing
Conclusions
Logical Design
anonymize
One-Way Interface
(anon. data)
parse
Anon.
Key
assemble
Offline
Online
capture
Capture Hardware
VM-based Implementation
Closed-box VM
anonymize
One-Way
Socket
parse
Anon.
Key
assemble
decrypt
Enc.
Key
encrypt
Offline
Online
capture
Hypervisor
Open-box NIC
Encrypted Raw Data
Capture Hardware
VM-based Implementation
Open-box VM
Closed-box VM
anonymize
One-Way
Socket
parse
save trace
Anon.
Key
assemble
logging
maintenance
decrypt
Enc.
Key
encrypt
Offline
Online
capture
Hypervisor
Open-box NIC
Encrypted Raw Data
Capture Hardware
Benefits

Strong privacy properties


Raw trace and other sensitive data cannot be leaked
Trace processing done offline


Can use your favorite language!
Parsing can be done with off-the-shelf components
Key Technologies

“Closed-box” VM protects sensitive data



Encryption protects on-disk data



Contains all raw trace data & processing code
No interactive access to closed-box (e.g. no console)
Randomly generated key held in volatile memory
Data cannot be decrypted upon reboot
“Safe-on-reboot” VM mitigates hardware attacks
Outline





Motivation
Design of our tool
System evaluation
Case study: Phishing
Conclusions
Software Engineering Benefits
Python
C
Lines of Code
60,000
40,000
63,382
53,995
20,000
1,350
5,512
0
UW
Toronto
Bunker
One order of magnitude btw. online and offline
Development time: Bunker - 2 months, UW/Toronto - years
Work Deferral
Queue Size (GB)
200
150
100
50
0
12:00 PM
6:00 PM
12:00 AM
6:00 AM
12:00 PM
Time
Don’t do now what you can do later
Error Recovery
100%
% of Flows
80%
Parsing OK
68.20%
60%
99.92%
40%
Collateral damage
20%
31.72%
0.08%
Parsing errors
0.08%
0%
Online Tracer
Tamper Resistant Tracer
Small bugs lead to small errors in the trace -- not huge gaps
Outline





Motivation
Design of our tool
System evaluation
Case study: Phishing
Conclusions
Phishing is Bad





Costs U.S. economy hundreds of millions
Affects 1+ million U.S. Internet users
2004 - mid 2006: # of phishing sites grew 10x
Banks claim phishing is #1 source of fraud
Phishing messages now personalized

Harder to filter
Two Day Hotmail Trace
Hotmail
Users
3,062
# of E-mails Received
13,438
# of From Addresses
7,422
# of To Addresses
25,456
Median # of Words in E-mail Body
Tues Jan 29/08 11:15am - Thurs Jan 31 11:23am,
University of Toronto at Mississauga
130
Questions



How often are URLs present in e-mails?
How often do people click on links in e-mails?
Do people verify an e-mail for legitimacy
before clicking on a link?
Links in Email
100%
90.80%
80%
% with Clicks <= 2 s
% with Clicks
78.80%
% with URLs
60%
40%
18.70%
20%
0%
1.53%
0.54%
Users
5.86%
Emails
Conclusions

Today’s tracing experiments need to look “deep” into
network activity


Serious privacy concerns


IP-level trace vs. email and browse history
Physical security isn’t enough: subpoenas
Bunker provides


the safety of online anonymization
the simplicity of offline anonymization
Acknowledgments



Andrew Miklas (U. of Toronto)
Alec Wolman (Microsoft Research)
Angela Demke Brown (U. of Toronto)
Questions?
http://www.cs.toronto.edu/~stefan
Design
Open-box VM
(DomainU)
Closed-box VM
(Domain0)
One-Way
Interface
Offline Software
Enc.
Key
Anon.
Key
Untrusted Software
Online Software
XEN Hypervisor
Encrypted
Raw Trace
Capture
NIC
Open
NIC
Phishy Mail Leaks through Filters
MURTY_PHISHING1
NORMAL_HTTP_TO_IP
4.33%
SCREENTIP
2.93%
0.85%
MURTY_PHISHING3
HTML_OBFUSCATE_05_10
17.10%
0.42%
HTML_OBFUSCATE_10_20
0.10%
SARE_BANK_URI_IP
0.03%
SARE_SPOOF_BADURL
0.01%
SARE_EBAY_SPOOF_NAME
0.22%
0%
5%
10%
% of Emails
15%
20%
anonymize
parse
Anonymized
Trace
Anon.
Key
assemble
Offline
Online
capture
Capture
Hardware
Commodity VM
Inaccessible VM
anonymize
One-Way
Socket
Anon.
Key
parse
save trace
assemble
logging
Offline
Online
maintenance
capture
Hypervisor
Anonymized
Trace
Capture
Hardware
Commodity VM
Inaccessible VM
anonymize
One-Way
Socket
Anon.
Key
parse
save trace
assemble
logging
decrypt
Enc.
Key
maintenance
encrypt
Offline
Online
capture
Hypervisor
Anonymized
Trace
Encrypted
Raw Trace
Capture
Hardware
Overall Privacy Goal
Tracing
Starts
Tamper
Attack
Time
Data
Protected

Data
Exposed
Goal: Ensure that user’s privacy is “no worse off”
when a trace is in progress