Tamper Resistant Network Tracing
Download
Report
Transcript Tamper Resistant Network Tracing
Bunker: A Tamper Resistant
Platform for Network Tracing
Stefan Saroiu
University of Toronto
Motivation
Today’s tracing help build tomorrow’s systems
ISPs view raw network traces as a liability
Traces can compromise user privacy
Protecting users’ privacy increasingly important
Trace anonymization mitigates these issues
Offline Anonymization
Trace anonymized after raw data is collected
Today’s traces require deep packet inspection
Privacy risk until raw data is deleted
Headers insufficient to understand phishing or P2P
Payload traces pose a serious privacy risk
Risk to user privacy is too high
Two universities rejected offline anonymization
Offline’s Privacy Vulnerabilities
1.
2.
Two types of attacks:
Traditional: Network intrusion attacks
New: Raw data can be subpoenaed
Both universities required that subpoenas
would not affect privacy
Online Anonymization
Trace anonymized while tracing
Difficult to meet performance demands
Extraction and anonymization must be done at line speeds
Code is frequently buggy and difficult to maintain
Raw data resides in RAM only
Low-level languages (e.g. C) + “Home-made” parsers
Small bugs cause large amounts of data loss
Introduces consistent bias against long-lived flows
Simple Tasks can be Very Slow
Regular expression for phishing:
" ((password)|(<form)|(<input)|(PIN)|(username)|(<script)|
(user id)|(sign in)|(log in)|(login)|(signin)|(log on)|
(sign on)|(signon)|(passcode)|(logon)|(account)|(activate)|(verify)|
(payment)|(personal)|(address)|(card)|(credit)|(error)|(terminated)|
(suspend))[^A-Za-z]”
libpcre: 5.5 s for 30 M = 44 Mbps max
Online Anonymization
Trace anonymized while tracing
Difficult to meet performance demands
Extraction and anonymization must be done at line speeds
Code is frequently buggy and difficult to maintain
Raw data resides in RAM only
Low-level languages (e.g. C) + “Home-made” parsers
Small bugs cause large amounts of data loss
Introduces consistent bias against long-lived flows
Our solution: Bunker
Combines best of both worlds
Same privacy benefits as online anonymization
Same engineering benefits as offline anonymization
Pre-load analysis and anonymization code
Lock-it and throw away the key (tamper-resistance)
Threat Model
Accidental disclosure:
Subpoenas:
Risk is substantial whenever humans are handling data
Attacker has physical access to tracing system
Subpoenas force researcher and ISPs to cooperate
As long as cooperation is not “unduly burdensome”
Implication: Nobody can have access to raw data
Is Developing Bunker Legal?
It Depends on Intent of Use
Developing Bunker is like
developing encryption
Must consider purpose and uses of Bunker
Developing Bunker for user privacy is legal
Misuse of Bunker to bypass law is illegal
Outline
Motivation
Design of our platform
System evaluation
Case study: Phishing
Conclusions
Logical Design
anonymize
One-Way Interface
(anon. data)
parse
Anon.
Key
assemble
Offline
Online
capture
Capture Hardware
VM-based Implementation
Closed-box VM
anonymize
One-Way
Socket
parse
Anon.
Key
assemble
decrypt
Enc.
Key
encrypt
Offline
Online
capture
Hypervisor
Open-box NIC
Encrypted Raw Data
Capture Hardware
VM-based Implementation
Open-box VM
Closed-box VM
anonymize
One-Way
Socket
parse
save trace
Anon.
Key
assemble
logging
maintenance
decrypt
Enc.
Key
encrypt
Offline
Online
capture
Hypervisor
Open-box NIC
Encrypted Raw Data
Capture Hardware
Benefits
Strong privacy properties
Raw trace and other sensitive data cannot be leaked
Trace processing done offline
Can use your favorite language!
Parsing can be done with off-the-shelf components
Key Technologies
“Closed-box” VM protects sensitive data
Encryption protects on-disk data
Contains all raw trace data & processing code
No interactive access to closed-box (e.g. no console)
Randomly generated key held in volatile memory
Data cannot be decrypted upon reboot
“Safe-on-reboot” VM mitigates hardware attacks
Outline
Motivation
Design of our tool
System evaluation
Case study: Phishing
Conclusions
Software Engineering Benefits
Python
C
Lines of Code
60,000
40,000
63,382
53,995
20,000
1,350
5,512
0
UW
Toronto
Bunker
One order of magnitude btw. online and offline
Development time: Bunker - 2 months, UW/Toronto - years
Work Deferral
Queue Size (GB)
200
150
100
50
0
12:00 PM
6:00 PM
12:00 AM
6:00 AM
12:00 PM
Time
Don’t do now what you can do later
Error Recovery
100%
% of Flows
80%
Parsing OK
68.20%
60%
99.92%
40%
Collateral damage
20%
31.72%
0.08%
Parsing errors
0.08%
0%
Online Tracer
Tamper Resistant Tracer
Small bugs lead to small errors in the trace -- not huge gaps
Outline
Motivation
Design of our tool
System evaluation
Case study: Phishing
Conclusions
Phishing is Bad
Costs U.S. economy hundreds of millions
Affects 1+ million U.S. Internet users
2004 - mid 2006: # of phishing sites grew 10x
Banks claim phishing is #1 source of fraud
Phishing messages now personalized
Harder to filter
Two Day Hotmail Trace
Hotmail
Users
3,062
# of E-mails Received
13,438
# of From Addresses
7,422
# of To Addresses
25,456
Median # of Words in E-mail Body
Tues Jan 29/08 11:15am - Thurs Jan 31 11:23am,
University of Toronto at Mississauga
130
Questions
How often are URLs present in e-mails?
How often do people click on links in e-mails?
Do people verify an e-mail for legitimacy
before clicking on a link?
Links in Email
100%
90.80%
80%
% with Clicks <= 2 s
% with Clicks
78.80%
% with URLs
60%
40%
18.70%
20%
0%
1.53%
0.54%
Users
5.86%
Emails
Conclusions
Today’s tracing experiments need to look “deep” into
network activity
Serious privacy concerns
IP-level trace vs. email and browse history
Physical security isn’t enough: subpoenas
Bunker provides
the safety of online anonymization
the simplicity of offline anonymization
Acknowledgments
Andrew Miklas (U. of Toronto)
Alec Wolman (Microsoft Research)
Angela Demke Brown (U. of Toronto)
Questions?
http://www.cs.toronto.edu/~stefan
Design
Open-box VM
(DomainU)
Closed-box VM
(Domain0)
One-Way
Interface
Offline Software
Enc.
Key
Anon.
Key
Untrusted Software
Online Software
XEN Hypervisor
Encrypted
Raw Trace
Capture
NIC
Open
NIC
Phishy Mail Leaks through Filters
MURTY_PHISHING1
NORMAL_HTTP_TO_IP
4.33%
SCREENTIP
2.93%
0.85%
MURTY_PHISHING3
HTML_OBFUSCATE_05_10
17.10%
0.42%
HTML_OBFUSCATE_10_20
0.10%
SARE_BANK_URI_IP
0.03%
SARE_SPOOF_BADURL
0.01%
SARE_EBAY_SPOOF_NAME
0.22%
0%
5%
10%
% of Emails
15%
20%
anonymize
parse
Anonymized
Trace
Anon.
Key
assemble
Offline
Online
capture
Capture
Hardware
Commodity VM
Inaccessible VM
anonymize
One-Way
Socket
Anon.
Key
parse
save trace
assemble
logging
Offline
Online
maintenance
capture
Hypervisor
Anonymized
Trace
Capture
Hardware
Commodity VM
Inaccessible VM
anonymize
One-Way
Socket
Anon.
Key
parse
save trace
assemble
logging
decrypt
Enc.
Key
maintenance
encrypt
Offline
Online
capture
Hypervisor
Anonymized
Trace
Encrypted
Raw Trace
Capture
Hardware
Overall Privacy Goal
Tracing
Starts
Tamper
Attack
Time
Data
Protected
Data
Exposed
Goal: Ensure that user’s privacy is “no worse off”
when a trace is in progress