20070327_tuesdaysem

Download Report

Transcript 20070327_tuesdaysem

802.11 User
Fingerprinting
Jeff Pang, Ben Greenstein,
Ramki Gummadi, Srini Seshan, and
David Wetherall
Most slides borrowed from Ben
Location Privacy is at Risk
Your MAC address:
00:0E:35:CE:1F:59
Usually < 100m
You
“The adversary”
(a.k.a., some dude with a laptop)
Are pseudonyms enough?
MAC address now:
00:0E:35:CE:1F:59
MAC address later:
00:AA:BB:CC:DD:EE
Implicit Identifiers Remain

Consider one user at SIGCOMM 2004

Visible in an “anonymized” trace






MAC addresses scrubbed
Effectively a pseudonym
Transferred 512MB via bittorrent
=> Crappy performance for everyone else
Let’s call him Bob
Can we figure out who Bob is?
Implicit Identifier: SSIDs

SSIDs in Probe Requests



Windows XP, Mac OS X probe for your preferred
networks by default
Set of networks advertised in a traffic sample
Determined by a user’s preferred networks list
SSID Probe:
“roofnet”
Bob
What if Bob used pseudonyms?



“roofnet” probe occurred during
different session than bittorrent
download
Can no longer explicitly associate
“roofnet” with poor network etiquette
Can we do it implicitly?
Implicit Identifier:
Network Destinations

Network Destinations



Set of IP <address, port> pairs in a traffic sample
In SIGCOMM, each visited by 1.15 users on average
A user is likely to visit a site repeatedly
(e.g., an email server)
SSH/IMAP server:
159.16.40.45
Bob
What if network is encrypted?


Can’t see IP addresses through linklayer encryption like WPA
Is Bob safe now?
Implicit Identifier:
Broadcast Packet Sizes

Broadcast Packet Sizes



Set of 802.11 broadcast packet sizes in a traffic sample
E.g., Windows machines NetBIOS naming advertisements;
FileMaker and Microsoft Office advertise themselves
In SIGCOMM, only 16% more unique <application, size>
tuples than unique sizes
Broadcast packet sizes:
239, 245, 257
Bob
Implicit Identifier:
MAC Protocol Fields

MAC Protocol Fields



Header bits (e.g., power mgmt., order)
Supported rates
Offered authentication algorithms
Mac Protocol Fields:
11,4,2,1Mbps, WEP, etc.
Bob
What else do implicit identifiers tell us?
David J. Wetherall
Anonymized 802.11 Traces from SIGCOMM 2004
Search on Wigle for “djw” in the Seattle area
A pseudonym
Google pinpoints David’s home (to within 200 ft)
Automating Implicit Identifiers
?
?
?
TRAINING:
OBSERVATION:
Collect some traffic
known to be from Bob
Which traffic is
from Bob?
Methodology

Simulate using
SIGCOMM, USCD

“The adversary”


Split trace into
training data and
observation data
Sample = 1hour of
traffic to/from a user
Assume pseudonyms
Did this traffic sample come
from Bob?
Naïve Bayesian Classifier:
We say sample s (with features fi) is from Bob if
Pr[s from Bob | s has features fi] > T
How to convert implicit identifiers into features?
Did This Traffic Sample Come
from Bob?
Features:
Set similarity (Jaccard Index), weighted by frequency:
Rare
linksys
Common
PROFILE FROM
TRAINING
djw
IR_Guest
SIGCOMM_1
SAMPLE FOR
VALIDATION
Individual Feature Accuracy

60% TPR with 99% FPR


Higher FPR, likely due to not
being user specific
Useful in combination with
other features, to rule out
identities
Multi-feature Accuracy

Samples from 1 in 4 users are identified
>50% of the time with 0.001 FPR
bcast + ssids +
fields + netdests
bcast + ssids +
fields
bcast + ssids
Was Bob here today?

Maybe…



Suppose N users present
Over an 8 hour day, 8*N opportunities to
misclassify a user’s traffic
Instead, say Bob is present iff
multiple samples are classified as his
Was Bob here today?



In a busy coffee shop
with 25 concurrent
users, more than half
(54%) can be
identified with 90%
accuracy
4 hour median to
detect (4 samples)
27% with two 9s.
Conclusion:
Pseudonyms Are Insufficient

4 new identifiers: netdests, ssids, fields, bcast
Average user emits highly distinguishing identifiers
Adversary can combine features

Future





Uncover more identifiers (timing, etc.)
Validate on longer/more diverse traces
(SSIDs stable in home setting for >=2 weeks)
Build a better link layer