Anonymity and Privacy in Electronic Services

Download Report

Transcript Anonymity and Privacy in Electronic Services

Privacy and anonymity
Claudia Diaz
Katholieke Universiteit Leuven
Dept. Electrical Engineering – ESAT/COSIC
UPC Seminar, Barcelona, November 24, 2008
Claudia Diaz (K.U.Leuven)
1
Introducing myself...

2000: I.T.S. de Telecomunicaciones, Universidad de Vigo

Master thesis (PfC) partially done at KULeuven as Erasmus student

2000-2001: Pre-doctoral student at COSIC/KULeuven
2001-2005: PhD student at COSIC/KULeuven
2005-now: Post-doc researcher at COSIC/KULeuven

My work:








Anonymity metrics
Anonymous communications: analysis and attacks
Traffic analysis
Stego file systems
Other PETs: private search, negative databases, ...
http://www.esat.kuleuven.be/~cdiaz
Claudia Diaz (K.U.Leuven)
2
Introducing COSIC (1/2)


Group at the Dept. of Electrical Engineering of the
Katholieke Universiteit Leuven
COSIC: COmputer Security and Industrial Cryptography





Professors: Bart Preneel, Ingrid Verbauwhede and Vincent
Rijmen
5 post-docs (more arriving soon)
40 PhD students (more arriving soon)
5-15 visitors and external associated researchers at any given
time
Very international: 15-25 nationalities, ~60% non-Belgians
Claudia Diaz (K.U.Leuven)
3
Introducing COSIC (2/2)

Research areas








Involved in 25-30 active projects



Symmetric Key Crypto
Public Key Crypto
Secure Hardware Implementations
Identity Management
Privacy Technologies
Mobile Secure Networks
Software Security
~ 10 European Projects (FP6 and FP7)
Coordinator of FP6 NoE E-CRYPT, FP7 NoE E-CRYPT II, FP7 IP TAS3
Biggest success: Rijndael encryption algorithm (AES)
Claudia Diaz (K.U.Leuven)
4
… a few words on the scope of the talk

New field, in development

Give an idea on the problems, concepts and
solutions for anonymity
Claudia Diaz (K.U.Leuven)
5
Outline




Privacy and anonymity
Anonymous communications
Anonymity metrics
Social networks
Claudia Diaz (K.U.Leuven)
6
Solove on “I have nothing to hide”

"the problem with the ‘nothing to hide’ argument is its
underlying assumption that privacy is about hiding bad
things.“

"Society involves a great deal of friction, and we are
constantly clashing with each other. Part of what makes
a society a good place in which to live is the extent to
which it allows people freedom from the
intrusiveness of others. A society without privacy
protection would be suffocation, and it might not be a
place in which most would want to live."
Claudia Diaz (K.U.Leuven)
7
Diffie and Landau on Internet
eavesdropping


Governments are expanding their surveillance powers to protect
against crime and terrorism.
BUT:



Secrecy, lack of transparency, and diminished safeguards may easily
lead to abuses
“Will the government’s monitoring tools be any more secure than the
network they are trying to protect? If not, we run the risk that the
surveillance facilities will be subverted or actually used against the
U.S.A.”
They conclude: “Communication is fundamental to our species;
private communication is fundamental to both our national
security and our democracy. Our challenge is to maintain this
privacy in the midst of new communications technologies and
serious national security threats. But it is critical to make choices
that preserve privacy, communications security and the ability to
innovate. Otherwise, all hope of having a free society will vanish.”
Claudia Diaz (K.U.Leuven)
8
Claudia Diaz (K.U.Leuven)
9
Privacy properties from a technical
point of view

Anonymity


Unlinkability


Hiding user activity
Pseudonymity


Hiding link between two or more actions / identities
Unobservability


Hiding link between identity and action
One-time pseudonyms / persistent pseudonyms
Plausible deniability

Not possible to prove user knows / has done
something
Claudia Diaz (K.U.Leuven)
10
The concept of Privacy [Solove]

Privacy threats we are trying to protect against
(out of 16 identified by Solove)

Surveillance: monitoring of electronic transactions


Interrogation: forcing people to disclose information


Preventive property: plausible deniability
Aggregation: combining several sources of information


Preventive properties: anonymity, unobservability
Preventive property: unlinkability
Identification: connecting data to individuals.

Preventive properties: anonymity and unlinkability
Claudia Diaz (K.U.Leuven)
11
Anonymity – Data and Communication
Layers
App
App
Com
Com
IP
Alice
Claudia Diaz (K.U.Leuven)
Bob
12
Classical Security Model
• Confidentiality
• Integrity
• Authentication
• Non repudiation
• Availability
Alice
Bob
Eve
Passive / Active
Claudia Diaz (K.U.Leuven)
13
Anonymity – Concept and Model
Set of Alices
Claudia Diaz (K.U.Leuven)
Set of Bobs
14
Anonymity Adversary
• Passive/Active
Recipient?
• Partial/Global
Third Parties?
• Internal/External
Claudia Diaz (K.U.Leuven)
15
Anonymity Adversary

The adversary will:


Try to find who is sending messages to whom.
Observe





All links (Global Passive Adversary)
Some links
Modify, delay, delete or inject messages.
Control some nodes in the network.
The adversary's limitations


Cannot break cryptographic primitives.
Cannot see inside nodes he does not control.
Claudia Diaz (K.U.Leuven)
16
Soft privacy enhancing technologies

Hard privacy



Focus on data minimization
Adversarial data holder / service provider
Soft privacy



Policies, access control, liability, right to correct
information
Adversary: 3rd parties, corrupt insider in honest
SP, errors
BUT user has already lost control of her data
Claudia Diaz (K.U.Leuven)
(Slide taken from G. Danezis)
17
Other privacy-enhancing technologies








Anonymous credentials / e-cash / ZK
protocols
Steganography / covert communication
Censorship resistance techniques
Anonymous publication
Private information retrieval (PIR)
Private search
K-anonymity
Location privacy
Claudia Diaz (K.U.Leuven)
18
Outline




Privacy and anonymity
Anonymous communications
Anonymity metrics
Social networks
Claudia Diaz (K.U.Leuven)
19
Concept of Mix: collect inputs
Router that hides
correspondence between
inputs and outputs
Claudia Diaz (K.U.Leuven)
20
Concept of Mix: mix and flush
Router that hides
correspondence between
inputs and outputs
Claudia Diaz (K.U.Leuven)
21
Functionality of Mixes

Mixes modify

The appearance of messages

Encryption / Decryption




Sender → Mix1 : {Mix2, {Rec, msg}KMix2}KMix1
Padding / Compression
Substitution of information (e.g., IP)
The flow of messages



Reordering
Delaying - Real-time requirements!
Dummy traffic - Cost of traffic!
Claudia Diaz (K.U.Leuven)
22
Pool Mixes



Based on the mix proposed by Chaum in 1981:
1.
Collect N inputs
2.
Shuffle
Round
3.
Flush (Forward)
Pool selection algorithm

No pool / Static pool / Dynamic pool

Influences the performance and anonymity provided
by the mix
Flushing condition

Time / Threshold

Deterministic / Random
Claudia Diaz (K.U.Leuven)
23
Example of pool mix
Deterministic threshold
static pool Mix
Pool = 2
Threshold = 4
Claudia Diaz (K.U.Leuven)
24
Stop-and-Go Mix





Proposed by Kesdogan in 1998
Reordering strategy based on delaying
M/M/∞
Delays generated by the user from an
Exponential distribution
Timestamping to prevent active attacks


Trusted Time Service
Anonymity estimates based on the
assumption of Poisson incoming traffic
Claudia Diaz (K.U.Leuven)
25
Network topology

Mixes are combined in networks in order to


Distribute trust
Improve availability
Cascade
Fully connected
network
Claudia Diaz (K.U.Leuven)
Restricted route
network
26
Cascades vs Free Route topologies

Flexibility of routing

Surface of attack


Availability


Advantage free routes
Intersection attacks


Advantage free routes
Advantage cascades (anonymity set smaller but no
partitioning possible)
Trust

Advantage free routes (more choices available to user)
Claudia Diaz (K.U.Leuven)
27
Peer-to-peer vs client-server
architectures

Symmetric vs asymmetric systems

Surface of attack


Liability issues


Advantage client-server
Availability


Advantage client-server
Resources / incentives / quality of service


Advantage peer-to-peer
Advantage peer-to-peer
Sybil attacks

Advantage? Depending on admission controls (for
peers/servers)
Claudia Diaz (K.U.Leuven)
28
Deployed systems I

Anon.penet.fi (Helsingius 1993)




Simple proxy, substituted email headers
Kept table of correspondences nym-email
Brought down by legal attack in 1996
Type I Cypherpunk remailers (Hughes,
Finney1996)




No tables (routing info in msgs themselves),
PGP encryption (no attacks based on content) –
attacks based on size are possible
Chains of mixes (distribution of trust)
Reusable reply blocks (source of insecurity)
Claudia Diaz (K.U.Leuven)
29
Deployed systems II

Mixmaster (Cottrell, evolving since 1995)





Fixed size (padding / dividing large messages)
Integrity protection measures
Multiple paths for better reliability
No replies
Mixminion (Danezis, 2003)



SURBs (Single-Use Reply Blocks)
Packet format: detection of tagging attacks (all-ornothing)
Forward security: trail of keys, updated with one-way
functions
Claudia Diaz (K.U.Leuven)
30
Low-latency applications

Stream-based instead of message-based communication


Bi-directional circuits


Delaying not an option
Proposed systems



Ephemeral session keys, onion-encrypted (forward secrecy)
Real-time requirements


Web browsing, interactive applications, voice over IP, etc.
C-S: Onion Routing, ISDN, Web Mixes
P2P: Crowds, P5 (broadcast), Herbivore (DC-nets)
Implemented systems:

TOR, JAP
Claudia Diaz (K.U.Leuven)
31
Onion encryption
R2
R3
D
R1
R
D231
Claudia Diaz (K.U.Leuven)
32
Onion Routing
R2
R3
D
R1
R1
Claudia Diaz (K.U.Leuven)
R2
R3
D
33
TOR – adversary model
Claudia Diaz (K.U.Leuven)
34
Anonymizing http traffic not trivial


Difficult to conceal traffic pattern
Difficult to pad




Vulnerable to strong adversaries (entry+exit)
Fingerprinting attacks


Lots of padding: scalability / cost problem
Little padding: not enough to conceal pattern
Adversary observes only user side
Internet exchanges
Claudia Diaz (K.U.Leuven)
35
Dummy Traffic





Fake messages/traffic introduced to confuse the
attacker
Undistinguishable from real traffic
Dummies improve the anonymity by making more
difficult the traffic analysis
Neccessary for unobservability
Expensive
Claudia Diaz (K.U.Leuven)
36
(Some) Attacks on mix-based systems

Passive attacks

Long-term intersection attacks (statistical disclosure)







Persistent communication patterns
Extract social network information
Traffic correlation / confirmation
Firgerprinting
Source separation
Epistemic attacks (route selection)
Active attacks





N-1 attacks
Sybil
Tagging
Replay
DoS
Claudia Diaz (K.U.Leuven)
37
Long-Term Intersection Attacks

Family of attacks with many variants:







Assumption:


Combine many observations (looking at who receives when Alice sends)
Intuition:


Alice has persistent communication relationships (she communicates
repeatedly with her friends)
Method:


Disclosure attack (Agrawal, Kesdogan)
Hitting set attack (Kesdogan)
Statistical disclosure attack (Danezis, Serjantov)
Extensions to SDA (Dingledine and Mathewson)
Two-Sided SDA (Danezis, Diaz, Troncoso)
Receiver-bound cover traffic (Mallesh, Wright)
If we observe rounds in which Alice sends, her likely recipients will appear
frequently
Result:

We can create a vector that expresses Alice’s sending probabilities (a
sending profile)
Claudia Diaz (K.U.Leuven)
38
Outline




Privacy and anonymity
Anonymous communications
Anonymity metrics
Social networks
Claudia Diaz (K.U.Leuven)
39
Definition [PfiHan2000]

First clear definition of anonymity (2000)



Anonymity is the state of being not
identifiable within a set of subjects, the
anonymity set.
The anonymity set is the set of all possible
subjects who might cause an action or be
addressed.
Anonymity depends on:


The number of subjects in the anonymity set
The probability distribution of each subject in the
anonymity set being the target
Claudia Diaz (K.U.Leuven)
40
Example: computing anonymity metrics
for a pool mix
p1=2-1
Recipient R1
Potentially infinite
subjects in the
recipient anonymity set
p2=2-2
Probability of
recipient Ri : pi=2-i
p3=2-3
Recipient R2
Recipient R3
p4=2-4
Recipient R4
Claudia Diaz (K.U.Leuven)
41
Entropy: information-theoretic anonymity
metrics [DSCP02, SD02]



Entropy: measure of the amount of information required
on average to describe the random variable
Measure of the uncertainty of a random variable
Increases with N and with uniformity of distribution
N
H   pi  log2  pi 
i 1


Distribution with entropy H equivalent to uniform
distribution with 2H subjects
Other information theoretic metrics: min-entropy, maxentropy, Rényi entropy, relative entropy, mutual
information, ....
Claudia Diaz (K.U.Leuven)
42
Combining traffic analysis with social
network knowledge



“On the Impact of Social Network Profiling on
Anonymity” [DTS-PETS08]
Use of Bayesian inference to combine different
sources of information (SN knowledge, text mining)
Results:



“Combined” anonymity decreases with network growth (if
only one source of information is considered, then growing
the network does not decrease anonymity)
Anonymity degradation as more profiles become known
ERRORS: Bad quality SN profile information


Small profile errors lead to large errors in the results
If adversary not completely confident of SN info → cannot have
any confidence on results
Claudia Diaz (K.U.Leuven)
43
Combinatorial approaches

Edman et al.




Consider deanonymization for a system as a whole
(instead of individual users)
Find perfect matching inputs/outputs
Perfect anonymity for t messages: t! equiprobable
combinations
GTDPV-WPES08




Edman et al.’s metric overestimates anonymity if users
send/receive more than one message
Generalization to users sending/receiving more than one
message
Divide and conquer algorithm to compute the metric
Upper and lower bounds on anonymity (easy to compute)
Claudia Diaz (K.U.Leuven)
44
Outline




Privacy and anonymity
Anonymous communications
Anonymity metrics
Social networks
Claudia Diaz (K.U.Leuven)
45
A social network approach


Most anonymity techniques focus on channels
and one-to-one relationships
Mailing lists, twikis, blogs, online social
networks:




Many-to-many communication (communities)
Shared spaces and collective content (group photo
albums, forums)
Moreover, information about third parties (not in
the community) is also leaked (pictures, stories,
references in posts)
This scenario presents new privacy challenges
Claudia Diaz (K.U.Leuven)
46
“The Economics of Mass Surveillance”
[DanWit06]

Model:

Assumptions

Clubs
People

Claudia Diaz (K.U.Leuven)
If one of the club participants
is under surveillance all
information shared in this
club, and the membership list
of the club becomes known to
Eve
Eve has a limited budget
(number of people it can put
under surveillance)
47
Target selection

How best to choose those to be put under
surveillance to maximize returns in terms of
observed clubs, people and their relationships?

How does the lack of information, due to the use of
an anonymizing networks, affect the effectiveness of
such target selection?

Data: mail archives of political network 2003-2006



Clubs = mailing lists (373)
People = email addresses posting more than 5 times to
those mailing lists (2338)
Links = number of posts of person over 3 years (3879)
Claudia Diaz (K.U.Leuven)
48
With and without anonymizing network

Without



Eve can observe all traffic flow (not
the content) and thus construct the
social graph (links people-clubs)
Eve chooses to put under
surveillance the nodes with the
highest degrees, excluding all
spaces already under surveillance
Result: surveilling 8% of the people
is enough to control 100% of the
clubs
Claudia Diaz (K.U.Leuven)

With



Eve can only observe aggregated
amount of traffic generated by
people
Eve puts under surveillance the
nodes that generate the most volume
Results: 50% links known when
surveying 5% of people. To get 80%
of links an adversary needs to
observe 30% of people (vs. 3%)
49
The Economics of Covert Community
Detection and Hiding [Nag08]




Adversary tries to uncover communities and learn
community membership
Counter-surveillance techniques that can be
deployed by covert groups (e.g., topological
rewiring)
“Our study confirms results from previous work
showing that invasion of privacy of most people is
cheap and easy”
With certain counter-surveillance strategies: “70% of
the covert community going undetected even when
99% of the network is under direct surveillance.”
Claudia Diaz (K.U.Leuven)
50
Conclusions

Privacy / anonymity is an area of research getting
increasing attention


Technical, legal, sociological, psychological, economic, political,
philosophical aspects
Economics of privacy:



Crypto: little overhead → lots of security
Anonymity: lots of overhead → a little bit of security
Privacy invasion is:



High-latency applications (email):


Problems with persistent user behavior
Low-latency applications


Profitable for businesses
Increases the power of Government
Insecure towards strong adversaries
Anonymous communications are fragile: if you want to
propose a new system:


Check the literature
Check known attacks
Claudia Diaz (K.U.Leuven)
51
(Some) Open Questions



Do individuals care enough to pay the price of privacy? Will they in the
future? What is privacy anyway?
Will privacy technology be implemented to re-establish the tradeoffs and
power balances of the pre-information era? Or will society and individuals
redefine privacy and power tradeoffs and adapt their behaviour to
accomodate a reality of privacy invasion and surveillance?
Anonymity metrics:







Do we really understand what is anonymity?
Which is the best way to measure anonymity? Which are the “adequate
levels”?
Current metrics very limited in application (e.g., how to measure privacy
losses due to personal data becoming public?
Real user behavioral models
Models for adversary knowledge
Solutions for Low-Latency Anonymous Communication?
Solutions for Social Networks?
Claudia Diaz (K.U.Leuven)
52
Thank you!
Recommended bibliography on the subject:
http://www.freehaven.net/anonbib/
Claudia Diaz (K.U.Leuven)
53