Transcript ppt
CS 259
2008
Anonymity and
Probabilistic Model Checking
John Mitchell
Course schedule
Lectures
• Prob. model checking, other tools (with examples)
Homework 2
• Posted last week, due Tues Feb 12
• Simple exercises using probabilistic tool
Projects
• Presentation 2: Feb 19, 21
– Describe tool or approach and the properties you will check
• Presentation 3: Mar 4 – 13 (two or three meetings)
– Final results: turn in slides and tool input
Dining Cryptographers
Clever idea how to make a message public in a
perfectly untraceable manner
• David Chaum. “The dining cryptographers problem:
unconditional sender and recipient untraceability.”
Journal of Cryptology, 1988.
Guarantees information-theoretic anonymity for
message senders
• This is an unusually strong form of security: defeats
adversary who has unlimited computational power
Impractical, requires huge amount of randomness
• In group of size N, need N random bits to send 1 bit
Three-Person DC Protocol
Three cryptographers are having dinner.
Either NSA is paying for the dinner, or
one of them is paying, but wishes to remain anonymous.
1. Each diner flips a coin and shows it to his left neighbor.
•
Every diner will see two coins: his own and his right neighbor’s.
2. Each diner announces whether the two coins are the
same. If he is the payer, he lies (says the opposite).
3. Odd number of “same” NSA is paying;
even number of “same” one of them is paying
•
But a non-payer cannot tell which of the other two is paying!
Non-Payer’s View: Same Coins
“same”
“different”
?
payer
“same”
“different”
?
payer
Without knowing the coin toss
between the other two, non-payer
cannot tell which of them is lying
Non-Payer’s View: Different Coins
“same”
“same”
?
payer
“same”
“same”
?
payer
Without knowing the coin toss
between the other two, non-payer
cannot tell which of them is lying
Superposed Sending
This idea generalizes to any group of size N
For each bit of the message, every user generates
1 random bit and sends it to 1 neighbor
• Every user learns 2 bits (his own and his neighbor’s)
Each user announces (own bit XOR neighbor’s bit)
Sender announces (own bit XOR neighbor’s bit XOR
message bit)
XOR of all announcements = message bit
• Every randomly generated bit occurs in this sum twice
(and is canceled by XOR), message bit occurs once
DC-Based Anonymity is Impractical
Requires secure pairwise channels between
group members
• Otherwise, random bits cannot be shared
Requires massive communication overhead and
large amounts of randomness
DC-net (a group of dining cryptographers) is
robust even if some members cooperate
• Guarantees perfect anonymity for the other members
A great protocol to analyze
• Difficult to reason about each member’s knowledge
Definitions of Anonymity
“Anonymity is the state of being not identifiable
within a set of subjects.”
• There is no such thing as absolute anonymity
Unlinkability of action and identity
• E.g., sender and his email are no more related within
the system than they are related in a-priori knowledge
Unobservability
• Any item of interest (message, event, action) is
indistinguishable from any other item of interest
“Anonymity is bullshit” - Joan Feigenbaum
Anonymity and Knowledge
Anonymity deals with hiding information
• User’s identity is hidden
• Relationship between users is hidden
• User cannot be identified within a set of suspects
Natural way to express anonymity is to state
what the adversary should not know
• Good application for logic of knowledge
• Not supported by conventional formalisms for
security (process calculi, I/O automata, …)
To determine whether anonymity holds, need
some representation of knowledge
k-Anonymity
Basic idea
• Someone robbed the bank
• Detectives know that it is one of k people
Advantage
• Does not involve probability
Disadvantages
• Does not involve probability
• Depends on absence of additional information
Data Anonymity
Problem: de-identifying data does not necessarily
make it anonymous. It can often be re-identified:
Name
Address
Ethnicity
Visit date
Diagnosis
Procedure
Medication
Total bill
Medical Data
ZIP
ZIP
Birth
date
Birth
date
Sex
Sex
Date
registered
Party
Date last
voted
Voter Lists
SOURCE: LATANYA SWEENEY
Date of birth, gender + 5-digit ZIP uniquely
identifies 87.1% of U.S. population
ZIP 60623 has
112,167 people,
11%, not 0%
uniquely identified.
Insufficient # over
55 living there.
•= one ZIP code
SOURCE: LATANYA SWEENEY
Anonymity via Random Routing
Hide message source by routing it randomly
• Popular technique: Crowds, Freenet, Onion Routing
Routers don’t know for sure if the apparent source
of a message is the true sender or another router
• Only secure against local attackers!
Onion Routing
R
R
R1
Alice
[Reed, Syverson, Goldschlag ’97]
R
R3
R2
R4
R
R
R
Bob
Sender chooses a random sequence of routers
• Some routers are honest, some hostile
• Sender controls the length of the path
• Similar to a MIX cascade
Goal: hostile routers shouldn’t learn that Alice is talking to Bob
The Onion
R2
Alice
R1
{R2,k1}pk(R1),{
{R3,k2}pk(R2),{
R3
{R4,k3}pk(R3),{
R4
{B,k4}pk(R4),{
{M}pk(B)
Bob
} k4
} k3
} k2
} k1
• Routing info for each link encrypted with router’s public key
• Each router learns only the identity of the next router
Crowds System
[Reiter,Rubin ‘98]
Messages encrypted with shared symmetric keys
C
C
C
C
C1
C0
sender
C
C2
C3
pf
1-pf
C4
C
C
C
recipient
Routers form a random path when establishing connection
• In onion routing, random path is chosen in advance by sender
After receiving a message, honest router flips a biased coin
• With probability Pf randomly selects next router and forwards msg
• With probability 1-Pf sends directly to the recipient
Probabilistic Notions of Anonymity
Beyond suspicion
• The observed source of the message is no more
likely to be the true sender than anybody else
Probable innocence
• Probability that the observed source of the message
is the true sender is less than 50%
Possible innocence
Guaranteed by Crowds if there are
sufficiently many honest routers:
Ngood+Nbad pf/(pf-0.5)(Nbad +1)
• Non-trivial probability that the observed source of
the message is not the true sender
A Couple of Issues
Is probable innocence enough?
…
49% 1% 1%
Maybe Ok for “plausible deniability”
1%
1%
1%
1%
Multiple-paths vulnerability
• Can attacker relate multiple paths from same sender?
– E.g., browsing the same website at the same time of day
• Each new path gives attacker a new observation
• Can’t keep paths static since members join and leave
Probabilistic Model Checking
0.2
0.3
• Same as Mur
State transitions are probabilistic
0.5
...
Participants are finite-state machines
...
• Transitions in Mur are nondeterministic
Standard intruder model
• Same as Mur: model cryptography with
abstract data types
Mur question:
“bad state”
• Is bad state reachable?
Probabilistic model checking question:
• What’s the probability of reaching bad state?
Discrete-Time Markov Chains
(S, s0, T, L)
S is a finite set of states
s0 S is an initial state
T :SS[0,1] is the transition relation
• s,s’S
s’ T(s,s’)=1
L is a labeling function
Markov Chain: Simple Example
C
0.2
s0
A
0.8
B
0.1
Probabilities of outgoing
transitions sum up to 1.0
for every state
0.5
0.5
D
E
1.0
0.9
1.0
• Probability of reaching E from s0 is 0.20.5+0.80.10.5=0.14
• The chain has infinite paths if state graph has loops
– Need to solve a system of linear equations to compute probabilities
PRISM
[Kwiatkowska et al., U. of Birmingham]
Probabilistic model checker
System specified as a Markov chain
• Parties are finite-state machines w/ local variables
• State transitions are associated with probabilities
– Can also have nondeterminism (Markov decision processes)
• All parameters must be finite
Correctness condition specified as PCTL formula
Computes probabilities for each reachable state
– Enumerates reachable states
– Solves system of linear equations to find probabilities
PRISM Syntax
C
0.2
s0
A
0.8
B
0.1
0.5
0.5
D
E
1.0
0.9
1.0
module Simple
state: [1..5]
[] state=1 ->
[] state=2 ->
[] state=3 ->
endmodule
init
0.8:
0.1:
0.5:
1;
state’=2 + 0.2: state’=3;
state’=3 + 0.9: state’=4;
state’=4 + 0.5: state’=5;
IF state=3 THEN with prob. 50% assign 4 to state,
with prob. 50% assign 5 to state
Modeling Crowds with PRISM
Model probabilistic path construction
Each state of the model corresponds to a
particular stage of path construction
• 1 router chosen, 2 routers chosen, …
Three probabilistic transitions
• Honest router chooses next router with probability pf,
terminates the path with probability 1-pf
• Next router is probabilistically chosen from N candidates
• Chosen router is hostile with certain probability
Run path construction protocol several times and
look at accumulated observations of the intruder
PRISM: Path Construction in Crowds
module crowds
Next router is corrupt with certain probability
. . .
// N = total # of routers, C = # of corrupt routers
// badC = C/N, goodC = 1-badC
[] (!good & !bad) ->
goodC: (good’=true) & (revealAppSender’=true) +
badC: (badObserve’=true);
// Forward with probability PF, else deliver
[] (good & !deliver) ->
PF: (pIndex’=pIndex+1) & (forward’=true) +
notPF: (deliver’=true);
. . .
Route with probability PF, else deliver
endmodule
PRISM: Intruder Model
module crowds
. . .
// Record the apparent sender and deliver
[] (badObserve & appSender=0) ->
(observe0’=observe0+1) & (deliver’=true);
. . .
// Record the apparent sender and deliver
[] (badObserve & appSender=15) ->
(observe15’=observe15+1) & (deliver’=true);
. . .
endmodule
• For each observed path, bad routers record apparent sender
• Bad routers collaborate, so treat them as a single attacker
• No cryptography, only probabilistic inference
PCTL Logic
[Hansson, Jonsson ‘94]
Probabilistic Computation Tree Logic
Used for reasoning about probabilistic temporal
properties of probabilistic finite state spaces
Can express properties of the form “under any
scheduling of processes, the probability that event
E occurs is at least p’’
•
By contrast, Mur can express only properties of the
form “does event E ever occur?’’
PCTL Syntax
State formulas
•
First-order propositions over a single state
::= True | a | | | | P>p[]
Predicate over state variables
(just like a Mur invariant)
Path formulas
•
Path formula holds
with probability > p
Properties of chains of states
::= X | Uk | U
State formula holds for
every state in the chain
First state formula holds for every state
in the chain until second becomes true
PCTL: State Formulas
A state formula is a first-order state predicate
•
Just like non-probabilistic logic
True
True
s0
X=1
y=1
1.0
X=1
y=2
0.2
0.5
0.8
X=3
y=0
False
0.5
= (y>1) | (x=1)
False
X=2
y=0
1.0
PCTL: Path Formulas
A path formula is a temporal property of a
chain of states
•
1U2 = “1 is true until 2 becomes and stays true”
s0
X=1
y=1
1.0
X=1
y=2
0.2
0.5
0.8
X=3
y=0
X=2
y=0
1.0
0.5
= (y>0) U (x>y) holds for this chain
PCTL: Probabilistic State Formulas
Specify that a certain predicate or path formula
holds with probability no less than some bound
True
False
s0
X=1
y=1
1.0
X=1
y=2
0.2
0.5
0.8
X=3
y=0
False
True
X=2
y=0
1.0
0.5
= P>0.5[(y>0) U (x=2)]
Intruder Model Redux
module crowds
. . .
// Record the apparent sender and deliver
[] (badObserve & appSender=0) ->
(observe0’=observe0+1) & (deliver’=true);
. . .
// Record the apparent sender and deliver
[] (badObserve & appSender=15) ->
(observe15’=observe15+1) & (deliver’=true);
. . .
endmodule
Every time a hostile crowd member receives a message
from some honest member, he records his observation
(increases the count for that honest member)
Negation of Probable Innocence
launch
[true
…
launch
[true
->
U (observe0>observe1) & done] > 0.5
->
U (observe0>observe9) & done] > 0.5
“The probability of reaching a state in which hostile crowd
members completed their observations and observed the
true sender (crowd member #0) more often than any of
the other crowd members (#1 … #9) is greater than 0.5”
Analyzing Multiple Paths with PRISM
Use PRISM to automatically compute interesting
probabilities for chosen finite configurations
“Positive”: P(K0 > 1)
• Observing the true sender more than once
“False positive”: P(Ki0 > 1)
• Observing a wrong crowd member more than once
“Confidence”: P(Ki0 1 | K0 > 1)
• Observing only the true sender more than once
Ki = how many times crowd member i was recorded as apparent sender
Size of State Space
States
15,000,000
10,000,000
5,000,000
0
20
Honest
routers
15
10
5
3
4
5
6
Paths
All hostile routers are treated as a single router, selected with probability 1/6
Sender Detection (Multiple Paths)
All configurations satisfy probable
Sender
detection
50%
innocence
Probability of observing the true
sender increases with the number
of paths observed
… but decreases with the increase
in crowd size
40%
30%
20%
10%
0%
24
Routers
18
6
12
6
3
5
4 Paths
1/6 of routers are hostile
Is this an attack?
Reiter & Rubin: absolutely not
But…
• Can’t avoid building new paths
• Hard to prevent attacker from
correlating same-sender paths
Attacker’s Confidence
Attacker
confidence
100%
95%
90%
85%
3
80%
4
Paths
5
6
6
12
18
24
Routers
1/6 of routers are hostile
“Confidence” = probability of detecting
only the true sender
Confidence grows with crowd size
Maybe this is not so strange
• True sender appears in every path,
others only with small probability
• Once attacker sees somebody twice,
he knows it’s the true sender
Is this an attack?
Large crowds: lower probability to catch
senders but higher confidence that the
caught user is the true sender
But what about deniability?
Probabilistic Contract Signing
Slides borrowed from Vitaly Shmatikov
Probabilistic Fair Exchange
Two parties exchange items of value
• Signed commitments (contract signing)
• Signed receipt for an email message (certified email)
• Digital cash for digital goods (e-commerce)
Important if parties don’t trust each other
• Need assurance that if one does not get what it
wants, the other doesn’t get what it wants either
Fairness is hard to achieve
• Gradual release of verifiable commitments
• Convertible, verifiable signature commitments
• Probabilistic notions of fairness
Properties of Fair Exchange Protocols
Fairness
• At each step, the parties have approximately equal
probabilities of obtaining what they want
Optimism
• If both parties are honest, then exchange succeeds
without involving a judge or trusted third party
Timeliness
• If something goes wrong, the honest party does not
have to wait for a long time to find out whether
exchange succeeded or not
Rabin’s Beacon
A “beacon” is a trusted party that publicly
broadcasts a randomly chosen number between
1 and N every day
• Michael Rabin. “Transaction protection by beacons”.
Journal of Computer and System Sciences, Dec 1983.
28
25
15
11
2
Jan 27
2
Jan 28 Jan 29 Jan 30 Jan 31 Feb 1
…
Contract
CONTRACT(A, B, future date D, contract terms)
Exchange of commitments must be
concluded by this date
Rabin’s Contract Signing Protocol
sigA”I am committed if 1 is broadcast on day D”
sigB”I am committed if 1 is broadcast on day D”
CONTRACT(A, B, future date D, contract terms)
sigA”I am committed if i is broadcast on day D”
sigB”I am committed if i is broadcast on day D”
…
sigA”I am committed if N is broadcast on day D”
sigB”I am committed if N is broadcast on day D”
2N messages are exchanged if both parties are honest
Probabilistic Fairness
Suppose B stops after receiving A’s ith message
• B has sigA”committed if 1 is broadcast”,
sigA”committed if 2 is broadcast”,
…
sigA”committed if i is broadcast”
• A has sigB”committed if 1 is broadcast”, ...
sigB”committed if i-1 is broadcast”
… and beacon broadcasts number b on day D
• If b <i, then both A and B are committed
• If b >i, then neither A, nor B is committed
This happens only
• If b =i, then only A is committed
with probability 1/N
Properties of Rabin’s Protocol
Fair
• The difference between A’s probability to obtain
B’s commitment and B’s probability to obtain A’s
commitment is at most 1/N
– But communication overhead is 2N messages
Not optimistic
• Need input from third party in every transaction
– Same input for all transactions on a given day sent out as
a one-way broadcast. Maybe this is not so bad!
Not timely
• If one of the parties stops communicating, the
other does not learn the outcome until day D
BGMR Probabilistic Contract Signing
[Ben-Or, Goldreich, Micali, Rivest ’85-90]
Doesn’t need beacon input in every transaction
Uses sigA”I am committed with probability pA” instead of
sigA”I am committed if i is broadcast on day D”
Each party decides how much to increase the
probability at each step
• A receives sigB”I am committed with probability pB” from B
is a parameter chosen by A
• Sets pA=min(1,pB)
• Sends sigA”I am committed with probability pA” to B
… the algorithm for B is symmetric
BGMR Message Flow
sigA”I am committed with probability 0.10 “
sigB”I am committed with probability 0.12 “
CONTRACT(A, B, future date D, contract terms)
sigA”I am committed with probability 0.20 “
sigB”I am committed with probability 0.23 “
…
sigA”I am committed with probability 1.00 “
sigB”I am committed with probability 1.00 “
Conflict Resolution
sigA”I am committed with probability pA1 “
sigB”I am committed with probability pB1 “
sigA”I am committed with probability pA2 “
???
sigB”I am committed with probability pB1 “
“Binding” or “Canceled”
(same verdict for both parties)
Waits until date D
If pB1, contract is binding,
else contract is canceled
“Binding” or “Canceled”
(same verdict for both parties)
01
judge
Judge
Waits until date D to decide
Announces verdict to both parties
Tosses coin once for each contract
Remembers previous coin tosses
• Constant memory: use pseudo-random
functions with a secret input to produce
repeatable coin tosses for each contract
Does not remember previous verdicts
• Same coin toss combined with different
evidence (signed message with a
different probability value) may result in a
different verdict
Privilege and Fairness
Privilege
A party is privileged if it has the evidence to
cause the judge to declare contract binding
Intuition: the contract binds either both parties, or neither;
what matters is the ability to make the contract binding
Fairness
At any step where Prob(B is privileged) > v,
Prob(A is not privileged | B is privileged) <
Intuition: at each step, the parties should have comparable probabilities of causing
the judge to declare contract binding (privilege must be symmetric)
Properties of BGMR Protocol
Fair
• Privilege is almost symmetric at each step:
if Prob(B is privileged) > pA0, then
Prob(A is not privileged | B is privileged) < 1-1/
Optimistic
• Two honest parties don’t need to invoke a judge
Not timely
• Judge waits until day D to toss the coin
• What if the judge tosses the coin and announces the
verdict as soon as he is invoked?
Formal Model
Protocol should ensure fairness given any
possible behavior by a dishonest participant
• Contact judge although communication hasn’t stopped
• Contact judge more than once
• Delay messages from judge to honest participant
Need nondeterminism
• To model dishonest participant’s choice of actions
Need probability
• To model judge’s coin tosses
The model is a Markov decision process
Constructing the Model
Discretize probability space of coin tosses
• The coin takes any of N values with equal probability
Fix each party’s “probability step”
Defines state space
• Rate of increases in the probability value contained in
the party’s messages determines how many messages
are exchanged
A state is unfair if privilege is asymmetric
• Difference in evidence, not difference in commitments
Compute probability of reaching an unfair state for
different values of the parties’ probability steps
Use PRISM
Attack Strategy
Dishonest B’s probability of driving the protocol
to an unfair state is maximized by this strategy:
1.Contact judge as soon as first message from A arrives
2.Judge tries to send verdict to A (the verdict is probably
negative, since A’s message contains a low probability value)
3.B delays judge’s verdicts sent to A
4.B contacts judge again with each new message from A until
a positive verdict is obtained
This strategy only works in the timely protocol
• In the original protocol, coin is not tossed and verdict
is not announced until day D
Conflict between optimism and timeliness
Analysis Results
Probability of reaching
a state where B is
privileged and A is not
Increase in B’s probability
value at each step
(lower increase means more
messages must be exchanged)
For a higher probability of winning, dishonest B must
exchange more messages with honest A
Attacker’s Tradeoff
Expected number of
messages before
unfair state is reached
Probability of reaching
a state where B is
privileged and A is not
Linear tradeoff for dishonest B between probability of winning
and ability to delay judge’s messages to A
Without complete control of the communication network, B
may settle for a lower probability of winning
Summary
Probabilistic contract signing is a good testbed
for probabilistic model checking techniques
• Standard formal analysis techniques not applicable
• Combination of nondeterminism and probability
• Good for quantifying tradeoffs
Probabilistic contract signing is subtle
• Unfairness as asymmetric privilege
• Optimism cannot be combined with timeliness, at
least not in the obvious way