Constraint-based methods
Download
Report
Transcript Constraint-based methods
Constraint-Based Methods:
Adding Algebraic Properties to
Symbolic Models
Vitaly Shmatikov
SRI International
One-Slide Summary
“Constraint solving” is a symbolic analysis
method for cryptographic protocols
Decidable without finite bounds on the attacker
Big win over finite-state checking (FDR, Mur, etc.)
Only need to specify behavior of honest participants
Can be extended with algebraic theories for
XOR, modular multiplication, Diffie-Hellman
Push-button procedure for finding both Dolev-Yao and
algebraic attacks (e.g., Pereira-Quisquater)
Works only for a finite number of sessions
“Attack template” must be expressed as a symbolic execution trace
Protocol Analysis Techniques
Protocol Analysis Techniques
Formal Models
Modal Logics
Finite-state
Decidable
(no probabilities)
Process Calculi
Computational Models
Inductive Proofs
…
Probabilistic poly-time
Random oracle
…
Infinite message space, finite sessions
Free attacker algebra
Attacker algebra with
equational theory
Protocol Analysis Meets Algebra
Dolev-Yao model uses “black-box” cryptography
Many crypto primitives are not black boxes
Attacker can and will exploit algebraic properties
XOR: ab = ba; aa = 0
Modular exponentiation: xy = yx; (xy)x-1 = y
Ryan-Schneider attack on Bull’s recursive authentication protocol
Pereira-Quisquater attack on A-GDH.2 protocol
Goal: fully automated analysis of protocols with
relevant algebraic theories
GDOI, group key management protocols, …
A-GDH.2 Protocol
[Ateniese, Steiner, Tsudik ’00]
Parties start with pairwise keys Kaz,Kbz,Kcz
The goal is to establish common session key rarbrcrz
A
,ra
B
ra,rb,rarb
p is prime
q is prime divisor of p-1
is generator of cyclic subgroup of Z*p of order q
C
rbrc,rarc,rarb,rarbrc
rbrcrzKaz, rarcrzKbz, rarbrzKcz
Computes session key rarbrcrz
-1
as (rarcrzKbz)Kbz
rb
Z
Is This Protocol Secure?
Suppose two sessions are run concurrently, and malicious C wants to
learn the session key of the session from which he is excluded
A
B
,ra
rbrcrzKaz,
rarcrzKbz,
rarbrzKcz
ra,rb,
rarb
Z
C
A
B
qa,
qb,
qaqb
,qa
rbrc,
rarc,
rarb,
rarbrc
C
qbqzKaz,
qaqzKbz
Z
Can the attacker who controls the network and participates in
the 1st session learn the session key of the 2nd session?
Model Checking Approach
Two sources of infinite behavior
Multiple protocol sessions, multiple participant roles
Message space or data space may be infinite
Finite approximation
Assume finite number of participants
Example: 2 clients, 2 servers
Assume finite message space
This restriction is necessary
(or the problem is undecidable)
Represent random numbers by r1, r2, r3, …
Do not allow encrypt(encrypt(encrypt(…)))
This is restriction is not necessary
for fully automated analysis!
Infinite-State Protocol Model
[Amadio and Lugiez ‘00]
Finite number of processes
[Millen and Shmatikov ‘01]
Variables represent data under attacker’s control
Attacker capabilities modeled by a term algebra
[Boreale ‘01]
Messages modeled as terms with variables
Each process models a protocol role
[Rusinowitch and Turuani ‘01]
No artificial bounds on attacker computations
Generates an infinite space of possible attacker messages
Protocol analysis problem reduces to a
decidable symbolic constraint solving problem
Easy-to-use, practical software for protocol analysis
Roles in A-GDH.2 Protocol
A
,ra
B
ra,rb,rarb
C
rbrc,rarc,rarb,rarbrc
B role
,X1
B
B X1,rb,X1rb
B Y1,Y2Kbz,Y3
rbrcrzKaz, rarcrzKbz, rarbrzKcz
Z
Z role
Z Z1,Z2,Z3,Z4
Z Z1rzKaz,Z2rzKbz,Z3rzKcz
Variables represent terms unknown to the party who plays the role
Attacker can instantiate a variable with any value, but instantiation
must be consistent in all terms where it occurs
Symbolic Execution Trace
Suppose two sessions are run concurrently, and malicious C wants to
learn the session key of the session from which he is excluded
A
B
,ra
rbrcrzKaz,
rarcrzKbz,
rarbrzKcz
ra,rb,
rarb
Z
C
A
B
,qa
rbrc,
rarc,
rarb,
rarbrc
B and Z from 1st session
Partial execution trace
(there are finitely many)
qa,
qb,
qaqb
C
B ,X1
qbqzKaz,
X
r
X
r
1
b
1
b
B , ,
qaqzKbz
Z Z1,Z2,Z3,Z4
Z
Z
r
K
Z
r
K
Z
r
K
Z 1 z az, 2 z bz, 3 z cz
B from 2nd session
B ,V1
B V1,qb,V1qb
B W1,W2Kbz,W3
Is There A Feasible Attack?
B ,X1
B X1,rb,X1rb
Z Z1,Z2,Z3,Z4
Z Z1rzKaz,Z2rzKbz,Z3rzKcz
B ,V1
B V1,qb,V1qb
B will use this value as session key.
If attacker can learn (and announce) it,
B W1,W2Kbz,W3
W2qb
the protocol is broken.
This attack is feasible if and only if
the attacker can consistently instantiate all variables in the trace so
that he can produce every message received by B and Z
Symbolic Attack Traces
Attack is modeled as a symbolic execution trace
Adequate for trace-based security properties
A trace is a sequence of message send and receive events
Attack trace ends in a violation (e.g., attacker learns the secret)
Messages contain variables, modeling data controlled by attacker
Secrecy, authentication, some forms of fairness…
A symbolic trace may or may not have a feasible
concrete instantiation
Finding whether such an instantiation exists is the main goal of
symbolic (infinite-state) protocol analysis
From Attack Traces to Constraints
For each message sent by the attacker in the
attack trace, create a symbolic constraint
mi
from t1,
…, tn
mi is the message attacker needs to send
t1,…,tn are the messages observed by attacker up to this point
Attack is feasible if and only if
all constraints are satisfiable simultaneously
There exists an instantiation such that i mi can be derived
from t1, …, tn in attacker’s term algebra
Constraint Generation for A-GDH.2
B ,X1
B X1,rb,X1rb
Z Z1,Z2,Z3,Z4
Z Z1rzKaz,Z2rzKbz,Z3rzKcz
B ,V1
B V1,qb,V1qb
B W1,W2Kbz,W3
W2qb
,X
1
from ,kcz
(attacker’s initial knowledge)
Z1,Z2,
X1,rb,X1rb
from
,k
,
cz
Z
Z
3, 4
,V1
W1,W2Kbz,
W3
from ,kcz,X1,rb,X1rb,
Z1rzKaz,Z2rzKbz,Z3rzKcz
from ,kcz,X1,rb,X1rb,
Z1rzKaz,Z2rzKbz,Z3rzKcz,
V1,qb,V1qb
W2qb from ,kcz,X1,rb,X1rb,
Z1rzKaz,Z2rzKbz,Z3rzKcz,
V1,qb,V1qb
Dolev-Yao Term Algebra
Attacker’s term algebra is a set of derivation rules
vT
if u=v for some
Tu
T[u,v]
Tu
T[u,v]
Tv
Tu Tv
T[u,v]
Tu
Tv
Tcryptu[v]
Tcryptu[v] Tu
Tv
Symbolic constraint m from t1, …, tn is satisfiable
if and only if there is a substitution such that
t1, …, tn m is derivable using these rules
Properties of Term Algebra
No restriction on structural size of terms
Untyped
The closure of any term set under derivation rules is infinite
There is no a priori bound on attacker computations
Attacker doesn’t have to comply with the protocol specification
Attacker may substitute a ciphertext for a random number, a
key for an output of a hash function, etc.
Symmetric encryption with non-atomic keys
Can add an equational theory to model
algebraic properties of cryptographic functions
XOR, modular exponentiation, blinded signatures, …
Solving Symbolic Constraints
[Millen and Shmatikov CCS ’01]
Constraint reduction rules
Replace each mi from Ti with one or more simpler constraints
Preserve essential properties of the constraint sequence
Nondeterministic reduction procedure
Structure-driven, but several rules may apply in any state
Exponential in the worst case (the problem is NP-complete)
The procedure is terminating and complete
If T m is derivable in attacker’s term algebra,
1. There exists reduction rule r=r() which is applicable to m from T and
produces some m’ from T’ such that
2. T’ m’ is derivable in attacker’s term algebra
Reduction Rules
[m1,m2] from T
m1 from T
m2 from T
(pair)
m from T, v
(elim)
m from T
m from [u,v], T
(split)
m from u, v, T
cryptk[m] from T
(enc)
m from T
k from T
m from t, T
(un)
___
add mgu(t,m) to
m from cryptu[v], T
(dec)
u from cryptuv, T
m from cryptu[v], v, T
Reduction Procedure
Initial
constraint sequence
apply every possible reduction rule
to first m from T where m is not a variable
•••
•••
No rule is
applicable
or
v1 from T1
• • •
vN from TN
If reduction tree has at least
one such sequence as a leaf,
there is a solution, and
attack trace is feasible
Symbolic Analysis Summary
specified by
the analyst
fully
automated
Formal specification of protocol roles
attacker is implicit!
variables model attacker’s input
Attack (violating execution trace)
may not have a feasible instantiation
Sequence of symbolic constraints
satisfiable if and only if there exists
a feasible instantiation of attack trace
Decidable constraint solving procedure
Let’s Add Algebraic Properties
Verification of trace-based security properties
… is decidable for protocols with XOR
Comon-Lundh and Shmatikov (LICS ’03)
Chevalier, Kϋsters, Rusinowitch, Turuani (LICS ’03)
… reduces to a system of quadratic Diophantine equations
for protocols with Abelian groups
Millen and Shmatikov (CSFW ’03)
… is decidable for a restricted class of protocols with
modular exponentiation
Chevalier, Kϋsters, Rusinowitch, Turuani (FST/TCS ’03)
… is decidable for any well-defined protocol with products
and modular exponentiation
Shmatikov (ESOP ’04)
Attacker Term Algebra
Dolev-Yao
vT
Tu
T[u,v]
Tu
Tu Tv
T[u,v]
Attacker can’t
take discrete logs
or solve DiffieHellman problem
T[u,v]
Tv
Tu
Tv
Tcryptu[v]
Tuv
Tv
Tuv Tuw
Tuvw
Tcryptu[v] Tu
Tv
Tu Tv
Tuv
Tu Tu Tv
Tuv
Tu-1
Associative: (x y) z = x (y z)
Commutative: x y = y x
Normalization xx-1 1
x1 x
rules: (x-1)-1 x (xy)-1 y-1x-1
x1 x
(xy)z xyz
Key Insights For Decidability
In a well-defined protocol, honest participants
don’t need to guess values of attacker inputs
Leads to a syntactic condition on usage of variables
If attacker can derive u from T, then there is a
derivation which uses only subterms of T and u
If constraints are satisfiable, then there is an
attack in which every variable is instantiated by a
product of subterms drawn from a finite set
Origination Stability
Variable origination condition
If C is a constraint sequence generated from an execution trace,
then there exists a linear ordering < on Vars(C) such that
if x appears for the first time in mi from Ti C,
then x Vars(mi) and y Vars(Ti) y < x
This condition must be satisfied by C after any partial
substitution
Rules out only ill-defined protocols
AB
BA
XY
X
Requires B to split a product
of two unknown values
Normal Derivations
t1T
t2T
tnT
Tt1
Tt2
Ttn
…
…
Tv
Tv1
ANALYSIS stage:
All intermediate
terms are products
of subterms of T
…
Tvk
…
Tu
SYNTHESIS stage:
Only pairing, encryption,
multiplication, inverse &
exponentiation used
Lemma: if Tu is derivable, then there is a normal derivation
Conservative Solutions
Conservative solution only uses subterms from
the original, uninstantiated constraint sequence
x Subterms(x) Subterms(C) closed under , inverse and
exponentiation
All subterms used in the conservative solution are drawn from
a finite set which is known before any variables are instantiated
Lemma: if C has a solution,
then C has a conservative solution
This lemma allows to derive a bound on the
size of the attack
Symbolic Decision Procedure
{ u1 from T1 , …, un from Tn }
1.
3.
4.
Monotonic: T1 … Tn
Satisfy the variable stability condition
Guess all equalities between subterms
2.
symbolic constraints
generated from protocol
Finite number of possible unifiers modulo AG
Guess the order in which subterms are derived
Replace exponentiation by and inverse
Reduce to a decidable system of Solvable iff a linear
quadratic Diophantine equations subsystem is solvable
Back to A-GDH.2
X1
Z1
Z2
Z3
Z4
V1
W2Kbz
W2qb
from ,kcz
from -”-,X1,rb,X1rb
from -”from -”from -”from -”-,
Z1rzKaz,Z2rzKbz,
Z3rzKcz
from -”-,
V1,qb,V1qb
from -”-
X1
rb-1Z1
rb-1Z2
rb-1Z3
rb-1Z4
from
from
from
from
from
kcz
kcz
kcz
kcz
kcz
Only and inverse
used in derivation.
Reduces to system
of Diophantine
equations.
Z3-1rz-1kcz-1V1 from kcz
Z2-1rz-1W2 from kcz
V1-1W2 from kcz
Key insight: under the Diffie-Hellman assumption, attacker can produce
-1
x from y if and only if he can produce y-1x (x=(y)y x)
Decidable Quadratic Equations
u1X11…X1k1 from t11, …, tm1
u2X21…X2k2 from t21, …, tm2
…
unXn1…Xnkn from tn1, …, tmn
Only and inverse used
in each derivation
Convert each constraint into a Diophantine equation
uiXi1…Xik from ti1, …, tim becomes uiXi1…Xik = ti1z1 …timzm for integer zj
If some tij is a variable, equation becomes quadratic, for example
a2X = (ab)z1
a6 = (ab)z2(bX)z3
Equations associated with execution traces have special structure
If a variable occurs on the right, it must previously occur on the left
All terms used to construct the variable where it first occurred are
available in every subsequent constraint
Intuition Behind Decidability
a2X = (ab)z1
a6 = (ab)z2(bX)z3
substitute X
a6 = (ab)z2(ba-2(ab)z1)z3
group (ab) terms together
a6 = (ab)z’(ba-2)z3
z’ = z2 + z1z3
Quadratic part always has a solution
because z2 is unconstrained
Is There A Feasible Attack? Yes!
B ,X1
B X1,rb,X1rb
Z Z1,Z2,Z3,Z4
Z Z1rzKaz,Z2rzKbz,Z3rzKcz
B ,V1
B V1,qb,V1qb
Attacker can learn this value by
B W1,W2Kbz,W3
W2qb
clever variable instantiation
Attack on A-GDH.2
Suppose two sessions are run concurrently, and malicious C wants to
learn the session key of the session from which he is excluded
A
B
,ra
rbrcrzKaz,
rarcrzKbz,
rarbrzKcz
1. Replace with 1
A
ra,rb,
rarb
Z
B
,qb
rbrc,
rarc,
rarb,
rarbrc
qbqzKaz,
qaqzKbz
3. Replace with rbrzkcz
qb,
qb,
qaqb
Attacks of this type
can be found
automatically from
protocol specification
Z
4. Replace with rbrzkbz
2. Replace with rb,rb,rb,rb
Attack: B will use r r q as session key, which attacker
can compute as (r r k q )k
b z b
b z cz b
-1
cz
Decision Procedures
Free (“black-box”) algebra: decidable
XOR: decidable
All integer variables are equal to 0 or 1
(Group) Diffie-Hellman: decidable
Implemented as an easy-to-use analysis tool
System of quadratic Diophantine equations, which is
solvable if and only if a linear subsystem is solvable
Some restrictions (no products in exponentiation base)
Blind signatures, super-exponentiation, ...
Axiomatic models of various cryptographic primitives
Current
research