Constraint-based methods

Download Report

Transcript Constraint-based methods

Constraint-Based Methods:
Adding Algebraic Properties to
Symbolic Models
Vitaly Shmatikov
SRI International
One-Slide Summary

“Constraint solving” is a symbolic analysis
method for cryptographic protocols

Decidable without finite bounds on the attacker
 Big win over finite-state checking (FDR, Mur, etc.)


Only need to specify behavior of honest participants
Can be extended with algebraic theories for
XOR, modular multiplication, Diffie-Hellman


Push-button procedure for finding both Dolev-Yao and
algebraic attacks (e.g., Pereira-Quisquater)
Works only for a finite number of sessions
 “Attack template” must be expressed as a symbolic execution trace
Protocol Analysis Techniques
Protocol Analysis Techniques
Formal Models
Modal Logics
Finite-state
Decidable
(no probabilities)
Process Calculi
Computational Models
Inductive Proofs
…
Probabilistic poly-time
Random oracle
…
Infinite message space, finite sessions
Free attacker algebra
Attacker algebra with
equational theory
Protocol Analysis Meets Algebra


Dolev-Yao model uses “black-box” cryptography
Many crypto primitives are not black boxes



Attacker can and will exploit algebraic properties



XOR: ab = ba; aa = 0
Modular exponentiation: xy = yx; (xy)x-1 = y
Ryan-Schneider attack on Bull’s recursive authentication protocol
Pereira-Quisquater attack on A-GDH.2 protocol
Goal: fully automated analysis of protocols with
relevant algebraic theories

GDOI, group key management protocols, …
A-GDH.2 Protocol
[Ateniese, Steiner, Tsudik ’00]


Parties start with pairwise keys Kaz,Kbz,Kcz
The goal is to establish common session key rarbrcrz
A
,ra
B
ra,rb,rarb
p is prime
q is prime divisor of p-1
 is generator of cyclic subgroup of Z*p of order q
C
rbrc,rarc,rarb,rarbrc
rbrcrzKaz, rarcrzKbz, rarbrzKcz
Computes session key rarbrcrz
-1
as (rarcrzKbz)Kbz
rb
Z
Is This Protocol Secure?
Suppose two sessions are run concurrently, and malicious C wants to
learn the session key of the session from which he is excluded
A
B
,ra
rbrcrzKaz,
rarcrzKbz,
rarbrzKcz
ra,rb,
rarb
Z
C
A
B
qa,
qb,
qaqb
,qa
rbrc,
rarc,
rarb,
rarbrc
C
qbqzKaz,
qaqzKbz
Z
Can the attacker who controls the network and participates in
the 1st session learn the session key of the 2nd session?
Model Checking Approach

Two sources of infinite behavior



Multiple protocol sessions, multiple participant roles
Message space or data space may be infinite
Finite approximation

Assume finite number of participants
 Example: 2 clients, 2 servers

Assume finite message space
This restriction is necessary
(or the problem is undecidable)
 Represent random numbers by r1, r2, r3, …
 Do not allow encrypt(encrypt(encrypt(…)))
This is restriction is not necessary
for fully automated analysis!
Infinite-State Protocol Model
[Amadio and Lugiez ‘00]

Finite number of processes


[Millen and Shmatikov ‘01]
Variables represent data under attacker’s control
Attacker capabilities modeled by a term algebra



[Boreale ‘01]
Messages modeled as terms with variables


Each process models a protocol role
[Rusinowitch and Turuani ‘01]
No artificial bounds on attacker computations
Generates an infinite space of possible attacker messages
Protocol analysis problem reduces to a
decidable symbolic constraint solving problem

Easy-to-use, practical software for protocol analysis
Roles in A-GDH.2 Protocol
A
,ra
B
ra,rb,rarb
C
rbrc,rarc,rarb,rarbrc
B role
,X1
B
B  X1,rb,X1rb
B  Y1,Y2Kbz,Y3


rbrcrzKaz, rarcrzKbz, rarbrzKcz
Z
Z role
Z  Z1,Z2,Z3,Z4
Z  Z1rzKaz,Z2rzKbz,Z3rzKcz
Variables represent terms unknown to the party who plays the role
Attacker can instantiate a variable with any value, but instantiation
must be consistent in all terms where it occurs
Symbolic Execution Trace
Suppose two sessions are run concurrently, and malicious C wants to
learn the session key of the session from which he is excluded
A
B
,ra
rbrcrzKaz,
rarcrzKbz,
rarbrzKcz
ra,rb,
rarb
Z
C
A
B
,qa
rbrc,
rarc,
rarb,
rarbrc
B and Z from 1st session
Partial execution trace
(there are finitely many)
qa,
qb,
qaqb
C
B  ,X1
qbqzKaz,
X
r
X
r
1
b
1
b
B   , ,
qaqzKbz
Z  Z1,Z2,Z3,Z4
Z
Z
r
K
Z
r
K
Z
r
K
Z   1 z az, 2 z bz, 3 z cz
B from 2nd session
B  ,V1
B  V1,qb,V1qb
B  W1,W2Kbz,W3
Is There A Feasible Attack?
B  ,X1
B  X1,rb,X1rb
Z  Z1,Z2,Z3,Z4
Z  Z1rzKaz,Z2rzKbz,Z3rzKcz
B  ,V1
B  V1,qb,V1qb
B will use this value as session key.
If attacker can learn (and announce) it,
B  W1,W2Kbz,W3


W2qb
the protocol is broken.
This attack is feasible if and only if
the attacker can consistently instantiate all variables in the trace so
that he can produce every message received by B and Z
Symbolic Attack Traces

Attack is modeled as a symbolic execution trace




Adequate for trace-based security properties


A trace is a sequence of message send and receive events
Attack trace ends in a violation (e.g., attacker learns the secret)
Messages contain variables, modeling data controlled by attacker
Secrecy, authentication, some forms of fairness…
A symbolic trace may or may not have a feasible
concrete instantiation

Finding whether such an instantiation exists is the main goal of
symbolic (infinite-state) protocol analysis
From Attack Traces to Constraints

For each message sent by the attacker in the
attack trace, create a symbolic constraint
mi

from t1,
…, tn

mi is the message attacker needs to send

t1,…,tn are the messages observed by attacker up to this point
Attack is feasible if and only if
all constraints are satisfiable simultaneously

There exists an instantiation  such that i mi can be derived
from t1, …, tn in attacker’s term algebra
Constraint Generation for A-GDH.2
B  ,X1
B  X1,rb,X1rb
Z  Z1,Z2,Z3,Z4
Z  Z1rzKaz,Z2rzKbz,Z3rzKcz
B  ,V1
B  V1,qb,V1qb
B  W1,W2Kbz,W3
 W2qb
,X
1
from ,kcz
(attacker’s initial knowledge)
Z1,Z2,
X1,rb,X1rb
from
,k
,
cz
Z
Z
 3, 4
,V1
W1,W2Kbz,
W3
from ,kcz,X1,rb,X1rb,
Z1rzKaz,Z2rzKbz,Z3rzKcz
from ,kcz,X1,rb,X1rb,
Z1rzKaz,Z2rzKbz,Z3rzKcz,
V1,qb,V1qb
W2qb from ,kcz,X1,rb,X1rb,
Z1rzKaz,Z2rzKbz,Z3rzKcz,
V1,qb,V1qb
Dolev-Yao Term Algebra
Attacker’s term algebra is a set of derivation rules
vT
if u=v for some 
Tu
T[u,v]
Tu
T[u,v]
Tv
Tu Tv
T[u,v]
Tu
Tv
Tcryptu[v]
Tcryptu[v] Tu
Tv
Symbolic constraint m from t1, …, tn is satisfiable
if and only if there is a substitution  such that
t1, …, tn  m is derivable using these rules
Properties of Term Algebra

No restriction on structural size of terms



Untyped




The closure of any term set under derivation rules is infinite
There is no a priori bound on attacker computations
Attacker doesn’t have to comply with the protocol specification
Attacker may substitute a ciphertext for a random number, a
key for an output of a hash function, etc.
Symmetric encryption with non-atomic keys
Can add an equational theory to model
algebraic properties of cryptographic functions

XOR, modular exponentiation, blinded signatures, …
Solving Symbolic Constraints
[Millen and Shmatikov CCS ’01]


Constraint reduction rules

Replace each mi from Ti with one or more simpler constraints

Preserve essential properties of the constraint sequence
Nondeterministic reduction procedure



Structure-driven, but several rules may apply in any state
Exponential in the worst case (the problem is NP-complete)
The procedure is terminating and complete

If T  m is derivable in attacker’s term algebra,
1. There exists reduction rule r=r() which is applicable to m from T and
produces some m’ from T’ such that
2. T’  m’ is derivable in attacker’s term algebra
Reduction Rules
[m1,m2] from T
m1 from T
m2 from T
(pair)
m from T, v
(elim)
m from T
m from [u,v], T
(split)
m from u, v, T
cryptk[m] from T
(enc)
m from T
k from T
m from t, T
(un)
___
add mgu(t,m) to 
m from cryptu[v], T
(dec)
u from cryptuv, T
m from cryptu[v], v, T
Reduction Procedure
Initial
constraint sequence
apply every possible reduction rule
to first m from T where m is not a variable
•••
•••
No rule is
applicable
or
v1 from T1
• • •
vN from TN
If reduction tree has at least
one such sequence as a leaf,
there is a solution, and
attack trace is feasible
Symbolic Analysis Summary
specified by
the analyst
fully
automated
Formal specification of protocol roles
attacker is implicit!
variables model attacker’s input
Attack (violating execution trace)
may not have a feasible instantiation
Sequence of symbolic constraints
satisfiable if and only if there exists
a feasible instantiation of attack trace
Decidable constraint solving procedure
Let’s Add Algebraic Properties
Verification of trace-based security properties
… is decidable for protocols with XOR
 Comon-Lundh and Shmatikov (LICS ’03)
 Chevalier, Kϋsters, Rusinowitch, Turuani (LICS ’03)
… reduces to a system of quadratic Diophantine equations
for protocols with Abelian groups
 Millen and Shmatikov (CSFW ’03)
… is decidable for a restricted class of protocols with
modular exponentiation
 Chevalier, Kϋsters, Rusinowitch, Turuani (FST/TCS ’03)
… is decidable for any well-defined protocol with products
and modular exponentiation
 Shmatikov (ESOP ’04)
Attacker Term Algebra
Dolev-Yao
vT
Tu
T[u,v]
Tu
Tu Tv
T[u,v]
Attacker can’t
take discrete logs
or solve DiffieHellman problem
T[u,v]
Tv
Tu
Tv
Tcryptu[v]
Tuv
Tv
Tuv Tuw
Tuvw
Tcryptu[v] Tu
Tv
Tu Tv
Tuv
Tu Tu Tv
Tuv
Tu-1
Associative: (x  y)  z = x  (y  z)
Commutative: x  y = y  x
Normalization xx-1  1
x1  x
rules: (x-1)-1  x (xy)-1  y-1x-1
x1  x
(xy)z  xyz
Key Insights For Decidability

In a well-defined protocol, honest participants
don’t need to guess values of attacker inputs

Leads to a syntactic condition on usage of variables

If attacker can derive u from T, then there is a
derivation which uses only subterms of T and u

If constraints are satisfiable, then there is an
attack in which every variable is instantiated by a
product of subterms drawn from a finite set
Origination Stability

Variable origination condition



If C is a constraint sequence generated from an execution trace,
then there exists a linear ordering < on Vars(C) such that
if x appears for the first time in mi from Ti  C,
then x  Vars(mi) and  y  Vars(Ti) y < x
This condition must be satisfied by C after any partial
substitution 
Rules out only ill-defined protocols
AB
BA
XY
X
Requires B to split a product
of two unknown values
Normal Derivations
t1T
t2T
tnT
Tt1
Tt2
Ttn
…
…
Tv
Tv1
ANALYSIS stage:
All intermediate
terms are products
of subterms of T
…
Tvk
…
Tu
SYNTHESIS stage:
Only pairing, encryption,
multiplication, inverse &
exponentiation used
Lemma: if Tu is derivable, then there is a normal derivation
Conservative Solutions



Conservative solution only uses subterms from
the original, uninstantiated constraint sequence

x Subterms(x)  Subterms(C) closed under , inverse and
exponentiation

All subterms used in the conservative solution are drawn from
a finite set which is known before any variables are instantiated
Lemma: if C has a solution,
then C has a conservative solution
This lemma allows to derive a bound on the
size of the attack
Symbolic Decision Procedure
{ u1 from T1 , …, un from Tn }


1.
3.
4.
Monotonic: T1  …  Tn
Satisfy the variable stability condition
Guess all equalities between subterms

2.
symbolic constraints
generated from protocol
Finite number of possible unifiers modulo AG
Guess the order in which subterms are derived
Replace exponentiation by  and inverse
Reduce to a decidable system of Solvable iff a linear
quadratic Diophantine equations subsystem is solvable
Back to A-GDH.2
X1
Z1
Z2
Z3
Z4
V1
W2Kbz
W2qb
from ,kcz
from -”-,X1,rb,X1rb
from -”from -”from -”from -”-,
Z1rzKaz,Z2rzKbz,
Z3rzKcz
from -”-,
V1,qb,V1qb
from -”-
X1
rb-1Z1
rb-1Z2
rb-1Z3
rb-1Z4
from
from
from
from
from
kcz
kcz
kcz
kcz
kcz
Only  and inverse
used in derivation.
Reduces to system
of Diophantine
equations.
Z3-1rz-1kcz-1V1 from kcz
Z2-1rz-1W2 from kcz
V1-1W2 from kcz
Key insight: under the Diffie-Hellman assumption, attacker can produce
-1
x from y if and only if he can produce y-1x (x=(y)y x)
Decidable Quadratic Equations
u1X11…X1k1 from t11, …, tm1
u2X21…X2k2 from t21, …, tm2
…
unXn1…Xnkn from tn1, …, tmn

Only  and inverse used
in each derivation
Convert each constraint into a Diophantine equation

uiXi1…Xik from ti1, …, tim becomes uiXi1…Xik = ti1z1 …timzm for integer zj

If some tij is a variable, equation becomes quadratic, for example
a2X = (ab)z1
a6 = (ab)z2(bX)z3

Equations associated with execution traces have special structure
If a variable occurs on the right, it must previously occur on the left
 All terms used to construct the variable where it first occurred are
available in every subsequent constraint

Intuition Behind Decidability
a2X = (ab)z1
a6 = (ab)z2(bX)z3
substitute X
a6 = (ab)z2(ba-2(ab)z1)z3
group (ab) terms together
a6 = (ab)z’(ba-2)z3
z’ = z2 + z1z3
Quadratic part always has a solution
because z2 is unconstrained
Is There A Feasible Attack? Yes!
B  ,X1
B  X1,rb,X1rb
Z  Z1,Z2,Z3,Z4
Z  Z1rzKaz,Z2rzKbz,Z3rzKcz
B  ,V1
B  V1,qb,V1qb
Attacker can learn this value by
B  W1,W2Kbz,W3

W2qb
clever variable instantiation
Attack on A-GDH.2
Suppose two sessions are run concurrently, and malicious C wants to
learn the session key of the session from which he is excluded
A
B
,ra
rbrcrzKaz,
rarcrzKbz,
rarbrzKcz
1. Replace with 1
A
ra,rb,
rarb
Z
B
,qb
rbrc,
rarc,
rarb,
rarbrc
qbqzKaz,
qaqzKbz
3. Replace with rbrzkcz
qb,
qb,
qaqb
Attacks of this type
can be found
automatically from
protocol specification
Z
4. Replace with rbrzkbz
2. Replace with rb,rb,rb,rb
Attack: B will use r r q as session key, which attacker
can compute as (r r k q )k
b z b
b z cz b
-1
cz
Decision Procedures

Free (“black-box”) algebra: decidable


XOR: decidable


All integer variables are equal to 0 or 1
(Group) Diffie-Hellman: decidable



Implemented as an easy-to-use analysis tool
System of quadratic Diophantine equations, which is
solvable if and only if a linear subsystem is solvable
Some restrictions (no products in exponentiation base)
Blind signatures, super-exponentiation, ...

Axiomatic models of various cryptographic primitives



Current
research