HashFunctionsFormal
Download
Report
Transcript HashFunctionsFormal
4.1 Hash Functions and Data
Integrity
A cryptographic hash function can provide assurance
of data integrity.
ex:
(x, y)
Alice
Bob
Bob can verify if y = hK(x)
h is a hash function
x is a message
y is the authentication tag (message digest)
K is key
1
Hash Functions and Data
Integrity
Definition 4.1: A hash family is a four-tuple (X, Y, K,H),
where the following condition are satisfied:
1 X is a set of possible messages
2 Y is a finite set of possible message digests or
authentication tags
3 K, the keyspace, is a finite set of possible keys
4 For each K K, there is a hash function hK H.
Each hk: X Y
2
Hash Functions and Data
Integrity
h is compress functions
X is a finite set
Y is a finite set
|X | |Y | or stronger, |X | 2|Y |
A pair (x ,y) XY is said to be valid under the key K
hK(x) = y.
Let FX,Y denote the set of all function from X to Y.
|X| = N and |Y| = M.
X,Y| = MN.
|F
X,Y is termed an (N,M)-hash family.
F F
An unkeyed hash function is a function
h: X Y
3
4.2 Security of Hash Functions
If a hash function is to be considered secure, these
three problems are difficult to solve
Problem 4.1: Preimage
Instance: A hash function h: X Y and an
element y Y.
Find: x X such that f(x) = y
Problem 4.2: Second Preimage
Instance: A hash function h: X Y and an
element x X
Find: x’ X such that x’ ≠ x and h(x’) = h(x)
Problem 4.3: Collision
Instance: A hash function h: X Y .
Find: x, x’ X such that x’ ≠ x and h(x’) = h(x)
4
Security of Hash Functions
A hash function for which Preimage cannot be
efficiently solved is often said to be one-way or
preimage resistant.
A hash function for which Second Preimage cannot be
efficiently solved is often said to be second preimage
resistant.
A hash function for which Collision cannot be
efficiently solved is often said to be collision resistant.
5
Security of Hash Functions
4.2.1 The Random Oracle Model
The random oracle model provides a mathematical
model of an “ideal” hash function.
In this model, a hash function h: X Y is chosen
randomly from FX,Y
The only way to compute a value h(x) is to query the
oracle.
THEOREM 4.1 Suppose that h FX,Y is chosen
randomly, and let X0 X. Suppose that the values
h(x) have been determined (by querying an oracle
for h) if and only if x X0. Then Pr[h(x)=y] = 1/M
for all x X \ X0 and all y Y.
6
Security of Hash Functions
4.2.2 Algorithms in the Random Oracle Model
Randomized algorithms make random choices
during their execution.
A Las Vegas algorithm is a randomized algorithm
may fail to give an answer
if the algorithm does return an answer, then
the answer must be correct.
A randomized algorithm has average-case success
probability ε if the probability that the algorithm
returns a correct answer, averaged over all
problem instances of a specified size , is at least ε
(0≤ε<1).
7
Security of Hash Functions
We use the terminology (ε,q)-algorithm to denote a
Las Vegas algorithm with average-case success
probability ε
the number of oracle queries made by algorithms
is at most q.
Algorithm 4.1: FIND PREIMAGE (h, y, q)
choose any X0 X,|X0| = q
for each x X0
do if h(x) = y
then return (x)
return (failure)
8
Security of Hash Functions
THEOREM 4.2 For any X0 X with |X0| = q, the
average-case success probability of Algorithm 4.1 is
ε=1 - (1-1/M)q.
proof
Let y Y be fixed. Let Χ0 = {x1,x..,xq}.
For 1 ≤ i ≤ q, let Ei denote the event “h(xi) = y”.
From Theorem 4.1 that the Ei’s are independent
events, and Pr[Ei] = 1/M for all 1 ≤ i ≤ q.
q
1
Therefore Pr[ E1 E1 ... Eq ] 1 1 M
The success probability of Algorithm 4.1, for any
fixed y, is constant.
Therefore, the success probability averaged over
all y Y is identical, too.
9
Security of Hash Functions
Algorithm 4.3: FIND COLLISION (h,q)
choose X0 X , |X0 | = q
for each x X0
do yx h(x)
if yx = yx’ for some x’ ≠ x
then return (x, x’)
else return (failure)
10
Security of Hash Functions
Birthday paradox
In a group of 23 randomly chosen people, at least
two will share a birthday with probability at least
½.
Finding two people with the same birthday is the
same thing as finding a collision for this particular
hash function.
ex: Algorithm 4.3 has success probability at least
½ when q = 23 and M = 365
Algorithm 4.3 is analogous to throwing q balls
randomly into M bins and then checking to see if
some bin contains at least two balls.
11
Security of Hash Functions
THEOREM 4.4 For any X0 X with |X0| = q, the
success probability of Algorithm 4.3 is
M 1 M 2
M q 1
1 (
)(
)...(
)
M
M
M
proof Let X0 = {x1,..,xq}.
Ei : the event “h(xi) {h(x1),..,h(xi-1)}.” , 2 i q
Using induction, from Theorem 4.1 that Pr[E1] = 1
and
M i 1
Pr[ Ei | E1 E2 .. Ei 1 ]
for 2 ≤ i ≤ q.
M
M 1 M 2 M q 1
Pr[ E1 E2 ... Eq ] (
)(
)..(
)
M
M
M
12
Security of Hash Functions
x is small
1-x e-x
The probability of finding no collision is
i
i
q ( q 1)
q 1
q 1
iq11
i
M
(1
) eM e
e 2M
M
i 1
i 1
ε denotes the probability of finding at least one
collision
q ( q 1)
q(q 1)
1
2
ln(
1
)
q q 2M ln
2M
2
M
e
1
1
1
1
Ignore –q,
ε= 0.5, q ≈ 1.17 M
Take M = 365, we get q ≈ 22.3
q
2M ln
13
Security of Hash Functions
This says that hashing just over M random
elements of X yields a collision with a prob. of
50%.
A different choice of εleads to a different
constant factor, but q will still be proportional
to M . So this algorithm is a (1/2, O( M ))algorithm.
14
Security of Hash Functions
The birthday attack imposes a lower bound on the size
of secure message digests. A 40-bit message digest
would be very in secure, since a collision could be found
with prob. ½ with just over 2^20 (about a million)
random hashes.
It is usually suggested that the minimum acceptable
size of a message digest is 128 bits (the birthday attack
will require over 2^64 hashes in this case). In fact, a
160-bit message digest (or larger) is usually
recommended.
15