Message Integrity and Hash Function

Download Report

Transcript Message Integrity and Hash Function

Hash Function
What are hash functions?
• Just a method of compressing strings
– E.g., H : {0,1}*  {0,1}160
– Input is called “message”, output is “digest”
• Why would you want to do this?
– Short, fixed-size better than long, variable-size
• True also for non-crypto hash functions
– Digest can be added for redundancy
– Digest hides possible structure in message
How are they built?
But not
always…
Typically using Merkle-Damgård iteration:
1. Start from a “compression function”
– h: {0,1}b+n{0,1}n
|M|=b=512 bits
h
c =160 bits
d=h(c,M)=160 bits
2. Iterate it
M1
IV=d0
M2
h
d1
ML-1
h
d2
…
ML
h
dL-1
h
dL
d=H(M)
The Merkle-Damgard iterated construction
m[0]
IV
(fixed)
m[1]
m[2]
m[3] ll PB
H(m)
h
h
h
h
Thm: h collision resistant ⇒ H collision resistant
Can we use H(.) to directly build a MAC?
No,Given H( k ll m) can compute H( k ll m ll PB ll w ) for any w.
What are they good for?
“Modern, collision resistant hash functions were designed to create
small, fixed size message digests so that a digest could act as a
proxy for a possibly very large variable length message in a digital
signature algorithm, such as RSA or DSA. These hash functions
have since been widely used for many other “ancillary” applications,
including hash-based message authentication codes, pseudo
random number generators, and key derivation functions.”
Important Properties of Cryptographic
Hash Functions
•
•
•
•
First pre-image resistance
Second pre-image resistance
Strong collision resistance
Efficient
First pre-image resistance:
• Given y in Y, it is “computationally infeasible”
to compute a value x in X such that h(x) = y
• Hard to invert
• Why we need this property?
– If hash function is invertible, then it is not useful
for crypto applications n E.g., password
– authentication will be broken
Second pre-image resistance:
• Given x in X, it is “computationally infeasible” to
compute a different value x’ in X with x!=x’ such that
h(x) = h(x’)
• Weak collision-resistance (sometimes also called target
collision resistance)
• Why we need this property?
– If hash function is not weak collision-resistant, then it is
not useful for crypto applications n E.g., password
authentication will be broken because even if the attacker
doesn’t get your actual password, he can still get another
string of bits that “collides with” and is as good as your
actual password
Strong collision-resistance:
• It is “computationally infeasible” to find distinct inputs x, x’
with x!=x’ such that h(x) = h(x’)
• Why we need this property?
– Usability of hash functions
– Security of hash-then-sign schemes
•
•
•
•
Given h, attacker computes m,m’ such that h(m) = h(m’)
Attacker gives m to Alice to hash-then-sign
Alice produces <m,sign(m)>
Attacker replaces it with <m’,sign(m)> and claims Alice signed it
• Strong collision resistance implies weak collision resistance
Birthday Paradox
Birthday Paradox
Two variants:
ƒ when drawing elements randomly (with replacement) from
a set of N elements, with high probability a repeated element
will be encountered after ~sqrt(N) selections
ƒ if we have a set of N elements, and we randomly select two
subsets of size ~sqrt(N) each, then with high probability, the
intersection of the two subsets will not be empty
These facts have a profound impact on the design of hash
functions (and other cryptographic algorithms and protocols)!
Birthday Paradox
ƒ Given a set of N elements, from which we draw k elements randomly
(with replacement). What is the probability of encountering at least
one repeating element?
ƒ first, compute the probability of no repetition:
– the first element x1 can be anything
– when choosing the second element x2, the probability of x2 ≠ x1 is 1-1/N
– when choosing x3, the probability of x3 ≠ x2 and x3 ≠ x1 is 1-2/N
–…
– when choosing the k-th element, the probability of no repetition is
1-(k-1)/N
– the probability of no repetition is (1 - 1/N)(1 - 2/N)…(1 – (k-1)/N)
– when x is small, (1-x) ≈ e-x
– (1 - 1/N)(1 - 2/N)…(1 – (k-1)/N) = e-1/Ne-2/N … e-(k-1)/N = e-k(k-1)/2N
ƒ the probability of at least one repetition after k drawing is
1 – e-k(k-1)/2N
Birthday Paradox
How many drawings do you need, if you want the probability of
at least one repetition to be ε ?
ƒ solve the following for k:
ε = 1 – e-k(k-1)/2N
k(k-1) = 2N ln(1/1-ε)
k ≈ sqrt(2N ln(1/1-ε))
ƒ examples:
ε = ½ -> k ≈ 1.177 sqrt(N)
ε = ¾ -> k ≈ 1.665 sqrt(N)
ε = 0.9 -> k ≈ 2.146 sqrt(N)
ƒ origin of the name “birthday paradox”:
– elements are dates in a year (N = 365)
– among 1.177 sqrt(365) ≈ 23 randomly selected people, there will be
at least two that have the same birthday with probability ½
Choosing the output size of hash
function
good hash functions can be modeled as follows:
– given a hash value y, the probability that a randomly chosen input x
maps to y is ~2-n
– the probability that two randomly chosen inputs x and x’ map into
the same hash value is also ~2-n
Æ n should be at least 64, but 80 is even better
ƒ birthday attacks
– among ~sqrt(2n) = 2n/2 randomly chosen messages, with high
probability there will be a collision pair
– it is easier to find collisions than to find preimages or 2nd
preimages for a given hash value
- in order to resist birthday attacks, n should be at least 128, but
160 is even better
Standardized method: HMAC
(Hash-MAC)
Most widely used MAC on the Internet.
H: hash function.
example: SHA-256 ; output is 256 bits
Building a MAC out of a hash function:
HMAC:
S( k, m ) = H( kopad ll H( kipad ll m ) )
HMAC in pictures
k⨁ipad
IV
(fixed)
>
h
m[0]
>
m[1]
h
>
m[2] ll PB
h
h
>
k⨁opad
tag
IV
(fixed)
>
h
>
h
Similar to the NMAC PRF.
main difference: the two keys k1, k2 are dependent
Warning: verification timing attacks
[L’09]
Example: Keyczar crypto library (Python)
[simplified]
def Verify(key, msg, sig_bytes):
return HMAC(key, msg) == sig_bytes
The problem: ‘==‘ implemented as a byte-bybyte comparison
• Comparator returns false when first inequality
found
Warning: verification timing attacks
[L’09]
m , tag
target
msg m
k
accept or reject
Timing attack: to compute tag for target message m
do:
Step 1: Query server with random tag
Step 2: Loop over all possible first bytes and query server.
stop when verification takes a little longer than in step 1
Step 3: repeat for all tag bytes until valid tag found
Make string comparator always take same time
(Python) :
return false if sig_bytes has wrong length
result = 0
for x, y in zip( HMAC(key,msg) , sig_bytes):
result |= ord(x) ^ ord(y)
return result == 0
Or
def Verify(key, msg, sig_bytes):
mac = HMAC(key, msg)
return HMAC(key, mac) == HMAC(key, sig_bytes)
Lesson
Don’t implement crypto yourself !