Transcript H - KIAS
Universal Hash Families
Universal hash families
Family of hash functions
Finite multiset H of string-valued functions, each h ∈ H having the same
nonempty domain A ⊆{0,1}* and range B ⊆{0,1}*, for some constant b
[Definition] -almost universal2 (-AU2) & -almost XOR
universal2 (-AXU2)
A family of hash functions H ={h: A {0,1}b} is -almost universal2 ,
written -AU2, if, for all distinct x, x’∈A, Prh∈ H [h(x)=h(x’)] .
The family of hash functions H is -almost XOR universal2 , written AXU2, if, for all distinct x, x’∈A, and for all c∈{0,1}b, Prh∈ H
[h(x)h(x’)=c] .
=maxxx’{Prh [h(x)=h(x’)]} : collision probability
Principle measure of an AU2 : How small its collision probability is
and how fast one can compute its functions
1
Composition of universal hash families
Make the domain of a hash family bigger
H ={h: {0,1}a{0,1}b}
Hm ={h: {0,1}am{0,1}bm}
its elements are the same as in H where h(x1 x2… xm), for | xi |=a, is defined
by h(x1)|| h(x2)||…|| h(xm)
[Proposition] If H is -AU2, then Hm is -AU2
Make the collision probability smaller
H1 ={h: A {0,1}b1}, H2 ={h: A {0,1}b2}
H1 & H2 ={h: A {0,1}b1+ b2}
its elements are pairs of functions (h1, h2)∈H1H2 and where (h1, h2)(x) is
defined as h1(x)||h2(x)
[Proposition] If H1 is 1-AU2 and H2 is 2-AU2, then H1 & H2 is 12-AU2
2
Composition of universal hash families(cont.)
Make the image of a hash function shorter
H1 ={h: {0,1}a{0,1}b}, H2 ={h: {0,1}b{0,1}c}
H2 ○ H1 ={h: {0,1}a{0,1}c}
its elements are pairs of functions (h1, h2)∈H1H2 and where (h1, h2)(x) is
defined as h2(h1(x))
[Proposition] If H1 is 1-AU2 and H2 is 2-AU2, then H2 ○ H1 is (1+2)AU2
Turn an AU2 family H1 and AXU2 family H2 into an AXU2 family
H2 ○ H1
[Proposition] Suppose H1 ={h: AB} is 1-AU2 and H2 ={h: BC} is
2-AXU2. Then H2 ○ H1 ={h: AC} is (1+2)-AXU2
3
Related researches
Carter-Wegman(1979, 1981)
Efficient authentication code under strongly universal hash functions
Key observations
Long messages can be authenticated efficiently using short keys if the
number of bits in the authentication tag is increased slightly compared to
‘perfect’ schemes
If a message is hashed to a short authentication tag, weaker properties are
sufficient for the first stage of the compression
Under certain conditions, the hash function can remain the same for many
plaintexts, provided that hash results is encrypted using a one-time pad
Stinson(1994)
Improves the works by Wegman-Carter and establishes an explicit link
between authentication codes and strongly universal hash functions
Johansson-Kabatianskii-Smeets(1996)
Establish a relation between authentication codes and codes correcting
independent errors
4
Related researches(cont.)
Krawczyk(1994, 1995)
Propose universal hash functions that are linear with respect to bitwise
XOR
Makes it easier to reuse the authentication code
Encrypt the m-bit hash result for each new message using a one-time pad
Simple and efficient constructions based on polynomials and LFSR
Shoup(1996)
Propose and analyze the constructions based on polynomials over finite
fields
Rogaway : Bucket hashing (1995)
Halevi-Krawczyk : MMH (1997)
Make optimal used of the multiply and accumulate instruction of the
Pentium MMX processor
Black-Halevi-Krawczyk-Krovertz-Rogaway : UMAC (1999)
Further improved the performance on high end processors
5
Constructions
Bucket hashing
is an -AU introduced by Rogaway
Defining the Bucket Hash Family B
word size w(1), parameters n(1), N(3)
domain D={0,1}wn, range R={0,1}wN
Let h B and let X=X1 Xn be the string we want to hash, where each
|Xi|=w. Then h(X) is defined by the following algorithm. First, for each
j{1,, N}, initialize Yj to 0w. Then, for each i{1,, n} and k hi,
replace Yk by Yk Xi. When done, set h(X) = Y1||Y2||||YN.
Pseudocode
for j 1 to N do Yj 0w
for i 1 to n do
Yhi1 Yhi1 Xi
Yhi2 Yhi2 Xi
Yhi3 Yhi3 Xi
return Y1||Y2||||YN
6
Constructions(cont.)
Bucket Hashing with Small Key Size
N=2s/L
Each hash function hB’[w,M,N] is specified by a list of length M
each entry contains L integers in the interval [0, N-1]
L arrays are introduced, each containing N buckets
Next, each array is compressed to s/L words, using a fixed primitive
element GF(2s/L)
The hash result is equal to the concatenation of the L compressed arrays,
each containing s/L words
7
Constructions(cont.)
Hash Family Based on Fast Polynomial Evaluation
is based on polynomial evaluation over a finite field
q = 2r, Q = 2m = 2r+s, n = 1+2s, : a linear mapping from GF(Q) onto
GF(q)
Q = q0m, q = q0r , q0 : a prime power
fa(x) =a0 + a1x + + an-1xn-1
x, y, a0, a1, , an-1 GF(Q), z GF(q)
H = {hx,y,z : hx,y,z(a) = hx,y,z(a0, a1, , an-1) = (y fa(x)) + z}
8
Constructions(cont.)
Hash Family Using Toeplitz Matrices
Toeplitz matrices are matrices with constant values on the left-to-right
diagonals
A Toeplitz matrix of dimension n m can be used to hash messages of
length m to hash results of length n by vector-matrix multiplication
The Toeplitz construction uses matrices generated by sequences of
length n + m - 1 drawn from -biased distributions
-biased distributions are a tool for replacing truly random sequences by
more compact and easier to generate sequences
The lower , the more random the sequence is
Krawczyk proves that the family of hash functions associated with a
family of Toeplitz-matrices corresponding to sequences selected from a
-biased distribution is -AXU with = 2-n +
9
Constructions(cont.)
Evaluation Hash Function
is one of the variants analyzed by Shoup
The input (of length tn) : viewed as a polynomial M(x) of degree < t
over GF(2n)
The key : a random element GF(2n)
the hash result : equal to M() GF(2n)
This family of hash functions is -AXU with = t/2n
10
Constructions(cont.)
Division Hash Function
represents the input as a polynomial M(x) of degree less than tn over
GF(2)
The hash key : a random irreducible polynomial p(x) of degree n over
GF(2)
The hash result : m(x) xn mod p(x)
This family of hash functions is -AXU with = tn/2n
The total number of irreducible polynomials of degree n is roughly equal
to 2n/n
11
Constructions(cont.)
MMH(Multilinear Modular Hashing) hashing
consists of a (modified) inner product between message and key
modulo a prime p (close to 2w, with w the word length; below w = 32)
is an -AXU2, but with xor replaced by subtraction modulo p
The core hash function maps 32 32-bit message words and 32 32-bit key
words to a 32-bit result
The key size is 1024 bits and = 1.5/ 230
For larger messages, a tree construction can be used
the value of and the key length have to be multiplied by the height of the
tree
This algorithm is very fast on the Pentium Pro, which has a multiply
and accumulate instruction
On a 32-bit machine, MMH requires only 2 instructions per byte for a 32bit result
12
Comparing the Hash Functions
13
Comparing the Hash Functions(cont.)
Scheme A
the input : divided into 32 blocks of 8 Kbyte
each block is hashed using the same bucket hash function with N = 160
results in an intermediate string of 20480 bytes
Scheme B
the input : divided into 64 blocks of 4 Kbyte
each block is hashed using the same bucket hash function with short
key(s=42, L=6, N=128)
results in an intermediate string of 10752 bytes
Scheme C
the input is divided into 64 blocks of 4 Kbyte
each block is hashed using a 331024 Toeplitz matrix, based on a -biased
sequence of length 1056 generated using an 88-bit LFSR
The length of the intermediate string is 8448 bytes
14
Comparing the Hash Functions(cont.)
Scheme D
the input : hashed twice using the polynomial evaluation hash function
with = 2-15 resulting in a combined value of 2-30
W = 5
The performance is slightly key dependent. Therefore an average over a
number of keys has been computed.
Scheme E
this is simply the evaluation hash function with t = 32768
the resulting value of is too small
However, choosing a smaller value of n that is not a multiple of 32 induces
a performance penalty
Scheme F
the input : divided into 2048 blocks of 128 bytes
each block is hashed twice using MMH
the length of the intermediate string is 16384 bytes
– It is not possible to obtain a value of closer to 2-32 in an efficient way
15
Universal Hashing MAC
Message authentication based on Universal hashing
Message authentication based on Universal hashing
Wegman-Carter approach
The parties share a secret key k=(h,P)
– P : infinite random string
– h : function drawn randomly from a strongly universal2 family of hash
functions H
» H is strongly universal2 if, for all xx’, the random variable
h1(x)||h2(x), for h ∈ H , is uniformly distributed
To authenticate a message x, the sender transmits h(x) xored with the next
piece of the pad P
Standard cryptographic technique
use of a pseudorandom function family, F
MAC( h ,a ) ( xi ) (i, Fa (i ) h( xi ))
[Theorem] Assume H is -AXU2, and that F is replaced by the truly
random function family R of functions. In this case, if an adversary makes
q1 queries to the authentication algorithm S and q2 queries to the
verification algorithm V, the probability of forging a MAC is at most q2
17
Universal hashing MAC
Why Universal hashing MAC?
The speed of a universal hashing MAC depends on the speed of the
hashing step and encrypting step
The encryption does not take long
hash function compresses messages => the encrypting message is short
The combinatorial properties of the universal hash function family is
mathematically proven
needs no “over-design” or “safe margin” the way a cryptographic primitive
would
Universal hashing MAC makes for desirable security properties
can select a cryptographically conservative design for the encrypting
step
can pay with only a minor impact on speed
the cryptographic primitive is applied only to the much shorter hashed
image of the message
security and efficiency are not conflicting requirements
18
UMAC
The UMAC algorithm
species how the message, key, and nonce determine an authentication
tag
The sender
will need to provide the receiver with the message, nonce, and tag
The receiver
can then compute what “should be” the tag for this particular message and
nonce, and see if it matches the received tag
employs a subkey generation process in which the shared
(convenient-length) key is mapped into UMAC's internal keys
subkey generation is done just once, at the beginning of a
communication session during which the key does not change, and so
subkey-generation is usually not performance-critical
UMAC depends on a few different parameters
19
UMAC(cont.)
An illustrative special case of UMAC
Subkey generation:
Using a PRG, map Key to K = K1K2 K1024 and to A
– each Ki : a 32-bit word, |A| = 512
Hashing the message Msg to HM = NHXKey(Msg):
Let Len be |Msg| mod 4096, encoded as a 2-byte string
Append to Msg the minimum number of 0 bits to make |Msg| divisible by
8
Let Msg = Msg1 || Msg2 || || Msgt where each Msgi is 1024 words
except for Msgt, which has between 2 and 1024 words
Let HM = NHK(Msg1) || NHK(Msg2) || || NHK(Msgt) || Len
Computing the authentication tag:
The tag is Tag = HMAC-SHA1A(HM || Nonce)
20
UMAC(cont.)
Definition of NH
blocksize n 2, wordsize w 1
domain : A = {0, 1}2w {0, 1}2w {0, 1}nw
range : B = {0, 1}2w
a random function in NH[n,w] is given by a random nw-bit string K
Uw : {0, , 2w-1}, U2w : {0, , 22w -1}
for integers x, y let (x +w y) denote (x + y) mod 2w
M A and M = M1 Ml , |M1| = = |Ml| = w
K = {0, 1}nw and K = K1 Kn , |K1| = = |Kn| = w
21
UMAC(cont.)
where mi Uw is the number that Mi represents (as an unsigned integer)
where ki Uw is the number that Ki represents (as an unsigned integer)
the right-hand side of the above equation is understood to name the
(unique) 2w-bit string which represents (as an unsigned integer) the
U2w-valued integer result
22
Compression Function of MD4
23
Compression Function of MD5
24
Compression Function of RIPEMD-160
25
Compression Function of SHA-1
26
Conclusions