chapter1_l2 - Big

Download Report

Transcript chapter1_l2 - Big

Cryptanalysis
•
•
•
•
Four kinds of attacks (recall)
The objective: determine the key (Herckhoff principle)
Assumption: English plaintext text
Basic techniques: frequency analysis based on:
– Probabilities of occurrences of 26 letters
– Common digrams and trigrams.
1
Cryptanalysis -- statistical analysis
• Probabilities of occurrences of 26 letters
–
–
–
–
–
–
E, having probability about 0.120 (12%)
T,A,O,I,N,S,H,R, each between 0.06 and 0.09
D,L, each around 0.04
C,U,M,W,F,G,Y,P,B, each between 0.015 and 0.028
V,K,J,X,Q,Z, each less than 0.01
See table 1.1, page 26
• 30 common digrams (in decreasing order):
– TH, HE, IN, ER, AN, RE,…
• 12 common trigrams (in decreasing order):
– THE, ING,AND,HER,ERE,…
2
Cryptanalysis of Affine Cipher
• Suppose a attacker got the following Affine cipher
– FMXVEDKAPHFERBNDKRXRSREFNORUDSDKDVSHVUFEDKAPRKDLYEVLR
HHRH
• Cryptanalysis steps:
– Compute the frequency of occurrences of letters
• R: 8, D:7, E,H,K:5, F,S,V: 4 (see table 1.2, page 27)
– Guess the letters, solve the equations, decrypt the cipher, judge correct or not.
• First guess: Re, Dt, i.e., eK(4)=17, eK(19)=3
– Thus, 4a+b=17
 a=6, b=19, since gcd (6,26)=2, so incorrect.
19a+b=3
• Next guess: Re, Et, the result will be a=13, not correct.
• Guess again: Re, Ht, the result will be a=8, not correct again.
• Guess again: Re, Kt, the result will be a=3, b=5.
– K=(3,5), eK(x)=3x+5 mod 26, and dK(y)=9y-19 mod 26.
– Decrypt the cipher: algorithmsarequitegeneraldefinitionsofarithmeticprocesses
•
•
• If the decrypted text is not meaningful, try another guess.
Need programming: compute frequency and solve equations
Since Affine cipher has 12*26=312 keys, can write a program to try all keys.
3
Cryptanalysis of substitution cipher
• Final goal is to find the corresponding plaintext
letter for each ciphertext letter.
• Ciphertext: example 1.11, page 28
• Steps:
– Frequency computation, see table 1.3, page 29
• Guess Ze, quite sure
• C,D,F,J,M,R,Y are t,a,o,i,n,s,h,r, but not exact
– Look at digrams, especially –Z or Z-.
• Since ZW occurs 4 times, but no WZ, so guess Wd (because
ed is a common digram, but not de)
• Continue to guess
– Look at the trigrams, especially THE, ING, AND,…
4
Cryptanalysis of Vigenere cipher
• In some sense, the cryptanalysis of Vigenere
cipher is a systematic method and can be
totally programmed.
• Step 1: determine the length m of the keyword
– Kasiski test and index of coincidence
• Step 2: determine K=(k1,k2,…,km)
– Determine each ki separately.
5
Kasiski test—determine keyword length m
• Observation: two identical plaintext segments will
be encrypted to the same ciphertext whenever they
appear  positions apart in plaintext, where 0
mod m. Vice Versa.
• So search ciphertext for pairs of identical
segments, record the distance between their
starting positions, such as 1, 2,…, then m should
divide all of i’s. i.e., m divides gcd of all i’s.
6
Index of coincidence
• Can be used to determine m as well as to confirm
m, determined by Kasiski test
• Definition: suppose x=x1x2,…,xn is a string of
length n. The index of coincidence of x, denoted by
Ic(x), is defined to be the probability that two
random elements of x are identical.
– Denoted the frequencies of A,B,…,Z in x by f0,f1,…,f25
25 f
25
i
( )
 fi(fi-1)
2
i=0
= i=0
( Formula IC )
--Ic(x)=
n(n-1)
(n )
2
7
Index of coincidence (cont.)
Suppose x is a string of English text, denote the expected
probability of occurrences of A,B,…,Z by p0,p1,…,p25 with
values from table 1.1, then Ic(x)  pi2 =0.0822+0.0152+…+0.0012=0.065
(since the probability that two random elements both are A is p02, both are B is p12,…)
Question: if y is a ciphertext obtained by shift cipher, what is the Ic(y)?
Answer: should be 0.065, because the individual probabilities will be
permuted, but the pi2 will be unchanged.
Therefore, suppose y=y1y2…yn is the ciphertext from Vigenere cipher.
For any given m, divide y into m substrings:
y1=y1ym+1y2m+1…
if m is indeed the keyword length, then
y2=y2ym+2y2m+2…
each yi is a shift cipher, Ic(yi) is about 0.065.
…
ym=ymy2my3m…
otherwise, Ic(yi)  26(1/26)2 = 0.038.
8
Index of coincidence (cont.)
For purpose of verify keyword length m, divide the ciphertext into m substrings,
compute the index of coincidence by formula IC for each substring. If all IC values
of the substrings are around 0.065, then m is the correct keyword length. Otherwise
m is not the correct keyword length.
If want to use Ic to determine correct keyword length m, what to do?
Beginning from m=2,3, … until an m, for which all substrings have
IC value around 0.065.
Now, how to determine keyword K=(k1,k2,…,km)? Assume m is given.
9
Determine keyword K=(k1,k2,…,km)
1. Determine each ki (from yi) independently.
2. Observation:
2.1 let f0,f1,…,f25 denote the frequencies of A,B,…,Z in yi and n′=n/m
2.2 then probability distribution of 26 letters in yi is:
f0
f25
n′,
, n′
2.3 if the shift key is ki, then f0+ki (i.e., A+ki) is the frequency of
a in the corresponding plaintext xi , …, f25+ki (note the subscript
25+ki should be computed by modulo 26) is the frequency
of z in xi. Since xi is normal English text, probability distribution
f0+ki
f25+ki
should be “close to” ideal probability
of
n′,
, n′
distribution p0,p1,…,p25.
p0, …,
p25
So: p f0+ki +…+ p f25+ki
0
25
n'
n'
p02+…+p252 =0.065
10
Determine keyword K=(k1,k2,…,km) (cont.)
On the other hand, for any g !=ki,
p0 f0+g
n'
f25+g
p
+…+ 25
n'
will not be close to 0.065.
Therefore, define:
Mg=i=025 pi
fi+g
n′
When g=ki, Mg will generally be around 0.065 (i.e., i=025 pi2).
Otherwise Mg will be quite smaller than 0.065.
So let g from 0, until 25, compute Mg, and for some g, if Mg
is around 0.065, then ki=g.
Note: the subscript i+g should be seen as modulo 26.
11
Cryptanalysis of Vigenere cipher--example
• Example 1.12, page 33.
– Using Kasiski test to determine the keyword length
• CHR appears five times at 1,166,236,276,286
• the distance is 165, 235,275,285, the gcd is 5, so m=5.
– Using index of coincidence to verify m=5.
• Divide ciphertext into y1, y2, y3, y4, y5
• Compute f0,f1,…,f25 for each yi and then Ic(yi), get 0.063,
0.068,0.069,0.061,0.072, so m=5 is correct.
– Determine ki for i=1,…,5.
• Compute Mg for g=0,1,…,25 and if Mg  0.065, then let
ki=g. where
fi+g
25
Mg=i=0 pi
n′
As a result, k1=9,k2=0,k3=13,k4=4,k5=19, i.e., JANET
12
Cryptanalysis of Hill cipher
• Difficult to break based on ciphertext only
• Easily to break based on both ciphertext and
plaintext.
• Suppose given at least m distinct plaintextciphertext pairs: xj=(x1,j,x2,j,…,xm,j)
yj=(y1,j,y2,j,…,ym,j)
then define two matrices X=(xi,j) and Y=(yi,j)
Let Y=XK, if X is invertible, then K=X-1Y.
13
Cryptanalysis of Hill cipher--example
• Suppose plaintext is: friday
and ciphertext is: PQCFKU
and the m=2. Then eK(f,r)=(P,Q), eK(i,d)=(C,F).
That is:
(
15 16
2 5)
=
5 17
( 8 3 )K
5 17 -1 15 16
9 1 15 16
K=(8 3 ) ( 2 5 )=( 2 15 )( 2 5)=( 7 19)
8 3
Then using the third pair, i.e., (a,y) and (K,U) to verify K.
In case m is unknown, try m=2,3, …
14
Cryptanalysis of LFSR stream cipher
• Vulnerable to known-plaintext attack.
• Suppose m, plaintext binary string x1,x2,…,xn and
ciphertext binary string y1,y2,…,yn are known, as
long as n>2m, the key can be broken:
– Keystream is: zi=(xi+yi) mod 2. (i=1,2,…,n)
– Then the initialization vector of K is z1,…, zm.
– Next is to determine coefficients (c0,c1,…,cm-1) of K
(recall that zi+m = m-1j=0cjzi+j mod 2 for all i1)
i.e,
z1 z2 … zm
–(zm+1,zm+2,…,z2m)=(c0,c1,…,cm-1) z2 z3 … zm+1
……………
zm zm+1 … z2m-1
15
Cryptanalysis of LFSR stream cipher (cont.)
Therefore:
-1
z1 z2 … zm
–(c0,c1,…,cm-1)=(zm+1,zm+2,…,z2m) z2 z3 … zm+1
……………
zm zm+1 … z2m-1
16
Cryptanalysis of LFSR stream cipher --example
Example 1.14, page 37. Suppose LFSR 5 with the following:
Ciphertext string: 101101011110010
Plaintext string: 011001111111000
Then keystream: 110100100001010
Therefore initialization vector is: 11010.
For next five key elements: 01000, set up equation
for coefficients (c0,c1,c2,c3,c4) and solve it.
The result is: (c0,c1,c2,c3,c4) =(1,0,0,1,0)
i.e., zi+5=(zi+zi+3) mod 2.
17