ENGG2013 Lecture 11 - Chinese University of Hong Kong

Download Report

Transcript ENGG2013 Lecture 11 - Chinese University of Hong Kong

ENGG2013 Unit 11
Row-Rank
Feb, 2011.
Last week
• Caesar Cipher
– Modulo arithmetic: divide and take the remainder
– Encryption: y = x + 3 mod 26
– Decryption: x = y – 3 mod 26
• Hill cipher
– Matrix with entries 0,1,2,…,or 25.
– Need to compute matrix inverse mod 26
kshum
ENGG2013
2
Today
• More examples on linear algebra mod n
• Rank of a matrix
– Row-rank
– How to compute row-rank by RREF
kshum
ENGG2013
3
TASTE FROM DISCRETE MATH
kshum
ENGG2013
4
Continuous vs discrete vs finite
2
0.25
Continuous
–1
0
1
e
The real number line
Discrete
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, …..
Finite, for example
0, 1. (binary bit)
kshum
ENGG2013
5
Example of Finite mathematics
A
B
C
D
E
F
G
H
I
J
K
L
M
0
1
2
3
4
5
6
7
8
9
10
11
12
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
13
14
15
16
17
18
19
20
21
22
23
24
25
Fix a one-to-one correspondence between the English alphabets
and the integers mod 26.
Caesar’s cipher: shift a letter to the right by 3, x  x+3 mod 26
kshum
ENGG2013
6
A psychological barrier
• In discrete and finite math., we need to forget
the fractions and decimal nimbers.
– In application like Caesar cipher, 0.25 has no
meaning.
• Geometry is often missing. We need to redefine some operations like taking inverse in a
pure algebraic way.
kshum
ENGG2013
7
Review
• What is the meaning of “square root of 5”?
kshum
ENGG2013
8
Square root of a positive real
number
• Geometric meaning:
1
0
1
2
5
• Algebraic meaning:
“Square root of 5” is a real number x which
satisfies the equation x2 = 5.
kshum
ENGG2013
9
Review
• What is the meaning of “1/3”?
• What is the meaning of “(2/5)-1”
kshum
ENGG2013
10
Multiplicative inverse of a real
number
• Geometric meaning: 1/3 is the length
obtained if we divide a line segment of unit
length into three equal parts.
0
1/3
1
• Algebraic meaning:
“One third” is a real number x which satisfies
the equation 3x = 1.
kshum
ENGG2013
11
Multiplicative inverse of a real
number
• Geometric meaning: What is (2/5)-1?
2/5
Area = 1
x
• Algebraic meaning:
“(2/5)-1” is a real number x which satisfies
the equation (2/5)x = 1.
(2/5)-1 = 5/2 = 2.5
kshum
ENGG2013
12
What is a number?
• We will adopt the viewpoint that
A number is the root of some polynomial
(with integer coefficients).
• For example:
– “2” is a number which satisfies x2 – 2=0.
– “3/4” is a number which satisfies 4x – 3=0.
– “10” is a number which satisfies x – 10=0
More precisely, the above numbers are “algebraic number”.
Some numbers such as  and e are not algebraic,
 and e are examples of “transcendental numbers”.
kshum
ENGG2013
13
Square root mod 10
• Geometric meaning: N/A
• Algebraic meaning:
Square root of 5 mod 10 is an integer x between
0 and 9 which satisfies x2 = 5 mod 10.
Example: 5 mod 10 is equal to 5.
Example: 4 mod 10 is either 2 or 8.
Example: 3 mod 10 does not exist
kshum
ENGG2013
14
Multiplicative inverse mod 10
• Geometric meaning: N/A
• Algebraic meaning: What is the multiplicative
inverse of 3 mod 10?
What we are going to do is:
If we can find an integer x between 0 and 9 such
that 3x=1 mod 10, then we say that this x is
the multiplicative inverse of 3 mod 10, and
write 3-1 = x mod 10.
kshum
ENGG2013
15
Find it by exhaustive search
Simply try all 10 possibilities one by one:
38 = 24 = 4 mod 10
31 = 3 mod 10
39 = 27 = 7 mod 10
32= 6 mod 10
30 = 27 = 0 mod 10
33 = 9 mod 10
34 = 12 = 2 mod 10
Answer:
3-1 = 7 mod 10
35 = 15 = 5 mod 10
36 = 18 = 8 mod 10
37 = 21 = 1 mod 10
Aha!
kshum
ENGG2013
16
Multiplicative inverse mod 5
• Geometric meaning: N/A
• Algebraic meaning: What is the multiplicative
inverse of 3 mod 5?
Try to find an integer x between 0 and 4 such
that 3x=1 mod 5. If succeed, we say that this x
is the multiplicative inverse of 3 mod 5, and
write 3-1 = x mod 5.
kshum
ENGG2013
17
Find it by exhaustive search
Simply try all 5 possibilities one by one:
31 = 3 mod 5
32= 6 = 1 mod 5
33 = 9 = 4 mod 5
34 = 12 = 2 mod 5
30 = 0 mod 5
Answer:
3-1 = 2 mod 5
kshum
ENGG2013
18
Multiplicative inverse mod n
in general
• Given an integer b and integer n.
• Definition:
If we can find an integer x between 0 and n-1
such that bx=1 mod n, then we say that this x
is the multiplicative inverse of b mod n, and
write b-1 = x mod n.
Two keywords
kshum
ENGG2013
19
First keyword “if”:
inverse may not exist
• What is the multiplicative inverse of 2 mod
10?
– Try to find an integer x between 0 and 9, such that
2x=1 mod 10.
21 = 2 mod 10
27 = 14 = 4 mod 10
22 = 4 mod 10
28 = 16 = 6 mod 10
23 = 6 mod 10
24 = 8 mod 10
29 = 18 = 8 mod 10
20 = 0 mod 10
25 = 10 = 0 mod 10
26 = 12 = 2 mod 10
kshum
None of them is equal to 1.
2-1 mod 10 does not exist.
ENGG2013
20
Property 1
• If the greatest common divisor of b and n are
larger than 1, then we cannot find any
multiplicative inverse of b mod n.
• Reason:
– Let d be the greatest common divisor of b and n.
Suppose d > 1.
– For any integer x, bx is a multiple of d.
But n is also a multiple of d.
– The remainder of n divided by bx is also a
multiple of d, which ENGG2013
by assumption is not 1.
kshum
21
Property 2, “the”:
multiplicative inverse is unique
• Suppose that the gcd of b and n is 1. (This assumption makes sense
because of property 1).
• Suppose that b has two multiplicative inverses mod n, say x1 and x2,
such that bx1= 1 mod n and bx2 = 1 mod n.
• Let q1 be the quotient when we divide bx1 by n and q2 be the
quotient when we divide bx2 by n. We have:
– bx1 = q1 n + 1,
– bx2 = q2 n + 1.
• Subtracting the above two equations, we get
b(x1 – x2) = (q1 – q2) n
•  b(x1 – x2) is divisible by n
• Because n and b has no common factor, (x1 – x2) is divisible by n.
• Because x1 and x2 are both between 0 and n – 1 , we must have
x1 = x2
• Therefore, there is only one solution to bx = 1 mod n.
kshum
ENGG2013
22
Example: Multiplicative inverse
mod 10
• 0-1 mod 10 does not exist.
• 5-1 mod 10 does not exist, because
gcd(5,10)=2.
• 1-1 = 1 mod 10, because 11=1
mod 10.
• 6-1 mod 10 does not exist, because
gcd(6,10)=2.
• 2-1 mod 10 does not exist, because
gcd(2,10)=2.
• 3-1 = 7 mod 10, because 37=21=1
mod 10.
• 4-1 mod 10 does not exist, because
gcd(4,10)=2.
kshum
• 7-1 = 3 mod 10, because 73=21=
mod 10.
• 8-1 mod 10 does not exist, because
gcd(8,10)=2.
• 9-1 = 9 mod 10, because 91=81=1
mod 10.
ENGG2013
23
2x2 Hill cipher
• Encryption matrix
a,b,c,d are integers
between 0 and 25.
• Encryption is done by matrix multiplication
• Need to find a decryption matrix D such that
kshum
ENGG2013
24
Formula for 2x2 matrix inverse
This is an integer between 0 and 25.
The multiplicative inverse calculated as in pp.12-15
• Example: Suppose that the encryption matrix
is given by
Task: Find the matrix inverse of E mod 26.
kshum
ENGG2013
25
Find the multiplicative inverse of
det by exhaustive search
• det(E) = 23 – 19= – 3 = 23 mod 26.
• Find an integer x between 0 and 25 such that
23x = 1 mod 26.
• By property 1, we only need to search for x
which has no common factor with 26.
231 = 23 mod 26
233 = 17 mod 26
239 = 25 mod 26
2311 = 19 mod 26
We can skip 13
235 = 11 mod 26
2315 = 7 mod 26
237 = 5 mod 26
kshum
2317 = 1 mod 26
ENGG2013
Stop. The multiplicative inverse
of 23 is found.
26
The inverse mod 26
• det(E) -1 = 23-1 = 17 mod 26,
– because 2317=391 = 1526+1
• The inverse of E mod 26 is
• Check:
kshum
ENGG2013
27
Encryption by Hill cipher
• The encryption and decryption matrix must be kept secret.
• Encryption of “HATE”:
– convert the English letters to integers between 0 and 25
HATE  7, 0, 19, 4.
– group the letters into blocks of length 2, and then multiply by
the matrix E.
– Convert 14, 7, 22, 5 to the ciphertext “OHWF”
A
B
C
D
E
F
G
H
I
J
K
L
M
0
1
2
3
4
5
6
7
8
9
10
11
12
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
13
14
15
16
17
18
19
20
21
22
23
24
25
kshum
ENGG2013
28
Decryption by Hill cipher
• Decryption of “OHWF”,
– convert the English letters to integers between 0 and 25
OHWF  14, 7, 22, 5.
– group the letters into blocks of length 2, and then multiply
by the decryption matrix D.
– Convert 7, 0, 19, 4 to “HATE”.
A
B
C
D
E
F
G
H
I
J
K
L
M
0
1
2
3
4
5
6
7
8
9
10
11
12
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
13
14
15
16
17
18
19
20
21
22
23
24
25
kshum
ENGG2013
29
RANK
kshum
ENGG2013
30
Back to the “real” world
• In the remainder of this lecture, all variables
are real numbers.
• In a system of linear equations, the notion of
“rank” measures the number of essentially
different equations.
They are multiple of each other.
We can remove one of them
without altering the solutions.
kshum
ENGG2013
31
A larger example
• In a large system of linear equations, how to
find and remove redundant equations?
kshum
ENGG2013
32
The representation using
augmented matrix
• Recall that the augmented matrix
representation of a system of linear equations.
The third equation is the sum
of the first and second equation
The third row is equal to the sum
of the first and second rows
kshum
ENGG2013
33
Recall: Linearly dependence
• Definition: m vectors v1, v2, …, vm (each of
them have n components) are linearly
dependent if there are m real numbers, a1, a2,
…, am, not all zero, such that the linear
combination is equal to zero
a1 v1+ a2 v2+ … + am vm= 0
kshum
ENGG2013
34
Simple fact
• If one of the m vectors v1, v2, …, vm is the zero
vector 0, then v1, v2, …, vm are linearly
dependent.
• Reason: Suppose v1= 0.
1 v1+ 0 v2+ … + 0 vm= 0
Not all coefficients are zero. In fact, the first
coefficient is not zero.
kshum
ENGG2013
35
Theorem
v1, v2, …, vm are linearly dependent
if and only if one of them is a linear
combination of the others.
Proof () Suppose that v1, v2, …, vm are linearly
dependent. By definition, we can find m real
numbers, a1, a2, …, am, not all zero such that
a1 v1+ a2 v2+ … + am vm= 0. Suppose that ai0.
Then
vi = (-1/ai)(a1 v1+ …+ai-1 vi-1+ ai+1 vi+1+ … + am vm)
The i-th vector vi is a linear combination of the others.
kshum
ENGG2013
36
Theorem
v1, v2, …, vm are linearly dependent
if and only if one of them is a linear
combination of the others.
Proof () Suppose that vi is a linear combination of the
other vector: vi = c1 v1+ …+ci-1 vi-1+ ci+1 vi+1+ … + cm vm
for some constants c1, …,ci-1, ci+1,…cm.
Then 0 = c1 v1+ …+ci-1 vi-1– vi+ ci+1 vi+1+ … + cm vm
The i-th coefficient is non-zero (it is equal to -1). The
zero vector can be expressed as a linear combination
of v1, v2, …, vm, in which not all coefficients are zero.
kshum
ENGG2013
37
Another way to say the same thing
• If v1, v2, …, vm are linearly independent
then none of them can be written as a linear
combination of the others,
and vice versa.
kshum
ENGG2013
38
Definition: Row-rank of a matrix
• Given a matrix M, the row-rank of M is the
maximum number of linearly independent
rows in M.
• Example:
– Because the third row is the sum of the other two rows, i.e., is a linear
combination of the other two rows., by the theorem in p.32, the three
rows are linearly dependent.
– The first and second row are linearly independent.
– Therefore the maximum number of linearly independent rows is 2.
–  the row-rank of this matrix is equal to 2.
kshum
ENGG2013
39
Example
• What is the row-rank of
• What is the row-rank of
kshum
ENGG2013
?
Ans: 2
?
Ans: 4
40
Theorem
Elementary row operations do not affect the row-rank
Proof: Let A be an nm matrix.
Let the i-th row of a matrix A be ri.
1. Exchanging two rows does not affect the
maximal number linearly independent rows.
2. Pick any k rows in A. Because of (1) above, we
may exchange the rows and re-label them as r1,
r2, …, rk. We want to see the effect of multiplying
the p-th row (1  p  k) by a nonzero constant c.
kshum
ENGG2013
41
Proof (step 2)
Elementary row operations do not affect the row-rank
Suppose r1, r2 … rp, …, rk are linearly dependent.
By definition a1r1+a2r2 +…+ ap rp+ …+ ak rk = 0.
(not all a1, a2, …, ak are zero).
Divide an multiply simultaneously by c,
a1r1+a2r2 +…+ (ap/c) (c rp)+ …+ak rk = 0.
Not all a1, a2, …, ap/c, …, ak are zero.
Therefore r1, r2 ,…, crp, …, rk are linearly
dependent.
kshum
ENGG2013
42
Proof (step 2 cont’d)
Suppose r1, r2 … crp, …, rk are linearly dependent.
By def., b1r1+b2r2 +…+ bp (crp)+ …+ bk rk = 0.
(not all b1, b2, …, bk are zero).
Just bring the constant c out of the parenthesis,
b1r1+b2r2 +…+ (cbp) rp+ …+bk rk = 0.
Not all b1, b2, …, cbp, …, bk are zero.
 r1, r2,…, rp, …, rk are linearly dependent.
Therefore, r1, r2,…, rp, …, rk are linearly
dependent if and only if r1, r2,…, crp, …, rk
kshum
ENGG2013
43
Proof (step 3)
3. What is the effect of adding d times the 2-nd
row to the 1-st row? (Because exchange of
rows does not affect the row-rank,
considering adding d times r2 to r1 is
sufficient. There is no loss of generality.)
Suppose r1, r2 …, rk are linearly independent.
By def., e1r1+e2r2 +…+ ek rk = 0
is possible only when e1= e2= …= ek = 0.
kshum
ENGG2013
44
Proof (step 3)
Now we want to check whether (r1+dr2), r2 , r3 ,
…,rk are linearly independent.
If f1(r1+dr2)+f2r2 +f3r3 +… …+ fk rk = 0, for some
constants f1, f2, …, fk
then f1r1+(f1d+f2)r2 +f3r3 +… …+ fk rk = 0.
But this is possible only if
f1= (f1d+f2)= …= fk = 0.
f2=0
kshum
All coefficients f1, f2 …, fk are 0
(r1+dr2), r2 ,r3,rk are linearly independent.
ENGG2013
45
Proof (step 3 reverse direction)
Suppose r1+dr2, r2 …, rk are linearly independent.
By def., g1(r1+dr2)+g2r2 +… …+ gk rk = 0
is possible only when g1= g2= …= gk = 0.
We want to check whether r1, r2 ,r3,…,rk are linearly
independent.
If h1r1+h2r2 +…+ hk rk = 0, then
h1(r1+dr2–dr2)+h2r2 +…+ hk rk = 0
h1(r1+dr2)+(–h1d+h2)r2 +…+ hk rk = 0.
But this is possible only when
h1= (–h1d+h2) = … = hk = 0
h2=0
kshum
All coefficients h1, h2… hk are 0
r1, r2 ,r3,…,rk are linearly independent.
ENGG2013
46
An algorithm for computing
row-rank
• Algorithm for computing the row-rank of M.
1. Apply elementary row operations to M, and
transform it to a matrix in RREF.
2. Count the number of non-zero rows in the
resulting RREF.
• Because
– Row reductions does not change the row-rank.
– The row-rank of a matrix in reduced row-echelon
form is easy to calculate: just count the number of
nonzero rows.
kshum
ENGG2013
47
RREF for rank computation
• For mxn matrix, computing the row-rank of a
matrix using the definition in p.39 requires
checking 2m-1 combinations. This is not feasible
even if the number of rows, m, is moderately
large.
• By reduction to RREF, the running times is of the
order m3. This is polynomial time in m.
• However, in Matlab, the computation of rank is
done by SVD (singular value decomposition),
because SVD is more stable numerically.
(type “doc rank” in Matlab for more details)
kshum
ENGG2013
48
Notes
 For modulo arithmetic, you can find out more in
 http://en.wikipedia.org/wiki/Modular_arithmetic
 http://en.wikibooks.org/wiki/High_School_Mathematics_Extensions/Primes/Modular_Arithm
etic
 Or any introductory book on number theory
 In some books “a=b mod n” is written as “ab mod n”,
(pronounced as “a is congruent to b mod n”)
 In some webpages or books, you may see the symbols
They stands for the set of integers and the set of rational
numbers respectively.
 The letter “Z” comes from German “Zahlen”, which
means “numbers”. “Q” comes from English “quotients”.
http://en.wikipedia.org/wiki/Integer
kshum
ENGG2013
49