CPSC 490 Number Theory Primes, Factoring and Euler's Phi

Download Report

Transcript CPSC 490 Number Theory Primes, Factoring and Euler's Phi

CPSC 490 Number Theory
Primes, Factoring and Euler
Phi-function
Mar.31st, 2006
Sam Chan
Clarifications
• The Euclidean Algorithm runs in
logarithmic time.
• Finding modular inverses can be done
using the extended Euclidean algorithm.
• Ex. 5x = 1 (mod 7) is equivalent to writing
5x – 7y = 1 or 5x + 7(-y) = 1. Use
extended Euclidean Algorithm to find x.
Primes
• Def: a positive integer p > 1 is prime if and
only if it has exactly two divisors, 1 and p.
• Ex. 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 are the
fist 10 primes.
• How many primes are there exactly?
Euclid’s Proof
• Euclid first proved that there exists an infinite number of primes.
(300 BC)
• Assume there are finitely many primes.
• Let S be the set of all n primes, p1p2p3…pn
• Let q = p1p2p3…pn + 1.
• There must exist a prime factorization for q. We claim that this
factorization does not contain an element of S. This would imply that
q uses a prime outside the set S, thus there are in fact more than n
primes.
• Assume that pi, 1 <= i <= n, is in the prime factorization of q. Then it
must divide p1p2p3…pn + 1. Specificially pi | 1.
• But since primes are greater than 1, this is our contradiction.
First approach
• Consider the following code:
bool isPrime( int n ) {
for( int i = 2; i*i <= n; i++ )
if( n % i == 0 ) return true;
return false;
}
• We attempt to find divisors of n. If we find one then
return false.
• Has a running time of O(n1/2)
• In practice, primality testing comes in to varieties.
Primality testing
• Deterministic tests will determine with
absolute certainty whether a given integer
n is prime.
• Probabilistic tests will determine with a
small probability of error, whether a given
integer n is prime.
Example of a probabilistic test
• Fermat’s Little Theorem
• Let n be a prime and a be any positive integer
such that a < n. Then a raised to the nth power
is congruent to a modulo n.
• an = a (mod n) or an-1-1 = 0 (mod n)
• Given an integer n, all we have to do is find the
remainder of an (mod n) and see if it equals a.
• We know how to compute modulo powers from
our previous discussion.
Fermat’s Little Theorem
• If an integer n satisfies this theorem, does this
imply that n is prime?
• NO!
• If an (mod n) is not congruent to a then we are
certain that n is not prime.
• If it is congruent to a then there is a good chance
that n is prime.
• As it turns out, there are composite numbers that
also satisfy Fermat’s Little Theorem.
Fermat’s Little Theorem
• A composite number n is a Carmichael
number if n satisfies Fermat’s Little
Theorem for every a that is relatively prime
to n.
• Smallest Carmichael number is 561.
• Other examples include: 89 = 8 (mod 9)
• These are called pseudoprimes in general
• What can we do to avoid having our test
fail?
Fermat’s Little Theorem
• We are free to choose any a we like, as
long as a < n.
• Pick many integers a to test for primality.
• If n passes this test for a lot of different
values of a, then it is highly probable that n
is prime.
• Run this test many times to reduce our
probability of error.
Running times
• Probabilistic tests have running times
around O(logn).
• In 2002, Agrawal et al discovered the first
algorithm for a deterministic test that is
polynomial in the number of digits of n.
• Running time is O(log7.5n). However, they
seem to run in O(log3n) in practice.
• Largest known prime currently is
230402457-1. It contains 9152052 digits.
Sieve of Eratosthenes
• What if we want a list of primes?
• Given n, we can find all the primes up to n
by using the Sieve of Eratosthenes.
• Start with a list of integers from 2 to n.
• At iteration i, mark the smallest integer a in
the set as a prime, then cross off all
multiples of a in the set.
• Repeat until i2 <= n.
Sieve of Eratosthenes
• Ex. Let n = 10, and x denote a crossed off
integer.
• 2, 3, 4, 5, 6, 7, 8, 9, 10
• 2, 3, x, 5, x, 7, x, 9, x
• 2, 3, x, 5, x, 7, x, x, x
• Done!
• So primes up to n = 10 are 2, 3, 5, and 7.
Sieve of Eratosthenes
• Here is an implementation:
bool prime[1000000];
void sieve( int n ) {
memset( prime, 1, sizeof(prime) );
prime[0] = prime[1] = 0;
for( int i = 2; i <= n; i++ )
if( prime[i] ) {
for( int j = i+i; j <= n; j += i )
prime[j] = 0;
}
}
Sieve of Eratosthenes
• At the end of this algorithm, prime[i] = 1 if
and only if i is prime, 0 otherwise.
• Outer loop starts at the smallest position
that hasn’t been crossed off.
• Inner loop repeatedly crosses of multiples
of i.
• With some analysis, we can show that this
has a running time of O(nlogn).
• Can we improve this algorithm?
Sieve of Eratosthenes
• Consider after having done the first
iteration in the outer loop. So i = 3.
• We start at i + i = 6.
• However, we already crossed off all
multiples of 2 (all the even numbers) in the
first iteration. So why are we looking here?
• Similarly, when we have crossed off all
multiples of 3, we should not look at any
multiples of 3 in any future iteration.
Sieve of Eratosthenes
• Improved implementation
bool prime[1000000];
void sieve( int n ) {
memset( prime, 1, sizeof(prime) );
prime[0] = prime[1] = 0;
for( int i = 2; i*i <= n; i++ )
if( prime[i] ) {
for( int j = i*i; j <= n; j += i )
prime[j] = 0;
}
}
• Changes are in bold.
Sieve of Eratosthenes
• Outer loop only goes up to the square root of n.
• Ex. n = 10, there is no need to check any
multiples of a where a > 3. Surely they will all
have been crossed off by some previous
combination. (ie. 2x5 = 5x2)
• Inner loop starts at i*i. This prevents looking at
positions that obviously is crossed off.
• Improved running time is still O(nlogn), but in
practice the speed usually speeds it up by 50%.
Sieve of Eratosthenes
• For anyone interested:
• There are other implementations that exploit the
fact that our prime[] array is just an array of 1’s
and 0’s.
• Use bits to lower memory usage, perhaps speed
things up.
• Other optimizations that can improve running
time in practice.
• Sieve of Atkin, optimized version of the Sieve of
Eratosthenes.
Factoring
• Given n, we are not interested in finding
positive integers a and b such that a*b = n.
• Factoring is hard, when n contains 100 or
more digits.
• Many cryptographic systems depend
heavily on this assumption.
Is factoring really that hard?
• Consider n = 2193 (n is given as such because
my calculator doesn’t like numbers that have
more than 10 digits, so pretend we do not know
the prime factorization of n ahead of time). How
long would it take to factor this number?
• It turns out that we can factor this number in 193
division operations. We just divide our number
by 2 193 times.
• It was trivial to factor this number, so why is
factoring considered a hard problem?
First attempt at factoring
• Recall our first approach in primality testing.
bool isPrime( int n ) {
for( int i = 2; i*i <= n; i++ )
if( n % i == 0 ) return true;
return false;
}
• We can do exactly the same thing, except
instead of returning true when we have found a
factor, we simply return that number that divided
n.
Further optimizations of our first
attempt
• To speed things up even further, we can
only divide by primes between 2 and
square root of n.
• How well does this work in practice?
• What is the worst case input?
Worst case for factoring
• What if we are given two 100 digit prime
numbers a and b. Let n = a*b.
• To factor this number with our approach, we will
have to check all primes between 2 and the
minimum of a and b.
• As an exmaple, consider n = 2193 – 1.
• The first prime divisor is 13,821,503.
• Assuming that our computer can perform a
billion division instructions per second, it will take
more than 35,000 years to find the second
largest factor of n!
Worst case for factoring
• As it turns out, the full factorization of n is:
• n = 13,821,503 *
61,654,440,233,248,340,616,559 *
14,732, 265,321,145,317,331,353,282,383.
• Conclusion: Factoring is hard for most cases.
RSA Factoring Challenge
• http://www.rsasecurity.com/rsalabs/node.asp?id
=2093
• Monetary reward for successful factoring of very
large integers.
• Latest factored number: RSA-640. An integer
that is 640 bits long.
• To quote the website: “The effort took
approximately 30 2.2 GHz- Opteron-CPU years
according to the submitters, over five months of
calendar time. (This is about half the effort for
RSA-20)
Factoring and the Sieve
• We know we can’t factor general numbers very efficiently.
• Here is an implementation of an augmented Sieve of Eratosthenes
that will factor for us:
int factors[1000000];
void fsieve( int n ) {
for( int i = 1; i <= n; i++ ) factors[i] = i;
for( int i = 2; i <= n; i++ )
if( factors[i] == I ) {
for( int j = i*i; j <= n; j += i )
if( factors[j] == j ) f[j] = i;
}
}
Factoring Sieve
• Again we loop over the numbers from 2 to n.
• The inner loop will set factor[j] = i if i is a multiple of j.
Since we loop in ascending order, it is the smallest
multiple of j.
• We can retrieve the list o factors for any arbitrary
position.
• Ex. n = 6
• Before the first iteration, factors[]: 1, 2, 3, 4, 5, 6
• After the last iteration, factors[]: 1, 2, 3, 2, 5, 2
• To retrieve the factors of 6, go to factors[6]. Divide that 6
by that value and get 3, go to factors[3], divide 6 by that
value and get 2, go to factors[2], divide 6 by 2 to get 1.
Stop.
• Factors are 1, 2, 3 and 6.
Euler’s Phi Function
• Def: “Phi of n”, denoted as Φ(n), counts
the number of integers in the range of
[1,n-1] that are relatively prime to n.
• Def: Two positive integers a and b are said
to be relatively prime if gcd(a,b) = 1.
• Note that the definition does not require
either a or b to be prime themselves.
• Ex. a = 6, b = 25. (6,25) = 1
Euler Phi Function
• Ex. Φ(6) = 2 since 1 and 5 are relatively
prime to 6.
• Ex. Φ(2000) = 800. How do we know this?
• Let’s first establish some ways we can
calculate Φ(n).
Case when n is prime
• If n is prime, then by definition, it contains
no factors other than itself and 1. Since
(n,1) = 1, the number integers less than n
that are relatively prime to n is just n – 1.
• Ex. Φ(7) = 6 since 1, 2, 3, 4, 5, 6 are all
relatively prime to 7.
Case when n = pr, r > 1
• If n is a power of a prime number p, can we
derive a formula for Φ(n)?
• Let’s consider what pr is not relatively prime to.
• Obviously, (p,pr) = p. In fact, all integers
ap <= pr where a is some positive integer are not
relatively prime to p.
• Also, all powers of p, up to pr-1 are not relatively
prime to p.
Case when n = pr, r > 1
• Thus the numbers p, 2p, 3p, … (pr-1-1)p are all
not relatively prime to p.
• There are pr-1 numbers less than pr. There
are pr-1-1 numbers that are not relatively
prime to pr.
• So Φ(n) = (pr-1)-(pr-1-1) = pr - pr-1.
General Case
• Given n and it’s prime factorization, can we
derive a general formula for n?
n  p p ...p
r1
1
r2
2
rm
m
 (n)   ( p ) ( p )... ( p )
r1
1
In general:
r2
2
rm
m


1 
1 
1 
 (n)  n1  1  ..........1  
 p1  p2 
 pk 
Looking ahead
• Next class, Andrew will be talking about cryptography.
• One of the more famous algorithms for public-key
encryption is RSA.
• Each step of the algorithm involves something that we
talked about in each of the 3 (including today)
discussions.
• We will need to find two large prime numbers. (primality)
• We will need to solve linear equations such as ax + by =
1. (extended Euclidean algorithm)
• We will need to compute Φ(n)
• Try to be at least a little familiar with all 3 or otherwise,
the next discussion may be hard to follow.
References
• http://mathworld.wolfram.com/PrimalityTes
t.html
• http://www.fortunecity.com/emachines/e11/
86/tourist2d.html
• http://www.rsasecurity.com/rsalabs/node.a
sp?id=2093
• http://en.wikipedia.org/wiki/RSA
• Last Year’s notes.