primality[1]

Download Report

Transcript primality[1]

A Prime Example
Lecture 20
CS 15-251
A positive integer p 2 is prime if the only positive
integers that divide p are 1 and p itself.
Positive integers n 2 which are not prime are called
composite.
The prime factorization of a positive integer n
is an expression of n as a product of primes
n = p1 p2 p3 …pk
Theorem: (Fundamental Theorem of Arithmetic)
Every positive integer n 2 has a unique prime
factorization (up to order of the prime factors).
Proof.
Existence: Suppose m is the smallest positive integer 
2 with no prime factorization.
m is prime m itself is a prime factorization
m is composite 
m = m1m2 for 1 < m1 < m and 1 < m2 < m
m1 and m2 have prime factorizations
so m1m2 yields a prime factorization for m

Uniqueness: Suppose m is the smallest positive integer 2
with two different prime factorizations.
m = p1p2…pi
m = p1’p2’…pk’
p1 | p1’p2’…pk’
p1 | pj’ for some 1 j k
p1 = pj’ (since both are prime)
Remove p1 from p1p2…pi and pj’ from p1’p2’…pk’
to get m’ with two different prime factorizations:
m’ = p2…pi = p1’p2’…pj-1’pj+1’…pk’
But m’ < m, so m’ must have a unique prime
factorization. 
Three Classic Problems
1. Density
How many primes are in {1…n}?
2. Generation
List all the primes in {1…n}.
3. Testing
Given a positive integer n, is n prime?
These problems go back to the ancient
Greeks!
…And a Modern Problem
4. Random choice
Pick a random prime number in {1...n}.
This problem arises in cryptographic
algorithms (such as RSA) that need large
prime numbers to make cryptographic
keys. We’ll learn more about these
algorithms in a future lecture.
1. Density of Primes
Let p(n)  the number of primes in {1…n}
p(10) = 4
p(20) = 8
(2,3,5,7)
(2,3,5,7,11,13,17,19)
Theorem (Euclid): The set of prime numbers is infinite.
Proof:
Suppose the primes are the finite set {p1, p2, …, pk }
Let m = p1 p2 … pk + 1
m is not divisible by any prime pi, so m must be a multiple
of a prime that is not in the “set of all primes”.
Thanks to Euclid, we know p(n) as n
. But how thickly distributed are the
primes? That is, if we pick a random
number in {1…n}, what’s the probability of
getting a prime? We’ll see a better
characterization of p(n) shortly.
2. Generating Primes
Give an algorithm that lists all the primes in {1...n}.
Sieve of Eratosthenes
set prime [2..n] = 1
for p = 2 to n do
if prime [p] = 1 then
print “p is prime”
for m = 2 to n/p do
prime [mp] = 0
For each prime p, the Sieve eliminates all multiples of p.
No prime will ever be eliminated, and every composite (which
must have a prime factor smaller than itself) is guaranteed to be
eliminated before the outer loop reaches it.
Running time: O(p(n)n (logn)2 + n) if multiplication is O((logn)2)
3. Testing Primality
Give an algorithm for deciding whether a number n
is prime.
A) Trial division
for k = 2 to n do
if k | n then
return “n is not prime”
otherwise return “n is prime”
O(n (logn)2) time if division is O((logn)2)
B) Sieve method
Run the Sieve of Eratosthenes on {1...n}
O(p(n)n (logn)2 + n) time
C) Trial division up to n
for k = 2 to n do
if k |n then
return “n is not prime”
otherwise return “n is prime”
O(n (logn)2) time if division is O((logn)2)
Claim: if n is composite, then n has a prime factor p  n.
Proof: By contradiction. Suppose some composite n has a
prime factorization n = p1 p2 … pk where all pi > n. But
then
n = p1 p2 … pk > (n )k
which is a contradiction unless k < 2, that is, unless n is
prime. 
The trial division algorithm can be easily
adapted to find at least one factor of n:
for k = 2 to n do
if k |n then
return “k is a factor of n”
otherwise return “n is prime”
This algorithm runs in O(n (logn)2) time,
which is O(n) (in fact, sublinear). So why do
we think factoring is so hard? Why are
banks and governments willing to trust
their secrets to a number that can be
factored in O(n) time?
Trial division is exponential as a function of the
length of its input!
The input number n can be represented by
k=O(logn) bits. As a function of k, the trial
division algorithm runs in O(n) = 2O(k) time
4. Random Primes
Give an algorithm to pick a random prime number
in {1..n}.
Brute force algorithm
Generate all the primes in {1…n}
Pick one at random
Randomized algorithm
Pick a random number m {1…n}
Test if m is prime
If m is prime, return it
Otherwise try again
What is the expected number of tries before finding
a prime?
Let p = probability of picking a prime on one try
p = p(n)/n
Expected number of tries to get a prime is
1/p = n/p(n)
The randomized algorithm motivates us to look for
better solutions to the classic problems in prime
numbers.
Density of primes
We need some assurance that primes are numerous
enough that the expected number of tries n/p(n) is
about O(logn).
Testing primality
The O(n) trial division algorithm is intractably slow.
We need something more like O(logn).
Solving these two problems will consume the rest of
the lecture.
Density of Primes (revisited)
Prime Density Theorem
p ( n)
lim
1
n n / ln n
In other words, as n , p(n)  n / lnn.
This deep and famous result was conjectured
by Euler around 1750, but not proved until 150
years later, by Hadamard. The proof is hard.
Corollary: The density of primes p(n)/n  1/lnn as
n .
Example: How many primes are in {1 … 1010}?
p(1010)  1010 / ln 1010  434,000,000
Example: What is the probability that a random
100-digit number is prime?
p(10100)/ 10100  1 / ln 10100  1 in 230
For our purposes, we don’t need to prove the Prime
Density Theorem. All we need is the weaker
statement:
p(n) = W (n /logn)
If this statement is true, we can put a lower bound on
the probability that a randomly chosen number in
{1..n} is prime:
p(n)/n = W (1/logn)
So picking randomly from {1…n}, we need only
expected O(logn) tries to find a prime number.
Theorem: p(n) = W (n /logn)
Proof:
For a prime p and a positive integer n, define the
multiplicity of p in n as the number of times p
occurs in the prime factorization of n.
500 = 5×5×5×2×2
multiplicity of 5 in 500 = 3
multiplicity of 2 in 500 = 2
multiplicity of 11 in 500 = 0
Claim: for any prime p and positive integer n, the
multiplicity of p in n ! is
n

 pi 
i 1 


Look at n ! = 1×2×3××××n
This product contains
n/p multiples of p
n/p2 multiples of p2
n/pi multiples of pi
So the formula above counts the multiples of p, p2,…
such that a multiple of pi (but not pi+1) is counted
exactly i times. Since only multiples of p can
contribute p factors, the formula counts the
number of times p appears in the prime
factorization of n !
Let r (p) be the natural number such that
pr (p)  2n < pr (p)+1
Claim: for any prime p and positive integer n, the
 2n 
multiplicity of p in  n  is at most r (p).
 2n  (2n)! 2n(2n  1)( 2n  2) (n  1)
  

n(n  1)( n  2) 1
 n  n!n!
Each factor p in the denominator cancels out a factor p
 2n 
in the numerator. So the multiplicity of p in  n  is

 2n 
 n     2n   n 

 pi   2  pi      pi   2  pi 
i 1 
i 1 

 i 1     

Lemma:
a - a < 1
2a - 2 a < 2
2a - 2 a < 2
2a - 2 a  1
  2n   n 
    i   2  i  since p r (p)+1 >2n
i 1   p 
 p 
by lemma at left
 r ( p)
r ( p)
 2n 
 
n
The prime factorization of
is the product of its
prime factors raised to the power of their
multiplicities. Since the multiplicity of any prime
 2n 
factor p in  n  is at most r (p), we get:
 2n 
r ( p)
  
p

 n  prime factors p of  2 n 
 n 
 

p
r ( p)
prime p  2 n

 2n
prime p  2 n
 ( 2n) p ( 2 n )
since the prime factors of
 2n 
  are at most 2n
n
since p
r (p)
 2n by definition
 2n
   ( 2n) p ( 2 n )
 n
Taking logs of both sides and rearranging terms gives
 2n
log 
 n
p ( 2n) 
log 2n
 2n 2n(2n  1)(2n  2) (n  1)
Since   
 2n ,
n(n  1)(n  2)1
 n
n
p ( 2n) 
log 2n
So we have the desired result:
n
p ( n) 
2 log n
A quick detour...
The prime density theorem states that n/lnn is a good
estimate for p(n). But how large is the error?
The Riemann Hypothesis
lim
n
p (n)  n / ln n
n
0
Riemann proposed this conjecture in 1859. It is one of
the most famous open problems in mathematics.
Fermat’s Little Theorem
For all primes p and integers a such that 1  a  p -1,
a p -1 = 1 (mod p)
For example:
a = 5, p = 7
56 = 15,625 = 2,232×7 + 1 = 1 (mod 7)
What is 238 (mod 7)?
238 = (26)6×22 = 16×22 = 4 (mod 7)
Recall From Group Theory
A group (G, ×) is a set G and a binary operation × that
has closure, associativity, inverses, and identity.
The set
Zn* = { x | 1  x < n and gcd(x,n) = 1}
is a group with respect to multiplication mod n.
In particular, for prime p,
Zp* = {1 ,…, p -1}
is a group with multiplication mod p.
Lagrange: If G is a finite group and S is a subgroup of
G, then the size of S divides the size of G.
Theorem. (Fermat)
For all primes p and integers a such that 1  a  p -1,
a p -1 = 1 (mod p)
Proof:
Given a  Zp*
Let aZp* = { ax | x  Zp*}
So aZp* = { a, 2a, 3a, …, (p-1)a }
aZp* = Zp* because ax  aZp* ax  Zp*
and x  Zp* x = a (a -1 x)  aZp*
Multiply all the elements in each set (mod p):
a×2a×3a× × × (p -1)a = 1×2×3× × × (p -1)
(mod p)
a p -1 (1×2×3× × × (p -1)) = 1×2×3× × × (p -1)
(mod p)
p -1
Testing Primality (revisited)
A simple randomized primality test is based on
Fermat’s Little Theorem.
To test if n is prime:
pick a  {1…n -1} at random
if a n -1  1 (mod n),
return “n is not prime”
else
O(log n)
return “n may be prime”
multiplies
by using
repeated
squaring
Hmm...
What if n = 341, a = 2?
2340 = (210)34
= 102434
= (3×341 + 1)34
= 1 (mod 341)
But 341 is not prime. 341 = 11×31
Fermat’s Little Theorem doesn’t
work in both directions. Just
because some numbers n and a
satisfy the Fermat property, that
doesn’t necessarily mean that n is
prime!
The situation is even worse than that. Some
composite numbers (called Carmichael numbers)
actually satisfy Fermat’s Little Theorem for all 1 
a  n-1. Our simple primality test cannot tell
Carmichael numbers from prime numbers.
First three Carmichael numbers:
561
1105
1729
Fortunately Carmichael numbers
are extremely rare -- only 255 of
them occur in the first
100,000,000 integers. We will
ignore them for now.
If n is not a Carmichael number, we can bound the
probability that the primality test goes wrong.
Define
Kn* = { a  Zn* | an-1 = 1 (mod n)}
So
Kp* = Zp* for primes p (by Fermat’s Theorem)
K341* contains 2
Kk* = Zk* for Carmichael numbers k (by definition)
If our choice of a for the test happens to fall in Kn*,
then the test would give the wrong answer.
We want Kn* to be small relative to {1…n-1}, so that a
random a {1…n-1} has a good chance of missing Kn*.
Theorem. If n is composite but not a Carmichael
number, then Pr (a  Kn*)  1/2.
Proof: Recall Kn* = { a  Zn* | an-1 = 1 (mod n)}
Claim: Kn* is a subgroup of Zn*
Closure: an-1 = 1 (mod n), bn-1 = 1 (mod n)  (ab)n-1 = 1 (mod n)
Identity: 1n-1 = 1 (mod n)
Associativity: multiplication is associative
Inverses: an-1 = 1  (an-1)-1 = 1  (a -1)n-1 = 1 (mod n)
Since n is neither prime nor Carmichael, Kn*  Zn*
|Kn*| < |Zn*|
But |Kn*| divides |Zn*|
(Lagrange’s Theorem)
So |Kn*|  1/2 |Zn*|  1/2 (n-1).
We can boost the probability of the randomized
primality test as high as we like by repeating it.
Randomized primality test:
repeat k times
pick a  {1…n -1} at random
if a n -1  1 (mod n) then
return “n is not prime”
return “n is probably prime or Carmichael”
(Chance of mistake < 1/2k as we vary over the
possible coin tosses of the algorithm)
A slightly more complicated version of this algorithm
(the “Miller-Rabin” algorithm) detects and eliminates
Carmichael numbers with probability  1 - 1/2k.