Probability --
Download
Report
Transcript Probability --
Discrete Math
CS 2800
Prof. Bart Selman
[email protected]
Module
Probability --- Part e)
1) The Probabilistic Method
2) Randomized Algorithms
1
The Probabilistic Method
2 2
The Probabilistic Method
Method for providing non-constructive existence proofs:
Thm. If the probability that a randomly selected element of the set S does
not have a particular property is less than 1, then there exists an
element in S with this property.
Alternatively: If the probability that a random element of S has a
particular property is larger than 0, then there exists at least one
element with that property in S.
Note: We saw an earlier example of the probabilistic method when discussing
the 7/8 alg. for 3-CNF.
3
Example: Lower bound for Ramsey numbers
Recall the definition of Ramsey number R(k,k):
Let R(k,k) be the minimal n such that if the edges of the complete
graph on n nodes are colored Red and Blue, then there either is a
complete subgraph of k nodes with all edges Red or a complete
subgraph of k nodes with all edges Blue.
R(3,3) = 6. So, any complete 6 node graphs has either a Red or
a Blue triangle. (Proof: see “party problem”.)
4
Reminder: “The party problem”
Dinner party of six: Either there is a group of 3 who all
know each other, or there is a group of 3 who are all
strangers.
By contradiction. Assume we have a party of six
where no three people all know each other
Let’s say she knows 3 others.
and no three people are all strangers.
She either knows or doesn’t
know each other person.
If any of those 3 know each other, we
have a blue , which means 3 people
know each other. Contradicts
assumption.
So they all must be strangers. But then we
have three strangers. Contradicts
assumption.
But there are 5 other people!
So, she knows, or doesn’t
know, at least 3 others.
(GPH)
The case where she doesn’t know 3
others is similar. Also, leads
to constradiction.
So, such a party does not exist!
QED
Consider one
person.
How do we get a lower bound on R(k,k)?
E.g., lower bound for R(3,3)?
We need to find an n such that the complete graph on
n nodes does not contain a Red or a Blue triangle.
E.g.
So, R(3,3) > 5.
I.e., we have a lower
bound on the Ramsey
Number for k=3.
Can we do this for R(k,k) in general? Very difficult to construct
the graphs, but… we can prove they exist for non-trivial k.
Thm. For k ≥ 4, R(k,k) ≥ 2k/2
Unbiased coloring is not
essential. Only matters that
each possible coloring has
some non-zero probability.
So, e.g., k = 20, then there exists a Red/Blue coloring of the complete graph
with 2^10 – 1 = 1023 nodes that does not have any complete
monochromatic sub graph of size 20. (But we have no idea of how to find
such a coloring!)
Proof: Consider a sample space where each possible coloring of
the n-node complete graph is equally likely. A sample coloring
can be obtained by randomly coloring each edge. I.e., with
probability ½ set edge to Blue, otherwise Red.
Let n < 2k/2
Aside: each particular coloring has probability (1/2) (n * (n-1) /2)
We want to show that there is a larger than 0 probability of getting
a coloring with no monochromatic k-clique. Hmm…
Why?
Consider a subset of k nodes. There will be
subsets, S_1, S_2, … S_
.
of such
Let E_i event that S_i is a monochromatic subgraph (a Red or
a Blue clique).
So, a sample is a randomly selected graph coloring. What is
the probability of E_i?
Why?
Note: number of edges in clique of size k is:
So, the probability that the randomly selected coloring will
have some monochromatic k-clique is:
If we can show that this probability is strictly less than 1, then
we know that there must exist some coloring that does not have
any monochromatic k-clique!
We’ll do this. It’s mainly a matter of rewriting the combinatorial
expression for the probability and finding an upper bound.
First step: Can we just add up the individual probabilities?
But, computing the exact value of
very tricky. Why??
is
Events are not disjoint and not independent!
A coloring can have multiple
monochromatic cliques. Also, having e.g. several
Blue cliques in part of the graph, makes it more likely
to have more Blue cliques on other cliques that share many
edges with the Blue cliques. So, all kinds of subtle
dependencies!
Would need to use the inclusion-exclusion formula (many terms)!
Fortunately, we “just” need to upper bound the probability. So,
we can truncate the formula to the first set of
terms. This gives us Boole’s Inequality:
p(E1 U E2) ≤ p(E1) + p(E2)
General form, see exercise
15, sect. 6.2.
?
So, we get:
<1
So, we have “upper bounded” our probability. What’s left?
We need to show that the left hand side is strictly less than 1.
“Just” combinatorics…
O(nk)
Note: We have many terms in the sum, but p(E_i) can
be pretty small.
Ex. 17, 5.4
by assumption:
by assumption:
So,
Note: Proof is non-constructive.
No effective method known
for finding such coverings!
So,
when
and
So, is the probability of having a monochromatic k-clique is
strictly less than 1. Therefore, there must exist edge colorings
in our sample space that do not have such monochromatic
k-cliques!
So, R(k,k) ≥ 2k/2
QED
Probabilistic Algorithms
14 14
Monte Carlo Algorithm
Probabilistic Algorithms --- algorithms that make random choices at one
or more steps.
Decision problems --- problems for which the answer is True or False.
E.g., is this number composite?
Monte Carlo algorithms – there is a small probability that they will return
an incorrect answer.
15
Probabilistic Primality Test:
Miller’s Test
Let n be a positive integer and let n-1 = 2s t (i.e. divide out the
factors of 2), where s is a non-negative integer and t is an odd
positive integer.
We say that n passes the Miller test for the base b if either
bt ≡1 (mod n)
or
b
2jt
1(mod n), for some j, 0 j s 1
It can be shown that a composite integer n passes the Miller’s test
for fewer than n/4 bases b, with 1 < b < n.
16
Probabilistic Primality Testing
Goal of the algorithm is to decide the question – “Is n composite?”
The basic structure of randomized primality tests is as follows:
1. Randomly pick a number 1 < b < n.
2. Does n pass the Miller test for b?
If n fails the test, then n is a composite number, and the answer is
True; b is known as a witness for the compositeness, and the test
STOPS.
Otherwise the answer is “unknown”
3. Repeat step 1, k times until the required certainty is achieved.
After k iterations, if n is not found to be a composite
number, then it can be declared probably prime.
17
Probabilistic primality testing
Note:
This algorithm can only make mistakes when the answer is unknown.
The probability that a composite integer n passes the Miller’s test for
a
randomly selected base b is less than ¼.
By repeating it k times, given that the iterations are independent, the
probability that n is composite but the algorithm responds that is
prime is less than ¼ k
18
Probabilistic primality testing
By taking k sufficient large, we can make the probability of error
really, really small.
10 iterations less than 1 in 106
30 iterations less than 1 in 1018
If we use n as prime as one of the two primes used in the RSA
cryptosystem and n is actually a composite, the procedures used to
decrypt messages will not produce the original encrypted message.
They key is then discarded and two new possible primes are used.
19
Probability theory is a very rich topic of
increasing importance in computer science, but
for now, this ends our probabilistic adventures.
And, our adventures in Discrete Mathematics!!
Of course, you’re still left with the popular party question:
How does discrete math differ from indiscrete math?