Lec7Probability

Download Report

Transcript Lec7Probability

Lecture 7
5.1 5.2 Discrete Probability
5.1 Probabilities
• Important in study of complexity of algorithms.
• Modeling the uncertain world: information, data.
• Applications in error-correcting coding, data compression, data restoration,
medical expert systems, search engines, etc.
• Modern AI deals with uncertainty in the world (was my measurement correct,
were my assumptions correct). Probability theory is the answer to that.
• Developed in the context of gambling.
5.1
Q: If we roll a die VERY (infinitely) often, what is the frequency (or fraction)
that we find “1” ?
A: 1/6 (if the die is fair).
Each side has equal chance of landing face up.
Sample space: The space of all possible outcomes S={1 2 3 4 5 6}.
Event: A subset of the sample space E={1}.
Probability of an event: p(E)=|E|/|S|=1/6
In this example we all outcomes are equally likely! This is not the case in general
as we will see later.
Q: If we roll 2 dice, what is the probability that the sum is 7?
A: |S| = 6 x 6 = 36.
|E| = (1,6), (2,5), (3,4), (4,3), (5,2), (6,1) = 6.
p(E)=1/6
5.1
It’s all counting again!
example: A lottery has a big price if you correctly guess four digits out of
4 in the right order. A small price is won if you have guessed 3
digits correctly at the correct location.
 |S| = 10^4.
|E-big| = 1
|E-small| = ?
There are 4 ways to have 3 digits correct (and one digit wrong therefore).
For each of these the number of possibilities are:
1 x 1 x 1 x 9 (9 for incorrect digit).
|E-small| = 4 x 9 = 36.
p(E-small) = 36 / 10^4.
5.1
Examples:
1) What is the probability to draw a full house from a deck of cards
(2 of one kind & 3 of one kind) ?
 First draw the 3 of a kind, then the 2 of a kind (order matters).
P(13,2) is number of ways to draw 2 different kinds out of 13 kinds.
C(4,3) is number of ways to pick 3 cards among 4 (order doesn’t matter).
C(4,2) is number of ways to pick 2 cards among 4 (order doesn’t matter).
C(52,5) is total number of 5 cards drawn from a deck of 52.
solution: P(13,2) x C(4,3) x C(4,2) / C(52,5).
2) Probability of sequence 11,4,17,39,23 out of 50
when sampling a) without replacement, b) with replacement.
a) |E| = 1, |S| = P(50,5)  P(E) = 1/P(50,5). (sorry - different P’s !)
b) |E| = 1, |S| = 50^5  P(E) = 1/50^5
5.1
Theorem: Let E denote the complement of E.
Then:
p( E )  1  p( E )
Proof: |
E | = |S|-|E|.
p( E ) = |S| - |E| / |S| = 1 – p(E).
Example: We generate a bit-string of length 10. What is the probability that at least
one bit has a 0?
E = large and hard to enumerate.
E = small: no zeros in the bit-string! Only one possibility.
p(E) = 1 - p ( E ) = 1 – 1 / 2^10 = 1023/1024.
5.1
Set S1 is the set of all Irish citizens with blue eyes.
Set S2 is the set of all Irish citizens with black hair.
Set S3 is the set of all Irish citizens.
|S1| = 1000, |S2|=3000, |S3|=10,000, | S1 S2 | 200
Q: If we meet a random Irish citizen in the streets of Dublin, what is
the probability that he/she has blue eyes OR black hair?
A: Total number of possibilities |S3| = 10,000.
Total “area covered” by S1 U S2 is:
|E| = |S1 U S2| = |S1| + |S2| - | S1 S2 |
p(E) = p(S1 U S2) = |S1| + |S2| - | S1
= p(S1) + p(S2) – p(| S1
S2 | / |S|
S2 | )
Theorem: p(E1 U E2) = p(E1) + p(E2) – p(E1
U
S1
S3
E2 )
| S1
S2 |
S2
5.1
A famous example:
You participate in a game where there are 3 doors with only one hiding a big price.
You pick a door.
The game show host (who knows where the price is), opens another empty door
and offers you to switch. Should you?
 You don’t switch:
You have probability 1/3 to pick the correct door. If you don’t switch that doesn’t
change (imagine doing the experiment a million times with this strategy).
 You switch:
If you got the correct door (prob. 1/3) and you switch you lost.
If you got the wrong door (prob. 2/3) and you switch you win!
5.2
Until now, we have only dealt with events that are equally likely.
What about this example:
Q: We have a coin that is heavier one side than on the other. Consequently
it comes up heads ¾ of time.
What is the frequency with which we see tails?
A: If we see it heads up ¾ of the time, we’ll see it tails up ¼ of the time.
Formally:
Let S be the sample space (a set) and s be an element in {S}.
To each element of S we assign a probability p(s) that fulfils 2 conditions:
1)
s 0  p ( s )  1
2)
 p( s)  1
sS
5.2
p(s) is called a probability distribution.
This is a generalization of the definition of probability in the previous section
because there it was assumed that all elements were equally likely: p(s) = 1 / |S|.
This is called the uniform distribution.
Definition: assume that E is a subset of S with |E| elements.
The total probability of the event is now given by:
p( E )   p(s)
sE
5.2
Example: We have a loaded die such that 3 appears twice as often and the
other sides appear equally often. What is the probability of finding an odd number
when we roll the die?
p(1)=p(2)=p(4)=p(5)=p(6) = q
p(3) = 2q
 p( s)  1
 5q + 2q.  q = 1/7
sS
odd outcomes: E = {1 3 5}
p(E) = p(1)+p(2)+p(3) = 4/7.
5.2
Theorem: Let E denote the complement of E.
Then:
p( E )  1  p( E )
Still true because:
p( E )   p( s)  1  p( s)  1  p( E )
sE
where we used that
sE
 p( s)  1
sS
5.2
Theorem: When E1, E2, ...,En etc are disjoint subsets of S we have:
n
p(
i 1

Ei ) 
n
s
n
n
p( s )    p( s )  p( Ei )
i 1 sEi
Ei
i 1
i 1
Situation is more difficult if we have overlaps in the subsets:
For 2 subsets we have:
Theorem:
p(E1 U E2) = p(E1) + p(E2) – p( E1
p ( E1
E2 )   p ( s )   p ( s ) 
sE1
p ( E1 )  p ( E2 )  p( E1
sE2
E2 )
E2 )

sE1 E2
p(s) 
5.2 Exercises
Ex. 20 p. 377: Find the smallest number of people in a room such that the
probability that someone has his/her birthday today is larger than ½.
assuming: a year has 366 days and all birthdays are equally likely.
Let’s say there are n people.
S = {all combinations of birthdays for n people}
|S| = 366^n
E1 = {one person has his/her birthday today}
E2 = {two people have their birthday today}
E3 =....
E none of the people has their birthday today.
| E | = 365^n
365n
p( E )  1  p( E )  1 | E | / | S | 1 
 0.5
366n
 n  log 2 / log(366 / 365)  253.35
n=254
5.2 Exercises
Exercise 12 p. 377
p(E) = 0.7
p(F) = 0.5
p(E U F) = p(E) + p(F) – p( E
F )
We know p(E) and p(F) and we know that
p(E F ) <= 0.5 (total overlap F is subset E)
p(E F ) >= 0.2 (minimal overlap since |E| + |F| - | E
 0.7 <= p(E U F) <= 1
F | <= 1