Basic discrete probability

Download Report

Transcript Basic discrete probability

Discrete Probability

Sample space (set) S of elementary event
 eg.

An event A is a subset of S
 eg.


The 36 ways of 2 dices can fall
Rolling 7 with 2 dices
A probability distribution Pr{} is a map from
events of S to R
Probability Axiom:
 Pr{ A}  0
 Pr{S }  1
 Pr{ A  B}  Pr{ A}  Pr{B}, if A  B  
 Theorem :
Pr  A  B  Pr  A  Pr B  Pr  A  B
 Pr  A  Pr  B

Theorem :
For discrete probability distributions
Pr  A   Pr s
sA

A random variable (r.v.) X is a function from S to R
 The
event “X = x” is defined as {sS : X(s) = x}
 eg. Rolling 2 dices:





|S|=36 possible outcomes
Uniform distribution: Each element has the same probability
1/|S|=1/36
Let X be the sum of dice
Pr{ X = 5 } = 4/36,
{(1, 4), (2, 3), (3, 2), (4, 1)}
Expected value: E  X    x x Pr  X  x 
Linearity: E  aX  Y   aE  X   E Y 
 X1:
number on dice 1
 X2: number on dice 2
 X=X1+X2,
E[X1]=E[X2]=1/6(1+2+3+4+5+6)=21/6
Independence

Two random variables X and Y are independent if
x, y, Pr  X  x and Y  y  Pr  X  x  Pr Y  y

If X and Y are independent, then
E  XY   E  X  E Y 

Indicator random variables
 Given
a sample space S and an event A, the indicator
random variable I{A} associated with event A is defined
as:
I  A 

1
0
if A occurs
o/w

E.g.: Consider flipping a fair coin:
 Sample
space S = { H,T }
 Define random variable Y with Pr{ Y=H } = Pr{ Y=T }=1/2
 We can define an indicator r.v. XH associated with the
coin coming up heads, i.e. Y=H
X H  I Y  H  

1
0
if Y  H
if Y  T
E  X H   E  I Y  H 
 1  Pr Y  H   0  Pr Y  T 
1
 Pr Y  H  
2
 Lemma :
Given a sample space S and an event A in the
sample space S, let X A  I { A}
Then E  X A   Pr  A
Proof :
E  X A   E  I  A  1 Pr  A  0  Pr  A
 Pr  A

The birthday paradox:
 How
many people must there be in a room before there
is a 50% chance that two of them born on the same day
of the year?

(1)
 Suppose
there are k people and there are n days in a
year,
bi : i-th person’s birthday, i =1,…,k
 Pr{bi=r}=1/n, for i =1,…,k and r=1,2,…,n
 Pr{bi=r, bj=r}=Pr{bi=r}.Pr{bj=r} = 1/n2
1
Pr
b

r
,
b

r



i
j
n
r 1
 Define event Ai : Person i’s birthday is different from
person j’s for j < i
k
 Bk 
A : the event that k people have distinct birthday
i 1 i
 Bk 1 Ak
 Pr{Bk} = Pr{Bk-1∩Ak} = Pr{Bk-1}Pr{Ak|Bk-1}
where Pr{B1} = Pr{A1}=1
 Pr
b  b   
i
n
j
 Pr{Bk }  Pr{Bk 1}Pr{ Ak
| Bk 1}
 Pr{Bk  2 }Pr{ Ak 1 | Bk  2 }Pr{ Ak | Bk 1}
 ...  Pr{B1}Pr{ A2 | B1}Pr{ A3 | B2 }...Pr{ Ak | Bk 1}
 1  ( nn1 )( n n 2 )...( n nk 1 )
 1n
 n2

( k 1)
n
 1 (1  1n )(1  n2 )...(1  kn1 )  e  e
e
1 x  ex
k 1
k ( k 1)
 i / n
 2n
i 1
e
e
 12
where  k (2kn1)  ln( 12 )

k (k  1)  2n ln 2  the prob.  12 ,
For n  365, we have k  23
k  (1  1  (8ln 2)n ) / 2
■

(2) Analysis using indicator random variables
 For
each pair (i, j) of the k people in the room, define
the indicator r.v. Xij, for 1≤ i < j ≤ k, by
X ij  I person i and person j have the same birthday
1 i and j have the same birthday

0 o/w


E  X ij   Pr person i and j have the same birthday
 1n
k
k
Let X   i 1  j i 1 X ij
k
k
E  X   E   i 1  j i 1 X ij 


k (k  1)
k
k
k
  i 1  j i 1 E  X ij     / n 
2n
2
k(k-1) ≥ 2n, the expected number of pairs of
people with the same birthday is at least 1
 When

1  1  8n
k  k  2n  0  k 
2
k  ( n ), n  365  k  28, we expect to find at least
one matching pair
■
2

Balls and bins problem:
 Randomly
toss identical balls into b bins, numbered
1,2,…,b. The probability that a tossed ball lands in any
given bin is 1/b
 (a) How many balls fall in a given bin?
 If n balls are tossed, the expected number of balls that fall in
the given bin is n/b
 (b) How many balls must one toss, on the average, until
a given bin contains a ball?
 By geometric distribution with probability 1/b
e  1  b1  2  (1  b1 ) b1  3  (1  b1 ) 2 b1  ...
 e  (1  b1 )e  b1  (1  b1 ) b1  (1  b1 ) 2 b1  ...
 b1 ( 1(11 1 ) )  1
 e 1 e  b
1
b
b
 (c)






(Coupon collector’s problem)
How many balls must one toss until every bin
contains at least one ball?
Want to know the expected number n of tosses required to get b
hits. The ith stage consists of the tosses after the (i-1)st hit
until the ith hit.
For each toss during the ith stage, there are i-1 bins that
contain balls and b-i+1 empty bins
Thus, for each toss in the ith stage, the probability of obtaining
a hit is (b-i+1)/b
Let ni be the number of tosses in the ith stage. Thus the number
of tosses required to get b hits is n=∑bi=1 ni
Each ni has a geometric distribution with probability of success
(b-i+1)/b → E[ni]=b/b-i+1
b
b
b
b 1
b


E  n   E  i 1 ni   i 1 E  ni    i 1 b i 1  b i 1 i


 b(ln b  O(1))  O(b ln b)
■
Streaks


Flip a fair coin n times, what is the longest streak
of consecutive heads?
Ans:θ(lg n)
Let Aik be the event that a streak of heads of length at
least k begins with the ith coin flip
Pr  Aik   1/ 2k
For k  2 lg n 


Pr Ai ,2lg n   2

2 lg n 

1
n2
For j=0,1,2,…,n, let Lj be the event that the longest streak
of heads has Length exactly j, and let L be the length of
the longest streak.
n
E  L   j 0 j  Pr L j 
 Note that the events L j for j  0,1,...,n are disjoint.
So the probability that a streak of heads of length  2 lg n 
begins anywhere is  j 2lg n Pr L j 
 
n
1
Thus,  j  2 lg n  Pr  L j   n
n


Pr  L   1
 Pr L   1. We have 
E  L    j  Pr L 

j  Pr  L   
j  Pr  L 

(2 lg n  ) Pr  L   
n Pr  L 
 2 lg n  
Pr  L   n
Pr L 
while
2 lg n  1
n
j 0
n
j 0
j
j 0
2 lg n  1
j
j
n
j 0
2 lg n  1
j  2 lg n 
j
j
n
j 0
j
2 lg n  1
j 0
j
 2 lg n  1  n  (1/ n)  O(lg n)
j  2 lg n 
n
j  2 lg n 
j
j


r lg n 
r
 Pr A

1
2

1
n
i , r lg n 


 The probability is  n n r  1 n r 1 that the largest streak
is  r lg n 
 Claim :
The expected length of the longest streak of heads in n
coin flips is   lg n 

We look for streaks of length s by partitioning the n
flips into approximately n/s groups of s flips each.
 Take s  (lg n) / 2 




Pr Ai ,(lg n ) / 2  1 2
s
s
s
n
(lg n ) / 2
1
n
The probability that a streak of heads of length
 (lg n) / 2  does not begin in position i is  1  1 n
 The  n
 groups are mutually exclusive, ind. coin flips,
 (lg n) / 2 
the prob. that every one of the groups fails to be a streak of
length (lg n) / 2  is at most
(1  1
n)
n

 (lg n ) / 2  


1 n / (lg n ) / 2  1
n
 2 n / lg n 1 / n
 (1 
e
)
 (1 
1 2 n / lg n 1
n
)
 O  e  lg n   O  1n 
 Thus, the prob. that the longest streak exceeds  (lg n) / 2  is

n
j  (lg n ) / 2  1
 E  L 



n
j  Pr L j 
j 0
(lg n ) / 2 
j 0
n
Pr  L j   1  O 1/ n  WHY?
j  Pr  L j    j  (lg n ) / 2 1 j  Pr L j 
j  (lg n ) / 2  1
n

(lg n) / 2   Pr  L j 

 (lg n) / 2   j  (lg n ) / 2 1 Pr  L j 
n


 (lg n) / 2  1  O 1/ n    (lg n)
■

Using indicator r.v. :
Let X ik  I  Aik 
n  k 1
 Let X  
X ik
i 1
 E  X   E  n  k 1 X 
ik
  i 1

n  k 1
n  k 1
n  k 1
  i 1 E  X ik    i 1 Pr  Aik    i 1 1/ 2k 

n  k 1
2k
k  c lg n, for some constant c,
n  c lg n  1 n  c lg n  1
1
(c lg n  1) / n
EX  



2c lg n
nc
nc 1
nc 1
1
 ( c 1 )
n
 If

If c is large, the expected number of streaks of
length clgn is very small.
 
1

1 
 If c  , then we obtain E  X     1 1    n 2
 n2 
and we expect that there will be a large number of streaks
of length ( 12 ) lg n
1
2
Therefore, one streak of such a length is very
likely to occur.
 Conclusion :
The length of the longest streak is (lg n)

■
The on-line hiring problem:

To hire an assistant, an employment agency
sends one candidate each day. After interviewing
that person you decide to either hire that person
or not. The process stops when a person is hired.

What is the trade-off between minimizing the
amount of interviewing and maximizing the quality
of the candidate hired?
What is the best k?
Let M(j) = max 1ij{score(i)}.
Let S be the event that the best-qualified
applicant is chosen.
Let Si be the event the best-qualified applicant
chosen is the i-th one interviewed.
Si are disjoint and we have Pr{S}=  ji=1Pr{Si}.
If the best-qualified applicant is one of the first k,
we have that Pr{Si}=0 and thus
Pr{S}=  ji=k+1Pr{Si}.








Let Bi be the event that the best-qualified applicant
must be in position i.
Let Oi denote the event that none of the applicants
in position k+1 through i-1 are chosen
If Si happens, then Bi and Oi must both happen.
Bi and Oi are independent! Why?
Pr{Si} = Pr{Bi  Oi} = Pr{Bi} Pr{Oi}.
Clearly, Pr{Bi} = 1/n.
Pr{Oi} = k/(i-1). Why???
Thus Pr{Si} = k/(n(i-1)).
n
Pr{S} =  Pr{Si }
i  k 1
n
k
 
i  k 1 n(i  1)
n
1
 ( k / n) 
i  k 1 (i  1)
n 1
1
 ( k / n) 
i k i
k
n1
n1 1
1
dx    
dx 
k 1 x
x
i
i k
n1
k (ln n  ln k )  Pr{S}  k (ln(n 1)  ln(k 1)).
n
n
k
Differentiate (ln n  ln k )with respect to k.
n
1
We have (ln n  ln k 1)  0.
n
Thus k  n / e and Pr{S}  1/ e.