Lecture-10-Conditional-Probability-Mass

Download Report

Transcript Lecture-10-Conditional-Probability-Mass

Conditional Probability Mass Function
Introduction
 P[A|B] is the probability of an event A, giving that we know that
some other event B has occurred.
 Unless A and B are independent, B will affect the probability of A.
P[A | B] =
P[A ∩ B]
P[B]
 Example:
We choose a coin out of a fair and weighted coins and toss it 4 times.
What’s the probability of observing 2 or more head?
The probability depends on which coin is selected (condition).
px[k| coin 1 chosen] is a Binomial PMF depends on p1
px[k| coin 2 chosen] is a Binomial PMF depends on p2
pX [k] = pX [k | coin 1 chosen]P[coin 1 chosen] +
+ pX [k | coin 2 chosen]P[coin 2 chosen]
2
Conditional Probability Mass Function
Let X be the discrete RV describing the outcome of the coin choice
ì 1 if coin 1 is chosen
X=í
î 2 if coin 2 is chosen
Since SX = {1,2}, we assign a PMF to X of
ì a
pX [i] = í
î 1- a
i =1
i=2
0 <a <1
The second part of the experiment consists of
tossing the chosen coin 4 times in succession.
SY = {0,1,2,3,4}
The event A corresponds to 2 or more heads.
3
Conditional Probability Mass Function
P[A] =
å
{(i, j ):(i, j )ÎA}
2
4
pX ,Y [i, j] = å å pX,Y [i, j]
i=1 j=2
Only the PMF is needed to determine the
desired probability. To do so we need
pX,Y [i, j] = P[X = i,Y = j]
pX [i] = P[X = i]
By using the definition of conditional probability for events we have
pX ,Y [i, j] = P[X = i,Y = j]
(definition of joint PMF)
= P[Y = j | X = i]P[X = i] (definition of cond. prob.)
(definition of marginal PMF)
= P[Y = j | X = i]pX [i]
4
Conditional Probability Mass Function
pX,Y [i, j] = P[Y = j | X = i]pX [i]
Is given
earlier
ì a
pX [i] = í
î 1- a
i =1
i=2
P[Y = j | X = i] can be determined from the experimental description
æ 4 ö j
4- j
P[Y = j | X = i] = ç
p
(1p
)
, j = 0,1,2, 3, 4
i
i
÷
è j ø
Note, that probability depends on the outcome X = i via pi.
For a given value of X = i , the probability has all usual properties
of a PMF
4
0 £ P[Y = j | X = i] £ 1
å P[Y = j | X = i] = 1
j=0
5
Conditional Probability Mass Function
Then
pY |X [ j | i] = P[Y = j | X = i] is a conditional PMF
i = 1, p1 = 1/ 4
i = 2, p2 = 1 / 2
Now we know pY|X[j|i] and pX we have
pX,Y [i, j] = pY |X [ j | i]pX [i]
6
Conditional Probability Mass Function
 The joint PDF is then given by
æ 4 ö j
4- j
p X ,Y [i, j] = ç
p
1p
a+
(
)
i
÷ i
j
è
ø
æ 4 ö j
4- j
+ç
p
1p
(1- a )
(
)
i
÷ 2
j
è
ø
i = 1; j = 0,1,2, 3, 4
i = 2; j = 0,1,2, 3, 4
 Finally the desired probability of even A is
4
4
j=2
j=2
P[A] = å pX ,Y [1, j] + å pX ,Y [2, j] =
4 æ
æ 4 ö j
4 ö j
4- j
4- j
= åç
p
1p
a
+
p
1p
(1- a )
(
)
(
)
å
i
i
÷ i
ç j ÷ 2
j
ø
ø
j=2 è
j=2 è
4
 As an example, if p1 = ¼ and p2 = ¾, we have for α = ½, that
P[A] = 0.6055, but if α = 1/8, then P[A] = 0.8633. Why??
7
Conditional Probability Mass Function
 The conditional PMF can be expressed as
pX ,Y [i, j]
pY |X [ j | i] =
pX [i]
 To make connection with cond. probability let’s rename
P[X = i,Y = j] P[A j ∩ Bi ]
pY |X [ j | i] = P[Y = j X = i] =
=
= P[A j Bi ]
P[X = i]
P[Bi ]
Hence, pY|X[j|i] is a conditional probability for the events Aj and Bi.
8
Joint, Conditional, and Marginal PMFs
 Conditional PMF is defined as
pX ,Y [xi , y j ]
pY |X [y j | xi ] =
pX [xi ]
 Each PMF in the family is a valid PMF when xi is considered to be
a constant.
 In previous example {pY|X[j|1], pY|X[j|2]} is a family or valid PMFs.
¥
åp
Y |X
¥
åp
[ j |1] = 1
Y |X
j=-¥
j=-¥
¥
But not
[ j | 2] = 1
åp
Y |X
[ j | i] = 1
i=-¥
9
Example: Two Dice toss
 Two dice are tossed. All outcomes are equally likely. The numbers
of dots are added together.
What’s the cond. PMF of the sum if it’s known the sum is even?
Let
Y is the sum
ì 0 if sum is odd
X=í
î 1 if sum is even
We wish to determine pY|X[j|0] and pY|X[j|1] for all j.
The sample space for Y is SY = {2,3,…,12}.
10
Example: Two Dice toss
 Conditional probability if the sum being even and also equaling j
pX,Y [1, j]
pY |X [ j |1] =
, j = 2, 4,6,8,10,12
pX [1]
or
pY |X [ j |1] = pX [ j], j = 2, 4,6,8,10,12
pX [ j] N j (1 / 36 ) 1
pY |X [ j |1] =
=
= Nj
1/ 2
1/ 2
18
Nj is the number of outcomes in SX,Y for which the sum is j.
11
Example: Two Dice toss
ì
ï
ï
ï
ï
ïï
pY |X [ j |1] = í
ï
ï
ï
ï
ï
ïî
1 /18
j=2
3 /18
j=4
5 /18
j=6
5 /18
j=8
3 /18
j = 10
1 /18
j = 12
ì
ï
ï
ï
ïï
pY |X [ j | 0] = í
Note that
ï
ï
p
[
j
|1]
=
1
Y
|X
j
ï
ï
ïî
p [ j | 0] ¹ 1- p [ j |1]
å
Y |X
2 /18
j=3
4 /18
j=5
6 /18
j=7
4 /18
j=9
2 /18
j = 11
Y |X
12
Properties of PMF
 Property 1. Joint PMF yields conditional PMFs
If the joint PMF pX,Y[xi, yj] is known, then the conditional PMFs are
pY |X [y j | xi ] =
pX ,Y [xi , y j ]
åp
X ,Y
[xi , y j ]
j
pX [xi ]
pY |X [xi | y j ] =
pX ,Y [xi , y j ]
åp
X ,Y
[xi , y j ]
i
pY [yi ]
Hence, the cond. PMF is the joint PMF with xi fixed and then
normalized so that it sums to one.
13
Properties of PMF
 Property 2. Conditional PMFs are related
pY |X [y j | xi ]pX [xi ]
pX|Y [xi | y j ] =
pY [y j ]
pY ,X [xi , y j ]
pX|Y [xi | y j ] =
pY [y j ]
Proof:
but
therefore
pY ,X [y j , xi ] = P[Y = y j , X = xi ]
= P[ X = xi ,Y = y j ] = pX ,Y [xi , y j ]
pY ,X [xi , y j ]
pX|Y [xi | y j ] =
pY [y j ]
(*)
Using pX,Y[xi, yj] = pY,X[yj|xi]pX[xi] yields the desired the results.
14
Properties of PMF
 Property 3. Conditional PMF is expressible using Bayes’ rule
pY |X [y j | xi ] =
Proof: From property 1
pX|Y [xi | y j ]pY [y j ]
åp
[xi | y j ]pY [y j ]
j
pY |X [yi | x j ] =
and using (*) we have
X|Y
pX ,Y [xi , y j ]
åp
X ,Y
[xi , y j ]
(**)
j
pX,Y [xi , y j ] = pX|Y [xi | y j ]pY [y j ]
substituting it into (**) yields the desired results
15
Properties of PMF
 Property 4. Conditional PMF and its corresponding marginal
PMF yields the joint PMF
pX,Y [xi , y j ] = pY |X [y j | xi ]pX [xi ]
pX,Y [xi , y j ] = pX|Y [xi | y j ]pY [yi ]
 Property 5. Conditional PMF and its corresponding marginal
PMF yields the other marginal PMF
pY [y j ] = å pY |X [y j | xi ]pX [xi ]
i
This is the law of total probability.
16
Conditional PMF relationships
Can also interchange X
and Y for similar results
17
Simplifying Probability Calculations Using
Conditioning
 Conditional PMFs can be used to simplify probability calculations.
 Find Z = X + Y, if X and Y are independent.
If X were known X = i we can find the PMF of Z because Z = i + Y
This is a transformation of one discrete RV to another discrete RV Z.
pZ|X[j|i] = pY|X[j-i|i].
(*)
To find unconditional PMF of Z we use
property 5.
¥
pZ [ j] =
Since (*)
pZ [ j] =
åp
Z|X
[ j | i]pX [i]
Y |X
[ j - i | i]pX [i]
i=-¥
¥
åp
i=-¥
If X and Y are independent so that pY|X = pY then
pZ [ j] =
¥
å p [ j - i]p
Y
X
[i]
i=-¥
18
Mean of the Conditional PMF
We can determine attributes such as the expected value of a RV Y,
when it is known that X = xi.
EY |X p[Y | xi ] = å y j pY |X [y j | xi ]
j
The mean of the conditional PMF is a constant when xi is fixed.
Generally, mean is a function of xi.
Example: Two dice are tossed, the event of interest is a sum, given
the sum is even or odd. The means of the conditional PMF are given
æ 1ö
æ 3ö
æ 5ö
æ 5ö
æ 3ö
æ 1ö
EY |X [Y |1] = 2 ç ÷ + 4 ç ÷ + 6 ç ÷ + 8 ç ÷ + 10 ç ÷ + 12 ç ÷ = 7
è 18 ø
è 18 ø
è 18 ø
è 18 ø
è 18 ø
è 18 ø
æ 2ö
æ 5ö
æ 6ö
æ 4ö
æ 2ö
EY |X [Y | 0] = 3 ç ÷ + 5 ç ÷ + 7 ç ÷ + 9 ç ÷ + 11ç ÷ = 7
è 18 ø
è 18 ø
è 18 ø
è 18 ø
è 18 ø
Usually not
equal
var[Y | xi ] = å j (y j - EY |X [Y | xi ])2 pY |X [y j | xi ]
19
Example: Toss one of two dice
Two dice are given: D1 = {1,2,3,4,5,6} and D2 = {2,3,2,3,2,3}.
The die is selected at random and tossed.
What’s the expected number of dots observed for the tossed die?
We can view this problem as a conditional one by letting
ì 1 if die 1 is chosen
X=í
î 2 if die 2 is chosen
and Y is the number of dots observed. Thus, we wish to determine
EY|X[Y|1] and EY|X[Y|2].
1
pY |X [ j |1] = , j = 1,2, 3, 4,5,6
6
6
7
EY |X [Y |1] = å jpY |X [ j |1] =
2
j=1
1
pY |X [ j | 2] = , j = 2, 3
2
3
EY |X [Y |1] = å jpY |X [ j | 2] =
j=2
5
2
20
Example: Toss one of two dice
Mean = 3.88, True mean = 3.5
Outcomes when die 1 is chosen
Mean = 3.58, True mean = 2.5
Outcomes when die 2 is chosen
 What is the unconditional mean (mean of Y)?
 Unconditional mean is the number of dots observed without
first condition on which die was chosen.
1
1
 Intuitively
EY [Y ] =
EY |X [Y |1] + EY |X [Y | 2]
2
2
21
Unconditional mean
 Let determine EY[Y] for the following experiment
1. Choose die 1 or die 2 with probability of ½.
2. Toss the chose die.
3. Count the number of dots on the face of tossed die, that is RV Y.
To determine theoretical mean Y we need pY[j].
pY [ j] = å i pX,Y [i, j]
pX,Y [i, j] = pY |X [ j | i]pX [i]
ìï 1 /12 i = 1; j = 1,2, 3, 4,5,6
pX,Y [i, j] = pY |X [ j | i]pX [i] = í
i = 2; j = 2, 3
ïî 1 / 4
ì
pX ,Y [1, j] = 1 /12
ï
pY [ j] = å i pX ,Y [i, j] = í
1 1 1
p
[1,
j]
+
p
[2,
j]
=
+ =
X ,Y
ï X ,Y
12 4 3
î
j = 1, 4,5,6
j = 2, 3
22
Unconditional mean
 Thus the unconditional mean becomes
6
EY [Y ] = å jpY [ j] =
j=1
æ 1ö
æ 1ö
æ 1ö
æ 1ö
æ 1ö
æ 1ö
= 1ç ÷ + 2 ç ÷ + 3 ç ÷ + 4 ç ÷ + 5 ç ÷ + 6 ç ÷ = 3
è 12 ø
è 3ø
è 3ø
è 12 ø
è 12 ø
è 12 ø
 The other way to find unconditional mean
EY [Y ] = EY |X [Y |1]pX [1]+ EY |X [Y | 2]pX [2]
That is the average of the conditional means.
23
Unconditional mean (Proof)
 In general unconditional mean is found as
EY [Y ] = å EY |X [Y | xi ]pX [xi ]
i
Proof
åE
Y |X
[Y | xi ]pX [xi ] =
i
æ
ö
å çè å y j pY |X [y j | xi ]÷ø pX [xi ] = (def. of cond. mean)
i
j
p X ,Y [x j , yi ]
å å y j p [x ] pX [xi ] =
i
j
X
i
åy å p
j
i
X ,Y
[x j , yi ] =
j
å y p [y ] = E [Y ]
j
i
(def. of cond. PMF)
Y
j
Y
(marginal PMF from
joint PMF)
24
Modeling human learning
 Child learns by attempting to pick up the toy,
dropping it, picking it up again after having learned
something.
 Each time the experiment, “attempting to pick up
the toy”, is repeated the child learns something or
equivalently narrows down then number the
number of strategies.
 Many models of human learning employ a Baysian framework.
By using it we are able to discriminate the right strategy with
more accuracy as we repeatedly perform and experiment and
observe the output.
25
Modeling human learning: Example
 Suppose we wish to “learn” whether a coin is fair (p = ½) or is
weighted (p ≠ ½).
 Our certainty that the coin is fair or not, will increase as the number of
trials increase.
 In the Bayesian model we assume that p is a RV. In reality, the coin has
a fixed probably, but it is unknown to us.
 Let the probability of heads be denoted by RV Y and its values by yj.
PY [y j ] =
1
M +1
for
y j = 0,
1 2
M -1
, ,...,
,1
M M
M
Prior PMF, it summarizes
our state of knowledge
before the experiment is
performed
26
Modeling human learning: Example
 Let N be the number of coin tosses and X denote the number of
tosses heads observed in the N tosses.
 X ~ bin(N, p) i.e. is binomially distributed, however the
probability of heads Y is unknown.
 We can only specify the PMF of X conditionally Y = yj then the
conditional PMF of the number of heads for X = i is
æ N ö i
N -1
pX|Y [i | y j ] = ç
y
(1y
)
,i = 0,1,..., N.
j
j
÷
è i ø
 We are interested in the prob of heads or the PMF of Y after
observing the outcomes of N coin tosses pY|X[yj|i].
pY|X[yj|i] is a posterior PMF, since it is determined
after the experiment is performed.
27
Modeling human learning: Example
 The posterior PMF pY|X[yj|i] contains all the info about the prob.
of heads that results from our prior knowledge, summarized by
pY, and our “data” knowledge, summarized by pX|Y.
 The posterior PMF is given by Bayes’ rule with xi = i as
pY |X [y j | i] =
pX|Y [i | y j ]pY [y j ]
å
j
pX|Y [i | y j ]pY [y j ]
æ N ö i
1
N -1
çè i ÷ø y j (1- y j ) M + 1
yij (1- y j )N -1
pY |X [y j | i] = M
= M
æ N ö i
1
i
N -1
N -1
y
(1y
)
j
j
å çè i ÷ø y j (1- y j ) M + 1 å
pY|X[yj|i] depends
j=0
j=0
on the observed
number of heads i.
y j = 0,1/ M,...,1;i = 0,1,..., N.
28
Modeling human learning: Example
p = 0.5
N = 10,i = 4
N = 20,i = 11
N = 40,i = 19
N = 10,i = 2
N = 20,i = 5
N = 40,i = 7
29
Problems
 A fair coin is tossed, If it comes up heads, then X = 1 and if it
comes up tails, then X = 0. Next, a point is selected at random
from the area A if X = 1 and from the area B if X = 0 as shown.
1
B
A
1
The area of the square is 4 and A and B both have areas of 3/2. If
the point selected is in an upper quadrant, we set Y = 1 and if it is
in a lower quadrant, we set Y = 0. Find the conditional PMF pY|X[j|i]
for all values of i and j. Next, compute P[Y = 0].
30
Problems
 Prove that
var(Y | xi ) = EY |X [Y 2 | xi ] - EY2|X [Y | xi ]
31
Problems
 If X and Y are independent RV, find PMF of Z = | X – Y|. Assume
that SX = {0,1,…} and SY = {0,1,…}.
Hint: The answer is
ì
ï
ï
pZ [k] = í
ï
ï
î
¥
åp
X
[i]pY [i]
k=0
i=0
¥
å ( p [i]p
Y
X
[i + k] + p X [i]pY [i + k]) k = 1,2,...
i=0
as intermediate step show that
ìï
pY [i]
k=0
pZ [k] = í
p [i + k] + pY [i = k] k ¹ 0
îï Y
32