Transcript Chapter3-1

Chapter 3. Conditional
Probability and Independence
Introduction
• Statistics deals with uncertainty
– Weather forecast
– Stock prices
– Hurricane prediction
• Availability of information reduces uncertainty
– Weather forecast with more information
2
• Toss two dice, suppose each of the possible 36
outcomes are equally likely. If we observed
that the first die is a 3, what is the probability
that the sum of the two dice equals to 8?
• Given the first die is 3, the sample space can
be reduced to {(3,1), (3,2), (3,3), (3,4), (3,5),
(3,6)} and the outcomes still equally likely. So
the desired probability is 1/6.
3
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5)
(4,1) (4,2) (4,3) (4,5) (4,6)
(5,1) (5,2) (5,4) (5,5) (5,6)
(6,1) (6,3) (6,4) (6,5) (6,6)
S
E
F
(3,1) (3,2) (3,3)
(3,4) (3,6)
(3,5)
(2,6) (4,4)
(5,3) (6,2)
E: The sum of the two dice is 8
F: The first die is 3
• P(E|F) = # outcomes in EF / # outcomes in F
= (# outcomes in EF / # outcomes in S) / (# outcomes in F / # outcomes in
S)
= P(EF)/P(F)
4
• A coin is flipped twice. Assuming that all four points
in the sample space S = {(h, h), (h, t), (t, h), (t, t)} are
equally likely, what is the conditional probability that
both flips land on heads, given that (a) the first flip
lands on heads; (b) at least one flip lands on heads?
• Let B = {(h,h)} be the event that both flips land on
heads; let F = {(h,h),(h,t)} be the event that the first
flip land on heads; and let A = {(h,h),(h,t),(t,h)} be
the event that at least one flip lands on heads.
P( BF )
P({(h, h)})
1/ 4
P( B | F ) 


 1/ 2
P( F )
P({(h, h), (h, t )}) 2 / 4
P( B | A) 
P( BA)
P({(h, h)})
1/ 4


 1/ 3
P( A) P({(h, h), (h, t ), (t , h)}) 3 / 4
5
• Toss two dice, suppose each of the possible 36
outcomes are equally likely. If you observed that
the first die is a 3, and you bet on one of the
following numbers: 4, 5, 6, 7, 8, 9, which all have
the same probability of 1/6. Do you gain any
advantage compared to not seeing the first die?
• If you had not seen the first die, there are 11
possible outcomes: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
which have probabilities, 1/36, 2/36, 3/36, 4/36,
5/36, 6/36, 5/36, 4/36, 3/36, 2/36, 1/36. The best
bet is 7, which has the same probability of 1/6.
6
Digitalis therapy
• Digitalis therapy is often beneficial to
patient who have suffered congestive
heart failure, but there is the risk of
digitalis intoxication, a serious side
effect that is, moreover, difficult to
diagnose. To improve the chances of a
correct diagnosis, the concentration of
digitalis in the blood can be measured.
Bellar (1971) conducted a study of the
relation of the concentration of digitalis
in the blood to digitalis intoxication in
135 patients. Their results are simplified
slightly in the following table.
7
•
•
•
•
T+ = high blood concentration (positive test)
T- = low blood concentration (negative test)
D+ = toxicity (disease present)
D- = no toxicity (disease absent)
D+
D-
Total
T+
25
14
39
T-
18
78
96
Total
43
92
135
25 of the 135 patients had a high blood concentration
of digitalis and suffered toxicity.
8
D+
D-
Total
T+
.185
.104
.289
T-
.133
.578
.711
Total
.318
.682
1.000
• P(T+) = .289, P(D+) = .318
• If the patient has high blood concentration (T+),
what is the probability of disease (D+)?
• P(D+|T+) = 25/39 = .64
• P(D+|T+) = P(D+T+)/P(T+) = .185/.289 = .64
• P(D+|T-) = P(D+T-)/P(T-) = .133/.711 = .187
9
• A student is taking a one-hour-time-limit
makeup examination. Suppose the probability
that the student will finish the exam in less
than x hours is x/2, for all 0 < x < 1. Given that
the student is still working after 0.75 hours,
what is the conditional probability that the full
hour is used?
• F: the full hour is used
• Lk: exam finished in k hours
P( F )  P( L1c )  1  P( L1 )  .5
c
P
(
FL
P( F )
.5
c
.75 )
P( F | L.75 ) 


 .8
c
P( L.75 ) 1  P( L.75 ) .625
10
• Ex 2e. Celine is undecided as to whether to take a
French course or a chemistry course. She estimated
that her probability of receiving an A grade would be
½ in a French course and 2/3 in a chemistry course. If
she decides to base her decision on the flip of a fair
coin, what is the probability that she gets an A in
chemistry?
• What are the events?
– A: receiving an A grade; C: taking chemistry; F: taking
French.
– P(A|F) = 1/2, P(A|C) = 2/3, P(C) = P(F) = 1/2.
• P(CA)?
• P(A|C) = P(AC)/P(C)
 P(AC) = P(A|C)P(C) = (2/3)(1/2) = 1/3
11
Multiplication rule
P( E1E2 E3  En )
 P( E1 ) P( E2 | E1 ) P( E3 | E1E2 ) P( En | E1  En1 )
Proof:
P( E1E2  En )
P( E1 E2 ) P( E1E2 E3 )
P( E1 )

 P( E1E2  En )
P( E1 ) P( E1E2 )
P( E1E2  En1 )
12
• What is the probability that Celine get an A
from either French or chemistry?
• P(A) = P(AC) + P(AF)
= P(C)P(A|C) + P(F)P(A|F)
= (1/2)(2/3) + (1/2)(1/2)
= 7/12
13
A useful formula for calculating
probabilities
E  EF  EF c
P( E )  P( EF )  P( EF c )
 P( E | F ) P( F )  P( E | F c ) P( F c )
 P( E | F ) P( F )  P( E | F c )[1  P( F )]
14
• Ex 3a part 1
• An insurance company believes that people can be divided into
two classes: those who are accident prone and those who are
not. Their statistics show that an accident-prone person will
have an accident at some time within a fixed 1-year period
with probability .4, whereas this probability decrease to .2 for
a non-accident-prone person. If we assume that 30 percent of
the population is accident prone, what is the probability that a
new policyholder will have an accident within a year of
purchasing a policy?
• The policyholder is either accident prone or not.
• A1: the policyholder will have an accident within a year of
purchase.
• A: the policyholder is accident prone.
• P(A1|A) = .4; P(A1|Ac) = .2; P(A) = .3; P(Ac) = .7; P(A1) = ?
• P(A1) = P(A1|A)P(A) + P(A1|Ac)P(Ac)
= (.4)(.3) + (.2)(.7) = .26
15
• Ex 3a part 2
• Suppose that a new policyholder has an accident
within a year of purchasing a policy. What is the
probability that he or she is accident prone?
• P(A1|A) = .4; P(A1|Ac) = .2; P(A) = .3; P(Ac)=.7
• P(A|A1)?
• P(A|A1) = P(AA1)/P(A1)
= P(A1|A)P(A)/P(A1)
= (.3)(.4)/.26 = 6/13
16
• 3d. A laboratory blood test is 99 percent effective in
detecting a certain disease when it is, in fact, present.
However, the test also yields a “false positive” result
for 1 percent of the healthy persons tested. (That is, if
a healthy person is tested, with probability 0.01, the
test result will imply he or she has the disease.) If .2
percent of the population actually has the disease,
what is the probability a person has the disease given
that the test result is positive?
• D: Event that the tested person has the disease.
• E: Event that the test result is positive.
P( DE )
P( E | D) P( D)

P( E )
P( E | D) P( D)  P( E | D c ) P( D c )
0.99  0.002
0.00198


 .166
.99  0.002  0.01 0.998 0.00198  0.00998
P( D | E ) 
17
• Ex 3f. At a certain stage of a criminal investigation the
inspector in charge is 60 percent convinced of the guilty of a
certain suspect. Suppose now that a new piece of evidence that
shows the criminal has a certain characteristic (such as lefthandedness, baldness, or brown hair) is uncovered. If 15
percent of the population possesses this characteristic, how
certain of the guilty of the suspect should the inspector now be
if it turns out that the suspect has this characteristic?
• G: event that the suspect is guilty
• C: event that he possesses the characteristic of the criminal
• P(G|C)?
• P(G|C) = P(GC)/P(C)
= P(C|G)P(G) / [P(C|G)P(G) + P(C|Gc)P(Gc)]
= 1(.6)/[1(.6) + (.15)(.4)] ≈ .91
18
Monty Hall problem
• Suppose you're on a game show, and you're
given the choice of three doors: Behind one
door is a car; behind the others, goats. You
pick a door, say No. 1, and the host, who
knows what's behind the doors, opens another
door, say No. 3, which has a goat. He then
says to you, "Do you want to pick door No. 2?"
Is it to your advantage to switch your choice?
19
Monty Hall problem
•
•
•
•
•
Cswitch: get a car by switching
Cstay: get a car by staying
Ecar: originally picked the car
Egoat: originally picked the goat
P(Cswitch) = P(Cswitch| Ecar)P(Ecar) + P(Cswitch|
Egoat)P(Egoat)
= 2/3
20
• We can express the change in the probability
of a hypothesis when new evidence is
introduced in a compact form using change in
the odds of the hypothesis.
• The odds of an event A is defined by
P(A)/P(Ac) = P(A)/[1-P(A)]
• The odds of an event A tells how much more
likely it is that the event A occurs than it is that
it does not occur.
21
Change of probability with new
evidence
• Hypothesis H with probability P(H).
• P(H|E) = P(E|H)P(H)/P(E)
• P(Hc|E) = P(E|Hc)P(Hc)/P(E)
P( H | E )
P( H ) P( E | H )

c
P( H | E ) P( H c ) P( E | H c )
22
• 3g. In the world bridge championships held in Buenos Aires
in May 1965 the famous British bridge partnership of
Terrence Reese and Boris Schapiro was accused of cheating
by using a system of finger signals that could indicate the
number of hearts held by the players. Reese and Schapiro
denied the accusation, and eventually a hearing was held by
the British bridge league. The hearing was in the form of a
legal proceeding with a prosecuting and defense team, both
having the power to call and cross-examine witnesses.
During the course of these proceedings the prosecutor
examined specific hands played by Reese and Schapiro and
claimed that their playing in these hands was consistent
with the hypothesis that they were guilty of having illicit
knowledge of the heart suit. At this point, the defense
attorney pointed out that their play of these hands was also
perfectly consistent with their standard line of play.
However, the prosecution then argued that as long as their
play was consistent with the hypothesis of guilt, then it must
be counted as evidence toward this hypothesis. What do you
think of the reasoning of the prosecution?
23
Bayes’ Formula
P( AB)
P( A | B) P( B)
P( B | A) 

P( A)
P( A | B) P( B)  P( A | B c ) P( B c )
n
F  S
i
i 1
n
E   EFi
i 1
n
n
i 1
i 1
P( E )   P( EFi )   P( E | Fi ) P( Fi )
P( F j | E ) 

P( EFj )
P( E )
P( E | F j ) P( F j )
n
 P( E | F ) P( F )
i 1
i
i
24
Occupational Mobility
• Suppose that occupations are grouped into
upper (U), middle (M), and lower (L) levels.
U1 will denote the event that a father’s
occupation is upper-level; U2 will denote the
event that a child’s occupation is upper-level,
etc. (the subscripts index generations). Glass
and Hall (1954) compiled the following
statistics on occupation mobility in England
and Wales:
25
U2
M2
L2
U1
.45
.48
.07
M1
.05
.70
.25
L1
.01
.50
.49
• This table is called transition probability
matrix.
• If a father is in U, the probability that his son is
in U is .45, the probability that his son is in M
is .48, etc.
• Conditional probabilities such as P(U2|U1)=.45
26
U2
M2
L2
U1
.45
.48
.07
M1
.05
.70
.25
L1
.01
.50
.49
• Suppose that of the father’s generation, 10%
are in U, 40% in M, and 50% in L. What is the
probability that a child in the next generation is
in U?
• P(U2) = P(U2|U1)P(U1) + P(U2|M1)P(M1) +
P(U2|L1)P(L1)
= .45×.10 + .05×.40 + .01×.50 = .07
27
U2
M2
L2
U1
.45
.48
.07
M1
.05
.70
.25
L1
.01
.50
.49
• Suppose we ask: if a child has occupation
status U2, what is the probability that his father
had occupational status U1? P(U1|U2)?
• P(U1|U2) = P(U1U2)/P(U2)
= P(U2|U1)P(U1) / [P(U2|U1)P(U1) +
P(U2|M1)P(M1) + P(U2|L1)P(L1) ]
= .45×.10 / .07 = .64
28
• Suppose that we have 3 cards identical in form
except that both sides of the first card are
colored red, both sides of the second card are
colored black, and one side of the third card is
colored red and the other side black. The 3
cards are mixed up in a hat, and 1 card is
randomly selected and put down on the
ground. If the upper side of the chosen card is
colored red, what is the probability that the
other side is colored black?
29
• Let
–
–
–
–
–
RR: all red card
BB: all black card
RB: red-black card.
R: upturned side of the chosen card is red
P(RB|R)?
P( RB | R) 
P( RB  R)
P( R)
P( R | RB) P( RB)

P( R | RR) P( RR)  P( R | RB) P( RB)  P( R | BB) P( BB)
(1 / 2)(1 / 3)

 1/ 3
1(1 / 3)  (1 / 2)(1 / 3)  0(1 / 3)
30