Revision Conditional Probability Power point

Download Report

Transcript Revision Conditional Probability Power point

Conditional Probability
AS91585
Section One
THE LANGUAGE OF PROBABILITY QUESTIONS.
The question, "Do you smoke?" was
asked of 100 people.
.
Male
Female
Total
Yes
19
12
31
No
41
28
69
The results are shown in the table.
Total
60
40
100
(i) What is the probability of a randomly
selected individual being a male who smokes?
.
Male
Female
Total
Yes
19
12
31
No
41
28
69
Total
60
40
100
(i) What is the probability of a randomly
selected individual being a male who smokes?
.
Male
Female
Total
Yes
19
12
31
No
41
28
69
Total
60
40
100
This is just a joint probability question:
The number of "Male and Smoke" divided by
the total = 19/100 = 0.19
(ii) What is the probability of a randomly
selected individual being a male ?
.
Male
Female
Total
Yes
19
12
31
No
41
28
69
Total
60
40
100
(ii) What is the probability of a randomly
selected individual being a male ?
.
Male
Female
Total
Yes
19
12
31
No
41
28
69
Total
60
40
100
This is the total for male divided by the
total = 60/100 = 0.60.
Since no mention is made of smoking or
not smoking, it includes all the cases.
(iii) What is the probability of a randomly
selected individual smoking?
.
Male
Female
Total
Yes
19
12
31
No
41
28
69
Total
60
40
100
(iii) What is the probability of a randomly
selected individual smoking?
.
Male
Female
Total
Yes
19
12
31
No
41
28
69
Total
60
40
100
Again, since no mention is made of gender, it is
the total who smoke divided by the total =
31/100 = 0.31.
(iv) What is the probability of a randomly
selected male smoking?
.
Male
Female
Total
Yes
19
12
31
No
41
28
69
Total
60
40
100
(iv) What is the probability of a randomly
selected male smoking?
.
Male
Female
Total
Yes
19
12
31
No
41
28
69
Total
60
40
100
This time, you're told that you have a male,
so we only consider the males. What is the
probability that the male smokes? Well, 19
males smoke out of 60 males, so 19/60
(v) What is the probability that a randomly
selected smoker is male?
.
Male
Female
Total
Yes
19
12
31
No
41
28
69
Total
60
40
100
(v) What is the probability that a randomly
selected smoker is male?
.
Male
Female
Total
Yes
19
12
31
No
41
28
69
Total
60
40
100
This time, you're told that you have a smoker
and asked to find the probability that the
smoker is also male. There are 19 male
smokers out of 31 total smokers, so 19/31
Section Two
CREATING TABLES
Question 2
There are three major manufacturing companies
that make a product: Aberations, Brochmailians,
and Chompielians. Aberations has a 50% market
share, and Brochmailians has a 30% market
share. 5% of Aberations' product is defective, 7%
of Brochmailians' product is defective, and 10%
of Chompieliens' product is defective.
Choose 1000 for the total
Company Good
Aberations
Brochmailians
Chompielians
Total
Defective Total
Use A, B and C to save writing time
1000
Aberations has a 50% market share, and
Brochmailians has a 30% market share.
Company Good
A
B
C
Total
Fill in gaps as you go.
Defective Total
500
300
200
1000
5% of Aberations' product is defective, 7% of
Brochmailians' product is defective, and 10% of
Chompieliens' product is defective.
Company Good
A
B
C
Total
Defective
25
21
20
66
Total
500
300
200
1000
5% of Aberations' product is defective, 7% of
Brochmailians' product is defective, and 10% of
Chompieliens' product is defective.
Company
A
B
C
Total
Good
475
279
180
934
Fill in the gaps
Defective
25
21
20
66
Total
500
300
200
1000
What is the probability a randomly selected product is
defective?
Company
A
B
C
Total
Good
475
279
180
934
Defective
25
21
20
66
Total
500
300
200
1000
(i) What is the probability a randomly selected product
is defective?
Company
A
B
C
Total
Good
475
279
180
934
66/1000=0.066
Defective
25
21
20
66
Total
500
300
200
1000
(ii) What is the probability that a defective product
came from Brochmailians?
Company
A
B
C
Total
Good
475
279
180
934
Defective
25
21
20
66
Total
500
300
200
1000
(ii) What is the probability that a defective product
came from Brochmailians?
Company
A
B
C
Total
Good
475
279
180
934
Defective
25
21
20
66
Total
500
300
200
1000
P(Brochmailian|Defective)
= P(Brochmailian and Defective) / P(Defective)
= 21/66 = 7/22 = 0.318 (approx).
(iii) Are these events independent?
Company
A
B
C
Total
Good
475
279
180
934
Defective
25
21
20
66
Total
500
300
200
1000
(iii) Are these events independent?
Company
A
B
C
Total
Good
475
279
180
934
Defective
25
21
20
66
Total
500
300
200
1000
No. If they were, then P(Brochmailians|Defective)=0.318 would
have to equal the P(Brochmailians)=0.30, but it doesn’t i.e. if
independent:
P ( BÇ D ) P ( B) ´ P ( D )
=
=
= P ( B)
D
P ( D)
P ( D)
( )
P B
SECTION THREE
PRACTICING METHODS
Using Tree diagrams
There are three major manufacturing companies
that make a product: Aberations, Brochmailians,
and Chompielians. Aberations has a 50% market
share, and Brochmailians has a 30% market
share. 5% of Aberations' product is defective, 7%
of Brochmailians' product is defective, and 10%
of Chompieliens' product is defective.
(i) What is the probability a randomly
selected product is defective?
There are three major manufacturing companies
that make a product: Aberations, Brochmailians,
and Chompielians. Aberations has a 50% market
share, and Brochmailians has a 30% market
share. 5% of Aberations' product is defective, 7%
of Brochmailians' product is defective, and 10%
of Chompieliens' product is defective.
P ( D) = 0.5 ´ 0.05 + 0.3´ 0.07 + 0.2 ´ 0.1 = 0.066
0.5
0.3
0.2
0.05
D
0.07
Not D
D
A
B
C
0.1
Not D
D
Not D
(ii) What is the probability that a defective
product came from Brochmailians?
There are three major manufacturing companies
that make a product: Aberations, Brochmailians,
and Chompielians. Aberations has a 50% market
share, and Brochmailians has a 30% market
share. 5% of Aberations' product is defective, 7%
of Brochmailians' product is defective, and 10%
of Chompieliens' product is defective.
( )
P B
0.3´ 0.07
0.021 7
=
=
=
» 0.318
D 0.5 ´ 0.05 + 0.3´ 0.07 + 0.2 ´ 0.1 0.066 22
0.5
0.3
0.2
0.05
D
0.07
Not D
D
A
B
C
0.1
Not D
D
Not D
Question Three
Players are strongly advised to warm up before playing
sports games to reduce their risk of injury from playing
the game.
For a particular sports team of 20 players:
• 14 of the players warmed up before the last game
• 5 of the players were injured during the last game
• 2 of the players did not warm up and were not injured
during the last game.
Using this information, calculate the probability that a
randomly chosen player from the team was injured, given
that the player did not warm up before the last game.
Venn Diagram
Injured
Warmed
up
13
1
20
4
4 2
=
6 3
2
Table – initial information
Warmed
Up
Injured
Not
Injured
Did not
warm up
Totals
5
2
14
20
Table fill the gaps
Injured
Not
Injured
Warmed
Up
1
13
14
4 2
=
6 3
Did not
warm up
4
2
6
Totals
5
15
20
Question 4
On a certain type of aircraft the warning lights
(showing green for normal and red for trouble)
for the engines are accurate 90% of the time. If
there are problems with the engines on 2% of
all flights, find the probability that there is a
fault with an engine, given that the warning
light shows red.
We can also create a table:
Assume we look at 1000 flights
Red
Trouble
OK
Totals
Green
20
980
1000
“problems with the engines on 2% of flights”
We can also create a table:
Assume we look at 1000 flights
Trouble
OK
Totals
Red
18
Green
882
20
980
1000
“warning lights are accurate 90% of the time”
We can also create a table:
Assume we look at 1000 flights
Trouble
OK
Totals
Red
18
98
116
Finish the table
Green
2
882
884
Totals
20
980
1000
find the probability that there is a fault with an
engine, given that the warning light shows red.
Trouble
OK
Totals
Red
18
98
116
18
9
=
116 58
Green
2
882
884
Totals
20
980
1000
0.02
0.98
0.9
R
0.1
0.1
G
R
0.9
G
T
T’
0.02 ´ 0.9
= 0.155
0.02 ´ 0.9 + 0.98 ´ 0.1
SECTION FOUR
READING TABLES
Question 5
One of the biggest problems with conducting a
mail survey is the poor response rate. In an
effort to reduce nonresponse, several different
techniques for formatting questionnaires have
been proposed. An experiment was conducted
to study the effect of the questionnaire layout
and page size on response in a mail survey. The
results are given below.
Format
Typewritten
(small page)
Typewritten
(large page)
Typeset
(small page)
Typeset
(large page)
Total
Responses
86
Nonresponses Total
57
143
191
97
288
72
69
141
192
92
284
541
315
856
(a) What proportion of the sample
responded to the questionnaire?
Format
Typewritten
(small page)
Typewritten
(large page)
Typeset
(small page)
Typeset
(large page)
Total
Responses
86
Nonresponses Total
57
143
191
97
288
72
69
141
192
92
284
541
315
856
(a) What proportion of the sample
responded to the questionnaire?
Format
Typewritten
(small page)
Typewritten
(large page)
Typeset
(small page)
Typeset
(large page)
Total
Responses
86
Nonresponses Total
57
143
191
97
288
72
69
141
192
92
284
541
315
856
541
856
b. What proportion of the sample received the
typeset small-page version?
Format
Typewritten
(small page)
Typewritten
(large page)
Typeset
(small page)
Typeset
(large page)
Total
Responses
86
Nonresponses Total
57
143
191
97
288
72
69
141
192
92
284
541
315
856
141
856
What proportion of those who received a typeset largepage version actually responded to the questionnaire?
Format
Typewritten
(small page)
Typewritten
(large page)
Typeset
(small page)
Typeset
(large page)
Total
Responses
86
Nonresponses Total
57
143
191
97
288
72
69
141
192
92
284
541
315
856
192 48
=
284 71
What proportion of the sample received a typeset
large-page questionnaire and responded?
Format
Typewritten
(small page)
Typewritten
(large page)
Typeset
(small page)
Typeset
(large page)
Total
Responses
86
Nonresponses Total
57
143
191
97
288
72
69
141
192
92
284
541
315
856
192 24
=
856 107
(e) What proportion of those who responded to the
questionnaire actually received a type-written large page
questionnaire?
Format
Typewritten
(small page)
Typewritten
(large page)
Typeset
(small page)
Typeset
(large page)
Total
Responses
86
Nonresponses Total
57
143
191
97
288
72
69
141
192
92
284
541
315
856
191
541
f. By looking at the response rates for each of the four
formats, what do you conclude from the study?
Format
Typewritten
(small page)
Typewritten
(large page)
Typeset
(small page)
Typeset
(large page)
Total
Responses
86
191
72
192
541
Nonresponses Total
86
57
143= 60.1%
143
191
97
288= 66.3%
288
69
141
72
= 51.1%
141
92
284
192
= 67.6%
284
315
856
Type set (Large page) gave the best response rate at 68%
with typewritten (large page) almost the same at 66% and
Type set (Small page) was the worst at 51%. As a margin of
error is likely I would conclude that the response rates
seem to be better for large page.
SECTION FIVE
HARDER PROBLEMS
Question 6
A cab was involved in a hit-and-run accident at
night. There are two cab companies that operate in
the city, a Blue Cab company and a Green Cab
company. It is known that 85% of the cabs in the
city are Green and 15% are Blue. A witness at the
scene identified the cab involved in the accident as
a Blue Cab. The witness was tested under similar
visibility conditions, and made correct colour
identifications in 80% of the trial instances. What is
the probability that the cab involved in the accident
was a Blue cab as stated by the witness?
If your answer was 80%, you are in the majority.
The 80% answer shows how we have a tendency
to primarily consider only the last evidence
given to us, ignoring earlier evidence.
If we are simply told that a cab was involved in a
hit-and-run accident, and are not given the
information about the witness, then the
majority of us will correctly estimate the
probability of it being a Blue cab as 15%.
Given new evidence (the 80% reliable witness)
we throw away the first calculation and base our
answer solely on the reliability of the witness.
We do this to simplify the calculation, but in this
case it leads to the wrong answer.
There are four possible scenarios.
1. Green (85%) and correctly identified as
Green (80%). Chance is 68%
2. Green (85%) and misidentified as Blue (20%).
Chance is 17%
3. Blue (15%) and correctly identified as Blue
(80%). Chance is 12%
4. Blue (15%) and misidentified as Green (20%).
Chance is 3%
Conditional probability
In this case, we know that the witness said it
was a Blue cab, so we only need to consider
those cases where the cab was identified as
Blue.
That means it was either a misidentified Green
(17%) or a correctly identified Blue (12%). So the
chance that it was actually Blue is the chance of
it being correctly identified as Blue (12%) over
the chance that it was identified as Blue,
whichever colour it actually was (12% + 17%, or
29%). That means that the chance of it being
Blue, after being identified as Blue, is 12/29, or
about 41%.
The chance that it was actually Green is the
remaining 59%.
But with a witness who is 80% reliable, how can
he be so likely to get it wrong? The catch is that
the small chance of his incorrect identification is
swamped by the huge number of Green cabs,
which just make it so much more likely that any
cab in the city is Green.
Basically, with a compound probability like this
you have to be careful to check out the
contribution of both the correct (correctly
identified Blue) and the incorrect (misidentified
Green) terms. Otherwise, you may miss a large
contribution which works against your intuition.
If only 5% of the cabs in the city are Blue, the
chances drop to 4/23, or 17%. In other words, if
only 5% of the cabs are Blue and our 80%
reliable witness identifies a Blue cab in an
accident, there is only a 17% chance that he's
actually right. Our 80% reliable witness is 5
times more likely to be wrong than right!
Question 7a
A survey of newspaper purchasing patterns in a
particular region found that 3⁄5 of the
households surveyed purchased a daily
newspaper during the previous week. The
survey also found that in 7⁄10 of the households
where a daily newspaper was purchased during
the previous week, a Sunday paper was also
purchased. A Sunday paper was purchased in 1⁄4
of households where no daily newspaper was
purchased.
What is the probability that, in a household
chosen at random from those surveyed:
(i) a daily newspaper was purchased or a
Sunday newspaper was purchased (but not
both)?
(ii) a daily newspaper was purchased, given
that a Sunday newspaper was purchased?
Table
Daily
Paper
42
Sunday
paper
No Sunday 18
Paper
Totals
60
No Daily
Paper
10
Totals
30
48
40
100
52
(i) a daily newspaper was purchased or a Sunday
newspaper was purchased (but not both)?
Daily
Paper
42
Sunday
paper
No Sunday 18
Paper
Totals
60
No Daily
Paper
10
Totals
30
48
40
100
28
7
=
100 25
52
(ii) a daily newspaper was purchased, given that a
Sunday newspaper was purchased?
Daily
Paper
42
Sunday
paper
No Sunday 18
Paper
Totals
60
No Daily
Paper
10
Totals
30
48
40
100
42 21
=
52 26
52
7b In a different survey of 120 households, where 70 of them
purchased a daily newspaper, the following information was
obtained.
Households may contain both primary and secondary
students.
(i) Find the probability that a household chosen at random from
this sample contains both primary and secondary school students.
Find the probability that a household chosen at random from this
sample contains both primary and secondary school students.
There are 120 households but 146 listed which
means 26 are counted twice. Answer 26/120
(ii) Demonstrate mathematically that the event ‘the household contains
both primary and secondary school students’ and the event ‘the
household purchases a daily newspaper’ are not independent.
Demonstrate mathematically that the event ‘the household contains
both primary and secondary school students’ and the event ‘the
household purchases a daily newspaper’ are not independent.
70 OUT OF 120 purchased a daily paper
14 were counted twice
Events are not independent
26
70
P ( P & S) =
; P ( D) =
120
120
14
P ( P & S Ç D) =
120
26 70
91
P ( P & S) ´ P ( D) =
´
=
120 120 720
P ( P & S Ç D) ¹ P ( P & S ) ´ P ( D)
(iii)
Find the probability that a household contains at
least one secondary school student, given that the
household purchases at least one daily
newspaper.
Find the probability that a household contains at least one secondary
school student, given that the household purchases at least one daily
newspaper.
Conditional probability P(S/D)
Find the probability that a household contains at least one secondary
school student, given that the household purchases at least one daily
newspaper.
Conditional probability P(S/D)
21
P(S
Ç
D)
21
120
P SD =
=
=
70
P(D)
70
120
( )
Scholarship 2010 Q3a
Statsmobiles may be classified as “two door” or “at
least three door” models. Fifty-five percent of all
two-door Statsmobiles are non air-conditioned and
30% of all Statsmobiles are both air-conditioned
and have at least three doors.
• The proportion of air-conditioned Statsmobiles
that have at least three doors is the same as the
overall proportion of Statsmobiles that have at
least three doors.
Find the value of this proportion.
Using a tree diagram
0.55
NAC
2 door
1-p
p
0.45
AC
1-q
NAC
q
AC
3+
doors
0.3
There are some 2 door cars so p = 2/3
0.55
2 door
1-p
p
NAC
0.3
p=
0.3 + 0.45 (1- p )
0.45
AC
0.3p + 0.45 p - 0.45 p 2 - 0.3 = 0
3+
doors
0.45 p 2 - 0.75 p + 0.3 = 0
1-q
NAC
2
p= , 1
3
q
AC
0.3