Lecture 9 - Statistics

Download Report

Transcript Lecture 9 - Statistics

Chapter 5
Probability in Our Daily Lives
 Learn
….
About probability – the way we quantify
uncertainty
How to measure the chances of the
possible outcomes of random
phenomena
How to find and interpret probabilities
Agresti/Franklin Statistics, 1 of 87
 Section 5.1
How Can Probability Quantify
Randomness?
Agresti/Franklin Statistics, 2 of 87
Randomness

Applies to the outcomes of a response
variable

Possible outcomes are known, but it is
uncertain which will occur for any given
observation
Agresti/Franklin Statistics, 3 of 87
Some Popular Randomizers

Rolling dice

Spinning a wheel

Flipping a coin

Drawing cards
Agresti/Franklin Statistics, 4 of 87
Random Phenomena

Individual outcomes are
unpredictable

With a large number of observations,
predictable patterns occur
Agresti/Franklin Statistics, 5 of 87
Random Phenomena

With random phenomena, the
proportion of times that something
happens is highly random and
variable in the short run but very
predictable in the long run.
Agresti/Franklin Statistics, 6 of 87
Jacob Bernoulli: Law of
Large Numbers

As the number of trials of a random
phenomenon increases, the
proportion of occurrences of any
given outcome approaches a
particular number “in the long run”.
Agresti/Franklin Statistics, 7 of 87
Probability

With a random phenomenon, the
probability of a particular outcome is
the proportion of times that the
outcome would occur in a long run
of observations.
Agresti/Franklin Statistics, 8 of 87
Roll a Die
What is the probability of rolling a ‘6’?
a. .22
b. .10
c. .17
Agresti/Franklin Statistics, 9 of 87
Question about Random
Phenomena

If a family has four girls in a row and
is expecting another child, does the
next child have more than a ½
chance of being a boy?
Agresti/Franklin Statistics, 10 of 87
Independent Trials

Different trials of a random
phenomenon are independent if the
outcome of any one trial is not
affected by the outcome of any
other trial.
Agresti/Franklin Statistics, 11 of 87
Section 5.2
How Can We Find Probabilities?
Agresti/Franklin Statistics, 12 of 87
Sample Space

For a random phenomenon, the
sample space is the set of all
possible outcomes
Agresti/Franklin Statistics, 13 of 87
Example: Roll a Die Once

The Sample Space consists of six
possible outcomes:
{1, 2, 3, 4, 5, 6}
Agresti/Franklin Statistics, 14 of 87
Example: Flip a Coin Twice

The Sample Space consists of the
four possible outcomes:
{(H,H) (H,T) (T,H) (T,T)}
Agresti/Franklin Statistics, 15 of 87
Example: A 3-Question Multiple
Choice Quiz

Diagram of the Sample Space
Agresti/Franklin Statistics, 16 of 87
Tree Diagram

An ideal way of visualizing sample
spaces with a small number of
outcomes

As the number of trials or the number
of possible outcomes on each trial
increase, the tree diagram becomes
impractical
Agresti/Franklin Statistics, 17 of 87
Event

An event is a subset of the sample
space
Agresti/Franklin Statistics, 18 of 87
Probabilities for a Sample
Space

The probability of each individual
outcome is between 0 and 1

The total of all the individual
probabilities equals 1
Agresti/Franklin Statistics, 19 of 87
Example: Assigning Subjects to
Echinacea or Placebo for Treating
Colds

Experiment
•
•
•
Multi-center randomized experiment to compare
an herbal remedy to a placebo for treating the
common cold
Half of the volunteers are randomly chosen to
receive the herbal remedy and the other half will
receive the placebo
Clinic in Madison, Wisconsin has four volunteers
• Two men: Jamal and Ken
• Two women: Linda and Mary
Agresti/Franklin Statistics, 20 of 87
Example: Assigning Subjects to
Echinacea or Placebo for Treating
Colds

Sample Space to receive the herbal
remedy:
{(Jamal, Ken), (Jamal, Linda), (Jamal, Mary),
(Ken, Linda), (Ken, Mary), (Linda, Mary)}

These six possible outcomes are equally
likely
Agresti/Franklin Statistics, 21 of 87
Example: Assigning Subjects to
Echinacea or Placebo for Treating
Colds

What is the probability of the event that
the sample chosen to receive the herbal
remedy consists of one man and one
woman?
Agresti/Franklin Statistics, 22 of 87
Probability of an Event


The probability of an event A, denoted by
P(A), is obtained by adding the probabilities of
the individual outcomes in the event.
When all the possible outcomes are
equally likely:
number of outcomes in event A
P ( A) 
number of outcomes in the sample space
Agresti/Franklin Statistics, 23 of 87
Example: What are the Chances
of a Taxpayer being Audited?

Each year, the Internal Revenue
Service audits a sample of tax forms
to verify their accuracy
Agresti/Franklin Statistics, 24 of 87
Example: What are the Chances
of a Taxpayer being Audited?
Agresti/Franklin Statistics, 25 of 87
Example: What are the Chances
of a Taxpayer being Audited?

What is the sample space for selecting a
taxpayer?
{(under $25,000, Yes), (under $25,000, No),
($25,000 - $49,000, Yes) …}
Agresti/Franklin Statistics, 26 of 87
Example: What are the Chances
of a Taxpayer being Audited?

For a randomly selected taxpayer in
2002, what is the probability of an
audit?
Agresti/Franklin Statistics, 27 of 87
Example: What are the Chances
of a Taxpayer being Audited?

For a randomly selected taxpayer in
2002, what is the probability of an
income of $100,000 or more?
Agresti/Franklin Statistics, 28 of 87
Basic Rules for Finding Probabilities
about a Pair of Events

Complement of an Event

Intersection of 2 Events

Union of 2 Events
Agresti/Franklin Statistics, 29 of 87
Complement of an Event

Complement of Event A:
• Consists of all outcomes in the sample
•
•
•
space that are not in A
Is denoted by Ac
The probabilities of A and Ac add to 1
P(Ac) = 1 – P(A)
Agresti/Franklin Statistics, 30 of 87
Complement of an Event
Agresti/Franklin Statistics, 31 of 87
Disjoint Events

Two events, A and B, are disjoint if
they do not have any common
outcomes
Agresti/Franklin Statistics, 32 of 87
Example: Disjoint Events

Pop Quiz: 3 Multiple-Choice
Questions
• Event A:
•
Student answers exactly 1
question correctly
Event B: Student answer exactly 2
questions correctly
Agresti/Franklin Statistics, 33 of 87
Example: Disjoint Events
Agresti/Franklin Statistics, 34 of 87
Intersection of Two Events

The intersection of A and B: consists
of outcomes that are in both A and B
Agresti/Franklin Statistics, 35 of 87
Union of Two Events


The union of A and B: Consists of
outcomes that are in A or B
In probability, “A or B” denotes that A
occurs or B occurs or both occur
Agresti/Franklin Statistics, 36 of 87
Intersection and Union of
Two Events
Agresti/Franklin Statistics, 37 of 87
How Can We Find the Probability
that A or B Occurs?

Addition Rule: Probability of the
Union of Two Events
• For the union of two events,
P(A or B) = P(A) + P(B) – P(A and B)
• If the events are disjoint, P(A and B) = 0,
so P(A or B) = P(A) + P(B)
Agresti/Franklin Statistics, 38 of 87
How Can We Find the Probability
that A and B Occurs?

Multiplication Rule: Probability of the
Intersection of Independent Events
• For the intersection of two independent
events, A and B:
P(A and B) = P(A) x P(B)
Agresti/Franklin Statistics, 39 of 87
Example: Two Rolls of A Die

P(6 on roll 1 and 6 on roll 2):
1/6 x 1/6 = 1/36
Agresti/Franklin Statistics, 40 of 87
Example: Guessing on a Pop
Quiz

Pop Quiz with 3 Multiple-choice
questions
• Each question has 5 options

A student is totally unprepared and
randomly guesses the answer to each
question
Agresti/Franklin Statistics, 41 of 87
Example: Guessing on a Pop
Quiz


The probability of selecting the
correct answer by guessing = 0.20
Responses on each question are
independent
Agresti/Franklin Statistics, 42 of 87
Tree Diagram for the Pop Quiz
Agresti/Franklin Statistics, 43 of 87
Example: Guessing on a Pop
Quiz

What is the probability that a student
answers at least 2 questions
correctly?
P(CCC) + P(CCI) + P(CIC) + P(ICC) =
0.008 + 3(0.032) = 0.104
Agresti/Franklin Statistics, 44 of 87
Events Often Are Not
Independent

Example: A Pop Quiz with 2 Multiple
Choice Questions
• Data giving the proportions for the actual
responses of students in a class
Outcome: II
IC
Probability: 0.26 0.11
CI
CC
0.05
0.58
Agresti/Franklin Statistics, 45 of 87
Events Often Are Not
Independent

Define the events A and B as follows:
• A: {first question is answered correctly}
• B: {second question is answered
correctly}
Agresti/Franklin Statistics, 46 of 87
Events Often Are Not
Independent

P(A) = P{(CI), (CC)} = 0.05 + 0.58 = 0.63

P(B) = P{(IC), (CC)} = 0.11 + 0.58 = 0.69

P(A and B) = P{(CC)} = 0.58

If A and B were independent,
P(A and B) = P(A) x P(B) = 0.63 x 0.69 =
0.43
Agresti/Franklin Statistics, 47 of 87
Question of Independence

Don’t assume that events are
independent unless you have given
this assumption careful thought and
it seems plausible
Agresti/Franklin Statistics, 48 of 87
Example: A family has two
children
If each child is equally likely to be a girl or
boy, find the probability that the family has
two girls.
a.
1/2
b.
1/3
c.
1/4
d.
1/8
Agresti/Franklin Statistics, 49 of 87
Section 5.3
Conditional Probability: What’s the
Probability of A, Given B?
Agresti/Franklin Statistics, 50 of 87
Conditional Probability

For events A and B, the conditional
probability of event A, given that
event B has occurred is:
P( A and B)
P( A | B) 
P( B)
Agresti/Franklin Statistics, 51 of 87
Conditional Probability
Agresti/Franklin Statistics, 52 of 87
Example: What are the Chances
of a Taxpayer being Audited?
Agresti/Franklin Statistics, 53 of 87
Example: Probabilities of a
Taxpayer Being Audited
Agresti/Franklin Statistics, 54 of 87
Example: Probabilities of a
Taxpayer Being Audited

What was the probability of being
audited, given that the income was ≥
$100,000?
• Event A:
• Event B:
Taxpayer is audited
Taxpayer’s income ≥ $100,000
Agresti/Franklin Statistics, 55 of 87
Example: Probabilities of a
Taxpayer Being Audited
P(A and B) 0.0010
P(A | B) 

 0.007
P(B)
0.1334
Agresti/Franklin Statistics, 56 of 87
Example: The Triple Blood Test
for Down Syndrome

A positive test result states that the
condition is present

A negative test result states that the
condition is not present
Agresti/Franklin Statistics, 57 of 87
Example: The Triple Blood Test
for Down Syndrome

False Positive: Test states the
condition is present, but it is actually
absent

False Negative: Test states the
condition is absent, but it is actually
present
Agresti/Franklin Statistics, 58 of 87
Example: The Triple Blood Test
for Down Syndrome

A study of 5282 women aged 35 or
over analyzed the Triple Blood Test to
test its accuracy
Agresti/Franklin Statistics, 59 of 87
Example: The Triple Blood Test
for Down Syndrome
Agresti/Franklin Statistics, 60 of 87
Example: The Triple Blood Test
for Down Syndrome

Assuming the sample is representative
of the population, find the estimated
probability of a positive test for a
randomly chosen pregnant woman 35
years or older
Agresti/Franklin Statistics, 61 of 87
Example: The Triple Blood Test
for Down Syndrome

P(POS) = 1355/5282 = 0.257
Agresti/Franklin Statistics, 62 of 87
Example: The Triple Blood Test
for Down Syndrome

Given that the diagnostic test result is
positive, find the estimated
probability that Down syndrome truly
is present
Agresti/Franklin Statistics, 63 of 87
Example: The Triple Blood Test
for Down Syndrome
P(D and POS) 48 / 5282
P(D | POS) 


P(POS)
1355 / 5282
0.009
 0.035
0.257
Agresti/Franklin Statistics, 64 of 87
Example: The Triple Blood Test
for Down Syndrome

Summary: Of the women who tested
positive, fewer than 4% actually had
fetuses with Down syndrome
Agresti/Franklin Statistics, 65 of 87
Multiplication Rule for Finding
P(A and B)

For events A and B, the probability
that A and B both occur equals:
• P(A and B) = P(A|B) x P(B)
•
also
P(A and B) = P(B|A) x P(A)
Agresti/Franklin Statistics, 66 of 87
Example: How Likely is a Double
Fault in Tennis?

Roger Federer – 2004 men’s
champion in the Wimbledon tennis
tournament
• He made 64% of his first serves
• He faulted on the first serve 36% of the
•
time
Given that he made a fault with his first
serve, he made a fault on his second serve
only 6% of the time
Agresti/Franklin Statistics, 67 of 87
Example: How Likely is a Double
Fault in Tennis?

Assuming these are typical of his
serving performance, when he serves,
what is the probability that he makes
a double fault?
Agresti/Franklin Statistics, 68 of 87
Example: How Likely is a Double
Fault in Tennis?



P(F1) = 0.36
P(F2|F1) = 0.06
P(F1 and F2) = P(F2|F1) x P(F1)
= 0.06 x 0.36 = 0.02
Agresti/Franklin Statistics, 69 of 87
Sampling Without
Replacement

Once subjects are selected from a
population, they are not eligible to be
selected again
Agresti/Franklin Statistics, 70 of 87
Example: How Likely Are You to
Win the Lotto?

In Georgia’s Lotto, 6 numbers are
randomly sampled without
replacement from the integers 1 to 49

You buy a Lotto ticket. What is the
probability that it is the winning
ticket?
Agresti/Franklin Statistics, 71 of 87
Example: How Likely Are You to
Win the Lotto?

P(have all 6 numbers) = P(have 1st and 2nd
and 3rd and 4th and 5th and 6th)
= P(have 1st)xP(have 2nd|have 1st)xP(have 3rd|
have 1st and 2nd) …P(have 6th|have 1st, 2nd,
3rd, 4th, 5th)
Agresti/Franklin Statistics, 72 of 87
Example: How Likely Are You to
Win the Lotto?
6/49 x 5/48 x 4/47 x 3/46 x 2/45 x 1/44
= 0.00000007
Agresti/Franklin Statistics, 73 of 87
Independent Events Defined
Using Conditional Probabilities

Two events A and B are independent
if the probability that one occurs is
not affected by whether or not the
other event occurs
Agresti/Franklin Statistics, 74 of 87
Independent Events Defined
Using Conditional Probabilities

Events A and B are independent if:
P(A|B) = P(A)

If this holds, then also P(B|A) = P(B)

Also, P(A and B) = P(A) x P(B)
Agresti/Franklin Statistics, 75 of 87
Checking for Independence

Here are three ways to check whether
events A and B are independent:
• Is P(A|B) = P(A)?
• Is P(B|A) = P(B)?
• Is P(A and B) = P(A) x P(B)?

If any of these is true, the others are also
true and the events A and B are
independent
Agresti/Franklin Statistics, 76 of 87
Example: How to Check Whether
Two Events are Independent

The diagnostic blood test for Down
syndrome:
POS = positive result
NEG = negative result
D = Down Syndrome
DC = Unaffected
Agresti/Franklin Statistics, 77 of 87
Example: How to Check Whether
Two Events are Independent
Blood Test:
Status
POS
NEG
Total
D
0.009
0.001
0.010
Dc
0.247
0.742
0.990
Total
0.257
0.743
1.000
Agresti/Franklin Statistics, 78 of 87
Example: How to Check Whether
Two Events are Independent

Are the events POS and D
independent or dependent?
• Is P(POS|D) = P(POS)?
Agresti/Franklin Statistics, 79 of 87
Example: How to Check Whether
Two Events are Independent

Is P(POS|D) = P(POS)?

P(POS|D) =P(POS and D)/P(D)
= 0.009/0.010 = 0.90

P(POS) = 0.256

The events POS and D are
dependent
Agresti/Franklin Statistics, 80 of 87