Probability - cloudfront.net

Download Report

Transcript Probability - cloudfront.net

Chapter 5
Modeling Variation with Probability
Calculated Risks...
Our lives are full of false-positives. When the smoke
alarm goes off because you've burned something on
the stove, that's a false-positive. It's positive because
an alarm goes off alerting you to a danger. It's false
because your house is not actually burning down.
After September 11th, the Transportation Security
Administration (TSA) was established and charged with
installing a screening system for airports that would
detect weapons and bombs on individuals or in
baggage. Since January 1, 2003, TSA has been
screening all checked luggage.
Calculated Risks...
The machines for checking baggage are costly, over $1
million per machine. Unfortunately, the technology is
not perfect. Shampoo, for example, which has the
same density as certain explosives, can be mistaken
for explosives and generate a false-positive.
Other items that produce false-positives are certain
food items (like cheese or chocolate), books,
deodorant sticks, and toothpaste. The machines also
flag luggage that has items the scanner can't see
through, such as laptop, camera equipment, and cell
phones. TSA screeners will hand-search bags that
register a positive reading.
Calculated Risks...
• In the context of the problem, what are these
and which of the following is most serious?
•
•
•
•
True negative
True positive
False negative
False positive
Calculated Risks...
• True negative: machine says no explosives/weapons &
there really aren’t any explosives/weapons
• True positive: machine says explosives/weapons &
there really are explosives/weapons
• False negative: machine says no explosives/weapons &
there really are explosives/weapons
• False positive: machine says explosives/weapons &
there really aren’t any explosives/weapons
Calculated Risks....
Which is most serious in the context of this situation?
• True negative: machine says no explosives/weapons &
there really aren’t any explosives/weapons
• True positive: machine says explosives/weapons & there
really are explosives/weapons
• False negative: machine says no explosives/weapons &
there really are explosives/weapons
• False positive: machine says explosives/weapons & there
really aren’t any explosives/weapons
Probability...
• Probability calculations are the basis for
inference (making decisions about a
population based on a sample).
• What we learn in this chapter will help us
describe statistics from random samples &
randomized comparative experiments later in
the course.
1-in-6 game…
As a special promotion for its 20-ounce bottles of soda, a soft
drink company printed a message on the inside of each bottle
cap. Some of the caps said, “Please try again!” while others
said, “You’re a winner!” The company advertised the
promotion with the slogan “1 in 6 wins a prize.” The prize is a
free 20-ounce bottle of soda, which comes out of the store
owner’s profits.
Seven friends each buy one 20-ounce bottle at a local
convenience store. The store clerk is surprised when three of
them win a prize. The store owner is concerned about losing
money from giving away too many free sodas. She wonders if
this group of friends is just lucky or if the company’s 1-in-6
claim is inaccurate. In this Activity, you and your classmates
will perform a simulation to help answer this question.
1-in-6 game…
For now, let’s assume that the company is telling
the truth, and that every 20-ounce bottle of
soda it fills has a 1-in-6 chance of getting a cap
that says, “You’re a winner!” We can model the
status of an individual bottle with a six-sided
die: let 1 through 5 represent “Please try again!”
and 6 represent “You’re a winner!”
1-in-6 game…
1. Roll your die seven times to imitate the process of the
seven friends buying their sodas. How many of them won
a prize? Repeat 3 times.
2. Write your three results on the board. Using Minitab,
input the data and create a dot plot displaying the
number of prize winners you got in Step 1 on the graph.
3. What percent of the time did the friends come away
with three or more prizes, just by chance? Does it seem
plausible that the company is telling the truth, but that
the seven friends just got lucky? Explain.
Whose book is this?
Suppose that four friends (including Ariana Grande) get together to
study at a doughnut shop for their next test in high school statistics.
When they leave their table to go get a doughnut, the doughnut shop
owner decides to mess with them (you know… because of Ariana’s
recent doughnut scandal) and makes a tower using their textbooks.
Unfortunately, none of the students wrote their name in their book, so
when they leave the doughnut shop, each student takes one of the
books at random.
When the students return the books at the end of the year and the
clerk scans their barcodes, the students are surprised to learn that
none of the four had their own book. How likely is it that none of the
four students ended up with the correct book? … simulation time! 
On four equally-sized slips of paper, write “Student 1,” “Student 2,” “Student
3,” and “Student 4.” Likewise, on four equally-sized slips of paper, write
“Book 1,” “Book 2,” “Book 3,” and “Book 4.”
Place the four papers with the student numbers on your desk. Then shuffle
the papers with book numbers and randomly place one paper on each
‘student.” If the book number matches the student number, this represents a
student choosing his own book from the tower of textbooks.
Count the number of students who get the correct book. Repeat this process
three times. Then write your results on the board. Input the data and create
a dot plot in Minitab.
How likely is it for none of the students to end up with their own book?
What if we were to do this entire simulation again. Would you expect to get
the same exact results? Why or why not?
Investigating Randomness… & More
Simulation
• Pretend that you are flipping a fair coin. Without actually
flipping a coin, imagine the first toss. Write down the
result you see in your mind, heads (H) or tails (T), below.
• Imagine a second coin flip. Write down the result below.
• Keep doing this until you have recorded the results of 25
imaginary flips. Write all 25 of your results in groups of 5
to make them easier to read, like this: HTHTH TTHHT, etc.
Investigating Randomness… & More
Simulation…
• A run is a repetition of the same result. In the
previous example, there is a run of two tails
followed by a run of two heads in the first 10 coin
flips. Read through your 25 imagined coin flips
that you wrote above and find the longest run
(doesn’t matter if it was heads or tails; just your
longest run).
• On the board, write the length of the longest run
you wrote (within your 25 values). Input into
Minitab and create a dot plot of the classes data.
Investigating Randomness… & More
Simulation
• Now, use a random digits table, technology, or
a coin to generate a similar list of 25 coin flips.
Find the longest run that you have.
• Now lets create another dot plot with this
new data from the class. Plot the length of
the longest run you got above.
Randomness…
• The idea of probability is that randomness is
predictable in the long run. Unfortunately, our
intuition about randomness tries to tell us that
random phenomena should also be predictable
in the short run.
• Probability Applet (www.whfreeman.com/tps5e)
Random Phenomenon...
We call an event ‘random’ if individual outcomes
are uncertain but there is nonetheless a
regular distribution of outcomes in a large
number of repetitions.
Big Idea
Chance behavior (random phenomenon) is unpredictable
in the short run, but has a regular and predictable
pattern in the long run.
Individual outcomes are uncertain; but a regular
distribution of outcomes emerges in a large number of
repetitions.
Probability of any outcome of random phenomenon is
the proportion of times an outcome would occur in a
very long series of repetitions. Probability is a longterm relative frequency (simulations very helpful).
Probability vs. Odds
• Probability =
• Odds =


successes
total
successes
failures
Careful...
• It makes no sense to discuss the probability of an
event that has already occurred.
• Meaningless to ask what the probability is of an
already-flipped coin being a tail. It’s already been
decided.
• Probability: future event
• Statistics: past event
Definition: Simulation is...
• the imitation of chance behavior, based on a
model that accurately reflects the
phenomenon under consideration.
• Examples include...
Simulation...
• Why would we want to simulate a situation
(rather than carry the event out in reality)?
• Discuss with a partner for one minute.
Simulation… model must
match situation...
• What model could we use to simulate the
probability of a soon-to-be new-born baby
being a girl or a boy?
Simulation...
What couldn’t be used as a model to simulate
this situation?
Discuss for one minute.
Simulation...
• ... can be an effective tool/method for finding
the likelihood of complex results IF you have a
trustworthy model.
• If not (if model does not correctly describe the
random phenomenon), probabilities derived
from model will also be incorrect/worthless.
Simulation Steps...
• State. Ask a question of interest about some
chance process
• Plan. Describe how to use a chance device to
imitate one repetition of the process. Tell
what you will record at the end of each
repetition
• Do. Perform many repetitions of the
simulation
• Conclude. Use the results of your simulation
to answer the question of interest
• Do following simulation if time permits
Simulation: Should I guess?
State – Plan – Do - Conclude
A multiple-choice test is scored as follows: For each
question you answer correctly, you get 4 points. For
each question you answer incorrectly, you lose 1 point.
For simplicity suppose that there are 10 multiplechoice questions with four choices for each question.
Suppose Mr. Deming doesn’t know the answers to any of
the questions, and he guesses on each one. Use
simulation methods to determine Mr. Deming’s
expected score.
Should I guess?
(lets use SPDC)
State – Plan – Do - Conclude
• Probability of guessing correctly is 0.25. Let digits 00 to 24
correspond to a correct solution & digits 25 to 99
correspond to incorrect solution.
• Teams of two; Simulate 2 or 3 trials; write your results on
board; class will create a graphical representation of all our
results
• State conclusion.
• Expected (theoretical) score is 2.5
Should I guess?
Note: What if there were five choices, so a guess
has a probability of 0.20 of being correct.
On average, students who guess at all ten questions
would get two correct for a score of 2 x 4 = 8
But they miss, on average, eight of the ten
questions so lose 8 points for these wrong
answers.
Final score: 0
Models …
•
yˆ  a  bx
, simulations, etc.
• Basis for all probability models:
• Sample space: list of all possible outcomes; can
be very simple or very complex
• Event: a subset of sample space
Probability Models
• Accurately counting outcomes is critical in probability
• Example: all possibilities when rolling 1 red die and 1
green die
(1r, 1g), (1r, 2g), (1r, 3g), etc.
or
(1g, 1r), (1g, 2r), (1g, 3r), etc.
• Tossing a penny & a quarter
(hp, hq), (hp, tq), (tp, hq), (tp, tq)
and
(hq, hp), (hq, tp), (tq, hp), (tq, tp)
Probability
• Tree diagrams sometimes helpful tool
• Good graphical technique for listing entire sample
space for relatively small sample space (not if you
have, say, 210 sample space)
• Diagram: Flip a coin then roll a die
or
• Diagram: Roll a die then flip a coin
Sampling with/without Replacement
• Without Replacement: Choosing a card from a
deck; keeping that card, then choosing another
card
• These are not independent events; caution
• With Replacement: Choosing a card from a deck;
putting that card back into the deck, then
choosing another card
• Sometimes not possible; but a good general
practice
Probability Rules ...
1. All probabilities are values between 0 & 1
(remember density curves?)
Consider event A:
0  P(A)  1
2. Sum of probabilities of all outcomes = 1
 space
S sample
P(S) = 1
Momentary detour...
Examples of disjoint/mutually exclusive events
include:
•
•
•
•
miss a bus; catch a bus
play chess; sleep
turn left; turn right
sit down; stand up
Non-examples of disjoint/mutually exclusive
events include:
• listen to music; do homework
• sleep; dream
Mutually Exclusive/Disjoint events
are...
• Events that cannot happen simultaneously
• Other examples of mutually exclusive/disjoint
events?
Another brief detour… “union”
* The union of any collection of events is the
event that at least one of the collection occur.
* Symbol “U”
* P(A or B or C) = P(A U B U C)
Back to the Probability Rules ...
3. If 2 events have no outcomes in common
(disjoint/mutually exclusive) then the probability
of one or the other occurring is the sum of their
individual probabilities.
P(A or B) = P(A) + P(B)
(Addition Rule for Disjoint Events)
Example: P (rolling a 2 or rolling an odd)
Non-example: P (rolling a 4 or rolling an even)
…more examples
P (A or B or C) = P(A U B U C)
= P (A) + P (B) + P (C) only if events are disjoint
A: freshman
B: sophomore
C: junior
D: senior
P(A) = 0.30
P(B) = 0.35
P(C) = 0.20
P(D) = 0.15
All disjoint events.
P(B U C) =
P(A U D) =
P (A U B U C U D) =
Probability Rules ...
4. Probability that an event does not occur is one minus
the probability that the event will occur (complement
rule)
P(Ac)= 1 - P(A)
Example: P (person has brown hair) = 0.53
So, P (person does not have brown hair) = 1 – 0.53 = 0.47
What would AUA c = ?
What would A Ac = ?
Probability Rules ....
... one more probability rule later... Stay tuned ...
Probability Rules Practice
Distance learning courses are rapidly gaining popularity among
college students. The probability of any age group is just the
proportion of all distance learners in that age group. Here is the
probability model:
Are rules 1 & 2 satisfied above?
Are the above groups mutually exclusive events? Why or why not?
P ( 18-23 yr & 30-39 yr) =
P (not being in 18-23 yr category) =
P (24-29 yr or 39-39 yr) =
Caution...
Be careful to apply the addition rule only to
disjoint/mutually exclusive events
P (queen or heart) =
4/52 + 13/52 (??) .... not disjoint... this
probability rule would not be correct in this
case
More on this later…
Review/Preview...
Mutually Exclusive/Disjoint
• sleeping; playing chess
• walking; riding a bike
Overlapping Events (not mutually exclusive)
• roll an even; roll a prime
• select 12th grader; select athlete
• choose hard-cover book; choose fiction
What if ...
• What if events are not disjoint/mutually
exclusive? i.e., they can occur simultaneously
(overlapping events)
• How do we calculate P(A or B)?
Data Collection Time…
Which do you use/do most?
Cell Phone
Emailing
Social
Media
Total
IPad/
Tablet
Computer
Total
General Addition Rule (disjoint or overlapping)
P (A or B) = P (A) + P (B) – P (A and B)
P (A U B) = P (A) + P (B) – P (A∩ B)
Pierced ears, anyone?
Find the probability that a given student:
• has pierced ears
• is a male
• is male and has pierced ears
• is male or has pierced ears
Morale of the story?
Be careful to apply the addition rule for mutually
exclusive events only to disjoint/mutually
exclusive events
P (queen or heart) =
4/52 + 13/52 .... not disjoint... counted queen
of hearts twice
P (queen or heart) = 4/52 + 13/52 – 1/52
(think of a Venn diagram; overlap)
Venn Diagrams…
(a) Event A and 𝐴𝑐
(b) A, B mutually exclusive/disjoint
Venn Diagrams
(a) Intersection of A & B (and)
(b) Union of A, B (or)
Conditional Probability...
Remember... Probability assigned to an event can
change if we know that some other event has
occurred (“given”)
Conditional Probability...
P (A | B) is read “the probability of A given B”
P (female) =
versus
P (female | 15-17 years) =
Conditional Probability... caution
P (male | 18-24 yr) =
P (18-24 yr | male) =
Formula…
To find the conditional probability P (A | B)
𝑃(𝐴 ∩ 𝐵)
𝑃 𝐴𝐵 =
𝑃(𝐵)
The conditional probability P (B | A) is given by
𝑃(𝐵 ∩ 𝐴)
𝑃 𝐵𝐴 =
𝑃(𝐴)
General Multiplication Rule for Any Two Events
The joint probability that events A and B both
happen is P (A ∩ B) = P (A) P (B|A)
P (female and 15-17yr) =
89/16,639
P(A ∩ B) = P(A) P(B|A)
A = female
B = 15-17 years
= (9,321/16,639) x (89/9,321)
= 89/16,639 ✓
Tree diagram…
About 27% of adult Internet users are 18 to 29
years old, another 45% are 30 to 49 years
old, and the remaining 28% are 50 and over.
The Pew Internet and American Life Project finds
that 70% of Internet users aged 18 to 29 have
visited a video-sharing site, along with 51% of
those aged 30 to 49 and 26% of those 50 or
older.
Review/Preview ....
Two events A & B are independent if knowing that one occurs
does not change the probability that the other occurs.
Examples:
- Roll a die twice. What I roll the first time does not change
the probability of what I will roll the second time.
- Win at chess; win the lottery
- Student on debate team; student on swim team
So, if events A and B are independent, then P (A|B) = P(A) and
likewise P (B|A) = P (B).
A = {The person chosen is male}
B = {The person chosen is 25 – 34 years }
(a) Explain why P(A) = 0.4397.
(b) Find P(B).
(c) Are the events A and B independent?
a) P (A): 7317/16,639 = .4397
b) P (B): (3,494)/16,639 = .2100
c) P (A|B) must equal P (A) for events to be independent
(1589/3494) = .4547 ≠ .4397
so events A and B are not independent
Are these events independent?
Event A: Honors student
Honors
Student
Basketball
Event B: Basketball
Non-Honors
Student
Total
450
1,800
1,500
6,000
Non-BB Player
Total
P (B) =
1800/6000 = 0.3
P (B|A) =
450/1500 = 0.3
P (A) =
1500/6000 = 0.25
P (A|B) =
450/1,800 = 0.25
So remember… Independent Events
Two events A and B that both have positive
probability are independent if
P (B|A) = P (B)
or
P (A|B) = P (A)
Last Probability Rule ....
If events A & B are independent, then
P (A & B) = P (A) P(B)
(this is the multiplication rule for independent events)
Example: Consider the following probabilities.
P( student has 4.0 GPA) = 0.15
P(student miss bus) = 0.30
If these two events are independent, then P (4.0 GPA &
missing bus) = (0.15)(0.30) = 0.045
Caution...
P ( heart & 3) -- without replacement – is not
independent; knowing outcome of first pick changes
outcome of second pick
Independent is not mutually exclusive/disjoint.
Mutually exclusive/disjoint is not independent.
(remember... mutually exclusive/disjoint events can’t
happen at same time; independent events can)
Free stuff...
• If events A and B are independent, then:
– their complements, Ac and Bc, are also independent
– Ac and B are also independent
– A and Bc are also independent
• Also extends to collections of more than 2 events,
i.e., independence of events A, B, & C means that
no information about any one or any two can
change the probability of the remaining events
General Probability Practice...
An automobile manufacturer buys computer
chips from a supplier. The supplier sends a
shipment containing 5% defective chips. Each
chip chosen from this shipment has
probability 0.05 of being defective, and each
automobile uses 12 chips selected
independently. What is the probability that all
12 chips in a car will work properly?
Answer ...
The probability that all 12 chips in the car will
work is (1 – 0.05)12 = (0.95)12 = 0.504.
Draw a venn diagram for...
•
•
•
•
Mutually exclusive/disjoint events
Independent events
Dependent events
Overlapping events
Case Closed...
• True negative: machine says no explosives/weapons &
there really aren’t any explosives/weapons
• True positive: machine says explosives/weapons &
there really are explosives/weapons
• False negative: machine says no explosives/weapons &
there really are explosives/weapons
• False positive: machine says explosives/weapons &
there really aren’t any explosives/weapons
Case Closed Questions...
• It is said that the occurrence of false-positives
in airport screenings has been about 30%.
What does that mean?
• The probability that the alarm will sound
(incorrectly) when scanning luggage that does
not contain explosives, guns, or knives is 0.3
Case-Closed Questions...
• In an FAA test, 40% of explosives planted by
government agents made it through security
checkpoints; and the occurrence of false-positives in
airport screenings has been about 30%.
• Assume that on average 1 suitcase in 10,000 has a
bomb in it. Construct a tree diagram to help you find
the probability that a suitcase with a bomb would be
detected. What’s the probability that a piece of
luggage that has a bomb in it would escape detection?
Case Closed...
Detected
Bomb
0.60
1/10,000
Not Detected
0.40
Luggage
Detected
No Bomb
0.30
9,999/10,000
Not Detected
0.70
Case Closed...
• Find the probability that no alarm is sounded
for a suitcase that has no bomb.
• Answer: (9,999/10,000) x (0.70) = 0.69993
Let’s Make a Deal!
• Go to New York Times website and do
simulation
• Discuss which strategy is best.
• Textbook Page 228
• Show clip of game show