6.3 Binomial and Geometric Random Variables

Download Report

Transcript 6.3 Binomial and Geometric Random Variables

6.3 Binomial and Geometric
Random Variables
Objectives
SWBAT:
• DETERMINE whether the conditions for using a binomial random variable are met.
• COMPUTE and INTERPRET probabilities involving binomial distributions.
• CALCULATE the mean and standard deviation of a binomial random variable. INTERPRET these
values in context.
• FIND probabilities involving geometric random variables.
What are the conditions for a binomial setting?
A binomial setting arises when we perform several independent trials of the same chance
process and record the number of times that a particular outcome occurs. The four conditions for a
binomial setting are:
• Binary? The possible outcomes of each trial can be classified as “success” or “failure.”
• Independent? Trials must be independent; that is, knowing the result of one trial must not tell us
anything about the result of any other trial.
• Number? The number of trials n of the chance process must be fixed in advance.
• Success? There is the same probability p of success on each trial.
What is a binomial random variable? What are the possible values of a binomial
random variable?
• When the same chance process is repeated several times, we are often interested in
whether a particular outcome does or doesn’t happen on each repetition. Some random
variables count the number of times the outcome of interest occurs in a fixed number of
repetitions. They are called binomial random variables.
• Some examples of binomial random variables:
• Toss a coin 5 times. Count the number of heads.
• Spin a roulette wheel 8 times. Record how many times the ball lands in a red slot.
• Take a random sample of 100 cats. Count the number of females.
• Binomial variables are discrete!
What are the parameters of a binomial distribution?
• Consider tossing a coin n times. Each toss gives either heads or tails. Knowing the
outcome of one toss does not change the probability of an outcome on any other
toss.
• If we define heads as a success, then p is the probability of a head and is 0.5 on
any toss.
• The number of heads in n tosses is a binomial random variable X. The probability
distribution of X is called a binomial distribution.
The count X of successes in a binomial setting is a binomial random variable.
The probability distribution of X is a binomial distribution with parameters n and
p, where n is the number of trials of the chance process and p is the probability of
a success on any one trial. The possible values of X are the whole numbers from 0
to n.
What is the most common mistake students make on binomial distribution
questions?
• On many questions involving binomial settings, students do not recognize
that using the binomial distribution is appropriate. In fact, free response
questions about the binomial distribution are often among the lowest scoring
questions on the exam.
• You need to spend time making sure you are able to identify a binomial
distribution. If you aren’t sure how to answer a probability question, check if
are working with a binomial setting.
Dice, Cars, and Hoops
Determine whether the random variables below have a binomial distribution. Justify
your answer.
a) Roll a fair die 10 times and let X = the number of sixes.
Binary?
Yes, success = six, failure = not a six.
Independent?
Yes, knowing the outcomes of past rolls doesn’t provide additional information
about the outcomes of future rolls.
Number?
Yes, there are 10 trials.
Success?
Yes, the probability of success is always 1/6.
This is a binomial setting. X is binomial with n = 10 and p = 1/6.
b) Shoot a basketball 20 times from various distances on the court. Let Y = number of
shots made.
Binary?
Yes, success = make the shot, failure = miss the shot.
Independent?
Yes, evidence suggests it is reasonable to assume that making a shot doesn’t
change the probability of making the next shot.
Number?
Yes, there are 20 trials.
Success?
No, the probability of success changes because the shots are taken from various
distances.
This is not a binomial setting. Y is not binomial.
c) Observe the next 100 cars that go by and let C = color.
Binary?
No. There are more than two possible colors. Also, C is not even a random
variable since the outcomes aren’t numerical.
Rolling Sixes: In many games involving dice, rolling a 6 is desirable. The
probability of rolling a six when rolling a fair die is 1/6. If X = the number of
sixes in 4 rolls of a fair die, then X is binomial with n = 4 and p =1/6.
What is P(X = 0)? That is, what is the probability that all 4 rolls are not sixes.
What is P(X = 1)?
P(X = 2)?
P(X = 3)?
P(X = 4)?
In general, how can we calculate binomial probabilities? Is the formula on the
formula sheet?
P(X = k) = P(exactly k successes in n trials)
= number of arrangements× p k (1- p) n-k
Binomial Probability
There is a
variant of this
on the formula
sheet.
If X has the binomial distribution with n trials and probability p of success
on each trial, the possible values of X are 0, 1, 2, …, n. If k is any one of
these values,
æ nö k
P(X = k) = ç ÷ p (1- p) n-k
è kø
Number of
arrangements of
k successes
Probability of
k successes
Probability of
n-k failures
This is the binomial coefficient: the number of ways that exactly k
successes can occur in a set of n trials.
• Let’s go back to the die example and look at how we can use this formula
when X = 3. Reminder, this probability was 0.015.
æ nö k
P(X = k) = ç ÷ p (1- p) n-k
è kø
The binompdf function calculates these probabilities for us. To locate,
go to 2nd, DISTR, A: binompdf.
binompdf(n, p, k) computes P(X = k)
n is your number of trials, p is the probability of success, and k is the x
value
In the previous example, we saw that the probability of rolling exactly
three 6’s was 0.015.
On the calculator, we would enter:
binompdf(trials: 4, p: 1/6, x value: 3)
We would get 0.015.
Let’s try another example.
At a certain intersection, the light for eastbound traffic is red for 15 seconds, yellow for
5 seconds, and green for 30 seconds. Find the probability that out of the next eight
eastbound cars that arrive randomly at the light, exactly three will be stopped by a red
light.
Let X = the number of stops at a red light
n = 8 and p = 15/50
(note: success is a red light, failure is any other color)
Technology:
15
binompdf trials: 8, p: , 𝑥 value: 3 = 0.254
50
Try this problem:
Each child of a particular set of parents has probability 0.25 of having type O blood.
Suppose the parents have 5 children. Find the probability that exactly 3 of the children
have type O blood.
Let X = the number of children with type O blood.
n = 5 and p =0.25
Technology:
binompdf trials: 5, p: 0.25, 𝑥 value: 3 = 0.088
• Let’s stick with the same problem and change the question.
Each child of a particular set of parents has probability 0.25 of having type O
blood. Suppose the parents have 5 children. Find the probability that the
parents have less than 2 children with type O blood.
Let X = the number of children with type O blood
This problem is asking us to find the probability that the parents have 0 or 1
children with type O blood. In other terms, it is asking to find 𝑃(𝑋 ≤ 1).
binompdf trials: 5, p: 0.25, 𝑥 value: 0 + binompdf trials: 5, p: 0.25, 𝑥 value: 1 = 0.6328
Each child of a particular set of parents has probability 0.25 of having type O
blood. Suppose the parents have 5 children. Find the probability that the
parents have more than 3 children with type O blood.
Let X = the number of children with type O blood
This problem is asking us to find the probability that the parents have 4 or 5
children with type O blood. In other terms, it is asking to find 𝑃(𝑋 ≥ 4), or
𝑃(𝑋 > 3).
binompdf trials: 5, p: 0.25, 𝑥 value: 4 + binompdf trials: 5, p: 0.25, 𝑥 value: 5 = 0.0156
It seems like there would be a more convenient way to do this…
When working with a binomial distribution, we won’t always find probabilities
for an exact X value. Sometimes we will find probabilities less than an X value
and sometimes we will find probabilities greater than an X value. These require
a different formula and different command on the calculator.
• The command used in this situation would be option B in DISTR for binomcdf.
It is okay to use binompdf and binomcdf commands on the AP exam! However,
you need to keep a few things in mind:
Step 1: State the distribution and the values of interest. Specify a binomial distribution with the
number of trials n, success probability p, and the values of the variable clearly identified.
Step 2: Perform calculations—show your work!
Do one of the following:
(i) Use the binomial probability formula to find the desired probability; or
(ii) Use binompdf or binomcdf command and label each of the inputs.
Step 3: Answer the question.
Let’s revisit: Each child of a particular set of parents has probability 0.25 of having type O
blood. Suppose the parents have 5 children. Find the probability that the parents have less
than 2 children with type O blood. When we did this before, we got 0.6328.
𝑃 𝑋 ≤ 1 = binomcdf trials: 5, p: 0.25, 𝑥 value: 1 = 0.6328
Each child of a particular set of parents has probability 0.25 of having type O blood.
Suppose the parents have 5 children. Find the probability that the parents have more than 3
children with type O blood. We got 0.0156 before.
𝑃 𝑋 > 3 = 1 − 𝑃 𝑋 ≤ 3 = 1 − binomcdf trials: 5, p: 0.25, 𝑥 value: 3 = 0.0156
• We need to be very careful when using the binomcdf command, because of the
fact that it operates as 𝑃(𝑋 ≤ 𝑥), so we need to be aware of the input we are
putting in the calculator.
• If you are asked to find the probability less than or equal to some value, you are
fine, because that is how the command works. For example, if you wanted to
find the probability less than or equal to 3, or 𝑃(𝑋 ≤ 3), you would enter 3 as
the x value.
• If you wanted to find the probability less than a value, for example less than 7,
or 𝑃(𝑋 < 7), this is really the same thing as 𝑃(𝑋 ≤ 6), so you need to enter 6 as
the x value.
• If you want to find the probability greater than some value, for example 𝑃(𝑋 >
10), we need to enter 1 − 𝑃(𝑋 ≤ 10) into the calculator, so our x value is 10.
• If you want to find the probability greater than or equal to some value, for
example 𝑃(𝑋 ≥ 8), think about this as being the same as 𝑃 𝑋 > 7 , so in the
calculator we need to enter 1 − 𝑃(𝑋 ≤ 7), meaning our x value is 7.
Roulette: In Roulette, 18 of the 38 spaces on the wheel are black. Suppose you
observe the next 10 spins of a roulette wheel.
a) What is the probability that exactly 4 of the spins land on black?
Let X = the number of times the ball lands in a black slot
There are 10 independent trials of the chance process, each with success
probability 18/38. So X has a binomial distribution with n =10 and p = 18/38.
To find P(X = 4): binompdf(trials: 10, p: 18/38, x value: 4) = 0.225
There is a 0.225 probability that exactly 4 of the 10 spins land on black.
Roulette: In Roulette, 18 of the 38 spaces on the wheel are black. Suppose you
observe the next 10 spins of a roulette wheel.
b) What is the probability that at least 8 of the spins land on black?
Let Y = the number of times the ball lands in a black slot
There are 10 independent trials of the chance process, each with success
probability 18/38. So Y has a binomial distribution with n =10 and p = 18/38.
1 - binomcdf(trials: 10, p: 18/38, x value: 7) = 1 – 0.9615 = 0.0385
There is a 0.0385 probability that at least 8 of the 10 spins land on black.
How can you calculate the mean and SD of a binomial distribution? Are these on
the formula sheet?
Mean and Standard Deviation of a Binomial Random Variable
If a count X has the binomial distribution with number of trials n and probability of success p,
the mean and standard deviation of X are
m X = np
s X = np(1- p)
These are on the formula sheet.
Roulette: Let X = the number of the next 10 spins of a roulette wheel that land
on black.
a) Calculate and interpret the mean and standard deviation of X.
n = 10 and p = 18/38
If many individuals span a roulette wheel 10 times, the average number of spins to land on
black would be 4.7368.
If many individuals span a roulette wheel 10 times, the number of spins to land on black
would typically vary by about 1.5789 spins from the mean (4.7368).
Skip (b)
When is it okay to use a binomial distribution when sampling without
replacement? Why is this an issue?
• The binomial distributions are important in statistics when we wish to make inferences
about the proportion p of successes in a population.
• Almost all real-world sampling, such as taking an SRS from a population of interest, is done
without replacement. However, sampling without replacement leads to a violation of the
independence condition.
• When the population is much larger than the sample, a count of successes in an SRS of size n
has approximately the binomial distribution with n equal to the sample size and p equal to
the proportion of successes in the population.
10% Condition
When taking an SRS of size n from a population of size N, we can use a
binomial distribution to model the count of successes in the sample as
long as
1
n£
10
N
• What this means is that the binomial distribution is a good approximation as long as
we don’t sample more than 10% of our population
• Do not mistake this to mean that we want a small sample. This is almost never the
case. It just means that if we have a sample that is larger than 10% of the population,
we shouldn’t use the binomial distribution.
Almost everyone has one—a drawer that holds miscellaneous batteries of all sizes.
Suppose that your drawer contains 8 AAA batteries but only 6 of them are good. You
need to choose 4 for your graphing calculator. If you randomly select 4 batteries, what
is the probability that all 4 of them will work?
On the calculator this is binompdf (trials: 4, p: 6/8, x value: 4) = 0.3164
The problem is that we are taking a sample of 4 batteries from a population of 8
batteries, so our sample size is 50% of our population, which violates the 10%
condition.
Therefore, a binomial distribution is not an appropriate distribution to use. The real
probability here is:
Skip racecar example
• In a binomial setting, the number of trials n is fixed in advance, and
the binomial random variable X counts the number of successes. The
possible values of X are 0, 1, 2,…,n. In other situations, the goal is to
repeat a chance process until a success occurs.
• Roll a pair of dice until you get doubles
• In basketball, attempt a three-point shot until you make one.
• Keep placing a $1 bet on the number 7 in roulette until you win.
• These are all examples of a geometric setting.
• Although the number of trials isn’t fixed in advance (we don’t know how long
it will take until we achieve a success), the trials are independent and the
probability of success remains constant.
What are the conditions for a geometric setting?
• A geometric setting arises when we perform independent trials of the same
chance process and record the number of trials it takes to get one success.
On each trial, the probability p of success must be the same.
What is a geometric random variable? What are the possible values of a
geometric random variable?
What are the parameters of a geometric distribution?
• The number of trials Y that it takes to get a success in a geometric setting is a
geometric random variable. The probability distribution of Y is a geometric
distribution with parameter p, the probability of a success on any trial. The
possible values of Y are 1, 2, 3, . . . .
Like binomial random variables, it is important to be able to distinguish
situations in which the geometric distribution does and doesn’t apply!
Monopoly: In the board game Monopoly, one way to get out of jail is to roll
doubles. Suppose that a player has to stay in jail until he or she rolls doubles.
The probability of rolling doubles is 1/6.
a) Explain why this is a geometric setting.
The random variable of interest in this example is Y = number of attempts it
takes to roll doubles one time. Each attempt is one trial of the chance process.
Knowing the outcome of previous rolls does not tell us anything about future
rolls (trials are independent). On each roll, the probability of success is 1/6.
This is a geometric setting. Because Y counts the number of attempts it takes
to get doubles, it is a geometric random variable with parameter p = 1/6.
b) Define the geometric random variable and state its distribution.
Y = number of attempts it takes to roll doubles one time
Geometric distribution with parameter p = 1/6.
c) Find the probability it takes exactly three rolls to get out of jail.
Exactly four rolls.
100 rolls.
In general, how can you calculate geometric probabilities? Is this
formula on the formula sheet?
Geometric Probability Formula
If Y has the geometric distribution with probability p of success on each
trial, the possible values of Y are 1, 2, 3, … . If k is any one of these
values,
k-1
P(Y = k) = (1- p) p
On calculator:
This is NOT
on the
formula
sheet
On average, how many rolls should it take to escape jail in Monopoly?
What do you think?
The probability of rolling doubles is 1/6. So on average, how many rolls
should it take before you roll doubles?
6!
In general, how do you calculate the mean of a geometric distribution? Is the
formula on the formula sheet?
Mean (Expected Value) Of A Geometric Random Variable
If Y is a geometric random variable with probability p of success on each
trial, then its mean (expected value) is E(Y) = µY = 1/p.
What is the probability it takes longer than average to escape jail? What does
this probability tell you about the shape of the distribution?
We want to find P(Y>6). However, there are an infinite number of possible
values for Y greater than 6. We need to use our complement rule.
The shape of the distribution is skewed right, as all geometric distributions
are. The probabilities in the beginning are very high, and then spread out
as you move to the right.