Chapter 17 Probability Models

Download Report

Transcript Chapter 17 Probability Models

Chapter 17 Probability Models
math2200
O’Neal’s free throws
• Suppose on average, Shaq shoots 45.1%
• Let X be the number of free throws Shaq
needs to shoot until he makes one
• Pr(X=2)=?
• Pr(X=5)=?
• E(X)=?
Bernoulli trials
• Only two possible outcomes
– Success or failure
• Probability of success, denoted by p, is the
same for every trial
• The trials are independent
• Examples
– tossing a coin
– Free throw in a basketball game
Independence
• Be careful when sampling without
replacement in finite population
• Precisely, these draws are not independent
• But if the size of the population is large
enough, we can treat them as independent
– Rule of thumb: the sample size is smaller than
10% of the population
Geometric model
• How long does it take to achieve a success in
Bernoulli trials?
• A Geometric probability model tells us the
probability for a random variable that counts the
number of Bernoulli trials until the first success
• Geom(p)
–
–
–
–
–
–
p = probability of success
q = 1-p = probability of failure
X: number of trials until the first success occurs
P(X=x) =
E(X) =
Var(X) =
Geometric model
• How long does it take to achieve a success in
Bernoulli trials?
• A Geometric probability model tells us the
probability for a random variable that counts the
number of Bernoulli trials until the first success
• Geom(p)
–
–
–
–
–
–
p = probability of success
q = 1-p = probability of failure
X: number of trials until the first success occurs
P(X=x) = qx-1 p
E(X) =
Var(X) =
Geometric model
• How long does it take to achieve a success in
Bernoulli trials?
• A Geometric probability model tells us the
probability for a random variable that counts the
number of Bernoulli trials until the first success
• Geom(p)
–
–
–
–
–
–
p = probability of success
q = 1-p = probability of failure
X: number of trials until the first success occurs
P(X=x) = qx-1 p
E(X) = 1/p
Var(X) = q/p2
• What is the probability that Shaq makes at
least one successful throw in the first four
attempts?
• What is the probability that Shaq makes at
least one successful throw in the first four
attempts?
– 1-P(NNNN) = 1-(1-0.451)4 = 0.9092
– P(X=1)+P(X=2)+P(X=3)+P(X=4)
Binomial model
• A Binomial model tells us the probability
for a random variable that counts the
number of successes in a fixed number of
Bernoulli trials.
• Binom(n,p)
• Let X be the number of success in n
Bernoulli trials
• p = probability of success
• q = 1-p = probability of failure
The Binomial Model (cont.)
• In n trials, there are
n!
n Ck 
k ! n  k !
ways to have k successes.
– Read nCk as “n choose k.”
• Note: n! = n x (n-1) x … x 2 x 1, and n! is
read as “n factorial.”
The Binomial Model (cont.)
Binomial probability model for Bernoulli trials:
Binom(n,p)
n = number of trials
p = probability of success
q = 1 – p = probability of failure
X = number of successes in n trials
n!
 n  x n x
n
P( X  x)    p q where  
x
 x  x !(n  x)!
  np
  npq
How do we find E(X) and Var(X)?
• Use P(X=x) directly
• Binomial random variable can be viewed
as the sum of the outcome of n Bernoulli
trials
• Let Y1,…,Yn be the outcome of each
Bernoulli trial
• E(Y1)=…=E(Yn)=p*1+q*0=p
• Var(Y1)=…=Var(Yn)=(1-p)2 *p+(0-p)2 *q = pq
Mean and variance of sum
• Suppose Y1,…,Yn are independent and
have the same mean µ and variance σ2
• Let X = Y1+…+Yn
• E(X) = E(Y1)+…+E(Yn)=nµ
• Var(X) = Var(Y1)+…+Var(Yn)=nσ2
• If Shaq shoots 20 free throws, what is the
probability that he makes no more than
two?
• Binom(n,p), p=0.451, n=20
• P(X=0 or 1 or 2)
= P(X=0) + P(X=1) + P(X=2) = 0.0009
Normal approximation to Binomial
• If X ~ Binomial(n,p), n=10000
• P(X<2000)=?
• When dealing with a large number of trials in a Binomial
situation, making direct calculations of the probabilities
becomes tedious (or outright impossible).
• When n is large, np is not too small or too big, then
Binomial(n,p) looks similar to Normal with mean = np
and variance = npq
• P(X<2000)=P(Z<(2000-np)/sqrt(npq))
• Success/failure condition : np>=10 and nq>=10
Continuous Random Variables
• When we use the Normal model to
approximate the Binomial model, we are
using a continuous random variable to
approximate a discrete random variable.
• So, when we use the Normal model, we
no longer calculate the probability that the
random variable equals a particular value,
but only that it lies between two values.
Poisson model
• For small p and large n, even when np<10, we
can approximate Binomial(n,p) by Poisson(np)
• Let λ=np, we can use Poisson model to
approximate the probability.
• Poisson(λ)
– λ : mean number of occurrences
– X: number of occurrences
e 
P  X  x 
x!

EX   
x
SD  X   
The Poisson Model (cont.)
• Although it was originally an approximation
to the Binomial, the Poisson model is also
used directly to model the probability of
the occurrence of events for a variety of
phenomena.
– It’s a good model to consider whenever your
data consist of counts of occurrences.
– It requires only that the events be
independent and that the mean number of
occurrences stays constant.
More about Poisson model
• It scales to the sample size
– The average occurrence in a sample of size
35,000 is 3.85
– The average occurrence in a sample of size
3,500 is 0.385
• Occurrence of the past events doesn’t
change the probability of future events
– Even though the events appear to cluster, the
probability of another event occurring is still
the same
An application of Poisson model
• In 1946, the British statistician R.D. Clarke
studied the distribution of hits of flying
bombs in London during World War II.
• Want to know if the Germans were
targeting these districts or if the
distribution was due to chance.
• Clarke began by dividing an area into
hundreds of tiny, equally sized plots.
Flying bomb hits on London
• The average number of hits per square is then
537/576=.9323 hits per square
# of hits
0
1
2
3
4
5
# of cell plots with # of hits
above
229
211
93
35
7
1
Poisson Fit
• No need to move people from one sector to another,
even after several hits!
Flying bomb hits on London
• The average number of hits per square is then
537/576=.9323 hits per square
# of hits
0
1
2
3
4
5
# of cell plots with # of hits
above
229
211
93
35
7
1
Poisson Fit
226.7
211.4
98.5
30.6
7.1
1.6
• No need to move people from one sector to another,
even after several hits!
What Can Go Wrong?
• Be sure you have Bernoulli trials.
– You need two outcomes per trial, a constant
probability of success, and independence.
– Remember that the 10% Condition provides a
reasonable substitute for independence.
• Don’t confuse Geometric and Binomial
models.
• Don’t use the Normal approximation with
small n.
– You need at least 10 successes and 10
failures to use the Normal approximation.
What have we learned?
– Geometric model
• When we’re interested in the number of Bernoulli trials until
the next success.
– Binomial model
• When we’re interested in the number of successes in a
certain number of Bernoulli trials.
– Normal model
• To approximate a Binomial model when we expect at least 10
successes and 10 failures.
– Poisson model
• To approximate a Binomial model when the probability of
success, p, is very small and the number of trials, n, is very
large.
TI-83
• 2nd + VARS (DISTR)
• pdf: P(X=x)
– geometpdf(prob,x)
– binompdf(n,prob,x)
– poissonpdf(mean,x)
• cdf: P(X<=x)
– beometcdf(prob,x)
– binomcdf(n,prob,x)
– poissoncdf(mean,x)