S2.1 Binomial and Poisson distributions
Download
Report
Transcript S2.1 Binomial and Poisson distributions
S2 The Poisson distribution
Contents
The Poisson distribution
Poisson tables
Mean and variance
Approximating a Binomial by a Poisson
Approximating the Poisson using a normal
Introduction
We are sometimes interested in the number of times an
event occurs in a period of space or time:
1) Sam counts the number of cars travelling past her on a
quiet country road. X represents the number of cars passing
her in 15 minutes.
2) Xiu uses a Geiger counter to record the number of particles
emitted by a radioactive substance. X is the number of
emissions in one minute.
3) Scott counts the number of people leaving a pub. X is the
number of people leaving in a 5 minute interval.
Introduction
4) Selina is taking samples of sea water. X is the number of a
particular kind of organism that she finds in a 1 ml sample
of water.
5) Ankur has produced a first draft of a novel. X is the number
of typing mistakes made on a page.
6) Steve records the number of accidents that occur on a
stretch of motorway. X is the number of accidents that occur
in a day.
The Poisson distribution
In each of these situations, the random variable X counts the
number of times an event occurs in a given amount of space
or time. X takes the values 0, 1, 2, 3, … .
The Poisson distribution is a model that can sometimes be
used for count data. The distribution is named after the French
mathematician and scientist Siméon Denis Poisson (17811840).
The Poisson distribution has a number of conditions.
Conditions for a Poisson distribution
A random variable, X, which counts the number of times an
event occurs in a given unit of space or time will have a
Poisson distribution if:
the events occur independently of each other
and at random;
the events occur at a constant rate (in the sense
that the number of events occurring in a given
interval is directly proportional to the length of that
interval);
the events occur singly (that is, one at a time).
The Poisson distribution
The notation used to indicate that a random variable X has a
Poisson distribution is
X ~ Po(λ)
The distribution is fully specified by a single parameter λ,
representing the mean number of events that occur in the given
unit of space or time.
We will now reconsider the seven situations presented
earlier. Decide whether the Poisson distribution might be an
appropriate model in each case.
The Poisson distribution
1) The number of cars passing along a quiet
country road in 15 minutes.
2) The number of emissions from a radioactive
substance in one minute.
3) The number of people leaving a pub in a 5
minute interval.
Could be Poisson
Poisson
Not Poisson
The Poisson distribution
4)The number of a particular kind of organism
found in a 1 ml sample of seawater.
Could be Poisson
5) Simon has produced a first draft of a novel.
X is the number of typing mistakes made on
a page.
Could be Poisson
6) Steve records the number of accidents that
occur on a stretch of motorway. X is the
number of accidents that occur in a day.
Not Poisson
Calculating probabilities
If X ~ Po(λ), then
e x
P( X = x ) =
x!
for x = 0, 1, 2, 3, …
Suppose X ~ Po(0.85). Find P(X = 3).
e0.85 0.853
P( X = 3 ) =
= 0.0437 (3 s.f.)
3!
Calculating probabilities
X ~ Po(0.85). Find P(X > 2).
P(X > 2) = 1 – P(X = 0) – P(X = 1) – P(X = 2).
e 0.85 0.850
P( X = 0) =
= 0.4274
0!
e 0.85 0.851
= 0.3633
P(X = 1) =
1!
e 0.85 0.852
= 0.1544
P( X = 2) =
2!
Therefore, P(X > 2) = 1 – 0.9451
= 0.0549
Calculating probabilities
On average a call centre receives 1.75 phone calls per
minute.
a) Assuming a Poisson distribution, find the probability that the
number of phone calls received in a randomly chosen
minute is:
(i) exactly 4;
(ii) no more than 2.
b) Find the probability that 6 phone calls are received in a 4
minute period.
Calculating probabilities
a) Let X = number of phone calls received in 1 minute.
Then X ~ Po(1.75).
e 1.75 1.754
P( X = 4) =
= 0.0679 (3 s.f.)
4!
P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 2)
e1.75 1.750
P( X 0)
= 0.1738
0!
e 1.75 1.751
P( X 1)
= 0.3041
1!
e 1.75 1.752
P( X 2)
= 0.2661
2!
Therefore, P(X ≤ 2) = 0.744 (3 s.f.)
Calculating probabilities
b) Let Y = number of phone calls received in 4 minutes.
The number of calls in 4 minutes will be on average
1.75 × 4 = 7
So, Y ~ Po(7).
Therefore,
e7 76
= 0.149 (3 s.f.)
P(Y 6)
6!
Examination-style question
Examination-style question
A gardener has calculated that weeds in his garden occur at a
mean rate of 3.25 per square metre. Assuming that a Poisson
distribution is appropriate:
a) Find the probability that there will be fewer than 4 weeds
in an area of 2 m2.
b) State what needs to be assumed about the distribution
of weeds in order for the Poisson distribution to be fully
justified.
Examination-style question
Let X = number of weeds in an area of 2 m2.
a) X = 3.25 × 2 = 6.5, so X ~ Po(6.5).
P(X < 4) = P(X = 0, 1, 2, 3)
e6.5 6.50 e6.5 6.51 e6.5 6.52 e6.5 6.53
=
0!
1!
2!
3!
= 0.00150 + 0.00977 + 0.03176 + 0.06881
= 0.112 (3 s.f.)
b) For a Poisson distribution to be justified, the weeds
would need to occur randomly and at a constant rate.
Poisson tables
The Poisson distribution
Contents
Poisson tables
Mean and variance
Approximating a binomial by a Poisson
Approximating the Poisson using a normal
16 of 58
© Boardworks Ltd 2006
Poisson tables
Tables of probabilities exist for many Poisson distributions.
The tables are cumulative, that is they give P(X ≤ x).
λ
0.5
1.0
1.5
2.0
2.5
x=0
0.6065
0.3679
0.2231
0.1353
0.0821
x=1
0.9098
0.7358
0.5578
0.4060
0.2873
x=2
0.9856
0.9197
0.8088
0.6767
0.5438
x=3
0.9982
0.9810
0.9344
0.8571
0.7576
x=4
0.9998
0.9963
0.9814
0.9473
0.8912
x=5
1.0000
0.9994
0.9955
0.9834
0.9580
x=6
1.0000
0.9999
0.9991
0.9955
0.9858
If X ~ Po(1.5), P(X ≤ 4) = 0.9814
Poisson tables
λ
0.5
1.0
1.5
2.0
2.5
x=0
0.6065
0.3679
0.2231
0.1353
0.0821
x=1
0.9098
0.7358
0.5578
0.4060
0.2873
x=2
0.9856
0.9197
0.8088
0.6767
0.5438
x=3
0.9982
0.9810
0.9344
0.8571
0.7576
x=4
0.9998
0.9963
0.9814
0.9473
0.8912
x=5
1.0000
0.9994
0.9955
0.9834
0.9580
x=6
1.0000
0.9999
0.9991
0.9955
0.9858
If X ~ Po(1.5), P(X = 2) = P(X ≤ 2) – P(X ≤ 1)
= 0.8088 – 0.5578
= 0.251
Poisson tables
λ
0.5
1.0
1.5
2.0
2.5
x=0
0.6065
0.3679
0.2231
0.1353
0.0821
x=1
0.9098
0.7358
0.5578
0.4060
0.2873
x=2
0.9856
0.9197
0.8088
0.6767
0.5438
x=3
0.9982
0.9810
0.9344
0.8571
0.7576
x=4
0.9998
0.9963
0.9814
0.9473
0.8912
x=5
1.0000
0.9994
0.9955
0.9834
0.9580
x=6
1.0000
0.9999
0.9991
0.9955
0.9858
If Y ~ Po(2), P(Y > 1) = P(Y = 2, 3, 4, …)
= 1 – P(Y ≤ 1)
= 1 – 0.4060
= 0.594
Examination-style question
Examination-style question
A corner shop has on average 18 customers per hour.
Assume that a Poisson distribution is appropriate.
a) Calculate the probability that
i) more than 10 customers will arrive in a 15 minute interval;
ii) exactly 2 customers will arrive in a 1 minute interval.
b) Find the time interval such that the probability of no
customers arriving during that interval is 0.2.
Examination-style question
a) Let X1 be the random variable for the number of customers
arriving in a 15 minute interval.
X1 ~ Po(18 ÷ 4), so X1 ~ Po(4.5).
P(X1 > 10) = 1 – P(X1 ≤ 10)
= 1 – 0.9933 (using tables)
= 0.0067
Let X2 be the random variable for the number of customers
arriving in a 1 minute interval.
X2 ~ Po(18 ÷ 60), so X2 ~ Po(0.3).
P(X2 = 2) = P(X2 ≤ 2) – P(X2 ≤ 1)
= 0.9964 – 0.9631 (from tables)
= 0.0333
Examination-style question
b) Let Y be the number of customers arriving in an interval of
length t minutes.
Then Y ~ Po(18t ÷ 60), so Y ~ Po(0.3t).
From the question, P(Y = 0) = 0.2
We can find P(Y = 0) in terms of t:
e 0.3t (0.3t )0
= e 0.3t
P(Y = 0) =
0!
e0.3t = 0.2
0.3t = ln0.2
ln0.2
t=
= 5.36 minutes
0.3
Mean and variance
The Poisson distribution
Contents
Poisson tables
Mean and variance
Approximating a binomial by a Poisson
Approximating the Poisson using a normal
23 of 58
© Boardworks Ltd 2006
Mean and variance
Suppose that X ~ Po(λ).
It can be shown that the mean and variance of X are equal:
E(X) = Var(X) = λ
This result provides us with a useful, informal way to test
whether a variable could be modelled by a Poisson
distribution.
Mean and variance
Example: The table below shows the number of goals
scored by each team in matches in the Premiership during
the period from August 21st to September 12th 2005.
r
0
1
2
3
4
5 or more
Frequency, f
21
19
10
3
3
0
Calculate the values of the mean and variance of this
data. Discuss whether these values support the use of
a Poisson distribution as a model for the data.
Mean and variance
The mean of the data is:
(0×21) +(1×19) +(2×10) +(3×3) +(4×3) 60
x=
=
=1.071
56
56
Now calculate the variance:
x 2 f = (02 ×21)+(12 ×19)+...+(42 ×3) = 134
1
Variance =
n
2
134 60
x f x
= 1.245 (4 s.f.)
56 56
2
2
It can be seen that the mean and the variance are
approximately equal, suggesting that a Poisson distribution
might be a suitable model for this data.
Fitting a Poisson model to data
It is possible to fit a Poisson model to a set of data.
The table below shows the number of goals scored by each
team in matches in the Premiership during the period from
August 21st to September 12th 2005.
r
0
1
2
3
4
5 or more
Frequency, f
21
19
10
3
3
0
Using a Poisson distribution with the same mean as the data,
calculate the theoretical frequencies for 0, 1, 2, 3, 4, or at
least 5 goals in a match.
Fitting a Poisson model to data
Let X represent the number of goals scored by a team in a
Premiership match.
The mean of the data was 1.071 goals per match.
We therefore adopt a Po(1.071) distribution to model X.
If X is the random variable for the number of goals scored:
e r e1.0711.0710
P(X 0)
= 0.3427 (4 s.f.)
r!
0!
e1.0711.0711
P(X =1) =
= 0.3670 (4 s.f.)
1!
e1.0711.0712
P(X = 2) =
0.1965 (4 s.f.) etc…
2!
Fitting a Poisson model to data
x
P(X = x)
0
0.3427
1
0.3670
2
0.1965
3
0.0702
4
0.0188
5 or more
0.0048
Expected frequencies
P(X ≥ x) is found by
subtracting the sum of the
other probabilities from 1.
Fitting a Poisson model to data
x
P(X = x)
Expected frequencies
0
0.3427
19.2
1
0.3670
20.6
2
0.1965
11.0
3
0.0702
3.9
4
0.0188
1.1
5 or more
0.0048
0.3
The expected frequencies
can be found by multiplying
the probabilities by the total
frequency, i.e. 56.
Fitting a Poisson model to data
x
f
Expected frequencies
0
21
19.2
1
19
20.6
2
10
11.0
3
3
3.9
4
3
1.1
5 or more
0
0.3
We can see that these expected frequencies are quite
close to the frequencies that were actually observed,
which suggests that the Poisson distribution appears to
be a reasonable model for the data.
Approximating a binomial by a Poisson
The Poisson distribution
Contents
Poisson tables
Mean and variance
Approximating a binomial by a Poisson
Approximating the Poisson using a normal
32 of 58
© Boardworks Ltd 2006
Approximating a binomial by a Poisson
Approximating a binomial by a Poisson
The previous activity showed that there are circumstances
when a Poisson distribution provides a good approximation
to a binomial distribution.
If X ~ B(n, p), then X can reasonably be approximated by a
Poisson distribution with mean np if
Note: It is sometimes
n is large, and
convenient to
approximate a
p is small.
The rule of thumb is
n > 50 and np < 5
binomial with a
Poisson distribution
because it is slightly
easier to calculate
probabilities using a
Poisson distribution.
Approximating a binomial by a Poisson
A drug manufacturer has found that 2% of patients taking a
particular drug will experience a particular side-effect.
A hospital consultant prescribes the drug to 150 of her
patients.
Using a suitable approximation calculate the probability that:
a) None of her patients suffer from the side-effects.
b) No more than 5 suffer from the side-effects.
Approximating a binomial by a Poisson
Let X represent the number of patients experiencing these
side-effects.
The exact distribution of X is B(150, 0.02).
Since n is large and p is small, X ≈ Po(150 × 0.02)
So, X ≈ Po(3).
3
0
e
3
a) P(X = 0) =
= 0.0498 (3 s.f.)
0!
b) P(X ≤ 5) = 0.9161 (directly from tables).
Examination-style question
Examination-style question:
The probability that a directory enquiry service gives out the
correct phone number has been estimated to be 0.975.
a) Sabah requires 10 phone numbers. Find the probability
that the service gives her at least 9 correct numbers.
b) A large organisation requests 140 phone numbers. Find
the probability that more than 135 of them are given out
correctly.
Examination-style question
a) Let X be the random variable for the number of correct
phone numbers given to Sabah.
Then X ~ B(10, 0.975).
P(X ≥ 9) = P(X = 9) + P(X = 10).
P( X = 9) = 10C9 0.9759 (1 0.975) = 0.1991
P( X =10) = 10C10 0.97510 (1 0.975)0 = 0.7763
So, P(X ≥ 9) = 0.1991 + 0.7763 = 0.9754
Examination-style question
b) The probability of being given the correct phone number
(0.975) is not small.
However, the probability of receiving an incorrect phone
number (0.025) is small.
Therefore we consider Y, the number of incorrect numbers
received.
The exact distribution of Y is B(140, 0.025).
This can be approximated to Po(3.5).
140 × 0.025
The probability of more than 135 correct numbers is
equivalent to the probability of 4 or fewer incorrect numbers.
Using tables: P(Y ≤ 4) = 0.7254
Approximating a binomial by a Poisson
The Poisson distribution
Contents
Poisson tables
Mean and variance
Approximating a binomial by a Poisson
Approximating the Poisson using a normal
40 of 58
© Boardworks Ltd 2006
Approximating the Poisson using a normal
Key result: If X ~ Po(λ) and λ is large, then X is approximately
normally distributed:
X ≈ N[λ, λ]
Recall that the mean and variance of a Poisson distribution
are equal.
There is a widely used rule of thumb that can be applied to tell
you when the approximation will be reasonable:
A Poisson can be approximated reasonably
well by a normal distribution provided λ > 15.
Note: A continuity correction is required
because we are approximating a discrete
distribution using a continuous one.
Approximating the Poisson using a normal
Example: An animal rescue centre finds a new home for
an average of 3.5 dogs each day.
a) What assumptions must be made for a Poisson
distribution to be an appropriate distribution?
b) Assuming that a Poisson distribution is appropriate:
i. Find the probability that at least one dog is
rehoused in a randomly chosen day.
ii. Find the probability that, in a period of 20 days,
fewer than 65 dogs are found new homes.
Approximating the Poisson using a normal
Solution:
a) For a Poisson distribution to be appropriate we would need to
assume the following:
1. The dogs are rehoused independently of one another
and at random;
2. The dogs are rehoused one at a time;
3. The dogs are rehoused at a constant rate.
b) i) Let X represent the number of dogs rehoused on a
given day. So, X ~ Po(3.5).
P(X ≥ 1) = 1 – P(X = 0)
= 1 – 0.0302 (from tables)
= 0.9698
Approximating the Poisson using a normal
b) ii) Let Y represent the number of dogs rehoused over a
period of 20 days. So, Y ~ Po(3.5 × 20) i.e. Po(70).
As λ is large, we can approximate this Poisson distribution
by a normal distribution:
Y ≈ N[70, 70].
P(Y < 65) → P(Y ≤ 64.5)
N[70, 70]
Standardize
64.5 70
0.657
70
N[0, 1]
Approximating the Poisson using a normal
P(Y ≤ 64.5) = P(Z ≤ –0.657)
= 1 – 0.7445
= 0.2555
So the probability that less than 65 dogs are
rehoused is 0.2555.
Examination-style question
Examination-style question: An electrical retailer has
estimated that he sells a mean number of 5 digital radios
each week.
a) Assuming that the number of digital radios sold on
any week can be modelled by a Poisson distribution,
find the probability that the retailer sells fewer than 2
digital radios on a randomly chosen week.
b) Use a suitable approximation to decide how many
digital radios he should have in stock in order for him
to be at least 90% certain of being able to meet the
demand for radios over the next 5 weeks.
Examination-style question
Solution:
a) Let X represent the number of digital radios sold in a week.
So X ~ Po(5).
P(X < 2) = P(X ≤ 1)
= 0.0404
(from tables).
So the probability that the retailer sells fewer than 2
digital radios in a week is 0.0404.
b) Let Y represent the number of digital radios sold in a
period of 5 weeks.
So, Y ~ Po(25).
We require y such that P(Y ≤ y) = 0.9.
Examination-style question
Since the parameter of the Poisson distribution is large, we can
use a normal approximation:
Y ≈ N[25, 25].
P(Y ≤ y) → P(Y ≤ y + 0.5) (using a continuity correction).
N[25, 25]
Standardize
The 90% point of
a normal is 1.282
N[0, 1]
Examination-style question
So, y 0.5 25 1.282
5
y 5 1.282 24.5
y 30.91
So the retailer would need to keep
31 digital radios in stock.