Transcript document
Using the Normal pdf
Approximations to
Several Distributions
8 - 61
Approximating
Distributions
• The normal has been found to
be a useful distribution in
approximation other
distributions.
• Although it is a continuous
distribution, it can also be used
to approximate discrete
distributions, specifically the
binomial and the Poisson.
8 - 62
Approximating the
Binomial
• As n becomes large,
calculating binomial
probabilities can become time
consuming.
• The normal distribution is
useful in approximating
binomial probabilities.
• The larger the binomial
parameter n, the more
accurate the approximation.
8 - 63
Mean and Variance
(from Binomial to Normal)
• If the normal is to approximate
a binomial, it seems
reasonable that the mean and
variance of the normal should
be the same as the mean and
variance of the binomial that is
being approximated.
• Specifically, let
m = np and s2 = np(1-p).
8 - 64
Example 10, N(10,5)
To approximate a binomial with
n = 20 and p = 0.5 would require
a normal distribution with
m = (20)(0.5) = 10,
s2 = (20)(0.5)(1-0.5) = 5,
and
s = 5 = 2.236.
8 - 65
Example 10 - Shape
• In this instance, the shapes of
the distribution are quite
similar, and consequently the
approximation will be good.
8 - 66
Good Approximations?
two tests on np and nq
• Generally, the approximation is
reasonable when the mean of
the binomial, np, is greater
than or equal to 5 and n(1-p) is
greater than or equal to 5.
• The approximation becomes
quite good when np is greater
than or equal to 10 and n(1-p)
is greater than or equal to 10.
8 - 67
Example 11
• Suppose that 2,000
subjects are asked to
select whether Pepsi
or Coke tastes better.
• If it is assumed that there is no
difference in product
preference, what is the
probability of observing 900 or
less subjects who thought
Coke was superior?
8 - 68
Example 11 - Solution
• Let X = number of subjects
that selected Coke as superior.
• The actual distribution of X is
binomial (n = 2000, p = .5), but
the desired probability is too
difficult to calculate directly.
8 - 69
Example 11 - Solution
• The expected value of X is
m = E(X) = np
= (2000)(.5) = 1000.
• The variance of X is
s2 = V(X) = np(1 - p )
= (2000)(.5)(1 - .5) = 500.
• The standard deviation is
s = sqrt(500)= 22.36.
8 - 70
Example 11 - Solution
• Let Y be a normally distributed
random variable with a mean of
1,000 and standard deviation of
22.36, then
P( X 900 ) is approximately P ( Y
mately P ( Y 900 ) = P( z (900 - 1000))
22.36
= P( z 4.472) .0000004.
8 - 71
Example 11 - Solution
The probability is so small (about
4 in one million), observing 900
or less persons who prefer Coke
is almost certain not to occur if
the assumption of no preference
is true. [Similarly for <900.5]
8 - 72
Continuity Correction
• The normal approximation to
the binomial can be improved
by using a continuity
correction.
• Example 12
Suppose that you wished to
determine the probability that a
binomial random variable
(n = 20, p = .5) is equal to 5.
8 - 73
Example 12
• To approximate the probability using
the normal would be equivalent to
approximating the area of the
shaded rectangle in the figure.
• To approximate the area of the
rectangle using the normal would
require finding the area under the
curve between 4.5 and 5.5.
8 - 74
Example 12
• Similarly, to use the normal to
approximate the probability
that a binomial random
variable was 5 or less implies
finding the area of the
rectangles for 0, 1, 2, 3, 4,
and 5.
8 - 75
Example 12
• Instead of using the normal
approximation P(X 5), use
the continuity correction P(X
5.5) in order to accumulate all
of the probabilities under the
normal curve that corresponds
to the rectangle associated
with the point 5.
8 - 76
Example 12
• If the problem were to find the
probability that the binomial
random variable were greater
than 4, then the continuity
correction for the normal
approximation would be
P(X 4.5).
8 - 77
Example 13
• A supplier of diskettes has
recently raised its prices.
• A company which purchases
large quantities of diskettes
has decided to look for other
suppliers.
• One of the critical concerns in
the purchase of diskettes is
the fraction that will not format
properly.
8 - 78
Example 13
• Disks that will not format will
be rejected by duplicating
equipment.
• A potential supplier claims that
only 1 percent of their disks
will not format.
• Assume that the supplier's
claim is correct.
8 - 79
Example 13
• If a sample of 1000 diskettes
are purchased, what are the
answers to the questions that
follow.
• Let X = the number of
diskettes which will not format
in a sample of 1000.
• X has a binomial distribution
with n = 1000 and p = .01.
8 - 80
Example 13 - A
What is the expected number
of diskettes in the sample that
will not format?
m = np = (1000)(.01) = 10
8 - 81
Example 13 - B
What is the standard deviation of the
number of diskettes in the sample that will
not format?
s=
npq
1000(.01)(.99) 3.1464
Note: Since np and n(1-p) are both 5, X has
an approximately normal distribution with
m = np = (1000)(.01) = 10 and
s=
npq
1000(.01)(.99) 3.1464
.
Since np =10 10, use continuity
correction.
8 - 82
Ex13 – C, x=16,17,…
are included
What is the probability that
more than 15 of the diskettes
in the sample will not format?
[pillar for 16 begins at 15.5]
P(X > 15) P(X > 15.5)
- m > 15.5 - 10)
=P (X s
3.1464
= P(z > 1.75)
= .5 - P(0 < z < 1.75)
=.5 - .4599 = .0401
8 - 83
Ex13 – D, x=21,22,…
are included
What is the probability that
more than 20 of the diskettes
in the sample will not format?
P(X > 20) P(X > 20.5)
=P{[(x-m)/s ]>[(20.5-20)/3.1464] }
= P(z > 3.34)
= .5 - .4996 = .0004
8 - 84
Example 13 - E
Suppose you observed 22
diskettes fail. Would you
believe that suppliers claim?
Give reasons for your
conclusions.
No. Because from part D, we know
that if p = .01, P(X > 20) = .0004.
i.e. If p = .01, it is very unlikely (.04%
chance) that the number of defective
diskettes is greater than 20.
8 - 85
Approximating the
Poisson by the Normal
• To use the normal
approximation, the mean and
variance of the normal should
be set to the mean and
variance of the Poisson.
• Since the mean and variance
of the Poisson are both l, the
appropriate mean, variance,
and standard deviation for the
normal would be
m = l, s2 = l, s =
l.
8 - 86
Example 14
A company manufacturing metal
sheets believes that the number
of defects on a 10’ by 10’ sheet
of metal follows a Poisson
distribution with an average
defect rate of 5 per sheet.
Metal Inc.
8 - 87
Example 14 - A
Find the standard deviation of
the number of defects per
sheet.
s=
l =
5 = 2.236
8 - 88
No Continuity
Correction if l 5.
• Let X = the number of defects
on a 10' by 10' sheet of metal.
• X has a Poisson distribution
with a mean of 5 and standard
deviation of 2.236.
• X also has an approximately
normal distribution with mean
of 5 and standard deviation of
2.236.
• No continuity correction is
necessary because l = 5 5.
8 - 89
Compute the exact
prob. (Poisson Tables)
Using the Poisson table in the
appendix, find the exact
probability of observing at
least 10 defects per sheet.
P(X 0) = P(X=10) + P(X=11) +
P(X=12) + P(X=13) + P(X=14) +
P(X=15) + ...
= .0181 + .0082 + .0034 + .0013
+ .0005 + .0002 + 0 = .0317
8 - 90
Compute Normal
Approx to Poissson
Using the normal
approximation to the binomial,
find the probability of
observing at least 10 defects
per sheet.
x - m 10 - 5)
(
P(X 10) = P s
2.236
= P(z 2.24)
= .5 - P(0 < z < 2.24)
= .5 - .4875 = .0125
8 - 91
Normal Approx
underestimates prob.
How do the answers in parts B
and C compare?
The difference is
.0125 - .0317 = -.0192.
i.e. The normal approximation
underestimated the actual
probability by .02.
8 - 92
How Stats Learning can
Help in Life
• Problem Solving skills
• Attention to Detail and
focusing on the problem
• Clear Statement of the
Problem (avoiding fuzzy)
• Delineation of the steps in
solving problems and the
patience in implementing
them.
8 - 93