Transcript Document

4
Continuous Random
Variables and
Probability Distributions
4.1
Probability Density
Functions
Probability Density Functions
A discrete random variable is one whose possible values
either constitute a finite set or else can be listed in an
infinite sequence (a list in which there is a first element, a
second element, etc.).
A random variable whose set of possible values is an entire
interval of numbers is not discrete.
3
Probability Density Functions
A random variable is continuous if both of the following
apply:
1. Its set of possible values consists either of all numbers in
a single interval on the number line or all numbers in a
disjoint union of such intervals (e.g., [0, 10]  [20, 30]).
2. No possible value of the variable has positive probability,
that is, P(X = c) = 0 for any possible value c.
4
Probability Distributions for
Continuous Variables
5
Probability Distributions for Continuous Variables
Definition
Let X be a continuous rv. Then a probability distribution
or probability density function (pdf) of X is a function f(x)
such that for any two numbers a and b with a  b,
P(a  X  b) =
6
Probability Distributions for Continuous Variables
That is, the probability that X takes on a value in the
interval [a, b] is the area above this interval and under the
graph of the density function, as illustrated in Figure 4.2.
P(a  X  b) = the area under the density curve between a and b
Figure 4.2
The graph of f(x) is often referred to as the density curve.
7
Probability Distributions for Continuous Variables
For f(x) to be a legitimate pdf, it must satisfy the following
two conditions:
1. f(x)  0 for all x
2.
= area under the entire graph of f(x)
=1
8
Example 4
The direction of an imperfection with respect to a reference
line on a circular object such as a tire, brake rotor, or
flywheel is, in general, subject to uncertainty.
Consider the reference line connecting the valve stem on a
tire to the center point, and let X be the angle measured
clockwise to the location of an imperfection. One possible
pdf for X is
9
Example 4
cont’d
The pdf is graphed in Figure 4.3.
The pdf and probability from Example 4
Figure 4.3
10
Example 4
cont’d
Clearly f(x)  0. The area under the density curve
is just the area of a rectangle:
(height)(base) =
(360) = 1.
The probability that the angle is between 90 and 180 is
11
Probability Distributions for Continuous Variables
Because whenever 0  a  b  360 in Example 4.4 and
P(a  X  b) depends only on the width b – a of the interval,
X is said to have a uniform distribution.
Definition
A continuous rv X is said to have a uniform distribution
on the interval [A, B] if the pdf of X is
12
Probability Distributions for Continuous Variables
When X is a discrete random variable, each possible value
is assigned positive probability.
This is not true of a continuous random variable (that is, the
second condition of the definition is satisfied) because the
area under a density curve that lies above any single value
is zero:
13
Probability Distributions for Continuous Variables
The fact that P(X = c) = 0 when X is continuous has an
important practical consequence: The probability that X lies
in some interval between a and b does not depend on
whether the lower limit a or the upper limit b is included in
the probability calculation:
P(a  X  b) = P(a < X < b) = P(a < X  b) = P(a  X < b)
(4.1)
If X is discrete and both a and b are possible values (e.g.,
X is binomial with n = 20 and a = 5, b = 10), then all four of
the probabilities in (4.1) are different.
14
Example 5
“Time headway” in traffic flow is the elapsed time between
the time that one car finishes passing a fixed point and the
instant that the next car begins to pass that point.
Let X = the time headway for two randomly chosen
consecutive cars on a freeway during a period of heavy
flow. The following pdf of X is essentially the one suggested
in “The Statistical Properties of Freeway Traffic” (Transp.
Res., vol. 11: 221–228):
15
Example 5
cont’d
The graph of f(x) is given in Figure 4.4; there is no density
associated with headway times less than .5, and headway
density decreases rapidly (exponentially fast) as x
increases from .5.
The density curve for time headway in Example 5
Figure 4.4
16
Example 5
Clearly, f(x)  0; to show that
calculus result
cont’d
f(x)dx = 1, we use the
e–kx dx = (1/k)e–k  a.
Then
17
Example 5
cont’d
The probability that headway time is at most 5 sec is
P(X  5) =
=
.15e–.15(x – .5) dx
= .15e.075
e–15x dx
=
18
Example 5
cont’d
= e.075(–e–.75 + e–.075)
= 1.078(–.472 + .928)
= .491
= P(less than 5 sec)
= P(X < 5)
19
The Cumulative Distribution
Function
20
The Cumulative Distribution Function
The cumulative distribution function (cdf) F(x) for a discrete
rv X gives, for any specified number x, the probability
P(X  x) .
It is obtained by summing the pdf p(y) over all possible
values y satisfying y  x.
The cdf of a continuous rv gives the same probabilities
P(X  x) and is obtained by integrating the pdf f(y) between
the limits
and x.
21
The Cumulative Distribution Function
Definition
The cumulative distribution function F(x) for a
continuous rv X is defined for every number x by
F(x) = P(X  x) =
For each x, F(x) is the area under the density curve to the
left of x. This is illustrated in Figure 4.5, where F(x)
increases smoothly as x increases.
A pdf and associated cdf
Figure 4.5
22
Example 6
Let X, the thickness of a certain metal sheet, have a
uniform distribution on [A, B].
The density function is shown in Figure 4.6.
The pdf for a uniform distribution
Figure 4.6
23
Example 6
cont’d
For x < A, F(x) = 0, since there is no area under the graph
of the density function to the left of such an x.
For x  B, F(x) = 1, since all the area is accumulated to the
left of such an x. Finally for A  x  B,
24
Example 6
cont’d
The entire cdf is
The graph of this cdf appears in Figure 4.7.
The cdf for a uniform distribution
Figure 4.7
25
Using F(x) to Compute
Probabilities
26
Using F(x) to Compute Probabilities
The importance of the cdf here, just as for discrete rv’s, is
that probabilities of various intervals can be computed from
a formula for or table of F(x).
Proposition
Let X be a continuous rv with pdf f(x) and cdf F(x). Then for
any number a,
P(X > a) = 1 – F(a)
and for any two numbers a and b with a < b,
P(a  X  b) = F(b) – F(a)
27
Using F(x) to Compute Probabilities
Figure 4.8 illustrates the second part of this proposition; the
desired probability is the shaded area under the density
curve between a and b, and it equals the difference
between the two shaded cumulative areas.
Computing P(a  X  b) from cumulative probabilities
Figure 4.8
This is different from what is appropriate for a discrete
integer valued random variable (e.g., binomial or Poisson):
P(a  X  b) = F(b) – F(a – 1) when a and b are integers.
28
Example 7
Suppose the pdf of the magnitude X of a dynamic load on a
bridge (in newtons) is
For any number x between 0 and 2,
29
Example 7
cont’d
Thus
The graphs of f(x) and F(x) are shown in Figure 4.9.
The pdf and cdf for Example 4.7
Figure 4.9
30
Example 7
cont’d
The probability that the load is between 1 and 1.5 is
P(1  X  1.5) = F(1.5) – F(1)
The probability that the load exceeds 1 is
P(X > 1) = 1 – P(X  1)
= 1 – F(1)
31
Example 7
cont’d
=1–
Once the cdf has been obtained, any probability involving X
can easily be calculated without any further integration.
32
Obtaining f(x) from F(x)
33
Obtaining f(x) from F(x)
For X discrete, the pmf is obtained from the cdf by taking
the difference between two F(x) values. The continuous
analog of a difference is a derivative.
The following result is a consequence of the Fundamental
Theorem of Calculus.
Proposition
If X is a continuous rv with pdf f(x) and cdf F(x), then at
every x at which the derivative F(x) exists, F(x) = f(x).
34
Example 8
When X has a uniform distribution, F(x) is differentiable
except at x = A and x = B, where the graph of F(x) has
sharp corners.
Since F(x) = 0 for x < A and F(x) = 1 for
x > B, F(x) = 0 = f(x) for such x.
For A < x < B,
35
Percentiles of a Continuous
Distribution
36
Percentiles of a Continuous Distribution
When we say that an individual’s test score was at the 85th
percentile of the population, we mean that 85% of all
population scores were below that score and 15% were
above.
Similarly, the 40th percentile is the score that exceeds 40%
of all scores and is exceeded by 60% of all scores.
37
Percentiles of a Continuous Distribution
Proposition
Let p be a number between 0 and 1. The (100p)th
percentile of the distribution of a continuous rv X, denoted
by (p), is defined by
p = F((p)) =
f(y) dy
(4.2)
According to Expression (4.2), (p) is that value on the
measurement axis such that 100p% of the area under the
graph of f(x) lies to the left of (p) and 100(1 – p)% lies to
the right.
38
Percentiles of a Continuous Distribution
Thus (.75), the 75th percentile, is such that the area under
the graph of f(x) to the left of (.75) is .75.
Figure 4.10 illustrates the definition.
The (100p)th percentile of a continuous distribution
Figure 4.10
39
Example 9
The distribution of the amount of gravel (in tons) sold by a
particular construction supply company in a given week is a
continuous rv X with pdf
The cdf of sales for any x between 0 and 1 is
40
Example 9
cont’d
The graphs of both f(x) and F(x) appear in Figure 4.11.
The pdf and cdf for Example 4.9
Figure 4.11
41
Example 9
cont’d
The (100p)th percentile of this distribution satisfies the
equation
that is,
((p))3 – 3(p) + 2p = 0
For the 50th percentile, p = .5, and the equation to be
solved is 3 – 3 + 1 = 0; the solution is  = (.5) = .347. If
the distribution remains the same from week to week, then
in the long run 50% of all weeks will result in sales of less
than .347 ton and 50% in more than .347 ton.
42
Percentiles of a Continuous Distribution
Definition
The median of a continuous distribution, denoted by , is
the 50th percentile, so satisfies .5 = F( ) That is, half the
area under the density curve is to the left of and half is to
the right of .
A continuous distribution whose pdf is symmetric—the
graph of the pdf to the left of some point is a mirror image
of the graph to the right of that point—has median equal
to the point of symmetry, since half the area under the
curve lies to either side of this point.
43
Percentiles of a Continuous Distribution
Figure 4.12 gives several examples. The error in a
measurement of a physical quantity is often assumed to
have a symmetric distribution.
Medians of symmetric distributions
Figure 4.12
44
Expected Values
45
Expected Values
For a discrete random variable X, E(X) was obtained by
summing x  p(x)over possible X values.
Here we replace summation by integration and the pmf by
the pdf to get a continuous weighted average.
Definition
The expected or mean value of a continuous rvX with
pdf f(x) is
 x = E(X) =
x  f(x) dy
46
Example 10
The pdf of weekly gravel sales X was
f(x) =
(1 – x2) 0  x  1
0
otherwise
So
47
Expected Values
When the pdf f(x) specifies a model for the distribution of
values in a numerical population, then  is the population
mean, which is the most frequently used measure of
population location or center.
Often we wish to compute the expected value of some
function h(X) of the rv X.
If we think of h(X) as a new rv Y, techniques from
mathematical statistics can be used to derive the pdf of Y,
and E(Y) can then be computed from the definition.
48
Expected Values
Fortunately, as in the discrete case, there is an easier way
to compute E[h(X)].
Proposition
If X is a continuous rv with pdf f(x) and h(X) is any function
of X, then
E[h(X)] = h(X) =
h(x)  f (x) dx
49
Example 11
Two species are competing in a region for control of a
limited amount of a certain resource.
Let X = the proportion of the resource controlled by species
1 and suppose X has pdf
f(x) =
0x1
otherwise
which is a uniform distribution on [0, 1]. (In her book
Ecological Diversity, E. C. Pielou calls this the “broken- tick”
model for resource allocation, since it is analogous to
breaking a stick at a randomly chosen point.)
50
Example 11
cont’d
Then the species that controls the majority of this resource
controls the amount
h(X) = max (X, 1 – X) =
The expected amount controlled by the species having
majority control is then
E[h(X)] =
max(x, 1 – x)  f(x)dx
51
Example 11
cont’d
=
max(x, 1 – x)  1 dx
=
(1 – x)  1 dx +
x  1 dx
=
52
Variance
For h(X), a linear function, E[h(X)] = E(aX + b) = aE(X) + b.
In the discrete case, the variance of X was defined as the
expected squared deviation from  and was calculated by
summation. Here again integration replaces summation.
Definition
The variance of a continuous random variable X with pdf
f(x) and mean value  is
= V(X) =
(x – )2  f(x)dx = E[(X – )2]
The standard deviation (SD) of X is X =
53
Variance
The variance and standard deviation give quantitative
measures of how much spread there is in the distribution or
population of x values.
Again  is roughly the size of a typical deviation from .
Computation of 2 is facilitated by using the same shortcut
formula employed in the discrete case.
Proposition
V(X) = E(X2) – [E(X)]2
54
Example 12
For weekly gravel sales, we computed E(X) = . Since
E(X2) =
=
=
x2  f(x) dx
x2 
(1 – x2) dx
(x2 – x4) dx =
55
Example 12
cont’d
and X = .244
When h(X) = aX + b, the expected value and variance of
h(X) satisfy the same properties as in the discrete case:
E[h(X)] = a + b and V[h(X)] = a2  2.
56
The Normal Distribution
57
The Normal Distribution
The normal distribution is the most important one in all of
probability and statistics.
Many numerical populations have distributions that can be
fit very closely by an appropriate normal curve.
Examples: heights, weights, measurement errors in
scientific experiments, anthropometric measurements on
fossils, reaction times in psychological experiments,
measurements of intelligence and aptitude, scores on
various tests, and numerous economic measures and
indicators.
58
The Normal Distribution
Definition
A continuous rv X is said to have a normal distribution
with parameters  and  (or  and 2), where
<<
and 0 < , if the pdf of X is
f(x; , ) =
<x<
e = 2.71828… The base of the natural logarithm
π = pi = 3.14159…
(4.3)
59
The Normal Distribution
The statement that X is normally distributed with
parameters  and 2 is often abbreviated X ~ N(, 2).
Clearly f(x; , )  0, but a somewhat complicated calculus
argument must be used to verify that
f(x; , ) dx = 1.
It can be shown that E(X) =  and V(X) = 2, so the
parameters are the mean and the standard deviation of X.
60
The Normal Distribution
Figure 4.13 presents graphs of f(x; , ) for several
different (, ) pairs.
Two different normal density curves
Figure 4.13(a)
Visualizing  and  for a normal
distribution
Figure 4.13(b)
61
A family of density curves
Here, means are the same ( = 15)
while standard deviations are
different ( = 2, 4, and 6).
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
2
4
6
8
Here, means are different
( = 10, 15, and 20) while standard
deviations are the
same ( = 3).
0
10
12
14
16
18
20
22
24
26
28
62
30
The Standard Normal
Distribution
63
The Standard Normal Distribution
The computation of P(a  X  b) when X is a normal rv with
parameters  and  requires evaluating
(4.4)
None of the standard integration techniques can be used to
accomplish this. Instead, for  = 0 and  = 1, Expression
(4.4) has been calculated using numerical techniques
and tabulated for certain values of a and b.
This table can also be used to compute probabilities for any
other values of  and  under consideration.
64
The Standard Normal Distribution
Definition
The normal distribution with parameter values  = 0 and
 = 1 is called the standard normal distribution.
A random variable having a standard normal distribution is
called a standard normal random variable and will be
denoted by Z. The pdf of Z is
<z<
The graph of f(z; 0, 1) is called the standard normal (or z)
curve. Its inflection points are at 1 and –1. The cdf of Z is
P(Z  z) =
which we will denote by
65
The Standard Normal Distribution
The standard normal distribution almost never serves as a
model for a naturally arising population.
Instead, it is a reference distribution from which information
about other normal distributions can be obtained.
Appendix Table A.3 gives
= P(Z  z), the area under the
standard normal density curve to the left of z, for
z = –3.49, –3.48,..., 3.48, 3.49.
66
The Standard Normal Distribution
Figure 4.14 illustrates the type of cumulative area
(probability) tabulated in Table A.3. From this table, various
other probabilities involving Z can be calculated.
Standard normal cumulative areas tabulated in Appendix Table A.3
Figure 4.14
67
Example 13
Let’s determine the following standard normal probabilities:
(a) P(Z  1.25),
(b) P(Z > 1.25),
(c) P(Z  –1.25), and
(d) P(–.38  Z  1.25).
a. P(Z  1.25) = (1.25), a probability that is tabulated in
Appendix Table A.3 at the intersection of the row
marked 1.2 and the column marked .05.
The number there is .8944, so P(Z  1.25) = .8944.
68
Example 13
cont’d
Figure 4.15(a) illustrates this probability.
Normal curve areas (probabilities) for Example 13
Figure 4.15(a)
b. P(Z > 1.25) = 1 – P(Z  1.25) = 1 – (1.25), the area
under the z curve to the right of 1.25 (an upper-tail
area). Then (1.25) = .8944 implies that
P(Z > 1.25) = .1056.
69
Example 13
cont’d
Since Z is a continuous rv, P(Z  1.25) = .1056.
See Figure 4.15(b).
Normal curve areas (probabilities) for Example 13
Figure 4.15(b)
c. P(Z  –1.25) = (–1.25), a lower-tail area. Directly from
Appendix Table A.3, (–1.25) = .1056.
By symmetry of the z curve, this is the same answer as
in part (b).
70
Example 13
cont’d
d. P(–.38  Z  1.25) is the area under the standard
normal curve above the interval whose left endpoint is
–.38 and whose right endpoint is 1.25.
From Section 4.2, if X is a continuous rv with cdf F(x),
then P(a  X  b) = F(b) – F(a).
Thus P(–.38  Z  1.25) =
(1.25) –
(–.38)
= .8944 – .3520
= .5424
71
Example 13
cont’d
See Figure 4.16.
P(–.38  Z  1.25) as the difference between two cumulative areas
Figure 4.16
72
Percentiles of the Standard Normal
Distribution
73
Percentiles of the Standard Normal Distribution
For any p between 0 and 1, Appendix Table A.3 can be
used to obtain the (100p)th percentile of the standard
normal distribution.
74
Example 14
The 99th percentile of the standard normal distribution is
that value on the horizontal axis such that the area under
the z curve to the left of the value is .9900.
Appendix Table A.3 gives for fixed z the area under the
standard normal curve to the left of z, whereas here we
have the area and want the value of z. This is the “inverse”
problem to P(Z  z) = ?
so the table is used in an inverse fashion: Find in the
middle of the table .9900; the row and column in which it
lies identify the 99th z percentile.
75
Example 14
cont’d
Here .9901 lies at the intersection of the row marked 2.3
and column marked .03, so the 99th percentile is
(approximately) z = 2.33.
(See Figure 4.17.)
Finding the 99th percentile
Figure 4.17
76
Example 14
cont’d
By symmetry, the first percentile is as far below 0 as the
99th is above 0, so equals –2.33 (1% lies below the first
and also above the 99th).
(See Figure 4.18.)
The relationship between the 1st and 99th percentiles
Figure 4.18
77
Percentiles of the Standard Normal Distribution
In general, the (100p)th percentile is identified by the row
and column of Appendix Table A.3 in which the entry p is
found (e.g., the 67th percentile is obtained by finding .6700
in the body of the table, which gives z = .44).
If p does not appear, the number closest to it is often used,
although linear interpolation gives a more accurate answer.
78
Percentiles of the Standard Normal Distribution
For example, to find the 95th percentile, we look for .9500
inside the table.
Although .9500 does not appear, both .9495 and .9505 do,
corresponding to z = 1.64 and 1.65, respectively.
Since .9500 is halfway between the two probabilities that
do appear, we will use 1.645 as the 95th percentile and
–1.645 as the 5th percentile.
79
z Notation for z Critical Values
80
z Notation for z Critical Values
In statistical inference, we will need the values on the
horizontal z axis that capture certain small tail areas under
the standard normal curve.
Notation
z will denote the value on the z axis for which  of the
area under the z curve lies to the right of z.
(See Figure 4.19.)
z notation Illustrated
Figure 4.19
81
z Notation for z Critical Values
For example, z.10 captures upper-tail area .10, and z.01
captures upper-tail area .01.
Since  of the area under the z curve lies to the right of z,
1 –  of the area lies to its left. Thus z is the 100(1 – )th
percentile of the standard normal distribution.
By symmetry the area under the standard normal curve to
the left of –z is also . The z s are usually referred to as
z critical values.
82
z Notation for z Critical Values
Table 4.1 lists the most useful z percentiles and z values.
Standard Normal Percentiles and Critical Values
Table 4.1
83
Example 15
z.05 is the 100(1 – .05)th = 95th percentile of the standard
normal distribution, so z.05 = 1.645.
The area under the standard normal curve to the left of
–z.05 is also .05. (See Figure 4.20.)
Finding z.05
Figure 4.20
84
Nonstandard Normal
Distributions
85
Nonstandard Normal Distributions
When X ~ N(,  2), probabilities involving X are computed
by “standardizing.” The standardized variable is (X – )/.
Subtracting  shifts the mean from  to zero, and then
dividing by  scales the variable so that the standard
deviation is 1 rather than .
Proposition
If X has a normal distribution with mean  and standard
deviation , then
86
Nonstandard Normal Distributions
has a standard normal distribution. Thus
87
Nonstandard Normal Distributions
The key idea of the proposition is that by standardizing, any
probability involving X can be expressed as a probability
involving a standard normal rv Z, so that Appendix Table
A.3 can be used.
This is illustrated in Figure 4.21.
Equality of nonstandard and standard normal curve areas
Figure 4.21
88
Nonstandard Normal Distributions
The proposition can be proved by writing the cdf of
Z = (X – )/ as
Using a result from calculus, this integral can be
differentiated with respect to z to yield the desired pdf
f(z; 0, 1).
89
Example 16
The time that it takes a driver to react to the brake lights on
a decelerating vehicle is critical in helping to avoid rear-end
collisions.
The article “Fast-Rise Brake Lamp as a CollisionPrevention Device” (Ergonomics, 1993: 391–395) suggests
that reaction time for an in-traffic response to a brake signal
from standard brake lights can be modeled with a normal
distribution having mean value 1.25 sec and standard
deviation of .46 sec.
90
Example 16
cont’d
What is the probability that reaction time is between 1.00
sec and 1.75 sec? If we let X denote reaction time, then
standardizing gives
1.00  X  1.75
if and only if
Thus
91
Example 16
cont’d
= P(–.54  Z  1.09) =
(1.09) –
(–.54)
= .8621 – .2946 = .5675
This is illustrated in Figure 4.22
Normal curves for Example 16
Figure 4.22
92
Example 16
cont’d
Similarly, if we view 2 sec as a critically long reaction
time, the probability that actual reaction time will exceed
this value is
93
Percentiles of an Arbitrary Normal
Distribution
94
Percentiles of an Arbitrary Normal Distribution
The (100p)th percentile of a normal distribution with mean
 and standard deviation  is easily related to the (100p)th
percentile of the standard normal distribution.
Proposition
Another way of saying this is that if z is the desired
percentile for the standard normal distribution, then the
desired percentile for the normal (, ) distribution is z
standard deviations from .
95
Example 18
The amount of distilled water dispensed by a certain
machine is normally distributed with mean value 64 oz and
standard deviation .78 oz.
What container size c will ensure that overflow occurs only
.5% of the time? If X denotes the amount dispensed, the
desired condition is that P(X > c) = .005, or, equivalently,
that P(X  c) = .995.
Thus c is the 99.5th percentile of the normal distribution
with  = 64 and  = .78.
96
Example 18
cont’d
The 99.5th percentile of the standard normal distribution is
2.58, so
c = (.995) = 64 + (2.58)(.78) = 64 + 2.0 = 66 oz
This is illustrated in Figure 4.23.
Distribution of amount dispensed for Example 18
Figure 4.23
97
The Normal Distribution and
Discrete Populations
98
The Normal Distribution and Discrete Populations
The normal distribution is often used as an approximation
to the distribution of values in a discrete population.
In such situations, extra care should be taken to ensure
that probabilities are computed in an accurate manner.
99
Example 19
IQ in a particular population (as measured by a standard
test) is known to be approximately normally distributed with
 = 100 and  = 15.
What is the probability that a randomly selected individual
has an IQ of at least 125?
Letting X = the IQ of a randomly chosen person, we wish
P(X  125).
The temptation here is to standardize X  125 as in
previous examples. However, the IQ population distribution
is actually discrete, since IQs are integer-valued.
100
Example 19
cont’d
So the normal curve is an approximation to a discrete
probability histogram, as pictured in Figure 4.24.
A normal approximation to a discrete distribution
Figure 4.24
The rectangles of the histogram are centered at integers,
so IQs of at least 125 correspond to rectangles beginning
at 124.5, as shaded in Figure 4.24.
101
Example 19
cont’d
Thus we really want the area under the approximating
normal curve to the right of 124.5.
Standardizing this value gives P(Z  1.63) = .0516,
whereas standardizing 125 results in P(Z  1.67) = .0475.
The difference is not great, but the answer .0516 is more
accurate. Similarly, P(X = 125) would be approximated by
the area between 124.5 and 125.5, since the area under
the normal curve above the single value 125 is zero.
102
Example 19
cont’d
The correction for discreteness of the underlying
distribution in Example 19 is often called a continuity
correction.
It is useful in the following application of the normal
distribution to the computation of binomial probabilities.
103
Approximating the Binomial
Distribution
104
Approximating the Binomial Distribution
Recall that the mean value and standard deviation of a
binomial random variable X are X = np and X =
respectively.
105
Approximating the Binomial Distribution
Figure 4.25 displays a binomial probability histogram for
the binomial distribution with n = 20, p = .6, for which
 = 20(.6) = 12 and  =
Binomial probability histogram for n = 20, p = .6 with
normal approximation curve superimposed
Figure 4.25
106
Approximating the Binomial Distribution
A normal curve with this  and  has been superimposed
on the probability histogram.
Although the probability histogram is a bit skewed (because
p  .5), the normal curve gives a very good approximation,
especially in the middle part of the picture.
The area of any rectangle (probability of any particular
X value) except those in the extreme tails can be
accurately approximated by the corresponding normal
curve area.
107
Approximating the Binomial Distribution
For example,
P(X = 10) = B(10; 20, .6) – B(9; 20, .6) = .117,
whereas the area under the normal curve between 9.5 and
10.5 is P(–1.14  Z  –.68) = .1212.
More generally, as long as the binomial probability
histogram is not too skewed, binomial probabilities can be
well approximated by normal curve areas.
It is then customary to say that X has approximately a
normal distribution.
108
Approximating the Binomial Distribution
Proposition
Let X be a binomial rv based on n trials with success
probability p. Then if the binomial probability histogram is
not too skewed, X has approximately a normal distribution
with  = np and  =
In particular, for x = a possible value of X,
109
Approximating the Binomial Distribution
In practice, the approximation is adequate provided that
both np  10 and n(1-p)  10, since there is then enough
symmetry in the underlying binomial distribution.
A direct proof of this result is quite difficult. In the next
chapter we’ll see that it is a consequence of a more general
result called the Central Limit Theorem.
In all honesty, this approximation is not so important for
probability calculation as it once was.
This is because software can now calculate binomial
probabilities exactly for quite large values of n.
110
Example 20
Suppose that 25% of all students at a large public
university receive financial aid.
Let X be the number of students in a random sample of
size 50 who receive financial aid, so that p = .25.
Then  = 12.5 and  = 3.06.
Since np = 50(.25) = 12.5  10 and n(1-p) = 37.5  10, the
approximation can safely be applied.
111
Example 20
cont’d
The probability that at most 10 students receive aid is
Similarly, the probability that between 5 and 15 (inclusive)
of the selected students receive aid is
P(5  X  15) = B(15; 50, .25) – B(4; 50, .25)
112
Example 20
cont’d
The exact probabilities are .2622 and .8348, respectively,
so the approximations are quite good.
In the last calculation, the probability P(5  X  15) is being
approximated by the area under the normal curve between
4.5 and 15.5—the continuity correction is used for both the
upper and lower limits.
113