Chapter 1: Statistics
Download
Report
Transcript Chapter 1: Statistics
Chapter 6 ~ Normal Probability Distributions
P(a x b)
a
b
x
1
Chapter Goals
• Learn about the normal, bell-shaped, or
Gaussian distribution
• How probabilities are found
• How probabilities are represented
• How normal distributions are used in the real
world
2
6.1 ~ Normal Probability Distributions
• The normal probability distribution is the most
important distribution in all of statistics
• Many continuous random variables have normal
or approximately normal distributions
• Need to learn how to describe a normal
probability distribution
3
Normal Probability Distribution
1. A continuous random variable
2. Description involves two functions:
a. A function to determine the ordinates of the graph
picturing the distribution
b. A function to determine probabilities
3. Normal probability distribution function:
1 ( x-)
e 2 s
2
1
s 2p
This is the function for the normal (bell-shaped) curve
f ( x) =
4. The probability that x lies in some interval is the area
under the curve
4
The Normal Probability Distribution
s
- 3s - 2s - s
s
2s 3s
5
Probabilities for a Normal Distribution
• Illustration
b
P(a x b) = f ( x )dx
a
a
b
x
6
Notes
The definite integral is a calculus topic
We will use the TI83/84 to find probabilities for normal
distributions
We will learn how to compute probabilities for one special
normal distribution: the standard normal distribution
We will learn to transform all other normal probability
questions to this special distribution
Recall the empirical rule: the percentages that lie within
certain intervals about the mean come from the normal
probability distribution
We need to refine the empirical rule to be able to find the
percentage that lies between any two numbers
7
Percentage, Proportion & Probability
• Basically the same concepts
• Percentage (30%) is usually used when talking
about a proportion (3/10) of a population
• Probability is usually used when talking about
the chance that the next individual item will
possess a certain property
• Area is the graphic representation of all three
when we draw a picture to illustrate the situation
8
6.2 ~ The Standard Normal Distribution
• There are infinitely many normal probability
distributions
• They are all related to the standard normal
distribution
• The standard normal distribution is the
normal distribution of the standard variable z
(the z-score)
9
Standard Normal Distribution
Properties:
• The total area under the normal curve is equal to 1
• The distribution is mounded and symmetric; it extends indefinitely in
both directions, approaching but never touching the horizontal axis
• The distribution has a mean of 0 and a standard deviation of 1
• The mean divides the area in half, 0.50 on each side
• Nearly all the area is between z = -3.00 and z = 3.00
Notes:
Table 3, Appendix B lists the probabilities associated with the intervals
from the mean (0) to a specific value of z
Probabilities of other intervals are found using the table
entries, addition, subtraction, and the properties above
10
Table 3, Appendix B Entries
0
z
• The table contains the area under the standard normal curve
between 0 and a specific value of z
11
Example
Example: Find the area under the standard normal curve between
z = 0 and z = 1.45
0
• A portion of Table 3:
z
0.00
0.01
0.02
0.03
145
.
0.04
z
0.05
0.06
..
.
1.4
0.4265
..
.
P (0 z 145
. ) = 0.4265
12
Using the TI 83/84
• To find the area between 0 and 1.45, do the
following:
•
•
•
•
•
2nd DISTR 2 which is normalcdf(
Enter the lower bound of 0
Enter a comma
Then enter 1.45
Close the parentheses if you like or hit “Enter”
• The value of .426 is shown as the answer!
• Interpretation of the result: The probability that Z
lies between 0 and 1.45 is 0.426
13
Example
Example: Find the area under the normal curve to the right
of z = 1.45; P(z > 1.45)
Area asked for
0.4265
0
145
.
z
P( z 145
. ) = 0.5000 - 0.4265 = 0.0735
14
Using the TI 83/84
• To find the area between 1.45 and ∞, do the
following:
•
•
•
•
•
2nd DISTR 2 which is normalcdf(
Enter the lower bound of 1.45
Enter a comma
Then enter 1 2nd EE 99
Close the parentheses if you like or hit “Enter”
• The value of .074 is shown as the answer!
• Interpretation of result: The probability that Z is
greater than 1.45 is 0.074
15
Example
Example: Find the area to the left of z = 1.45; P(z < 1.45)
0.5000
0.4265
0
145
.
z
P( z 145
. ) = 0.5000 0.4265 = 0.9265
16
Using The TI 83/84
• To find the area between - ∞ and 1.45, do the
following:
•
•
•
•
•
2nd DISTR 2 which is normalcdf(
Enter the lower bound of -1 2nd EE 99
Enter a comma
Then enter 1.45
Close the parentheses if you like or hit “Enter”
• The value of 0.926 is shown as the answer!
• Interpretation of result: The probability that Z is
less than 1.45 is 0.926
17
Notes
The addition and subtraction used in the previous
examples are correct because the “areas” represent
mutually exclusive events
The symmetry of the normal distribution is a key factor
in determining probabilities associated with values
below (to the left of) the mean. For example: the area
between the mean and z = -1.37 is exactly the same as
the area between the mean and z = +1.37.
When finding normal distribution probabilities, a sketch
is always helpful
18
Example
Example: Find the area between the mean (z = 0) and
z = -1.26
Area asked for
-126
.
0
126
.
z
P( -126
. z 0) = 0.3962
19
Using the TI 83/84
• Find the area to the left of z = -0.98
• Use -1E99 for - ∞ and enter 2nd DISTR
• Normalcdf (-1e99, -0.98) which gives .164
Area asked for
-0.98
0
20
Example
Example: Find the area between z = -2.30 and z = 1.80
0.4893
- 2.30
0.4641
0
180
.
P ( -2.30 z 180
. ) = P ( -2.30 z 0) P ( 0 z 180
. )
= 0.4893 0.4641 = 0.9534
21
Using the TI 83/84
Find the area between z = -2.30 and z = 1.80
• Enter 2nd DISTR, normalcdf (-2.3, 1.80) and press enter
• .953 is given as the answer.
• Remember, the function normalcdf is of the form:
• Normalcdf(lower limit, upper limit, mean, standard deviation) and if
you’re working with distributions other than the standard normal
(recall mean = 0, stddev = 1), you must enter the values for mean
and standard deviation
22
Normal Distribution Note
The normal distribution table may also be used to determine
a z-score if we are given the area (working backwards)
Example: What is the z-score associated with the 85th
percentile?
23
Using the TI 83/84
• There is another function in the DISTR list that is
used to find the value of z (or x) when the
probability is given. For the previous problem,
we are actually asking what is the value of z such
that 85% of the distribution lies below it.
24
Using the TI 83/84
• Use 2nd DISTR invNorm( to calculate this value
• 2nd DISTR invNorm(.85) “ENTER” gives us a
value of 1.036 which is shown
25
Example
Example: What z-scores bound the middle 90% of a
standard normal distribution?
26
Using the TI 83/84
• The TI 83/84 calculates areas from -∞ to the
value of z we are interested in. Therefore, we
must get a little creative to solve some problems.
• Using the idea that the total area equals one
comes in very handy here!
• For the example given, where we are interested in
the value of z that bounds the middle 90%, the
tails therefore represent a total of 10%. Divide
this in two since it is symmetric and this gives
5% in each tail.
27
Using the TI 83/84
• Now use the 2nd DISTR invNorm with .05 in the
argument like this:
• Which gives an answer of -1.645
– Since the distribution is symmetric, the upper limit is
1.645, so 90% of the distribution lies between
(-1.645, 1.645)
28
Using the TI 83/84
Now let’s work the problems on page 279
29
6.3 ~ Applications of Normal Distributions
• Apply the techniques learned for the z distribution
to all normal distributions
• Start with a probability question in terms of
x-values
• Convert, or transform, the question into an
equivalent probability statement involving
z-values
30
Standardization
• Suppose x is a normal random variable with mean and
standard deviation s
x-
• The random variable z =
s
distribution
0
has a standard normal
c
c-
s
x
z
31
Example
Example: A bottling machine is adjusted to fill bottles with a
mean of 32.0 oz of soda and standard deviation of
0.02. Assume the amount of fill is normally distributed
and a bottle is selected at random:
1) Find the probability the bottle contains between 32.00 oz and
32.025 oz
2) Find the probability the bottle contains more than 31.97 oz
Solutions:
1) When x = 32.00 ;
When x = 32.025;
32.00 - 32.00 - 32.0
=
= 0.00
z=
s
0.02
z=
32.025 - 32.025 - 32.0
=
= 1.25
s
0.02
32
Solution Continued
Area asked for
32.0
0
32.025
125
.
x
z
32.0 - 32.0 x - 32.0 32.025 - 32.0
P ( 32.0 x 32.025) = P
0.02
0.02
0.02
= P ( 0 z 1.25) = 0. 3944
33
Example, Part 2
2)
3197
.
- 150
.
32.0
0
x
z
x - 32.0
3197
. - 32.0
= P( z -150)
P( x 3197
. ) = P
.
0.02
0.02
= 0.5000 0.4332 = 0.9332
34
Notes
• The normal table may be used to answer many kinds of questions
involving a normal distribution
• Often we need to find a cutoff point: a value of x such that there is
a certain probability in a specified interval defined by x
Example: The waiting time x at a certain bank is approximately
normally distributed with a mean of 3.7 minutes and a
standard deviation of 1.4 minutes. The bank would
like to claim that 95% of all customers are waited on
by a teller within c minutes. Find the value of c that
makes this statement true.
35
Solution
0.0500
0.5000 0.4500
3.7
0
P ( x c) = 0.95
x - 3.7 c - 3.7 =
0.95
P
14
.
1.4
c - 3.7 =
0.95
P z
14
.
c
1645
.
x
z
c - 3.7
= 1645
.
14
.
c = (1645
. )(14
. ) 3.7 = 6.003
c 6 minutes
36
Example
Example: A radar unit is used to measure the speed of
automobiles on an expressway during rush-hour traffic. The
speeds of individual automobiles are normally distributed with a
mean of 62 mph. Find the standard deviation of all speeds if 3% of
the automobiles travel faster than 72 mph.
0.0300
0.4700
62
72
x
0
188
.
z
37
Solution
P( x 72) = 0.03
x-
=
;
z
s
P ( z 188
. ) = 0.03
72 - 62
1.88 =
s
188
. s = 10
s = 10 / 188
. = 5.32
38
Notation
• If x is a normal random variable with mean and
standard deviation s, this is often denoted:
x ~ N(, s)
Example: Suppose x is a normal random variable
with = 35 and s = 6. A convenient notation to
identify this random variable is: x ~ N(35, 6).
39
6.4 ~ Notation
• z-score used throughout statistics in a variety of
ways
• Need convenient notation to indicate the area
under the standard normal distribution
• z(a) is the algebraic name, for the z-score (point on
the z axis) such that there is a of the area
(probability) to the right of z(a)
40
Illustrations
z(0.10) represents the
value of z such that the
area to the right under
the standard normal
curve is 0.10
010
.
0
z(0.10)
z
z(0.80) represents the
value of z such that the
area to the right under
the standard normal
curve is 0.80
0.80
z(0.80) 0
z
41
Example
Example: Find the numerical value of z(0.10):
Table shows this area (0.4000)
0.10 (area information
from notation)
0
z(0.10)
z
z(0.10) = 1.28
42
Example
Example: Find the numerical value of z(0.80):
Look for 0.3000; remember
that z must be negative
z(0.80) 0
z
• Use Table 3: look for an area as close as possible to 0.3000
• z(0.80) = -0.84
43
Notes
• The values of z that will be used regularly come from
one of the following situations:
1. The z-score such that there is a specified area in one
tail of the normal distribution
2. The z-scores that bound a specified middle
proportion of the normal distribution
44
Example
Example: Find the numerical value of z(0.99):
0.01
z(0.99)
0
z
• Because of the symmetrical nature of the normal distribution,
z(0.99) = -z(0.01)
45
Example
Example: Find the z-scores that bound the middle 0.99 of the
normal distribution:
0.005
0.005
0.495
z(0.995)
or
-z(0.005)
0.495
0
z(0.005)
z(0.005) = 2.575 and z(0.995) = -z(0.005) = -2.575
46
6.5 ~ Normal Approximation of the Binomial
• Recall: the binomial distribution is a probability
distribution of the discrete random variable x, the
number of successes observed in n repeated
independent trials
• Binomial probabilities can be reasonably
estimated by using the normal probability
distribution
47
Background & Histogram
• Background: Consider the distribution of the binomial
variable x when n = 20 and p = 0.5
• Histogram: P( x )
0.18
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
x
The histogram may be approximated by a normal curve
48
Notes
The normal curve has mean and standard deviation
from the binomial distribution:
= np = (20)(0.5) = 10
s = npq = (20)(0.5)(0.5) = 5 2.236
Can approximate the area of the rectangles with the
area under the normal curve
The approximation becomes more accurate as n
becomes larger
49
Two Problems
1. As p moves away from 0.5, the binomial distribution is less
symmetric, less normal-looking
Solution: The normal distribution provides a reasonable
approximation to a binomial probability distribution whenever the
values of np and n(1 - p) both equal or exceed 5
2. The binomial distribution is discrete, and the normal distribution
is continuous
Solution: Use the continuity correction factor. Add or subtract
0.5 to account for the width of each rectangle.
50
Example
Example: Research indicates 40% of all students entering a
certain university withdraw from a course during
their first year. What is the probability that fewer
than 650 of this year’s entering class of 1800 will
withdraw from a class?
• Let x be the number of students that withdraw from a course
during their first year
• x has a binomial distribution: n = 1800, p = 0.4
• The probability function is given by:
1800
x
1800- x
P( x ) =
(
0
.
4
)
(
0
.
6
)
for x = 0, 1, 2, ... ,1800
x
51
Solution
• Use the normal approximation method:
= np = (1800)(0.4) = 720
s = npq = (1800)(0.4)(0.6) = 432 20.78
P( x is fewer than 650) = P( x 650) (for discrete variable x )
= P( x 649.5) (for a continuous variable x )
x - 720 649.5 - 720
= P
20.78
20.78
= P( z -3.39)
= 0.5000 - 0.4997 = 0.0003
52
Random Number Generation
• With each rand execution, the TI-84 Plus
generates the same random-number sequence
• for a given seed value. The TI-84 Plus factory-set
seed value for rand is 0. To generate a
• different random-number sequence, store any
nonzero seed value to rand. To restore
• the factory-set seed value, store 0 to rand or reset
the defaults (Chapter 18).
• Note: The seed value also affects randInt(,
randNorm(, and randBin( instructions.
53