Module 12 Lesson 3 Notes

Download Report

Transcript Module 12 Lesson 3 Notes

The Normal Distribution
NOTES
In earlier courses, you have explored data in the following
ways:
 By plotting data (histogram, stemplot, bar graph,
etc.)
 By looking for patterns (shape, center, spread,
outliers, etc.)
 By calculating (mean, median, mode, spread, etc.)
Sometimes, the overall pattern of a large
number of observations is so regular that
we can describe it using a curve.
LET’S START BY LOOKING AT A DENSITY
CURVE.
Suppose you have created the histogram on the left to represent a set of data.
Because it appears symmetric and its ends behave similarly, we could approximate
it with a curve, or model, as shown on the right.
It will be easier for us to work with
the curve than with the histogram
itself.
THIS IS BECAUSE WE WILL NOT HAVE TO
WORRY ABOUT THE CATEGORIES FROM OUR
HISTOGRAM WHEN TRYING TO DESCRIBE
THE DATA.
THE PROPERTIES OF THE CURVE WILL
ALLOW US TO DESCRIBE OUR DATA MORE
QUICKLY AND ACCURATELY THAN BEFORE.
LET’S SEE HOW THIS PROCESS WORKS.
Our curve is an example of a NORMAL CURVE.
 A normal curve is used to





describe a normal
distribution.
A normal curve is
symmetric.
A normal curve has a single
peak.
A normal curve is bellshaped.
All normal curves have the
same shape.
The area underneath the
curve is exactly 1, and it
represents the proportion
of all observations.
More About the Normal Curve
 The mean, µ, is located at
the center of the curve.
 The mean is the same as
the median.
 The standard deviation,
σ, is the measure of
spread for normal
distributions. The points
at which the curvature
changes are located at a
distance of σ from the
mean.
Here’s a visual of mean and standard deviation for a normal
curve:
One standard
deviation, σ
Two standard
deviations, 2σ
Mean, µ
Normal Distributions are Important Because…
 They are good descriptions for some distributions of
data…like SAT scores, characteristics of populations,
and even some scores on psychological tests.
 They are good approximations of many kinds of
chance outcomes, like flipping a coin or rolling a die.
 Many of the inferences we can make using a normal
distribution can be applied to other situations in
which data is almost symmetric.
Keep in mind…
 Not all sets of data are modeled by a normal
distribution.
 Some sets of data are skewed towards the right (like
income distributions).
 Some sets of data are skewed towards the left (like
the average number of letters in words we say each
day).
Now, let’s begin to learn how to
work with a normal distribution.
The 68-95-99.7 Rule
 In the normal distribution with mean µ and standard
deviation σ:



68% of the observations fall within σ of the mean µ
95 % of the observations fall within 2σ of the mean µ
99.7% of the observations fall within 3σ of the mean µ
The distribution of heights of adult males is approximately normal with
mean 69 inches and standard deviation 2.5 inches.
 Between what heights do
the middle 95% of men
fall?




95% will fall within 2
standard deviations of the
mean.
2 standard deviations above
the mean is 74 inches.
2 standard deviations below
the mean is 64 inches.
This means that 95% of men
have heights that fall
between 64 and 74 inches.
 What percent of men are
taller than 74 inches?



74 inches is 2 standard
deviations above the mean.
If we take 100% of men and
subtract away the 95% that
fall within 2 standard
deviations, we are left with
5% of men.
Half of this 5% will be above
the 2 standard deviation
mark, so 2.5% of men are
taller than 74 inches.
Standard Normal Distribution
 Not all predictions that we need to make will be an exact




standard deviation or two away from the mean. Because
of this, we need to standardize our values.
The standard normal distribution is the normal
distribution N(0, 1) with mean 0 and standard deviation
1.
To standardize a variable x with a normal distribution
N(µ, σ), we will use
This is often referred to as a z-score.
At this point, if you haven’t already done so, print the
standard normal table linked in the introduction to these
notes on the course page.
Let’s practice!
 Let’s find the proportion of adult men who are less




than 70 inches tall.
Remember from before that the mean is 69 inches
and the standard deviation is 2.5 inches.
Because 65 isn’t an equal number of standard
deviations from the mean, let’s standardize it!
The formula:
So, this is .2 standard deviation more than the mean
height.
 To actually determine the proportion this z-score





represents, we can use the area under the standard
normal curve.
Look at your standard normal table and find z = 0.2.
The corresponding standard normal probability is .5793.
This means that 57.93% of adult men are less than 70
inches tall.
This means that 100%-57.93% = 42.07% of adult men are
more than 70 inches tall.
Keep in mind that the numbers in the standard normal
table will always represent the area under the curve to
the LEFT of z.
You try this one. Then, go to the next slide to check your
answer.
 Find the proportion of adult men who are at least 79
inches (or 6 feet, 9 inches) tall.
How well did you do?
 Find the z-score:
 Go to the table and find
the probability
associated with the zscore of 3.8: .99993
 This means that 1 -
.99993 = .00007 or
.007% of adult men are
at leas 79 inches tall.
Margin of Error
 Shows how accurate we believe our guess is, based
on the variability of our estimates
 We can discuss margin of error using the properties
of the standard normal curve.
Example
 The sampling distribution of a set of test scores is
approximately normal with a mean of 280 and a
standard deviation of 1.9.
 According to the 68-95-99.7 Rule, about 95% of all the
values would fall within 2 standard deviations, or within
3.8, of the mean of this curve.
 This means that the margin of error for this distribution
is ± 3.8., meaning that the actual mean score for
everyone taking the test is within 3.8 points of the mean
of 280 (or between 276.2 and 283.8).
 We would be able to say we are 95% confident about this
range of values because it represents about 95% of all
values.