The Standard Deviation as a Ruler and the Normal

Download Report

Transcript The Standard Deviation as a Ruler and the Normal

Chapter 6:
The Standard
Deviation as a Ruler
and the Normal Model
The Standard Deviation as a
Ruler



Use standard deviation
when comparing unlike
measures.
Standard deviation is the
most common measure of
spread.
Remember standard
deviation is the square root
of the variance.
Standardizing


We standardize to eliminate units.
A standardized value can be found by
subtracting the mean from the value and
dividing by the standard deviation.
 Has

no units
A z-score measures the distance of each
data value from the mean in standard
deviation.
 Negative
 Positive
z-score- data value below the mean
z-score- data value above the mean
Benefits of Standardizing


Standardized values are converted to the
standard statistical unit of standard
deviations from the mean. (z-score)
Values that are measured on different scales
or in different units can now be compared.
Example

Which performance is better?


Bacher ran the 800-m in 129 seconds, which was 8 seconds
faster than the mean of 137 seconds. How many standard
deviations better than the mean is that? The standard deviation
of all the qualifying runners was 5 seconds. So her time was
(129-137)/5= -1.6 or 1.6 better than the mean. Prokhonova’s
winning jump was 60m longer than the 6m jump. The standard
deviation was 30cm, so the winning jump was (60/30)=2
standard deviations better than the mean.
The long jump was better because it was a greater improvement
over its mean than the winning 800m time, as measured in
standard deviation.
Shifting Data

Adding or subtracting a constant amount to each
value just adds or subtracts the same constant
to:



The spread does not change because the distribution
is simply shifting.


the mean and median
Maximum, minimum, and quartiles
The range, IQR, and the standard deviation remains the
same.
Recap: Adding a constant to every data value adds
the same constant to measures of center and
percentiles, but leaves measures of spread
unchanged.
Rescaling Data

Rescaling data is multiplying or
dividing all values by the same
number.

Changes the measurement units.


Ex. Inches to feet (multiply by 12)
When we divide or multiply all the
data values by any constant
value, both measures of location
(mean and median) and
measures of spread (range, IQR
and standard deviation) are
divided or multiplied by that same
value.
Example

Suppose the class took a 40 point quiz. The results show
a mean score of 30. median of 32, IQR 8, SD 6, min 12
and Q1 27. (suppose YOU got a 35)What happens to
each statistic
I decide to weight the quiz as 50 points, and will add
50 points to each score you score is now a 45
I decide to weight the score as 80 points and I
double each score. Your score is now a 70
I decide to count the quiz as 100 points; I’ll double
each score and add 20 points . Your score is now a 90
Table
Statistic
Mean
Median
IQR
SD
Minimum
Q1
Your
score
Originalx
30
32
8
6
12
27
35
X+10
40
42
8
6
22
37
45
2x
60
64
16
12
24
54
70
2X+20
80
84
16
12
44
74
90
What happened
Measures of center and position are
affected by addition and multiplication
 Measures of spread are only affected by
multiplication

Back to z-scores


Standardizing z-scores is shifting them by the
mean and rescaling them by standard deviation.
Standardizing:
 does not change the shape of the distribution of
a variable.
 Changes the center by making the mean 0
 changes the spread by making standard deviation
1
When is a z-score BIG?


Normal models- appropriate for distributions
whose shapes are unimodal and roughly
symmetric
parameter- a numerically value attribute of a
model
The values of μ (mean) and σ (standard deviation)
in N(μ,σ) model are parameters.
 ex.


summaries of data are called statistics
standard Normal model (standard Normal
distribution) - the Normal model with mean μ=0
and standard deviation σ=1
The 68-95-99.7 Rule

In a normal model:
 about
68% of the data fall within one standard
deviation of the mean
 about 95% of the data fall within two standard
deviations of the mean
 about 99.7% of the data fall within three standard
deviations of the mean
The First Three Rules for
Working with Normal Models

Make a picture.

Make a picture.

Make a picture.
Working with the 68-95-99.7 Rule
Step by Step


SAT scores are designed to have an overall
mean of 500 and standard deviation of 100.
Where do u stand among other students if you
earned a 600? (use the 68-95-99.7 rule)
 Make
a picture
 Model the score with N(500,100)
Continued (page 110 and 111)


A score of 600 is one standard
deviation away from the mean.
About 32% (100%-68%) of
those who took the test were
more than one standard
deviation away from the mean,
but only half on the high side.
 About 16 % of the test
scores were better than 600
Finding Normal Percentiles by Hand


The normal percentile corresponding to a z-score
gives the percentage of values in a standard Normal
distribution found at that z-score or below.
Table of normal percentiles- used when a value
doesn’t fall exactly 1, 2, or 3  from the 
 convert
data into z-score before using the table
 look down the left column of the table for the first two
digits (of z)
 look across the top row for the third digit

whatever number connects the two is your percent
Normal Percentiles Using Technology

Normalcdf- finds the area between 2 z-scores


Example: find the area between z= -5 and z= 10.


2nd DISTR- normal cdf (zLeft,zRight)
2nd DISTR- normal cdf ( -5, 10)
when you want infinity as your cut point, use -99
or 99
ex. What percentage of 1.8 above the 
 2nd DISTR- normal cdf (1.8, 99) = .0359

From Percentiles to Scores: z in
Reverse

What z-score represents the first quartile in a normal
model? (25th percentile)

go to 2nd DISTR, invNorm



specify the desired percentile invorm(.25) and ENTER
the cutpoint for the 25 % is z= -.674
What z-score cuts off the highest 10% of a Normal
model?
 Since
we want the cut point for the highest 10%, we
know that the 90% must be below the z-score


invNorm(.90) = 1.28
10% of the area in a Normal model is more than 1.28
standard deviations from the mean
Are You Normal? How Can You Tell?

Draw a histogram- if the histogram is unimodal
and symmetric, the Normal model is appropriate
to use
 usually
the easiest way to tell if the distribution is
Normal

Normal probability Plot- a display to asses
whether a distribution of data is Normal
 Normal
model is appropriate if plot is nearly straight
 deviations from a straight line indicate that the
distribution is not Normal
What Can Go Wrong?


Don’t use Normal models
when the distribution is not
unimodal and symmetric.
Don’t use the mean and
standard deviation when
outliers are present.
 Both
mean and standard
deviation can be distorted by
outliers
Lets Try One! Page 102, # 19 a-c

What percent of a standard Normal model is
found in each region?

a) z>1.5


b) z< 2.25


normal cdf ( 1.5,99) = 6.68%
normal cdf (-99,2.25)= 98.78%
c) -1 <z< 1.15
 normal
cdf ( -1, 2.25)=71.6%
Lets try another! Page 102,# 21 a-c




In a Normal model, what values of z cut off the
region described?
a) highest 20%
 invNorm(.8)= .842
b) highest 75%
 =invNorm (.25)= -.675
c) the lowest 3%
 =invNorm( .03)= -1.881