Sullivan 2nd ed Chapter 3

Download Report

Transcript Sullivan 2nd ed Chapter 3

Chapter 3
Section 4
Measures of
Position
Chapter 3 – Section 4
● Learning objectives
1

Determine and interpret z-scores
2 Determine and interpret percentiles
3
 Determine and interpret quartiles
4
 Check a set of data for outliers
Chapter 3 – Section 4
● Mean / median describe the “center” of the data
● Variance / standard deviation describe the
“spread” of the data
● This section discusses more precise ways to
describe the relative position of a data value
within the entire set of data
Chapter 3 – Section 4
● Learning objectives
1

Determine and interpret z-scores
2 Determine and interpret percentiles
3
 Determine and interpret quartiles
4
 Check a set of data for outliers
Chapter 3 – Section 4
● The standard deviation is a measure of
dispersion that uses the same dimensions as the
data (remember the empirical rule)
● The distance of a data value from the mean,
calculated as the number of standard deviations,
would be a useful measurement
● This distance is called the z-score
If the mean was 20 and the standard
deviation was 6




The value 26 would have
a z-score of 1.0
(1.0 standard deviation
higher than the mean)
26  20
6
1
If the mean was 20 and the standard
deviation was 6




The value 14 would have
a z-score of –1.0
(1.0 standard deviation
lower than the mean)
14  20
6
 1
If the mean was 20 and the standard
deviation was 6




The value 17 would have
a z-score of –0.5
(0.5 standard deviations
lower than the mean)
17  20
6
  05
.
If the mean was 20 and the standard
deviation was 6
The value 20 would have
a z-score of 0.0
20  20
6
0
Chapter 3 – Section 4
● The population z-score is calculated using the
population mean and population standard
deviation
z
x

● The sample z-score is calculated using the
sample mean and sample standard deviation
z
xx
s
Chapter 3 – Section 4
● z-scores can be used to compare the relative
positions of data values in different samples
 Pat received a grade of 82 on her statistics exam
where the mean grade was 74 and the standard
deviation was 12
 Pat received a grade of 72 on her biology exam
where the mean grade was 65 and the standard
deviation was 10
 Pat received a grade of 91 on her kayaking exam
where the mean grade was 88 and the standard
deviation was 6
Chapter 3 – Section 4
● Statistics
 Grade of 82
 z-score of (82 – 74) / 12 = .67
● Biology
 Grade of 72
 z-score of (72 – 65) / 10 = .70
● Kayaking
 Grade of 81
 z-score of (91 – 88) / 6 = .50
● Biology was the highest relative grade
● Remember the Empirical Rule: 68-95-99.7
● A manufacture of bolts has a quality-control
policy that requires it to destroy any bolts that
are more than 2 standard deviations from the
mean. The mean of the bolts is 8 cm with a
standard deviation of 0.05 cm.
● For what lengths will the bolts be destroyed?
● What percentage of the bolts will be destroyed?
● A manufacture of bolts has a quality-control policy that requires it to destroy
any bolts that are more than 2 standard deviations from the mean. The
mean of the bolts is 8 cm with a standard deviation of 0.05 cm.
● For what lengths will the bolts be destroyed?
● A manufacture of bolts has a quality-control policy that requires it to destroy
any bolts that are more than 2 standard deviations from the mean. The
mean of the bolts is 8 cm with a standard deviation of 0.05 cm.
● For what lengths will the bolts be destroyed?
● What percentage of the bolts will be destroyed?
Chapter 3 – Section 4
● Learning objectives
1

Determine and interpret z-scores
2 Determine and interpret percentiles
3
 Determine and interpret quartiles
4
 Check a set of data for outliers
Chapter 3 – Section 4
● The median divides the lower 50% of the data
from the upper 50%
● The median is the 50th percentile
● If a number divides the lower 34% of the data
from the upper 66%, that number is the 34th
percentile
Chapter 3 – Section 4
● The computation is similar to the one for the
median
● Calculation
 Arrange the data in ascending order
 Compute the index i using the formula
k 

i 
 n  1
 100 
● If i is an integer, take the ith data value
● If i is not an integer, take the mean of the two
values on either side of i
Chapter 3 – Section 4
● Compute the 60th percentile of
1, 3, 4, 7, 8, 15, 16, 19, 23, 24, 27, 31, 33, 34
● Calculations
 There are 14 numbers (n = 14)
 The 60th percentile (k = 60)
 The index
k 
 60  14  1  9
i  

n

1





 100 
 100 
● Take the 9th value, or P60 = 23, as the 60th
percentile
Chapter 3 – Section 4
● Compute the 28th percentile of
1, 3, 4, 7, 8, 15, 16, 19, 23, 24, 27, 31, 33, 54
● Calculations
 There are 14 numbers (n = 14)
 The 28th percentile (k = 28)
 The index
k 
28 


i 
 n  1  
 14  1  4.2
 100 
 100 
● Take the average of the 4th and 5th values, or
P28 = (7 + 8) / 2 = 7.5, as the 28th percentile
Chapter 3 – Section 4
● Learning objectives
1

Determine and interpret z-scores
2 Determine and interpret percentiles
3
 Determine and interpret quartiles
4
 Check a set of data for outliers
Chapter 3 – Section 4
● The quartiles are the 25th, 50th, and 75th
percentiles
 Q1 = 25th percentile / also median of the lower 50%
 Q2 = 50th percentile = median
 Q3 = 75th percentile / also median of the upper 50%
● Quartiles are the most commonly used
percentiles
● The 50th percentile and the second quartile Q2
are both other ways of defining the median
Chapter 3 – Section 4
● Quartiles divide the data set into four equal parts
● The top quarter are the values between Q3 and
the maximum
● The bottom quarter are the values between the
minimum and Q1
Chapter 3 – Section 4
● Quartiles divide the data set into four equal parts
● The interquartile range (IQR) is the difference
between the third and first quartiles
IQR = Q3 – Q1
● The IQR is a resistant measurement of
dispersion
Chapter 3 – Section 4
● Learning objectives
1

Determine and interpret z-scores
2 Determine and interpret percentiles
3
 Determine and interpret quartiles
4
 Check a set of data for outliers
Chapter 3 – Section 4
● Extreme observations in the data are referred to
as outliers
● Outliers should be investigated
● Outliers could be




Chance occurrences
Measurement errors
Data entry errors
Sampling errors
● Outliers are not necessarily invalid data
Chapter 3 – Section 4
● One way to check for outliers uses the quartiles
● Outliers can be detected as values that are
significantly too high or too low, based on the
known spread
● The fences used to identify outliers are
 Lower fence = LF = Q1 – 1.5  IQR
 Upper fence = UF = Q3 + 1.5  IQR
● Values less than the lower fence or more than
the upper fence could be considered outliers
Chapter 3 – Section 4
● Is the value 54 an outlier?
1, 3, 4, 7, 8, 15, 16, 19, 23, 24, 27, 31, 33, 54
● Calculations






Q1 = (4 + 7) / 2 = 5.5 / or median of lower 50% is 7
Q2 = (16 + 19)/2 = 17.5
Q3 = (27 + 31) / 2 = 29 / or median of upper 50% is 27
IQR = 29 – 5.5 = 23.5 / or 27 – 7 = 20
UF = Q3 + 1.5  IQR = 29 + 1.5  23.5 = 64
Or UF = Q3 + 1.5  IQR = 29 + 1.5  20 = 59
Summary: Chapter 3 – Section 4
● z-scores
 Measures the distance from the mean in units of
standard deviations
 Can compare relative positions in different samples
● Percentiles and quartiles
 Divides the data so that a certain percent is lower and
a certain percent is higher
● Outliers
 Extreme values of the variable
 Can be identified using the upper and lower fences