Sullivan 2nd ed Chapter 3
Download
Report
Transcript Sullivan 2nd ed Chapter 3
Chapter 3
Section 4
Measures of
Position
Chapter 3 – Section 4
● Learning objectives
1
Determine and interpret z-scores
2 Determine and interpret percentiles
3
Determine and interpret quartiles
4
Check a set of data for outliers
Chapter 3 – Section 4
● Mean / median describe the “center” of the data
● Variance / standard deviation describe the
“spread” of the data
● This section discusses more precise ways to
describe the relative position of a data value
within the entire set of data
Chapter 3 – Section 4
● Learning objectives
1
Determine and interpret z-scores
2 Determine and interpret percentiles
3
Determine and interpret quartiles
4
Check a set of data for outliers
Chapter 3 – Section 4
● The standard deviation is a measure of
dispersion that uses the same dimensions as the
data (remember the empirical rule)
● The distance of a data value from the mean,
calculated as the number of standard deviations,
would be a useful measurement
● This distance is called the z-score
If the mean was 20 and the standard
deviation was 6
The value 26 would have
a z-score of 1.0
(1.0 standard deviation
higher than the mean)
26 20
6
1
If the mean was 20 and the standard
deviation was 6
The value 14 would have
a z-score of –1.0
(1.0 standard deviation
lower than the mean)
14 20
6
1
If the mean was 20 and the standard
deviation was 6
The value 17 would have
a z-score of –0.5
(0.5 standard deviations
lower than the mean)
17 20
6
05
.
If the mean was 20 and the standard
deviation was 6
The value 20 would have
a z-score of 0.0
20 20
6
0
Chapter 3 – Section 4
● The population z-score is calculated using the
population mean and population standard
deviation
z
x
● The sample z-score is calculated using the
sample mean and sample standard deviation
z
xx
s
Chapter 3 – Section 4
● z-scores can be used to compare the relative
positions of data values in different samples
Pat received a grade of 82 on her statistics exam
where the mean grade was 74 and the standard
deviation was 12
Pat received a grade of 72 on her biology exam
where the mean grade was 65 and the standard
deviation was 10
Pat received a grade of 91 on her kayaking exam
where the mean grade was 88 and the standard
deviation was 6
Chapter 3 – Section 4
● Statistics
Grade of 82
z-score of (82 – 74) / 12 = .67
● Biology
Grade of 72
z-score of (72 – 65) / 10 = .70
● Kayaking
Grade of 81
z-score of (91 – 88) / 6 = .50
● Biology was the highest relative grade
● Remember the Empirical Rule: 68-95-99.7
● A manufacture of bolts has a quality-control
policy that requires it to destroy any bolts that
are more than 2 standard deviations from the
mean. The mean of the bolts is 8 cm with a
standard deviation of 0.05 cm.
● For what lengths will the bolts be destroyed?
● What percentage of the bolts will be destroyed?
● A manufacture of bolts has a quality-control policy that requires it to destroy
any bolts that are more than 2 standard deviations from the mean. The
mean of the bolts is 8 cm with a standard deviation of 0.05 cm.
● For what lengths will the bolts be destroyed?
● A manufacture of bolts has a quality-control policy that requires it to destroy
any bolts that are more than 2 standard deviations from the mean. The
mean of the bolts is 8 cm with a standard deviation of 0.05 cm.
● For what lengths will the bolts be destroyed?
● What percentage of the bolts will be destroyed?
Chapter 3 – Section 4
● Learning objectives
1
Determine and interpret z-scores
2 Determine and interpret percentiles
3
Determine and interpret quartiles
4
Check a set of data for outliers
Chapter 3 – Section 4
● The median divides the lower 50% of the data
from the upper 50%
● The median is the 50th percentile
● If a number divides the lower 34% of the data
from the upper 66%, that number is the 34th
percentile
Chapter 3 – Section 4
● The computation is similar to the one for the
median
● Calculation
Arrange the data in ascending order
Compute the index i using the formula
k
i
n 1
100
● If i is an integer, take the ith data value
● If i is not an integer, take the mean of the two
values on either side of i
Chapter 3 – Section 4
● Compute the 60th percentile of
1, 3, 4, 7, 8, 15, 16, 19, 23, 24, 27, 31, 33, 34
● Calculations
There are 14 numbers (n = 14)
The 60th percentile (k = 60)
The index
k
60 14 1 9
i
n
1
100
100
● Take the 9th value, or P60 = 23, as the 60th
percentile
Chapter 3 – Section 4
● Compute the 28th percentile of
1, 3, 4, 7, 8, 15, 16, 19, 23, 24, 27, 31, 33, 54
● Calculations
There are 14 numbers (n = 14)
The 28th percentile (k = 28)
The index
k
28
i
n 1
14 1 4.2
100
100
● Take the average of the 4th and 5th values, or
P28 = (7 + 8) / 2 = 7.5, as the 28th percentile
Chapter 3 – Section 4
● Learning objectives
1
Determine and interpret z-scores
2 Determine and interpret percentiles
3
Determine and interpret quartiles
4
Check a set of data for outliers
Chapter 3 – Section 4
● The quartiles are the 25th, 50th, and 75th
percentiles
Q1 = 25th percentile / also median of the lower 50%
Q2 = 50th percentile = median
Q3 = 75th percentile / also median of the upper 50%
● Quartiles are the most commonly used
percentiles
● The 50th percentile and the second quartile Q2
are both other ways of defining the median
Chapter 3 – Section 4
● Quartiles divide the data set into four equal parts
● The top quarter are the values between Q3 and
the maximum
● The bottom quarter are the values between the
minimum and Q1
Chapter 3 – Section 4
● Quartiles divide the data set into four equal parts
● The interquartile range (IQR) is the difference
between the third and first quartiles
IQR = Q3 – Q1
● The IQR is a resistant measurement of
dispersion
Chapter 3 – Section 4
● Learning objectives
1
Determine and interpret z-scores
2 Determine and interpret percentiles
3
Determine and interpret quartiles
4
Check a set of data for outliers
Chapter 3 – Section 4
● Extreme observations in the data are referred to
as outliers
● Outliers should be investigated
● Outliers could be
Chance occurrences
Measurement errors
Data entry errors
Sampling errors
● Outliers are not necessarily invalid data
Chapter 3 – Section 4
● One way to check for outliers uses the quartiles
● Outliers can be detected as values that are
significantly too high or too low, based on the
known spread
● The fences used to identify outliers are
Lower fence = LF = Q1 – 1.5 IQR
Upper fence = UF = Q3 + 1.5 IQR
● Values less than the lower fence or more than
the upper fence could be considered outliers
Chapter 3 – Section 4
● Is the value 54 an outlier?
1, 3, 4, 7, 8, 15, 16, 19, 23, 24, 27, 31, 33, 54
● Calculations
Q1 = (4 + 7) / 2 = 5.5 / or median of lower 50% is 7
Q2 = (16 + 19)/2 = 17.5
Q3 = (27 + 31) / 2 = 29 / or median of upper 50% is 27
IQR = 29 – 5.5 = 23.5 / or 27 – 7 = 20
UF = Q3 + 1.5 IQR = 29 + 1.5 23.5 = 64
Or UF = Q3 + 1.5 IQR = 29 + 1.5 20 = 59
Summary: Chapter 3 – Section 4
● z-scores
Measures the distance from the mean in units of
standard deviations
Can compare relative positions in different samples
● Percentiles and quartiles
Divides the data so that a certain percent is lower and
a certain percent is higher
● Outliers
Extreme values of the variable
Can be identified using the upper and lower fences