Section 6B - Gordon State College
Download
Report
Transcript Section 6B - Gordon State College
6
Putting
Statistics to
Work
Measure of
Variation
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 1
Why Variation Matters
Consider the following waiting times for 11 customers
at 2 banks.
Big Bank (three lines):
4.1 5.2 5.6 6.2 6.7 7.2
7.7 7.7 8.5 9.3 11.0
Best Bank (one line):
6.6 6.7 6.7 6.9 7.1 7.2
7.3 7.4 7.7 7.8 7.8
Which bank is likely to have more unhappy customers?
→ Big Bank, due to more surprise long waits
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 2
Range
The range of a data set is the difference between its
highest and lowest data values.
range = highest value (max) – lowest value (min)
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 3
Example
Consider the following two sets of quiz score
for nine students. Which set has the greater
range? Would you also say that the scores
in the set are more varied?
Quiz 1: 1 10 10 10 10 10 10 10 10
Quiz 2: 2 3 4 5 6 7 8 9 10
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 4
Example (cont)
Solution
The range for Quiz 1 is 10 – 1 = 9 points, which is
greater than the range for Quiz 2 of 10 – 2 = 8
points. However, aside from a single low score (an
outlier), Quiz 1 has no variation at all because every
other student got a 10. In contrast, no two students
got the same score on Quiz 2, and the scores are
spread throughout the list of possible scores. The
scores on Quiz 2 are more varied even though Quiz
1 has the greater range.
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 5
Quartiles
The lower quartile (or first quartile) divides the
lowest fourth of a data set from the upper threefourths. It is the median of the data values in the
lower half of a data set.
The middle quartile (or second quartile) is the
overall median.
The upper quartile (or third quartile) divides the
lower three-fourths of a data set from the upper
fourth. It is the median of the data values in the
upper half of a data set.
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 6
The Five-Number Summary
The five-number summary for a data set
consists of the following five numbers:
low value
lower quartile
median
upper quartile
high value
A boxplot shows the five-number summary
visually, with a rectangular box enclosing the
lower and upper quartiles, a line marking the
median, and whiskers extending to the low and
high values.
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 7
The Five-Number Summary
Five-number summary of the waiting times at each bank:
Big Bank
Best Bank
low value (min) = 4.1
lower quartile = 5.6
median = 7.2
upper quartile = 8.5
high value (max) = 11.0
low value (min) = 6.6
lower quartile = 6.7
median = 7.2
upper quartile = 7.7
high value (max) = 7.8
The corresponding boxplot:
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 8
Standard Deviation
The standard deviation is the single number most
commonly used to describe variation.
sum of (deviation s from the mean) 2
standard deviation
total number of data values 1
Note: This is a sample standard deviation formula.
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 9
Calculating the Standard Deviation
The standard deviation is calculated by completing
the following steps:
1. Compute the mean of the data set. Then find the
deviation from the mean for every data value.
deviation from the mean = data value – mean
2. Find the squares of all the deviations from the mean.
3. Add all the squares of the deviations from the mean.
4. Divide this sum by the total number of data values
minus 1.
5. The standard deviation is the square root of this
quotient.
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 10
The Range Rule of Thumb
The standard deviation is approximately related to
the range of a data set by the range rule of
thumb:
range
standard deviation
4
If we know the standard deviation for a data set,
we estimate the low and high values as follows:
low value mean 2 standard deviation
high value mean 2 standard deviation
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 11
Example
Studies of the gas mileage of a Prius under varying
driving conditions show that it gets a mean of 45
miles per gallon with a standard deviation of 4 miles
per gallon. Estimate the minimum and maximum
gas mileage that you can expect under ordinary
driving conditions.
Solution
low value ≈ mean – (2 × standard deviation)
= 45 – (2 × 4)
= 37
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 12
Example (cont)
high value ≈ mean + (2 × standard deviation)
= 45 + (2 × 4)
= 53
The range of gas mileage for the car is roughly from
a minimum of 37 miles per gallon to a maximum of
53 miles per gallon.
Copyright © 2015, 2011, 2008 Pearson Education, Inc.
Chapter 6, Unit B, Slide 13