6-B - rlhawkmath

Download Report

Transcript 6-B - rlhawkmath

Putting Statistics to Work
Copyright © 2011 Pearson Education, Inc.
Unit 6B
Measures of Variation
Copyright © 2011 Pearson Education, Inc.
Slide 6-3
6-B
Why Variation Matters
Consider the following waiting times for 11 customers
at 2 banks.
Big Bank (three lines):
4.1 5.2 5.6 6.2 6.7 7.2
7.7 7.7 8.5 9.3 11.0
Best Bank (one line):
6.6 6.7 6.7 6.9 7.1 7.2
7.3 7.4 7.7 7.8 7.8
Which bank is likely to have more unhappy customers?
→ Big Bank, due to more surprise long waits
Copyright © 2011 Pearson Education, Inc.
Slide 6-4
6-B
The range, R, of a variable is the difference
between the largest data value and the smallest
data values. That is
Range = R = Largest Data Value – Smallest Data Value
3-5
Slide 6-5
6-B
Quartiles

The lower quartile (or first quartile) divides the
lowest fourth of a data set from the upper threefourths. It is the median of the data values in the
lower half of a data set.

The middle quartile (or second quartile) is the
overall median.

The upper quartile (or third quartile) divides the
lower three-fourths of a data set from the upper
fourth. It is the median of the data values in the
upper half of a data set.
Copyright © 2011 Pearson Education, Inc.
Slide 6-6
6-B
Quartiles divide data sets into fourths, or four equal parts.
• The 1st quartile, denoted Q1, divides the bottom 25% the data from the top 75%.
Therefore, the 1st quartile is equivalent to the 25th percentile.
• The 2nd quartile divides the bottom 50% of the data from the top 50% of the data,
so that the 2nd quartile is equivalent to the 50th percentile, which is equivalent to
the median.
• The 3rd quartile divides the bottom 75% of the data from the top 25% of the data,
so that the 3rd quartile is equivalent to the 75th percentile.
3-7
Slide 6-7
6-B
© 2010 Pearson Prentice Hall. All rights reserved
3-8
Slide 6-8
6-B
EXAMPLE
Finding and Interpreting Quartiles
A group of Brigham Young University—Idaho students (Matthew Herring,
Nathan Spencer, Mark Walker, and Mark Steiner) collected data on the speed
of vehicles traveling through a construction zone on a state highway, where
the posted speed was 25 mph. The recorded speed of 14 randomly selected
vehicles is given below:
20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40
Find and interpret the quartiles for speed in the construction zone.
Step 1: The data is already in ascending order.
Step 2: There are n = 14 observations, so the median, or second quartile, Q2, is the
mean of the 7th and 8th observations. Therefore, M = 32.5.
Step 3: The median of the bottom half of the data is the first quartile, Q1.
20, 24, 27, 28, 29, 30, 32
The median of these seven observations is 28. Therefore, Q1 = 28. The median of the
top half of the data is the third quartile, Q3. Therefore, Q3 = 38.
3-9
Slide 6-9
6-B
Interpretation:
• 25% of the speeds are less than or equal to the first quartile, 28 miles
per hour, and 75% of the speeds are greater than 28 miles per hour.
• 50% of the speeds are less than or equal to the second quartile, 32.5
miles per hour, and 50% of the speeds are greater than 32.5 miles per
hour.
• 75% of the speeds are less than or equal to the third quartile, 38
miles per hour, and 25% of the speeds are greater than 38 miles per
hour.
3-10
Slide 6-10
6-B
The Five-Number Summary

The five-number summary for a data set
consists of the following five numbers:
low value

lower quartile
median
upper quartile
high value
A boxplot shows the five-number summary
visually, with a rectangular box enclosing the
lower and upper quartiles, a line marking the
median, and whiskers extending to the low and
high values.
Copyright © 2011 Pearson Education, Inc.
Slide 6-11
6-B
The Five-Number Summary
Five-number summary of the waiting times at each bank:
Big Bank
Best Bank
low value (min) = 4.1
lower quartile = 5.6
median = 7.2
upper quartile = 8.5
high value (max) = 11.0
low value (min) = 6.6
lower quartile = 6.7
median = 7.2
upper quartile = 7.7
high value (max) = 7.8
The corresponding boxplot:
Copyright © 2011 Pearson Education, Inc.
Slide 6-12
6-B
3-13
Slide 6-13
6-B
EXAMPLE
Determining and Interpreting the
Interquartile Range
Determine and interpret the interquartile range of the speed data.
Q1 = 28
Q3 = 38
IQR  Q3  Q1
 38  28
 10
The range of the middle 50% of the speed of cars traveling through the
construction zone is 10 miles per hour.
3-14
Slide 6-14
6-B
Suppose a 15th car travels through the construction zone at 100 miles per
hour. How does this value impact the mean, median, standard deviation, and
interquartile range?
Without 15th car
With 15th car
Mean
32.1 mph
36.7 mph
Median
32.5 mph
33 mph
Standard deviation
6.2 mph
18.5 mph
IQR
10 mph
11 mph
3-15
Slide 6-15
6-B
The closing prices for 9 telecommunications stocks
are shown below. Compute the interquartile range,
IQR.
3.14 5.70
40.87 71.64
6.72
15.63
17.75
28.12 31.24
A. 29.845
B. 68.32
C. 6.21
D. 36.055
Slide 6-16
6-B
© 2010 Pearson Prentice Hall. All rights reserved
3-17
Slide 6-17
6-B
EXAMPLE
Determining and Interpreting the
Interquartile Range
Check the speed data for outliers.
Step 1: The first and third quartiles are Q1 = 28 mph and Q3 = 38 mph.
Step 2: The interquartile range is 10 mph.
Step 3: The fences are
Lower Fence = Q1 – 1.5(IQR)
Upper Fence = Q3 + 1.5(IQR)
= 28 – 1.5(10)
= 38 + 1.5(10)
= 13 mph
= 53 mph
Step 4: There are no values less than 13 mph or greater than 53 mph.
Therefore, there are no outliers.
© 2010 Pearson Prentice Hall. All rights reserved
3-18
Slide 6-18
6-B
© 2010 Pearson Prentice Hall. All rights reserved
3-19
Slide 6-19
6-B
EXAMPLE
Obtaining the Five-Number Summary
Every six months, the United States Federal Reserve Board conducts a survey of credit card
plans in the U.S. The following data are the interest rates charged by 10 credit card issuers
randomly selected for the July 2005 survey. Determine the five-number summary of the data.
Institution
Pulaski Bank and Trust Company
Rate
6.5%
Rainier Pacific Savings Bank
12.0%
Wells Fargo Bank NA
14.4%
Firstbank of Colorado
14.4%
Lafayette Ambassador Bank
14.3%
Infibank
13.0%
United Bank, Inc.
13.3%
First National Bank of The Mid-Cities
13.9%
Bank of Louisiana
Bar Harbor Bank and Trust Company
9.9%
14.5%
Source:
http://www.federalreserve.gov/pubs/SHOP/survey.htm
First, we write the data is
ascending order:
6.5%, 9.9%, 12.0%, 13.0%,
13.3%, 13.9%, 14.3%, 14.4%,
14.4%, 14.5%
The smallest number is 6.5%. The
largest number is 14.5%. The first
quartile is 12.0%. The second
quartile is 13.6%. The third quartile
is 14.4%.
Five-number Summary:
6.5% 12.0% 13.6% 14.4% 14.5%
3-20
Slide 6-20
6-B
© 2010 Pearson Prentice Hall. All rights reserved
3-21
Slide 6-21
EXAMPLE
Constructing a Boxplot
6-B
Every six months, the United States Federal Reserve Board conducts a survey of
credit card plans in the U.S. The following data are the interest rates charged by
10 credit card issuers randomly selected for the July 2005 survey. Draw a boxplot
of the data.
Institution
Pulaski Bank and Trust Company
Rate
6.5%
Rainier Pacific Savings Bank
12.0%
Wells Fargo Bank NA
14.4%
Firstbank of Colorado
14.4%
Lafayette Ambassador Bank
14.3%
Infibank
13.0%
United Bank, Inc.
13.3%
First National Bank of The Mid-Cities
13.9%
Bank of Louisiana
Bar Harbor Bank and Trust Company
9.9%
14.5%
Source:
http://www.federalreserve.gov/pubs/SHOP/survey.htm
© 2010 Pearson Prentice Hall. All rights reserved
3-22
Slide 6-22
6-B
Step 1: The interquartile range (IQR) is 14.4% - 12% = 2.4%. The lower and
upper fences are:
Lower Fence = Q1 – 1.5(IQR)
Upper Fence = Q3 + 1.5(IQR)
= 12 – 1.5(2.4)
= 14.4 + 1.5(2.4)
= 8.4%
= 18.0%
Step 2:
*
[
© 2010 Pearson Prentice Hall. All rights reserved
]
3-23
Slide 6-23
6-B
The interest rate boxplot indicates that the distribution is skewed left.
© 2010 Pearson Prentice Hall. All rights reserved
3-24
Slide 6-24
6-B
Use the boxplot to identify the first quartile.
10
18
|
|
|
|
24
|
|
|
26
|
30
|
|
|
10 12 14 16 18 20 22 24 26 28 30
A. 10
B. 18
C. 24
D. 26
Slide 3- 25
Copyright © 2010
Pearson Education,
Inc.
6-B
Use the boxplot to identify the first quartile.
10
18
|
|
|
|
24
|
|
|
26
|
30
|
|
|
10 12 14 16 18 20 22 24 26 28 30
A. 10
B. 18
C. 24
D. 26
Slide 3- 26
Copyright © 2010
Pearson Education,
Inc.
6-B
The interest rate boxplot indicates that the distribution is skewed left.
© 2010 Pearson Prentice Hall. All rights reserved
3-27
Slide 6-27
6-B
Use the boxplot to identify the first quartile.
10
18
|
|
|
|
24
|
|
|
26
|
30
|
|
|
10 12 14 16 18 20 22 24 26 28 30
A. 10
B. 18
C. 24
D. 26
Slide 3- 28
Copyright © 2010
Pearson Education,
Inc.
6-B
Use the boxplot to identify the first quartile.
10
18
|
|
|
|
24
|
|
|
26
|
30
|
|
|
10 12 14 16 18 20 22 24 26 28 30
A. 10
B. 18
C. 24
D. 26
Slide 3- 29
Copyright © 2010
Pearson Education,
Inc.
6-B
Standard Deviation
The standard deviation is the single number most
commonly used to describe variation.
sum of (deviation s from the mean) 2
standard deviation 
total number of data values  1
Copyright © 2011 Pearson Education, Inc.
Slide 6-30
6-B
Calculating the Standard Deviation
The standard deviation is calculated by completing
the following steps:
1. Compute the mean of the data set. Then find the
deviation from the mean for every data value.
deviation from the mean = data value – mean
2. Find the squares of all the deviations from the mean.
3. Add all the squares of the deviations from the mean.
4. Divide this sum by the total number of data values
minus 1.
5. The standard deviation is the square root of this
quotient.
Copyright © 2011 Pearson Education, Inc.
Slide 6-31
6-B
Standard Deviation
Let A = {2, 8, 9, 12, 19} with a mean of 10. Find the sample
standard deviation of the data set A.
x (data value)
2
8
9
12
19
x – mean
(deviation)
2 – 10 = –8
8 – 10 = –2
9 – 10 = –1
12 – 10 = 2
19 – 10 = 9
Total
(deviation)2
(-8)2 = 64
(-2)2 = 4
(-1)2 = 1
(2)2 = 4
(9)2 = 81
154
sum of (deviation s from the mean) 2
standard deviation 
total number of data values  1
154

 6.2
5 1
Copyright © 2011 Pearson Education, Inc.
Slide 6-32
6-B
The Range Rule of Thumb

The standard deviation is approximately related to
the range of a data set by the range rule of
thumb:
range
standard deviation 
4

If we know the standard deviation for a data set,
we estimate the low and high values as follows:
low value  mean  2  standard deviation
high value  mean  2  standard deviation
Copyright © 2011 Pearson Education, Inc.
Slide 6-33
6-B
Assignment

P. 389-390 7-25 odd
Copyright © 2011 Pearson Education, Inc.
Slide 6-34