Measures of Variation

Download Report

Transcript Measures of Variation

Section 3.2 Measures of Dispersion
1.
2.
3.
4.
5.
Range
Variance
Standard deviation
Empirical Rule for bell shaped distributions
Chebyshev’s Inequality for any distribution
3-1
Range
The range of a set of data is the difference
between the maximum value and the
minimum value.
Range = (maximum value) – (minimum value)
EXAMPLE
The following data represent the travel times (in minutes)
to work for all seven employees of a start-up web
development company.
23, 36, 23, 18, 5, 26, 43
Find the range.
Range = 43 – 5
= 38 minutes
Variance
The population variance is the sum of squared
deviations about the population mean divided by the
number of observations in the population, N.
That is it is the mean of the sum of the squared
deviations about the population mean.
3-4
The population variance is symbolically represented
by σ2 (lower case Greek sigma squared).
3-5
EXAMPLE
Population Variance
The following data represent the travel times (in minutes) to
work for all seven employees of a start-up web development
company.
23, 36, 23, 18, 5, 26, 43
Compute the population variance of this data. Recall that
174

 24.85714
7
3-6
xi
μ
xi – μ
(xi – μ)2
23
36
23
18
24.85714
24.85714
24.85714
24.85714
-1.85714
11.14286
-1.85714
-6.85714
3.44898
124.1633
3.44898
47.02041
5
26
43
24.85714
24.85714
24.85714
-19.8571
1.142857
18.14286
394.3061
1.306122
329.1633
 x   
i

2
x  


i
N
2
2

902.8571
902.8571

 129.0 minutes2
7
3-7
The sample variance is computed by determining the
sum of squared deviations about the sample mean and
then dividing this result by n – 1.
3-8
EXAMPLE
Sample Variance
For the travel time data assume we obtained the following simple random sample:
5, 36, 26.
Compute the sample variance travel time.
Travel Time, xi
Sample Mean,
Deviation about the
Mean,
Squared Deviations about the
Mean,
 x  x
2
x
xi  x
5
22.333
5 – 22.333
= -17.333
(-17.333)2 = 300.432889
36
22.333
13.667
186.786889
26
22.333
3.667
13.446889
i
 x  x
i
s
2
x  x



i
n 1
2
 500.66667
2

500.66667
3 1
 250.333 square minutes
3-9
Standard Deviation
The standard deviation of a set of sample
values is a measure of variation of values
about the mean.
Population standard deviation:
= square root of the population variance
Sample standard deviation:
s
= square root of the sample variance, so that
s s
2
3-11
EXAMPLE
Population Standard Deviation
The following data represent the travel times (in minutes) to
work for all seven employees of a start-up web development
company.
23, 36, 23, 18, 5, 26, 43
Compute the population standard deviation of this data.
Recall, from the last objective that σ2 = 129.0 minutes2.
Therefore,
902.8571
  2 
 11.4 minutes
7
3-12
EXAMPLE
Sample Standard Deviation
Recall the sample data 5, 26, 36 results in a sample variance of
s2 

xi  x
n 1

2

500.66667
3 1
 250.333 square minutes
Use this result to determine the sample standard deviation.
s  s2 
500.666667
 15.8 minutes
3 1
3-13
EXAMPLE
Comparing Standard Deviations
Wait Time at Wendy’s
1.50
2.53
1.88
3.99
0.90
0.79
1.20
2.94
1.90
1.23
1.01
1.46
1.40
1.00
0.92
1.66
0.89
1.33
1.54
1.09
0.94
0.95
1.20
0.99
1.72
0.67
0.90
0.84
0.35
2.00
Wait Time at McDonald’s
3.50
0.00
1.97
0.00
3.08
0.00
0.26
0.71
0.28
2.75
0.38
0.14
2.22
0.44
0.36
0.43
0.60
4.54
1.38
3.10
1.82
2.33
0.80
0.92
2.19
3.04
2.54
0.50
1.17
0.23
3-14
EXAMPLE
Comparing Standard Deviations
Determine the standard deviation waiting time
for Wendy’s and McDonald’s.
Which is the better company in terms of
waiting times?
3-15
EXAMPLE
Comparing Standard Deviations
Determine the standard deviation waiting time
for Wendy’s and McDonald’s.
Sample standard deviation for Wendy’s:
0.738 minutes
Sample standard deviation for McDonald’s:
1.265 minutes
3-16
The empirical rule for bell shaped
distributions
For many observations – especially if their histogram is bell-shaped
1.
2.
Roughly 68% of the observations in the list lie within
1 standard deviation from the average
And 95% of the observations lie within 2 standard deviations from the
average
Ave-2s.d.
Ave-s.d.
Average
68%
95%
Ave+s.d.
Ave+2s.d.
3-18
The Empirical Rule
The Empirical Rule
The Empirical Rule
EXAMPLE
Using the Empirical Rule
The following data represent the serum HDL
cholesterol of the 54 female patients of a family
doctor.
41
62
67
60
54
45
48
75
69
60
54
47
43
77
69
60
55
47
38
58
70
61
56
48
35
82
65
62
56
48
37
39
72
63
56
50
44
85
74
64
57
52
44
55
74
64
58
52
44
54
74
64
59
53
3-22
(a)Compute the population mean and standard deviation.
(b) Draw a histogram to verify the data is bell-shaped.
(c) Determine the percentage of patients that have serum HDL
within 3 standard deviations of the mean according to the
Empirical Rule.
(d) Determine the percentage of patients that have serum HDL
between 34 and 69.1 according to the Empirical Rule.
(e) Determine the actual percentage of patients that have serum
HDL between 34 and 69.1
(use the raw data directly, not the empirical rule for this question. See how close
the empirical rule approximation was!)
3-23
(a) Using a TI-83 plus graphing calculator or Excel, we find
  57.4 and   11.7
(b)
3-24
22.3
34.0
45.7
57.4
69.1
80.8
92.5
(c) According to the Empirical Rule, 99.7% of the patients that have serum HDL
within 3 standard deviations of the mean.
(d) 13.5% + 34% + 34% = 81.5% of patients will have a serum HDL between 34.0
and 69.1 according to the Empirical Rule.
(e) 45 out of the 54 or 83.3% of the patients have a serum HDL between 34.0 and
69.1.
3-25
Empirical rule for any shape
distribution
• Chebyshev’s Inequality
3-26
EXAMPLE
Using Chebyshev’s Theorem
Using the data from the previous example, use Chebyshev’s
Theorem to
(a) determine the percentage of patients that have serum HDL
within 3 standard deviations of the mean.
1

1  2 100%  88.9%
 3 
(b) determine the actual percentage of patients that have serum
HDL between 34 and 80.8.
1

1  2 100%  75%
 2 
3-27