Transcript Slide 1

SP 225
Lecture 8
Measures of Variation
Challenge Question
 A randomized, double-blind study of 50 subjects
shows daily administration of Echinacea
supplements shortens the average duration of
an Upper Respiratory Infection (URI) from 14 to
13 days.
 Based on this study, is Echinacea an effective
treatment for URI’s?
Roll of the Dice
 All outcomes are equally likely
 The probability of any outcome is 1/6 or
16.7%
Casinos Patrons: Risky Fun
Red, White and Blue Slots
 82% chance of loss on any spin
 Prizes for a dollar bet range
from $2400 to $1
 Patrons are expected to lose
$0.10 for each dollar bet
Casinos: False Risk
Soaring Eagle
 4300 slot machines
 25 spins per hour
 Open 24/7/365
 94,170,000 possible spins
Statistics vs. Parameters
 Statistics: numerical description of a
sample
 Parameter: numerical description of a
population
 Statistics are calculated randomly
selected members of a population
Differences Between Statistics
and Parameters
Sample: 3 Randomly Selected
People
Population: All People
Parameter: 5 of 15 or 33% wear glasses
Statistic: 0 of 3 or 0% wear
glasses
Random Sampling Activity
 Number of siblings of each student in the
freshman class of Powers Catholic High
school
 Take 3 samples, with replacement, of
sizes 1, 5 and 10
 Calculate the sample mean
 Record results in class data chart
Challenge Question
 A randomized, double-blind study of 50 subjects
shows daily administration of Echinacea
supplements shortens the average duration of
an Upper Respiratory Infection (URI) from 14 to
13 days.
 Based on this study, is Echinacea an effective
treatment for URI’s?
Why Do We Need Measures of
Variation?
 What is the average height of a male child?
 How many children are that tall?
 When is a child unusually tall or short?
Range
 Difference between the maximum and
minimum value
 Quick to Compute
 Not Comprehensive
Range = (maximum value) – (minimum value)
Quartiles
 Often used in the education field
 Can be used with any data distribution
 Measures distance in relation to the
MEDIAN not MEAN
Quartiles
 Q1 (First Quartile) separates the bottom
25% of sorted values from the top 75%.
 Q2 (Second Quartile) same as the median;
separates the bottom 50% of sorted
values from the top 50%.
 Q3 (Third Quartile) separates the bottom
75% of sorted values from the top 25%.
Quartiles (2)
Q1, Q2, Q3
divide ranked scores into four equal parts
25%
(minimum)
25%
25% 25%
Q1 Q2 Q3
(median)
(maximum)
Quartile Statistics
 Interquartile Range (or IQR): Q3 - Q1
Example
 Given the following data calculate Q1,
Q2 and Q3
 4.2, 4.4, 5.1, 5.6, 6.0, 6.4, 6.8, 7.1, 7.4,
7.4, 7.9, 8.2, 8.2, 8.7, 9.1, 9.6, 9.6, 10.0,
10.5, 11.6
Example Continued
http://www.maths.murdoch.edu.au/units/statsnotes/samplestats/boxplot.html
Standard Deviation for a Population
 Calculated by the following formula:
s ==
 (x - x)
n-1
2
 Used to show distance from the mean
 Tells how usual, or unusual a measurement is
Standard Deviation for a
Sample
s=
 (x - x)
n-1
2
Standard Deviation Important Properties
 Standard Deviation is always positive
 Increases dramatically with outliers
 The units of standard deviation s are
the same as the units of the mean
Calculating the Standard
Deviation of a SAMPLE
 Data points 1, 3, 5, 7, 9
Variance
 A measure of variation equal to the
square of the standard deviation
 Sample Variance = s 2
2
 Population Variance =
