Transcript Variability
Variability
Quantitative Methods in HPELS
440:210
Agenda
Introduction
Frequency
Range
Interquartile range
Variance/SD of population
Variance/SD of sample
Selection
Introduction
Statistics of variability:
Describe
how values are spread out
Describe how values cluster around the middle
Several statistics Appropriate measurement
depends on:
Scale
of measurement
Distribution
Basic Concepts
Measures of variability:
Frequency
Range
Interquartile
range
Variance and standard deviation
Each statistic has its advantages and
disadvantages
Agenda
Introduction
Frequency
Range
Interquartile range
Variance/SD of population
Variance/SD of sample
Selection
Frequency
Definition: The number/count of any
variable
Scale of measurement:
Appropriate
for all scales
Only statistic appropriate for nominal data
Statistical notation: f
Frequency
Advantages:
Ease
of determination
Only statistic appropriate for nominal data
Disadvantages:
Terminal
statistic
Calculation of the Frequency Instat
Statistics tab
Summary tab
Group tab
Select
group
Select column(s) of interest
OK
Agenda
Introduction
Frequency
Range
Interquartile range
Variance/SD of population
Variance/SD of sample
Selection
Range
Definition: The difference between the
highest and lowest values in a distribution
Scale of measurement:
Ordinal,
interval or ratio
Range
Advantages:
Ease
of determination
Disadvantages:
Terminal
statistic
Disregards all data except extreme scores
Calculation of the Range Instat
Statistics tab
Summary tab
Describe tab
Calculates
OK
range automatically
Agenda
Introduction
Frequency
Range
Interquartile range
Variance/SD of population
Variance/SD of sample
Selection
Interquartile Range
Definition: The difference between the 1st
quartile and the 3rd quartile
Scale of measurement:
Ordinal,
interval or ratio
Example: Figure 4.3, p 107
Interquartile Range
Advantages:
Ease
of determination
More stable than range
Disadvantages:
Disregards
quartiles
all values except 1st and 3rd
Calculation of the Interquartile
Range Instat
Statistics tab
Summary tab
Describe tab
Choose
additional statistics
Choose interquartile range
OK
Agenda
Introduction
Frequency
Range
Interquartile range
Variance/SD of population
Variance/SD of sample
Selection
Variance/SD Population
Variance:
Scale of measurement:
Interval or ratio
Advantages:
The average squared distance/deviation of all raw scores from
the mean
The standard deviation squared
Statistical notation: σ2
Considers all data
Not a terminal statistic
Disadvantages:
Not appropriate for nominal or ordinal data
Sensitive to extreme outliers
Variance/SD Population
Standard deviation:
The average distance/deviation of all raw
scores from the mean
The square root of the variance
Statistical notation: σ
Scale of measurement:
Interval or ratio
Advantages and disadvantages:
Similar to variance
Calculation of the Variance Population
Why square all values?
If all deviations from the mean are
summed, the answer always = 0
Calculation of the Variance Population
Example: 1, 2, 3, 4, 5
Mean = 3
Variations:
1 – 3 = -2
2 – 3 = -1
3–3=0
4–3=1
5–3=2
Sum of all deviations = 0
Sum of all squared
deviations
Variations:
1 – 3 = (-2)2 = 4
2 – 3 = (-1)2 = 1
3 – 3 = (0)2 = 0
4 – 3 = (1)2 = 1
5 – 3 = (2)2 = 4
Sum of all squared
deviations = 10
Variance = Average squared deviation of all points
10/5 = 2
Calculation of the Variance Population
Step 1: Calculate deviation of each point
from mean
Step 2: Square each deviation
Step 3: Sum all squared deviations
Step 4: Divide sum of squared deviations
by N
Calculation of the Variance Population
σ2 = SS/number of scores, where SS =
Σ(X
- )2
Definitional formula (Example 4.3, p 112)
or
ΣX2
– [(ΣX)2]
Computational formula (Example 4.4, p 112)
Computational formula
Step 4: Divide by N
Computation of the Standard Deviation Population
Take the square root of the variance
Agenda
Introduction
Frequency
Range
Interquartile range
Variance/SD of population
Variance/SD of sample
Selection
Variance/SD Sample
Process is similar with two distinctions:
Statistical notation
Formula
Statistical Notation Distinctions
Population vs. Sample
σ2 = s 2
σ=s
=M
N=n
Formula Distinctions
Population vs. Sample
s2 = SS / n – 1, where SS =
Σ(X - M)2
Definitional formula
ΣX2 - [(ΣX)2]
Computational formula
Why n - 1?
N vs. (n – 1) First Reason
General underestimation of population
variance
Sample variance (s2) tend to underestimate
a population variance (σ2)
(n – 1) will inflate s2
Example 4.8, p 121
Actual population σ2 = 14
Average biased s2 = 63/9 = 7
Average unbiased s2 = 126/9 = 14
N vs. (n – 1) Second Reason
Degrees of freedom (df)
df
= number of scores “free” to vary
Example:
Assume n = 3, with M = 5
The sum of values = 15 (n*M)
Assume two of the values = 8, 3
The third value has to be 4
Two values are “free” to vary
df = (n – 1) = (3 – 1) = 2
Computation of the Standard
Deviation of Sample Instat
Statistics tab
Summary tab
Describe tab
Calculates
OK
standard deviation automatically
Agenda
Introduction
Frequency
Range
Interquartile range
Variance/SD of population
Variance/SD of sample
Selection
Selection
When to use the frequency
Nominal
data
With the mode
When to use the range or interquartile range
Ordinal
data
With the median
When to sue the variance/SD
Interval
or ratio data
With the mean
Textbook Problem Assignment
Problems: 4, 6, 8, 14.