Transcript Lecture 3

CRIM 483
Measuring Variability
Variability


Variability refers to the spread or dispersion of scores
Variability captures the degree to which scores within a dataset
differ from one another
– High variability=large distance between scores

Score set 7,6,3,3,1
– Low variability=small distance between scores

Score set 4,2,3,3,1
– No variability=no distance between scores



Score set 4,4,4,4,4
Variability & mean are used together to describe the
characteristics of a distribution (sample) and show how
distributions differ from one another
There are 3 measures for variability: range, standard
deviation, and variance
Range


The range is the most general measure of
variability
The formula: r=h-l
– r=range
– h=highest score
– l=lowest score


Calculation of range provides a general estimate
of how wide or how much scores differ from one
another
Examples:
– Highest age=35, lowest age=21

35-21=14 years difference between age scores in sample
– Highest age=50, lowest age=15

50-15=35 years difference between age scores in sample
– Which sample has the greatest variability with regard to
age?
Standard Deviation

Standard deviation (SD)=average amount of
variability from the mean in the set of scores
(average distance from the mean)
– Standard deviation is used most often to measure
variability
– Reported (as a rule) in combination with means
– The greater the SD, the larger the distance between the
score and the mean

Formula to calculate the SD
s=√(∑(x-mean)2)/(n-1)





s=standard deviation
∑=sigma (sum of)
x=individual score
Mean=mean of all scores
n=sample size
Clarification of Formula

Why not add up the deviations from the
mean?
– Sum of deviations from the mean is always
equal to zero (good way to check your work)

Why square the deviations?
– To rid of the negative sign in order to avoid
summing to 0

Why take the square root?
– To return to the same units that you started
with
Unbiased v. Biased Estimates

Unbiased
– You produce an unbiased estimate by dividing by (n-1) in the
SD formula
– Artificially forces the SD to be larger than it would be
otherwise
– Why? This produces a more conservative estimate that we can
feel more comfortable with–it is safer to overestimate than
underestimate

Biased
– You produce a biased estimate by dividing by (n) in the SD
formula
– Use biased estimate when you are merely describing your
sample and you have no intention of comparing it to the
population

Ultimately, the larger your sample size the less difference
there is between the unbiased and biased estimates (p. 40)
In Sum…
 Must
always compute the mean first
 SD play a critical role later when
comparing scores between groups
(e.g., do male and female attitudes
differ)
 Like means, SD are sensitive to
extreme scores
Variance
Final method of measuring variability is
variance
 Very similar to SD formula

– Formula to calculate the SD
s2= (∑(x-mean)2)/(n-1)
–
–
–
–
–


s2=standard deviation
∑=sigma (sum of)
x=individual score
Mean=mean of all scores
n=sample size
Variance is difficult to interpret and apply by itself
Variance has greater utility in the formulas of more
advanced statistics
Standard Deviation v. Variance
Both measure variability, dispersion, or
spread
 SD produces variability in original units
and variance produces variability in units
squared

– Example from book re: circuit board assembly
 8.6 boards assembled/hour on average
 1.59=SD: difference across workers on
average boards produced is 1.59 boards
 2.53=Variance: difference is 2.53 boards
squared from the mean
Differences in Variability