Measures of Dispersion

Download Report

Transcript Measures of Dispersion

Measures of Dispersion
Variance and Standard Deviation
Basic Assumptions about
Distributions
• We should be able to plot the number of times a specific
value occurs on a graph using a line chart or histogram
(interval/ratio data)
• Some distributions will be normal or bell-shaped.
• Some distributions will be bi-modal or will have data
points distributed irregularly.
• Some distributions will be skewed to the right or skewed
to the left.
• Theoretically, samples taken from one population, should
over time, approximate a normal distribution.
• We should have a normal distribution if we are to use
inferential statistics.
Other reasons to use Measures of
Dispersion
• To see if variables taken from two or more
samples are similar to one another.
• To see if a variable taken from a sample is
similar to the same variable taken from a
population – in other words is our sample
representative of people in the population
at least on that one variable.
Variation in Two Samples
Sample 1
Sample 2
1
2
2
3
4
4
3
3
5
7
5
6
7
9
9
10
Mo = 4
Mdn = 4
Mean = 4
Mo, 3, 9
Mdn = 6
Mean = 6
Sample 2
2.2
2.0
1.8
1.6
1.4
1.2
Count
1.0
.8
2
VAR00001
3
5
7
9
10
Normal Distributions are Bellshaped and have the same
number of measures on either
side of the mean.
Note: According to Montcalm & Royse only unimodal
distributions can be normal distributions.
Normal Distributions
• 50% of all scores are on either side of the mean.
• The distribution is symmetrical – same number
of scores fall above and below the mean.
• The mean is the midpoint of the distribution.
• Mean = median = mode
• The entire area under the bell-shaped curve =
100%.
A standard deviation is:
• The degree to which each of the scores in
a distribution vary from the mean. (x –
mean)
• Calculated by squaring the deviation of
each score from the mean.
• Based on first calculating a statistic called
the variance.
Formulas are:
• Variance = Sum of each deviation squared
divided by (n -1) where n is the number of
values in the distribution.
• Standard Deviation = the square root of
the sum of squares divided by (n – 1).
Using Sample 1 as an example
1
Total
(1-4) = -3
9 Mean = 4
2 (2-4) = -2
4
3
(3-4) = -1
1
4
(4-4) = 0
0 Variance
S.D.
4
(4-4) = 0
0 28/(8-1)
Sq Root 4
5
(5-4) = 1
1
6
(6 - 4) =
2
4
7 (7 = 4) = 3
9
0
28
4
2
Another variance/SD example
1
-5.00
25.00 Mean = 6
2
-4.00
16.00
4
-2.00
4.00
8
2.00
4.00
10
4.00
Variance =
16.00
90/(6-1)
11
5.00
25.00
0.00
90.00
Total
SD = sq root of
18
18.00
4.24
Other Important Terms in This
Chapter
• Mean squares – the average of squared
deviations from the mean in a set of numbers.
(Same as variance)
• Interquartile range – points in a set of numbers
that occur between 75% of the scores and 25%
of the scores – that is, where the middle 50% of
all scores lie (use cumulative percentages)
• Box plot – gives graphic information about
minimum, maximum, and quartile scores in a
distribution.
Box Plot
160000
140000
29
120000
32
18
343
446
103
34
106
454
431
100000
80000
60000
371
348
468
240
72
80
168
413
277
134
242
40000
20000
0
N =
216
Fem al e
Gender
258
M al e
Interquartile Range
Test Scores Frequency
100
3
Cumulative
Percent
25%
100%
90
3
25%
75%
80
3
25%
50%
70
1
8.3%
25%
60
2
16.7%
16.7%
12
100.0%
Total
Percent
This information is important to our
discussion of normal distributions
Central Limit Theorem (we will
discuss this in two weeks) specifies
that:
• 50% of all scores in a normal distribution are on
either side of the mean.
• 68.25% of all scores are one standard deviation
from the mean.
• 95.44% of all scores are two standard deviations
from the mean.
• 99.74% of all scores in a normal distribution are
within 3 standard deviations of the mean.
Therefore, we will be able to
• Predict what scores are contained within
one, two, or three standard deviations
from the mean in a normal distribution.
• Compare the distribution of scores in
samples.
• Compare the distribution of scores from
populations to samples.
To calculate measures of central
tendency and dispersion in SPSS
•
•
•
•
Select descriptive statistics
Select descriptives
Select your variables
Select options (mean, sd, etc.)
SPSS output
Descriptive Statistics
N
Educational Level (years)
Valid N (listwise)
474
474
Range Minimum Maximum
13
8
21
Mean
13.49
Std.
Deviation Variance
2.885
8.322