Note: If you are interested in inferential statistics, then you should

Download Report

Transcript Note: If you are interested in inferential statistics, then you should

VARIABILITY
Measure of Variability
• A measure of variability is a summary of the
spread of performance.
• Suppose that 2 students took 10 quizzes in a
Preparatory School.
Measure of Variability
• The pass-point of the
preparatory school is 70.
• What do you think,
which student would
be more likely to pass
70 in the final exam?
• Compare two
distributions
Quizzes
Stud 1
1
2
3
4
5
6
7
8
9
10
Mean
Stud 2
58
45
46
52
50
61
50
53
58
52
52,5
20
75
82
45
21
85
87
12
84
14
52,5
Measure of Variability
stud1
stud2
5
2,0
4
3
Frequency
Frequency
1,5
2
1
0,5
Mean =52,50
Std. Dev. =5,212
N =10
0
0,00
1,0
10,00
20,00
30,00
40,00
50,00
stud1
60,00
70,00
80,00
90,00
100,00
Mean =52,50
Std. Dev. =33,07
N =10
0,0
0,00
10,00
20,00
30,00
40,00
50,00
stud2
60,00
70,00
80,00
90,00
100,00
Measure of Variability
• As you can see, the distributions of the scores for two
prep students differ, even if their means are identical.
– It seems that the first student will not succeed in the final
exam, since his/her score was never better than the cut
point, 70 point.
• Maybe, English is too hard for him/her.
– However, the second student pass 70 point in 5 quizzes
(50% of the tests).
• So, it is more likely for him/her to pass 70 point in the final exam.
• Today, many online shopping sites provide information
about their costumers’ ratings for the products. Which
would be more informative for you, a measure of
tendency for the ratings, or a measure of variability?
The Measures of Variability
•
•
•
•
Range
The Inter-quartile Range
Variance
Standard Deviation
Range
• The range is the difference between the lowest and
highest values in a dataset.
– The range of the first students scores is 61 – 45 = 16
– The range of the second students scores is 87 – 12 = 75
• The range is based solely on the two most extreme
values within the dataset
– exceptionally high or low scores (outliers) will result in a
range that is not typical of the variability within the
dataset.
• In order to reduce the problems caused by outliers in a
dataset, the inter-quartile range could be calculated
instead of the range
The Inter-quartile Range (IQR)
• The inter-quartile range is a
measure that indicates the extent to
which the central 50% of values
within the dataset are dispersed.
• To calculate the inter-quartile range,
we need to subtract the lower
quartile from the upper quartile
– Q3 – Q1 = P75 – P25
– The IQR for first student’s scores is
• Q1 = 48
• Q3 = 55.5
• IQR = Q3 - Q1 = 55.5 – 48 = 7.5
Variance
• Variance is a deviation
score. It summarize the
amount of deviation
from mean
• Note: If you are
interested in inferential
statistics, then you
should divide the sum
of squared deviation by
n-1, rather than n!
quizzes
1
2
3
4
5
6
7
8
9
10
Mean
Stud 1
58
45
46
52
50
61
50
53
58
52
52,5
Deviation
58 – 52.5 = 5.5
45 – 52.5 = -7.5
46 – 52.5 = -6.5
52 – 52.5 = -0.5
50 – 52.5 = -2.5
61 – 52.5 = 8.5
50 – 52.5 = -2.5
53 – 52.5 = 0.5
58 – 52.5 = 5.5
52 – 52.5 = -0.5
0
Squared
Deviation
30,25
56,25
42,25
0,25
6,25
72,25
6,25
0,25
30,25
0,25
24,45
Standard Deviation
• Variance is a squared value of measurement.
For that reason, it is not appropriate for
descriptive statistics.
• Stud1
– Variance = 24.45
– SD = 4.94
Properties of Range
• Range is easy to compute and it is good for a
fast scan of the data
• Range is based on two extreme scores. So, it
does not say anything about the rest of the
scores
• Range has little use beyond the descriptive
level
Properties of IQR
• Semi-Interquantile Range is quite similar to
median. So, it is not sensitive to the exact
values of the scores, but their ranks
• Therefore, it is more resistant to the presence
of a few extreme scores (compared to
variance)
Properties of Standart Deviation
• The standart deviation, like the mean, is
responsive to the exact position/value of
every score in the distribution
• Similarly, it is more sensitive to the extreme
values.
• Standart deviation is resistant to sampling
variation.
Standart Scores
• Look at your worksheets and compare Tarık’s
performance in Sociology and Psychology exams.
What do you think? His performance is same in
both exams?
• In fact, it is not easy to compare two different
distributions. That is like trying to compare apples
and oranges.
• In fact, we studied on a way that we can use to
compare Tarık’s performance in both exams.
Which one is that?
Standart Scores
• The percentile ranks are good for such
comparisons, but ranks are not scores. So, we
can not use them in complicated
mathematical computations.
• The other way is to compute standarized
score. By this way, we can make both
distributions same in terms of their mean and
SD.
Standart Scores
• Computation of standart scores is easy. What
we need to know is mean and the SD of the
distribution.
• So, a standart score is
– z= X – mean / SD
• Now, compute z scores for each interval.
Remember, you need to use midpoints.
Properties of Standart Scores
• The mean of z scores is always 0
• The standart deviation of z scores is always 1
• Even though z transformation changes mean
and standart deviation of the distribution, it
doesn’t change its shape