Ways of Describing Data

Download Report

Transcript Ways of Describing Data

Ways of Describing Data
Emily H. Wughalter, Ed.D.
Measurement and Evaluation in Kinesiology
Summer 2010
Symbols
•
•
•
•
•
X = sum of scores
X2 = sum of the squared scores
(X) 2 = sum of the scores squared
n = sample
N = population
Describing Data
• Grouping information into meaningful
categories or themes to make sense of it
Frequency Distributions
• A frequency distribution is a method of
describing scores
X (midterm scores)
f
95
1
88
3
86
5
85
2
80
2
Frequency Distributions
X (midterm
scores)
f
cf
X=the scores
95
1
13
f=frequency of the score
88
3
12
86
5
9
85
2
4
80
2
2
cf=cumulative frequency
n=number of participants
Frequency Distributions
X (midterm scores)
f
cf
xf
x2
95
1
13
95
9025
88
3
12
264
23232
86
5
9
430
36980
85
2
4
170
14450
80
2
2
160
12800
Types of Curves
• Normal Curve is a perfectly bisymmetrical
curve. In a normal curve 50 % falls on one
side and 50 % of the other side of the curve.
• A Skewed Curve occurs when fewer scores
fall to the left of the curve or to the right of
the curve. A negatively skewed curve has
fewer scores in the negative direction.
• A positively skewed curve has fewer scores
in the positive direction.
• A Leptokurtic curve occurs when many
people scored the same score.
• A Platykurtic curve occurs when lots of
scores have the same frequency of
occurrence.
Measures of Central Tendency
• Points around which the scores tend to
cluster
3 Measures of Central Tendency
• Mode
• Median
• Mean
Mode
• The mode is the most frequently occurring
score.
• The mode is affected by a single change to
the scores in the distribution.
Median
• The median is the 50th percentile score. It is
that score at which 50% of the group falls
below or above.
• A problem with the median is that it can be
affected by a change in a score; also, it is
based upon the number of scores not the
value of the scores.
Mean
• The mean is the average score. It is the best
measure of central tendency.
• The mean is prized by researchers and
evaluators because of its use and because
the mean or average is well understood.
Measures of Variability
• Measures of variability indicate information
about the spread of the scores
3 Measures of Variability
• Range
• Semi-Interquartile Deviation (SID)
• Standard Deviation
3 Measures of Variability
• Range associated and reported with the
mode
• Semi-Interquartile Deviation (SID)
associated and reported with the median
• Standard Deviation associated and reported
with the mean
Example of Between and Within
Subject Variability
120
100
80
Exam 1
60
Exam 2
Exam 3
40
20
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Range
• The range is the measure of variability
reported with the mode.
• It provides a description of the total spread
of the scores.
• The range can be calculated by
Range = (X high - X low) + 1
Problems with the Range
• A change in one score in the distribution
can change the range of scores.
Semi-Interquartile Deviation
• The semi-interquartile deviation is reported
with the median.
• It provides a measure of variability for the
middle 50% of the scores.
Problems with the SemiInterquartile Deviation
• The semi-interquartile deviation measures
the variability of only the middle 50% of the
scores.
• If heterogeneity of the scores or
homogeneity of the scores exists in the
extremes and this is different than the
middle then this will not be reflected in the
semi-interquartile deviation.
Standard Deviation
• The standard deviation measures variability
and is reported with the mean.
• Theoretically, the standard deviation
represents the mean of the differences of all
of the scores from the mean.
• The standard deviation represents the
deviation of an entire set of scores.