Lecture-3: Descriptive Statistics: Measures of Dispersion
Download
Report
Transcript Lecture-3: Descriptive Statistics: Measures of Dispersion
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
WFM 5201: Data Management and
Statistical Analysis
Lecture-3: Descriptive Statistics
[Measures of Dispersion]
Akm Saiful Islam
Institute of Water and Flood Management (IWFM)
Bangladesh University of Engineering and Technology (BUET)
April, 2008
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Descriptive Statistics
Measures of Central Tendency
Measures of Location
Measures of Dispersion
Measures of Symmetry
Measures of Peakedness
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Measures of Variability or
Dispersion
The dispersion of a distribution reveals how the
observations are spread out or scattered on
each side of the center.
To measure the dispersion, scatter, or variation
of a distribution is as important as to locate the
central tendency.
If the dispersion is small, it indicates high
uniformity of the observations in the distribution.
Absence of dispersion in the data indicates
perfect uniformity. This situation arises when all
observations in the distribution are identical.
If this were the case, description of any single
observation would suffice.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Purpose of Measuring Dispersion
A measure of dispersion appears to serve two
purposes.
First, it is one of the most important quantities used
to characterize a frequency distribution.
Second, it affords a basis of comparison between
two or more frequency distributions.
The study of dispersion bears its importance from
the fact that various distributions may have exactly
the same averages, but substantial differences in
their variability.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Measures of Dispersion
Range
Percentile range
Quartile deviation
Mean deviation
Variance and standard deviation
Relative measure of dispersion
Coefficient of variation
Coefficient of mean deviation
Coefficient of range
Coefficient of quartile deviation
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Range
The simplest and crudest measure of
dispersion is the range. This is defined as
the difference between the largest and the
smallest values in the distribution. If
x1 , x 2 ,.........., x nare the values of observations
in a sample, then range (R) of the variable
X is given by:
R x1 , x 2 ,........, x n max x1 , x 2 ,..........., x n min x1 , x 2 ,............, x n
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Percentile Range
Difference between 10 to 90 percentile.
It is established by excluding the highest
and the lowest 10 percent of the items,
and is the difference between the largest
and the smallest values of the remaining
80 percent of the items.
90
10
P
P90 P10
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Quartile Deviation
A measure similar to the special range (Q) is the interquartile range . It is the difference between the third
quartile (Q3) and the first quartile (Q1). Thus
Q Q3 Q1
The inter-quartile range is frequently reduced to the
measure of semi-interquartile range, known as the
quartile deviation (QD), by dividing it by 2. Thus
Q3 Q1
QD
2
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Mean Deviation
The mean deviation is an average of absolute
deviations of individual observations from the
central value of a series. Average deviation
about mean
k
MDx
f
i
xi x
i 1
n
k = Number of classes
xi= Mid point of the i-th class
fi= frequency of the i-th class
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Standard Deviation
Standard deviation is the positive square root of
the mean-square deviations of the observations
from their arithmetic mean.
Population
2
x
i
N
Sample
s
2
x
x
i
SD variance
N 1
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Standard Deviation for Group Data
SD is :
s
f i xi x 2
Where
N
s
fx
N
i i
i
Simplified formula
2
fx
x
f
fx
N
2
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Example-1: Find Standard
Deviation of Ungroup Data
Family
No.
1
2
3
4
5
6
7
8
9
10
Size (xi)
3
3
4
4
5
5
6
6
7
7
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
x
x
Here,
Family No.
n
i
50
5
10
1
2
3
4
5
6
7
8
9
10
Total
xi
3
3
4
4
5
5
6
6
7
7
50
xi x
-2
-2
-1
-1
0
0
1
1
2
2
0
4
4
1
1
0
0
1
1
4
4
20
9
9
16
16
25
25
36
36
49
49
270
x i x
xi
2
2
s2
2
x
x
i
n 1
20
2.2,
9
s 2.2 1.48
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Example-2: Find Standard
Deviation of Group Data
x i x x i x 2 f i x i x 2
xi
fi
f i xi
3
2
6
18
-3
9
18
5
3
15
75
-1
1
3
7
2
14
98
1
1
2
8
2
16
128
2
4
8
9
1
9
81
3
9
9
Total
10
60
400
-
-
40
f x
x
f
i
i
i
60
6
10
s
f i xi
2
2
f x i x
n 1
2
i
40
4.44
9
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Relative Measures of Dispersion
To compare the extent of variation of different
distributions whether having differing or identical
units of measurements, it is necessary to
consider some other measures that reduce the
absolute deviation in some relative form.
These measures are usually expressed in the
form of coefficients and are pure numbers,
independent of the unit of measurements.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Relative Measures of Dispersion
Coefficient of variation
Coefficient of mean deviation
Coefficient of range
Coefficient of quartile deviation
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Coefficient of Variation
A coefficient of variation is computed as a
ratio of the standard deviation of the
distribution to the mean of the same
distribution.
sx
CV
x
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Example-3: Comments on Children
in a community
Mean
SD
CV
Height
weight
40 inch
5 inch
0.125
10 kg
2 kg
0.20
Since the coefficient of variation for weight
is greater than that of height, we would
tend to conclude that weight has more
variability than height in the population.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Coefficient of Mean Deviation
The third relative measure is the coefficient of mean
deviation. As the mean deviation can be computed from
mean, median, mode, or from any arbitrary value, a
general formula for computing coefficient of mean
deviation may be put as follows:
Coefficien t of mean deviation =
Mean deviation
100
Mean
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Coefficient of Range
The coefficient of range is a relative measure
corresponding to range and is obtained by the
following formula:
LS
Coefficien t of range
100
LS
where, “L” and “S” are respectively the largest
and the smallest observations in the data set.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Coefficient of Quartile Deviation
The coefficient of quartile deviation is
computed from the first and the third
quartiles using the following formula:
Q3 Q1
Coefficien t of quartile deviation
100
Q3 Q1
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Assignment-1
Find the following measurement of dispersion
from the data set given in the next page:
Range,
Percentile range, Quartile Range
Quartile deviation, Mean deviation, Standard
deviation
Coefficient of variation, Coefficient of mean deviation,
Coefficient of range, Coefficient of quartile deviation
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Data for Assignment-1
Marks
No. of students
Cumulative
frequencies
40-50
6
6
50-60
11
17
60-70
19
36
70-80
17
53
80-90
13
66
90-100
4
70
Total
70