Lecture-3: Descriptive Statistics: Measures of Dispersion

Download Report

Transcript Lecture-3: Descriptive Statistics: Measures of Dispersion

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
WFM 5201: Data Management and
Statistical Analysis
Lecture-3: Descriptive Statistics
[Measures of Dispersion]
Akm Saiful Islam
Institute of Water and Flood Management (IWFM)
Bangladesh University of Engineering and Technology (BUET)
April, 2008
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Descriptive Statistics
Measures of Central Tendency
 Measures of Location
 Measures of Dispersion
 Measures of Symmetry
 Measures of Peakedness

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Measures of Variability or
Dispersion





The dispersion of a distribution reveals how the
observations are spread out or scattered on
each side of the center.
To measure the dispersion, scatter, or variation
of a distribution is as important as to locate the
central tendency.
If the dispersion is small, it indicates high
uniformity of the observations in the distribution.
Absence of dispersion in the data indicates
perfect uniformity. This situation arises when all
observations in the distribution are identical.
If this were the case, description of any single
observation would suffice.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Purpose of Measuring Dispersion




A measure of dispersion appears to serve two
purposes.
First, it is one of the most important quantities used
to characterize a frequency distribution.
Second, it affords a basis of comparison between
two or more frequency distributions.
The study of dispersion bears its importance from
the fact that various distributions may have exactly
the same averages, but substantial differences in
their variability.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Measures of Dispersion

Range





Percentile range
Quartile deviation
Mean deviation
Variance and standard deviation
Relative measure of dispersion




Coefficient of variation
Coefficient of mean deviation
Coefficient of range
Coefficient of quartile deviation
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Range

The simplest and crudest measure of
dispersion is the range. This is defined as
the difference between the largest and the
smallest values in the distribution. If
x1 , x 2 ,.........., x nare the values of observations
in a sample, then range (R) of the variable
X is given by:
R x1 , x 2 ,........, x n   max x1 , x 2 ,..........., x n  min x1 , x 2 ,............, x n 
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Percentile Range
Difference between 10 to 90 percentile.
 It is established by excluding the highest
and the lowest 10 percent of the items,
and is the difference between the largest
and the smallest values of the remaining
80 percent of the items.

90
10
P
 P90  P10
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Quartile Deviation

A measure similar to the special range (Q) is the interquartile range . It is the difference between the third
quartile (Q3) and the first quartile (Q1). Thus
Q  Q3  Q1

The inter-quartile range is frequently reduced to the
measure of semi-interquartile range, known as the
quartile deviation (QD), by dividing it by 2. Thus
Q3  Q1
QD 
2
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Mean Deviation

The mean deviation is an average of absolute
deviations of individual observations from the
central value of a series. Average deviation
about mean
k
MDx  



f
i
xi  x
i 1
n
k = Number of classes
xi= Mid point of the i-th class
fi= frequency of the i-th class
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Standard Deviation

Standard deviation is the positive square root of
the mean-square deviations of the observations
from their arithmetic mean.
Population

2


x


 i
N
Sample
s
2


x

x
 i
SD  variance
N 1
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Standard Deviation for Group Data


SD is :
s

f i xi  x 2
Where
N
s
 fx
N
i i
i
Simplified formula
2
fx

x
f




 fx 
N


2
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Example-1: Find Standard
Deviation of Ungroup Data
Family
No.
1
2
3
4
5
6
7
8
9
10
Size (xi)
3
3
4
4
5
5
6
6
7
7
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
x

x
Here,
Family No.
n
i
50

5
10
1
2
3
4
5
6
7
8
9
10
Total
xi
3
3
4
4
5
5
6
6
7
7
50
xi  x
-2
-2
-1
-1
0
0
1
1
2
2
0
4
4
1
1
0
0
1
1
4
4
20
9
9
16
16
25
25
36
36
49
49
270
x i  x 
xi
2
2
s2 
2


x

x
 i
n 1

20
 2.2,
9
s  2.2  1.48
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Example-2: Find Standard
Deviation of Group Data
x i  x  x i  x  2 f i x i  x 2
xi
fi
f i xi
3
2
6
18
-3
9
18
5
3
15
75
-1
1
3
7
2
14
98
1
1
2
8
2
16
128
2
4
8
9
1
9
81
3
9
9
Total
10
60
400
-
-
40
f x

x
f
i
i
i
60

6
10
s
f i xi
2


2
f x i  x 
n 1
2
i

40
 4.44
9
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Relative Measures of Dispersion

To compare the extent of variation of different
distributions whether having differing or identical
units of measurements, it is necessary to
consider some other measures that reduce the
absolute deviation in some relative form.

These measures are usually expressed in the
form of coefficients and are pure numbers,
independent of the unit of measurements.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Relative Measures of Dispersion
Coefficient of variation
 Coefficient of mean deviation
 Coefficient of range
 Coefficient of quartile deviation

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Coefficient of Variation

A coefficient of variation is computed as a
ratio of the standard deviation of the
distribution to the mean of the same
distribution.
sx
CV 
x
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Example-3: Comments on Children
in a community
Mean
SD
CV

Height
weight
40 inch
5 inch
0.125
10 kg
2 kg
0.20
Since the coefficient of variation for weight
is greater than that of height, we would
tend to conclude that weight has more
variability than height in the population.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Coefficient of Mean Deviation

The third relative measure is the coefficient of mean
deviation. As the mean deviation can be computed from
mean, median, mode, or from any arbitrary value, a
general formula for computing coefficient of mean
deviation may be put as follows:
Coefficien t of mean deviation =
Mean deviation
 100
Mean
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Coefficient of Range

The coefficient of range is a relative measure
corresponding to range and is obtained by the
following formula:
LS
Coefficien t of range 
 100
LS

where, “L” and “S” are respectively the largest
and the smallest observations in the data set.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Coefficient of Quartile Deviation

The coefficient of quartile deviation is
computed from the first and the third
quartiles using the following formula:
Q3  Q1
Coefficien t of quartile deviation 
100
Q3  Q1
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Assignment-1

Find the following measurement of dispersion
from the data set given in the next page:
 Range,
Percentile range, Quartile Range
 Quartile deviation, Mean deviation, Standard
deviation
 Coefficient of variation, Coefficient of mean deviation,
Coefficient of range, Coefficient of quartile deviation
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam
Data for Assignment-1
Marks
No. of students
Cumulative
frequencies
40-50
6
6
50-60
11
17
60-70
19
36
70-80
17
53
80-90
13
66
90-100
4
70
Total
70