Analyzing and Interpreting Data

Download Report

Transcript Analyzing and Interpreting Data

Describing Data
Wahyu Wibowo
Central Tendency
the value used to characterize the center
of the set of values it contains.
 useful to quantify the middle or central
location of a variable.
 the central location (quantitative data): the
mode, the median, and the mean.
 the central location (qualitative data): the
mode.

Dispersion Parameter
The boxplot provides an indication of the
value spread around the median.
 The field of statistics has developed
parameters to describe this spread, or
dispersion, using a single measure.

interquartile range
 range
 median absolute deviation
 standard deviation and variance

Coefficient of Variation
If two variables are measured with
different units, then the values of the
standard deviation cannot be used as the
measure of comparison for the dispersion
 CV can be to tompare dispersions
measured in different units
 equal to the quotient of the standard
deviation and the absolute value of the
mean

Skewness
Skewness is a measure of distribution
asymmetry.
 Yule & Pearson express the difference
between median and mean as a degree of
deviation from symmetry

Skew 
3( x  m ed )
s

Values larger than 0 indicate a rightskewed distribution, values less than 0
indicate a left-skewed distribution, and
values that are 0 indicate a symmetric
distribution.
Kurtosis
Kurtosis is used to help determine which
form is present.
 Defined as the fourth central moment,

Distribution of data

how the different values are distributed
around this location
Boxplot
Box plots provide a succinct summary of
the overall frequency distribution of a
variable.
 Six values are usually displayed: the
lowest value, the lower quartile (Q1), the
median (Q2), the upper quartile (Q3), the
highest value, and the mean

Dotplot
Use to assess and compare distributions
by plotting the values along a number line.
 Dotplots are especially useful for
comparing distributions
 The x-axis for a dotplot is divided into
many small intervals, or bins. Data values
falling within each bin are represented by
dots

Stem and Leaf
Use to examine the shape and spread of
sample data
 The display has three columns,
o The leaves (right)
o The stem (middle)
o Counts (left)

Histograms

The purpose of a histogram is to
graphically summarize the distribution of a
univariate data set.
The histogram graphically shows the
following:
1. center (i.e., the location) of the data;
2. spread (i.e., the scale) of the data is;
3. skewness of the data;
4. presence of outliers; and
5. presence of multiple modes in the data

Referensi :
Exploratory Data Analysis in Business and
Economics,
 Toit, S.H.C, Steyn, A.G.W., Stumpf, R.H.,
Graphical Exploratory Data Analysis,
Springer-Verlag
