Analyzing and Interpreting Data
Download
Report
Transcript Analyzing and Interpreting Data
Describing Data
Wahyu Wibowo
Central Tendency
the value used to characterize the center
of the set of values it contains.
useful to quantify the middle or central
location of a variable.
the central location (quantitative data): the
mode, the median, and the mean.
the central location (qualitative data): the
mode.
Dispersion Parameter
The boxplot provides an indication of the
value spread around the median.
The field of statistics has developed
parameters to describe this spread, or
dispersion, using a single measure.
interquartile range
range
median absolute deviation
standard deviation and variance
Coefficient of Variation
If two variables are measured with
different units, then the values of the
standard deviation cannot be used as the
measure of comparison for the dispersion
CV can be to tompare dispersions
measured in different units
equal to the quotient of the standard
deviation and the absolute value of the
mean
Skewness
Skewness is a measure of distribution
asymmetry.
Yule & Pearson express the difference
between median and mean as a degree of
deviation from symmetry
Skew
3( x m ed )
s
Values larger than 0 indicate a rightskewed distribution, values less than 0
indicate a left-skewed distribution, and
values that are 0 indicate a symmetric
distribution.
Kurtosis
Kurtosis is used to help determine which
form is present.
Defined as the fourth central moment,
Distribution of data
how the different values are distributed
around this location
Boxplot
Box plots provide a succinct summary of
the overall frequency distribution of a
variable.
Six values are usually displayed: the
lowest value, the lower quartile (Q1), the
median (Q2), the upper quartile (Q3), the
highest value, and the mean
Dotplot
Use to assess and compare distributions
by plotting the values along a number line.
Dotplots are especially useful for
comparing distributions
The x-axis for a dotplot is divided into
many small intervals, or bins. Data values
falling within each bin are represented by
dots
Stem and Leaf
Use to examine the shape and spread of
sample data
The display has three columns,
o The leaves (right)
o The stem (middle)
o Counts (left)
Histograms
The purpose of a histogram is to
graphically summarize the distribution of a
univariate data set.
The histogram graphically shows the
following:
1. center (i.e., the location) of the data;
2. spread (i.e., the scale) of the data is;
3. skewness of the data;
4. presence of outliers; and
5. presence of multiple modes in the data
Referensi :
Exploratory Data Analysis in Business and
Economics,
Toit, S.H.C, Steyn, A.G.W., Stumpf, R.H.,
Graphical Exploratory Data Analysis,
Springer-Verlag