Descriptive Statistics: Numerical Methods

Download Report

Transcript Descriptive Statistics: Numerical Methods

Chapter 3
Descriptive Statistics: Numerical
Methods
McGraw-Hill/Irwin
Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.
Descriptive Statistics
3.1 Describing Central Tendency
3.2 Measures of Variation
3.3 Percentiles, Quartiles and Box-andWhiskers Displays
3.4 Covariance, Correlation, and the Least
Square Line (Optional)
3.5 Weighted Means and Grouped Data
(Optional)
3.6 The Geometric Mean (Optional)
3-2
LO3-1: Compute and
interpret the mean,
median, and mode.
3.1 Describing Central Tendency

In addition to describing the shape of a distribution,
want to describe the data set’s central tendency
◦ A measure of central tendency represents the center or
middle of the data
◦ Population mean (μ) is average of the population
measurements


Population parameter: a number calculated from all
the population measurements that describes some
aspect of the population
Sample statistic: a number calculated using the
sample measurements that describes some aspect of
the sample
3-3
LO3-1
Measures of Central Tendency
Mean, 
Median, Md
Mode, Mo
The average or expected value
The value of the middle point
of the ordered measurements
The most frequent value
3-4
LO3-2: Compute and
interpret the range,
variance, and standard
deviation.


Figure 3.13
3.2 Measures of Variation
Knowing the measures of central tendency is
not enough
Both of the distributions below have
identical measures of central tendency
3-5
LO3-2
Measures of Variation
Range
Largest minus the smallest
measurement
Variance
The average of the squared deviations
of all the population measurements
from the population mean
Standard
Deviation
The square root of the population
variance
3-6
LO3-3: Use the
Empirical
Rule and Chebyshev’s
Theorem to describe
variation.
The Empirical Rule for Normal
Populations
If a population has mean µ and standard
deviation σ and is described by a normal
curve, then
 68.26% of the population measurements lie
within one standard deviation of the mean:
[µ-σ, µ+σ]
 95.44% lie within two standard deviations of
the mean: [µ-2σ, µ+2σ]
 99.73% lie within three standard deviations
of the mean: [µ-3σ, µ+3σ]

3-7
LO3-3
Chebyshev’s Theorem



Let µ and σ be a population’s mean and
standard deviation, then for any value k > 1
At least 100(1 - 1/k2)% of the population
measurements lie in the interval [µ-kσ,
µ+kσ]
Only practical for non-mound-shaped
distribution population that is not very
skewed
3-8
LO3-3
z Scores

For any x in a population or sample, the associated z
score is
x  mean
z
standard deviation

The z score is the number of standard deviations
that x is from the mean
◦ A positive z score is for x above (greater than) the mean
◦ A negative z score is for x below (less than) the mean
3-9
LO3-4: Compute and
interpret percentiles,
quartiles, and box-andwhiskers displays.
3.3 Percentiles, Quartiles, and Box-andWhiskers Displays
For a set of measurements arranged in increasing
order, the pth percentile is a value such that p
percent of the measurements fall at or below the
value and (100-p) percent of the measurements fall
at or above the value




The first quartile Q1 is the 25th percentile
The second quartile (median) is the 50th percentile
The third quartile Q3 is the 75th percentile
The interquartile range IQR is Q3 - Q1
3-10
LO3-5: Compute and
interpret covariance,
correlation, and the
least squares line
(Optional).
3.4 Covariance, Correlation, and the
Least Squares Line (Optional)


When points on a scatter plot seem to
fluctuate around a straight line, there is a
linear relationship between x and y
A measure of the strength of a linear
relationship is the covariance sxy
 x  x y
n
s xy 
i 1
i
i
y

n 1
3-11
LO3-6: Compute and
interpret weighted
means and the mean
and standard deviation
of grouped data
(Optional).

3.5 Weighted Means and Grouped
Data (Optional)
Sometimes, some measurements are more important
than others
◦ Assign numerical “weights” to the data
 Weights measure relative importance of the value

Calculate weighted mean as
w x
w
i i
i
where wi is the weight assigned to the ith
measurement xi
3-12
LO3-7: Compute and
interpret the geometric
mean (Optional).
3.6 The Geometric Mean (Optional)



For rates of return of an investment, use the
geometric mean to give the correct wealth at
the end of the investment
Suppose the rates of return (expressed as
decimal fractions) are R1, R2, …, Rn for
periods 1, 2, …, n
The mean of all these returns is the
calculated as the geometric mean:
Rg 
n
1  R1  1  R2  1  Rn  1
3-13