Chapter 3 - McGraw Hill Higher Education
Download
Report
Transcript Chapter 3 - McGraw Hill Higher Education
Chapter 3
Descriptive Statistics: Numerical
Methods
McGraw-Hill/Irwin
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved.
Descriptive Statistics
3.1
3.2
3.3
3.4
3.5
3.6
Describing Central Tendency
Measures of Variation
Percentiles, Quartiles and Box-andWhiskers Displays
Covariance, Correlation, and the Least
Square Line (Optional)
Weighted Means and Grouped Data
(Optional)
The Geometric Mean (Optional)
3-2
LO 1: Compute and
interpret the mean,
median, and mode.
In addition to describing the shape of a distribution,
want to describe the data set’s central tendency
3.1 Describing Central
Tendency
A measure of central tendency represents the center or
middle of the data
Population mean (μ) is average of the population
measurements
Population parameter: a number calculated from all
the population measurements that describes some
aspect of the population
Sample statistic: a number calculated using the
sample measurements that describes some aspect
of the sample
3-3
LO1
Measures of Central
Tendency
Mean,
Median, Md
Mode, Mo
The average or expected value
The value of the middle point of
the ordered measurements
The most frequent value
3-4
LO 2: Compute and
interpret the range,
variance, and standard
deviation.
3.2 Measures of
Variation
Knowing the measures of central tendency is not
enough
Both of the distributions below have identical
measures of central tendency
3-5
LO2
Measures of Variation
Range
Largest minus the smallest
measurement
Variance
The average of the squared deviations
of all the population measurements from
the population mean
Standard
Deviation
The square root of the population
variance
3-6
LO 3: Use the Empirical
Rule and Chebyshev’s
Theorem to describe
variation.
The Empirical Rule for
Normal Populations
If a population has mean µ and standard
deviation σ and is described by a normal
curve, then
68.26% of the population measurements lie within
one standard deviation of the mean: [µ-σ, µ+σ]
95.44% of the population measurements lie within
two standard deviations of the mean: [µ-2σ, µ+2σ]
99.73% of the population measurements lie within
three standard deviations of the mean: [µ-3σ,
µ+3σ]
3-7
LO3
Chebyshev’s Theorem
Let µ and σ be a population’s mean and
standard deviation, then for any value k > 1
At least 100(1 - 1/k2 )% of the population
measurements lie in the interval [µ-kσ, µ+kσ]
Only practical for non-mound-shaped
distribution population that is not very skewed
3-8
LO3
z Scores
For any x in a population or sample, the associated
z score is
x mean
z
standard deviation
The z score is the number of standard deviations
that x is from the mean
A positive z score is for x above (greater than) the
mean
A negative z score is for x below (less than) the
mean
3-9
LO3
Coefficient of Variation
Measures the size of the standard deviation relative
to the size of the mean
Standard deviation
Coefficien t of variation
100%
Mean
Used to:
Compare the relative variabilities of values about the mean
Compare the relative variability of populations or samples
with different means and different standard deviations
Measure risk
3-10
LO 4: Compute and
interpret percentiles,
quartiles, and box-andwhiskers displays.
3.3 Percentiles, Quartiles, and
Box-and-Whiskers Displays
For a set of measurements arranged in increasing
order, the pth percentile is a value such that p
percent of the measurements fall at or below the
value and (100-p) percent of the measurements fall
at or above the value
The first quartile Q1 is the 25th percentile
The second quartile (median) is the 50th percentile
The third quartile Q3 is the 75th percentile
The interquartile range IQR is Q3 - Q1
3-11
LO4
Calculating Percentiles
1.
2.
3.
Arrange the measurements in increasing order
Calculate the index i=(p/100)n where p is the
percentile to find
(a) If i is not an integer, round up and the next
integer greater than i denotes the pth percentile
(b) If i is an integer, the pth percentile is the
average of the measurements in the i and i+1
positions
3-12
LO 5: Compute and
interpret covariance,
correlation, and the
least squares line
(optional).
3.4 Covariance, Correlation, and
the Least Squares Line (Optional)
When points on a scatter plot seem to
fluctuate around a straight line, there is a
linear relationship between x and y
A measure of the strength of a linear
relationship is the covariance sxy
x x y
n
s xy
i 1
i
i
y
n 1
3-13
LO 6: Compute and
interpret weighted
means and the mean
and standard deviation
of grouped data
(optional).
Sometimes, some measurements are more
important than others
3.5 Weighted Means and
Grouped Data (Optional)
Assign numerical “weights” to the data
Weights measure relative importance of the value
Calculate weighted mean as
w x
w
i
i
i
where wi is the weight assigned to the ith
measurement xi
3-14
LO 7: Compute and
interpret the geometric
mean (optional).
3.6 The Geometric Mean
(Optional)
For rates of return of an investment, use the
geometric mean to give the correct wealth at
the end of the investment
Suppose the rates of return (expressed as
decimal fractions) are R1, R2, …, Rn for
periods 1, 2, …, n
The mean of all these returns is the
calculated as the geometric mean:
Rg
n
1 R1 1 R2 1 Rn 1
3-15