Comparing the Mean and Median
Download
Report
Transcript Comparing the Mean and Median
Chapter 5
Describing Distributions
Numerically
Describing a Quantitative Variable
using Percentiles
Percentile
– A given percent of the observations are
less than this value.
– Ex. 10th percentile - 10% of the
observations of the variable are less than
the 10th percentile.
– Ex. 90th percentile - 90% of the
observations of the variable are less than
the 90th percentile.
Important Percentiles
Minimum – 0th percentile
Q1 – 25th percentile (called the first
quartile)
Median – 50th percentile
Q3 – 75th percentile (called the third
quartile)
Maximum – 100th percentile
Median
50th percentile
– 50% of the observations are below the median
– 50% of the observations are above the median
Median is the ______________________
Measures the __________ of the
observations
Properties of the Median
Which observations affect the median?
73 is an outlier
– Does this observation affect the median?
Range
Measures spread (variability)
Minimum – 0th percentile
Maximum – 100th percentile
Range = _______________________
Properties of the Range
Which observations affect the range?
73 is an outlier
– Does this observation affect the range?
IQR (Interquartile Range)
Measures spread (variability)
IQR = Q3 - Q1
Spread of the center 50% of the
observations
Finding Q1 and Q3
In general,
– Q1 is the _________ of the lower half of
the ordered observations.
– Q3 is the _________ of the upper half of
the ordered observations.
Actual calculations from textbook and R
may be slightly different.
IQR of Home Runs Per Season for
Barry Bonds
Order the home runs from smallest to largest
5 16 19 24 25 25 26 33 33 34 34
37 37 40 42 45 45 46 46 49 73
Lower Half
– 5 16 19 24 25 25 26 33 33 34 34
– Q1 = 25
Upper Half
– 34 37 37 40 42 45 45 46 46 49 73
– Q3 = 45
IQR = 45 – 25 = 20
Five Number Summary
–Min = ____
–Q1 = ____
–Median = _____
–Q3 = _____
–Max = _____
Graph of Five Number Summary
Boxplot
– Box ___________________________.
– Line in the box marks the ____________.
– Lines extend out from box to the most
extreme data point which is no more than
1.5 times the IQR from the box.
A
B
C
D
E
F
0
5
10
15
20
25
Mean
Ordinary average
– Add up all observations.
– Divide by the number of observations.
Mean
Formula
– n observations
– y1, y2, y3, …, yn are the observations.
n
y1 y2 y3 yn
y
n
y
i 1
n
i
Properties of the Mean
What effect do the observations have
on the mean?
73 is an outlier. What effect does this
observation have on the mean?
Standard Deviation
Measures spread (variability)
“Average” spread from mean.
Denoted by letter s.
Standard Deviation
n
( y1 y ) ( y 2 y ) ( yn y )
s
n 1
2
2
2
( y y)
i 1
i
n 1
2
Standard Deviation
Usually calculate using computer or
calculator.
– Choose n-1 option on calculator.
Do once by hand
– Make a table.
Properties of s
s≥0
– s = 0 only when all observations are equal.
– s > 0 in all other cases.
s has the same units as the data.
Properties of s
What effect do the observations have
on the value of s?
73 is an outlier. What effect does this
observation have on the value of s?
Comparison of the Mean and Median
Median
Mean
Mean vs. Median
Mean and Median are generally similar
when
– Distribution is ________________
Mean and median are generally
different when either
– Distribution is ________________
– ___________ are present.
Influence of Outliers on the Mean
and Median
Small Example: Income in a small town
of 6 people
$25,000 $27,000 $29,000
$35,000 $37,000 $38,000
Mean income is $31,830
Median income is $32,000
Influence of Outliers on the Mean
and Median
– Bill Gates moves to town.
$25,000 $27,000 $29,000
$35,000 $37,000 $38,000 $100,000,000
– The mean income is $14,313,000
– The median income is $35,000
Influence of Skewness on the Mean
and Median
The observations in the tail influence
the mean. These observations do not
influence the median.
– Skewed to the right (large values)
____________________
– Skewed to the left (small values)
____________________
Final Word - Mean vs. Median
Always question when means are
reported for skewed data
– Income
– Housing prices
– Course grades
Which summaries are the best?
Five Number Summary
– ______________________
– ______________________
Mean and Standard Deviation
– ______________________
ALWAYS GET A PICTURE OF YOUR
DATA.