Transcript Summ_metric

Section1 Topic 3
Summarising metric data:
Median, IQR, and boxplots
Section 1 Topic 3
1
Summarising metric data:
Median, IQR & Box Plots





Can we describe a distribution with just
one or two numbers?
What is the median, how is it calculated
and what does it tell us?
What is the interquartile range, how is
it calculated and what does it tell us?
What is a five number summary?
What is a box plot and why is it useful?
Section 1 Topic 3
2
Will less than the whole
picture do?
Summary Statistics
 Measures of centre



Median
Mean
Measures of spread



Range
Interquartile Range
Standard Deviation
Section 1 Topic 3
3
Median
3
5
1
4
8
5
8
Firstly numerically order the data set
1
3
4
50% lower than or equal
to median
50% higher than or equal
to median
Location of Median = (n+1)/2
= (5+1)/2
= 3rdSection
observation
1 Topic 3
4
Notes p.97
For an odd number of data values the median will be one of
the data values
1
3
4
5
8
Median = 4
For an even number of data values the median may not
coincide with an actual data value
3
4
5
8
Median = 4.5
Location of Median = (4+1)/2
= (5)/2
= 2.5 observation
Limitations: Range

Depends on only two extreme values.

Data set 1



5 6 7
8
Range = 12 - 5
=7
9
10
11
12
12
12
12
12
Data set 2

5 12 12
12
Section 1 Topic 3
6
Interquartile range
The interquartile range (IQR) is defined to be the
spread of the middle 50% of data values, so that
IQR = Q3 - Q1
Quartiles are the points that divide a
distribution into quarters
Q1
Q2
Q3
25%
50%
75%
Median
Section 1 Topic 3
7
Notes p.99
Why is the IQR more useful that
the range?



IQR describes the middle 50% of
observations.
Upper 25% and lower 25% of
observations are discarded.
IQR generally not affected by outliers.
Section 1 Topic 3
8
Picturing quartiles with
histogram
14
12
10
Frequency
8
6
4
2
0
Q
1
bottom 25%
Q
2
Q
3
middle 50%
top 25%
Section 1 Topic 3
9
Notes p.97
Five number summary
Minimum value, Q1, Median, Q3, Maximum value
Section 1 Topic 3
10
The Boxplot
Graphical representation of five number summary
Section 1 Topic 3
11
Notes p.98
Constructing a Boxplot
Section 1 Topic 3
12
Notes p.99
*Exercise 4
Section 1 Topic 3
13
Notes p.103
Relating a boxplot to the shape
of the distribution : Symmetric
Q1
M
Q3
For a symmetric distribution, the box plot is also symmetric. The median
is in the middle of the box and the whiskers are approximately equal in
length.
Section 1 Topic 3
14
Notes p.104
Positively skewed distributions
positive skew
Q1
M
Q3
The box plot of a positively skewed distribution has the median off-centre
and to the left. The left hand whisker will be short, while the right hand
whisker will be long reflecting the gradual tailing off data values to the
right.
Section 1 Topic 3
15
Negatively skewed distributions
negative skew
Q1
M
Q3
The box plot of a negatively-skewed distribution has the median off-centre
and to the right. The right hand whisker will be short, while the left hand
whisker will be long reflecting the gradual tailing off data values to the left.
Section 1 Topic 3
16
Boxplot with outliers

Possible outliers defined as any values
outside of the interval


(Q1-1.5 X IQR,
Q3 + 1.5 X IQR)
We say possible, since the point may
just be part of the tail of the distribution
but we may not have enough data to be
sure
Section 1 Topic 3
17
Notes p.101
Boxplot with outliers
Min
38
Q1
63
M
70
Section 1 Topic 3
Q3
Max
75
76
18
*Exercise 5
Section 1 Topic 3
19
Notes p.107