Chapter 4 Displaying Quantitative Data
Download
Report
Transcript Chapter 4 Displaying Quantitative Data
Chapter 4
Displaying Quantitative Data
Dealing With a Lot of
Numbers...
When looking at large sets of quantitative
data, it can be difficult to get a sense of
what the numbers are telling us without
summarizing the numbers in some way.
In this chapter, we will concentrate on
graphical displays of quantitative data.
Percent of Population over 65
per state (1996)
13.0 14.3 12.5
5.2 12.8 12.6
13.2 18.5 15.2
14.4 9.9 13.7
10.5 12.9 12.6
11.0 11.4 11.4
13.8 13.2
13.9
11.4
14.1
12.4
12.4
12.3
13.8
11.4
12.0
13.8
11.0
13.4
12.5
14.5
13.4
13.5
13.4
15.9
15.8
12.1
14.4
12.5
10.2
8.8
12.1
11.2
11.6
15.2
13.3
11.2
What do these data tell us?
Make a picture
Histogram
Stem-and-Leaf Display
Dot plot
First three things to do with data
Make a picture
Make a picture
Make a picture
Displaying Quantitative Data
Histogram
Give each graph a title
Give each one of the axes a label
Make as neat as possible
• Computer
• Grid paper
Displaying Quantitative Data
Histogram
Divide data values into equal-width piles
(called bins)
Count number of values in each bin
Plot the bins on x-axis
Plot the bin counts on y-axis
Example – Population Over 65
Decide on bin values
Low value is 5.2 and high value is 18.5
Bins are 5.0 up to 6.0, 6.0 up to 7.0, etc.
Written as 5.0 ≤ X < 6.0, 6.0 ≤ X < 7.0
Count number of values in each bin
Bin 5.0 ≤ X < 6.0 has 1 value
Bin 6.0 ≤ X < 7.0 has 0 values
Bin 7.0 ≤ X < 8.0 has 0 values
Bin 8.0 ≤ X < 9.0 has 1 value
Continue counting values in each bin
Example – Population Over 65
Plot bins on x-axis
14 bins from 5.0 ≤ X < 6.0 to 18.0 ≤ X < 19.0
Plot bin counts on y-axis
Bin counts are:
1, 0, 0, 1, 1, 2, 9, 13, 13, 5, 4, 0, 0, 1
Displaying Quantitative Data
Stem and Leaf Display
Picture of Distribution
Generally used for smaller data sets
Group data like histograms
Still have original values (unlike
histograms)
Two columns
• Left column: Stem
• Right column: Leaf
Displaying Quantitative Data
Stem and Leaf Display
Leaf
• Contains the last digit of the values
• Arranged in increasing order away from stem
Stem
• Contains the rest of the values
• Arranged in increasing order from top to bottom
Example – Population Over 65
Leaf = tenths digit
Stem = tens and ones digits
Ex. 5 | 2
Ex. 10| 2 5
Ex. 14| 1 3 4 4 5
Percent of Population over Age 65 (by state) in
1996
5
6
7
8
9
10
11
12
13
14
15
16
17
18
2
8
9
2
0
0
0
1
2
5
5
0
1
2
3
2
2
1
2
4
8
2
3
3
4
9
4 4 4 4 6
4 4 5 5 5 6 6 8 9
4 4 4 5 7 8 8 8 9
5
Example – Frank Thomas
Career Home Runs (19902004)
4 7 15 18 24 28 29
32 35 38 40
40 41 42 43
0
1
2
3
4
4
5
4
2
0
7
8
8 9
5 8
0 1 2 3
Displaying Quantitative Data
Back-to-back Stem-and-Leaf Display
Used to compare two variables
Stems in center column
Leafs for one variable – right side
Leafs for other variable – left side
Arrange leafs in increasing order,
AWAY FROM STEM!
Example – Compare Frank
Thomas to Ryne Sandberg
Career Home Runs for
Ryne Sandberg (19811997)
0 5 7 8 9 12 14 16
19 19 25 26 26 26 30
40
9 8 7 5 0 0
9 9 6 4 2 1
6 6 6 5 2
0 3
0 4
4
5
4
2
0
7
8
8 9
5 8
1 2 3
Displaying Quantitative Data
If there are a large number of
observations in only a few stems, we
can split stems.
Split the stems into two stems
First stem is 0 – 4.
Second stem is 5 – 9.
If you choose to split one stem you
MUST split them all!
Example – Population Over 65
12 0 1 1 3 4 4 5 5 5 6 6 8 9
13 0 2 2 3 4 4 4 5 7 8 8 8 9
12
12
13
13
0
5
0
5
1
5
2
7
1
5
2
8
3
6
3
8
4
6
4
8
4
8
4
9
Looking at Distributions
Always report 3 things when
describing a distribution:
1.
2.
3.
Shape
Center
Spread
Looking at Distributions
Shape
How many humps (called modes)?
•
•
•
•
None = uniform
One = unimodal
Two = bimodal
Three or more = multimodal
Unimodal vs Bimodal
Size of Diamonds (carats)
Histogram of Octane Rating
10
15
9
8
Frequency
Frequency
7
6
5
4
3
10
5
2
1
0
0
86
87
88
89
90
91
92
Octane
93
94
95
96
0.1
0.2
0.3
Size (carats)
0.4
Looking at Distributions
Shape
Is it symmetric?
• Symmetric = roughly equal on both sides
• Skewed = more values on one side
• Right = Tail stretches to large values
• Left = Tail stretches to small values
Are there any outliers?
• Interesting observations in data
• Can impact statistical methods
Examples of Skewness
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
Looking at Distributions
Center
A single number to describe the data
Can calculate different numbers for center
Looking at Distributions
Spread
Variation in the data values
• Smallest observation to the largest observation
• May take into account any outliers
• Later, spread will be a single number
Example – Population Over 65
Shape
Unimodal
Symmetric
Two Outliers (5% and 18%)
Center - 12%
Spread - Almost all observations are
between 8% and 16%
Example – Frank Thomas
• Shape
0
1
2
3
4
4
5
4
2
0
7
8
8 9
5 8
0 1 2 3
– Unimodal
– Skewed left
– No outliers
• Center - 28
• Spread – between 4 and 43
Example – Compare Frank
Thomas to Ryne Sandberg
98 7500
99 6421
6 6652
03
04
4
5
4
2
0
7
8
89
58
123
• Shape
– Unimodal
– Skewed right
– No Outliers
• Center – 26
• Spread – between 0 and 40
• Both players have about the same
spread
• Thomas has more higher values
What Do We Know?
Histograms, Stem-and-Leaf Displays, Back-toBack Stem-and-Leaf Displays
When describing a display, always mention:
Shape: number of modes, symmetric or skewed
Spread
Center
Outliers (mention them if they exist; otherwise,
say there are no outliers)
What Do We Know? (cont.)
A graph is either symmetric or skewed,
not both!
If a graph is skewed, be sure to specify
the direction:
Skewed left or skewed right