Chapter Two Organizing and Summarizing Data

Download Report

Transcript Chapter Two Organizing and Summarizing Data

Chapter Two
Organizing and Summarizing
Data
2.2
Organizing Quantitative Data I
1
The first step in summarizing quantitative data is to
determine whether the data is discrete or
continuous.
If the data is discrete, the categories of data will
be the observations (as in qualitative data),
 if the data is continuous, the categories of data
(called classes) must be created using intervals of
numbers.
2
(i) Discret Data
EXAMPLE 1 Constructing Frequency and
Relative
Frequency Distribution from
Discrete Data
3
EXAMPLE1( cont’)
The following data represent the number of
available cars in a household based on a random
sample of 50 households. Construct a frequency
and relative frequency distribution.
3
4
1
3
2
0
2
1
3
3
1
2
3
2
2
2
2
2
1
1
1
1
4
2
2
1
2
1
2
2
1
2
2
0
1
2
0
1
3
1
0
2
2
2
3
2
4
2
2
5
Data based on results reported by the United
States Bureau of the Census.
4
EXAMPLE 1 (cont’)
Frequency and Relative Frequency Table
5
A histogram is constructed by drawing
rectangles for each class of data whose
height is the frequency or relative frequency
of the class. The width of each rectangle
should be the same and they should touch
each other.
6
EXAMPLE 2 Drawing a Histogram for Discrete Data
Draw a frequency and relative frequency
histogram for the “number of cars per household”
data.
7
8
9
(II) Continuous Data
Categories of data are created for
continuous data using intervals of
numbers called classes.
10
i) Group data, Constructing tables
Group the data into intervals
( or classes), construct the frequency
distribution table or relative
frequency distribution table.
11
The following data represents the number of persons
aged 25 - 64 who are currently work disabled.
Age
25 – 34
35 – 44
45 – 54
55 – 64
Number (in thousands)
2,132
3,928
4,532
5,108
The lower class limit of a class is the smallest value
within the class while the upper class limit of a class is
the largest value within the class. The lower class limit of
first class is 25. The lower class limit of the second class
is 35. The upper class limit of the first class is 34.
The class width is the difference between consecutive
lower class limits. The class width of the data given
above is 35 - 25 = 10.
12
EXAMPLE Organizing Continuous Data into a
Frequency and Relative Frequency Distribution
The following data represent the time between eruptions
(in seconds) for a random sample of 45 eruptions at the
Old Faithful Geyser in California. Construct a frequency
and relative frequency distribution of the data.
Source: Ladonna Hansen, Park Curator
13
The smallest data value is 672 and the
largest data value is 738. We will
create the classes so that the lower
class limit of the first class is 670
and the class width is 10 and obtain
the following classes:
14
The smallest data value is 672 and the largest
data value is 738. We will create the classes so
that the lower class limit of the first class is 670
and the class width is 10 and obtain the following
classes:
670 - 679
680 - 689
690 - 699
700 - 709
710 - 719
720 - 729
730 - 739
15
Frequency Table Using class width of 10
16
Frequency Table Using class width of 5
17
ii) Frequency or
Relative Frequency Histogram
18
EXAMPLE Constructing a Frequency and Relative
Frequency Histogram for Continuous Data
Using class width of 10: Frequency Histogram
19
Using class width of 10: Relative Frequency Histogram
20
Using class width of 5: Frequency Histogram
21
Stem-and-Leaf Plot
for Continuous data
22
Construction of a Stem-and-Leaf Plot
Step 1: The stem of the graph will consist of the
leading digits The leaf of the graph will be the
rightmost digit. The choice of the stem depends
upon the class width desired.
Step 2: Write the stems in a vertical column in
increasing order. Draw a vertical line to the right of
the stems.
Step 3: Write each leaf corresponding to the stems
to the right of the vertical line. The leafs must be
written in ascending order.
23
EXAMPLE Constructing a Stem-and-Leaf
Diagram
The employment ratio is the number of
employed to population ratio. It is found by
dividing the number of employed individuals in a
population by the size of the population. The
following data represent the employment ratio
by state in the United States for 1999.
Construct a stem-and-leaf diagram.
24
25
We let the stem represent the integer portion of the
number and the leaf will be the decimal portion. For
example, the stem of Alabama will be 60 and the leaf
will be 3.
26
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
7
05
47
301
56
897
3426
71034
416166
817250409
38
81
69
1031
401
27
0
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
7
05
47
013
56
789
2346
01347
114666
001245789
38
18
69
0113
014
28
0
Advantage of Stem-and-Leaf Diagrams over
Histograms
Once a frequency distribution or histogram
of continuous data is created, the raw data
is lost (unless reported with the frequency
distribution), however, the raw data can be
retrieved from the stem-and-leaf plot.
29
If the value of a variable is measured at different
points in time, the data is referred to as time
series data.
A time series plot is obtained by plotting the time
in which a variable is measured on the horizontal
axis and the corresponding value of the variable
on the vertical axis. Lines are then drawn
connecting the points.
30
The following data represent the closing value of
the Dow Jones Industrial Average for the years
1980 - 2001.
31
Time Series Plot
32
Distribution Shapes

Symmetric



Uniform
Normal
Nonsymmetric


Skewed to the left
Skewed to the right
33
Distribution Shapes---Uniform
34
Distribution Shapes---Bell-Shaped
35
Distribution Shapes-Skewed Right
36
Distribution Shapes-Skewed Left
37
EXAMPLE
Identifying the Shape of the Distribution
Identify the shape of the following histogram which
represents the time between eruptions at Old Faithful.
38
Answer: Slightly skewed left
39
Misleading Graphs
40
Distorted Vertical Scale
41
Characteristics of Good Graphics
 Clearly label the graphic and provide
explanations, if needed.
42
Characteristics of Good Graphics
 Clearly label the graphic and provide
explanations, if needed.
 Avoid distortion. Don’t lie about the data.
43
Characteristics of Good Graphics
 Clearly label the graphic and provide explanations, if
needed.
 Avoid distortion. Don’t lie about the data.
 Avoid three dimensions. Three dimensional pie
charts may look nice, but they distract the reader and
often result in misinterpretation of the graphic.
44
Characteristics of Good Graphics
 Clearly label the graphic and provide explanations, if
needed.
 Avoid distortion. Don’t lie about the data.
 Avoid three dimensions. Three dimensional pie
charts may look nice, but they distract the reader and
often result in misinterpretation of the graphic.
 Do not use more than one design in the same
graphic. Sometimes, graphs use a different design in
a portion of the graphic in order to draw attention to
this area. Don’t use this technique. Let the numbers
speak for themselves.
45