Chapter 1&2 Power Point slides
Download
Report
Transcript Chapter 1&2 Power Point slides
Chapter 1
Section 1
Introduction to the
Practice of Statistics
Chapter 1 – Section 1
• The science of statistics is
– Collecting
– Organizing
– Summarizing
– Analyzing
information to draw conclusions or answer
questions
Chapter 1 – Section 1
• Organize and summarize the information
Descriptive statistics (chapters 2 through 4)
• Draw conclusion/generalization from the
information
Inferential statistics (chapters 9 through 11)
Chapter 1 – Section 1
• A population
- Is the group to be studied
- Includes all of the individuals in the group
• A sample
– Is a subset of the population
– Is often used in analyses because getting
access to the entire population is impractical
Chapter 1 – Section 1
• Characteristics of the individuals under
study are called variables
– Some variables have values that are attributes or
characteristics … those are called qualitative or
categorical variables
– Some variables have values that are numeric
measurements … those are called quantitative
variables
• The suggested approaches to analyzing
problems vary by the type of variable
Chapter 1 – Section 1
• Examples of qualitative variables
–
–
–
–
–
Gender
Zip code
Blood type
States in the United States
Brands of televisions
• Qualitative variables have category values
… those values cannot be added,
subtracted, etc.
Chapter 1 – Section 1
• Examples of quantitative variables
–
–
–
–
–
Temperature
Height and weight
Sales of a product
Number of children in a family
Points achieved playing a video game
• Quantitative variables have numeric
values … those values can be added,
subtracted, etc.
Chapter 1 – Section 1
• Quantitative variables can be either
discrete or continuous
• Discrete variables
– Variables that have a finite or a countable number of
possibilities
– Frequently variables that are counts
• Continuous variables
– Variables that have an infinite but not countable
number of possibilities
– Frequently variables that are measurements
Chapter 1 – Section 1
• Examples of discrete variables
– The number of heads obtained in 5 coin flips
– The number of cars arriving at a McDonald’s between
12:00 and 1:00
– The number of students in class
– The number of points scored in a football game
• The possible values of qualitative
variables can be listed
Chapter 1 – Section 1
• Examples of continuous variables
– The distance that a particular model car can drive on
a full tank of gas
– Heights of college students
Summary: Chapter 1 – Section
1
• The process of statistics is designed to
collect and analyze data to reach
conclusions
• Variables can be classified by their type of
data
– Qualitative or categorical variables
– Discrete quantitative variables
– Continuous quantitative variables
Chapter 2
Organizing and
Summarizing Data
Chapter 2 Sections
• Sections in Chapter 2
– Organizing Qualitative Data
– Organizing Quantitative Data
– Graphical Misrepresentations of Data
Chapter 2
Section 1
Organizing
Qualitative Data
Chapter 2 – Section 1
• Qualitative data values can be organized
by a frequency distribution
• A frequency distribution lists
– Each of the categories
– The frequency for each category
Chapter 2 – Section 1
• A simple data set is
blue, blue, green, red, red, blue, red, blue
• A frequency table for this qualitative data
is
Color
Blue
Green
Red
Frequency
4
1
3
• The most commonly occurring color is
blue
Chapter 2 – Section 1
• The relative frequencies are the
proportions (or percents) of the
observations out of the total
• A relative frequency distribution lists
– Each of the categories
– The relative frequency for each category
Chapter 2 – Section 1
• A relative frequency table for this
qualitative data is
Color
Blue
Relative Frequency
.500
Green
Red
.125
.375
• A relative frequency table can also be
constructed with percents (50%, 12.5%,
and 37.5% for the above table)
Chapter 2 – Section 1
• Bar graphs for our simple data (using
Excel)
– Frequency bar graph
– Relative frequency bar graph
Chapter 2 – Section 1
• A Pareto chart is a particular type of bar graph
• A Pareto differs from a bar chart only in that the
categories are arranged in order
– The category with the highest frequency is placed first
(on the extreme left)
– The second highest category is placed second
– Etc.
• Pareto charts are often used when there are many
categories but only the top few are of interest
Chapter 2 – Section 1
• A Pareto chart for our simple data (using
Excel)
Chapter 2 – Section 1
• An example side-by-side bar graph
comparing educational attainment in 1990
versus 2003
Chapter 2 – Section 1
• An example of a pie chart
Chapter 2
Section 2
Organizing Quantitative
Data:
Chapter 2 – Section 2
• Consider the following data
• We would like to compute the frequencies
and the relative frequencies
Chapter 2 – Section 2
• The resulting frequencies and the relative
frequencies
Chapter 2 – Section 2
• Example of histograms for discrete data
– Frequencies
– Relative frequencies
Chapter 2 – Section 2
• Continuous data cannot be put directly into
frequency tables since they do not have
any obvious categories
• Categories are created using classes, or
intervals of numbers
• The continuous data is then put into the
classes
Chapter 2 – Section 2
• For ages of adults, a possible set of classes is
20 – 29
30 – 39
40 – 49
50 – 59
60 and older
• For the class 30 – 39
– 30 is the lower class limit
– 39 is the upper class limit
• The class width is the difference between the upper
class limit and the lower class limit
• For the class 30 – 39, the class width is
40 – 30 = 10
Chapter 2 – Section 2
• All the classes have the same widths,
except for the last class
• The class “60 and above” is an openended class because it has no upper limit
• Classes with no lower limits are also called
open-ended classes
Chapter 2 – Section 2
• The classes and the number of values in
each can be put into a frequency table
Age
Number
(frequency)
20 – 29
533
30 – 39
1147
40 – 49
1090
50 – 59
493
60 and older
110
• In this table, there are 1147 subjects
between 30 and 39 years old
Chapter 2 – Section 2
• Good practices for constructing tables for
continuous variables
– The classes should not overlap
– The classes should not have any gaps between them
– The classes should have the same width (except for
possible open-ended classes at the extreme low or
extreme high ends)
– The class boundaries should be “reasonable”
numbers
– The class width should be a “reasonable” number
Chapter 2 – Section 2
• Just as for discrete data, a histogram can
be created from the frequency table
• Instead of individual data values, the
categories are the classes – the intervals
of data
Chapter 2 – Section 2
• A stem-and-leaf plot is a different way to
represent data that is similar to a histogram
• To draw a stem-and-leaf plot, each data value
must be broken up into two components
– The stem consists of all the digits except for the right
most one
– The leaf consists of the right most digit
– For the number 173, for example, the stem would be
“17” and the leaf would be “3”
Chapter 2 – Section 2
• In the stem-and-leaf plot below
–
–
–
The smallest value is 56
The largest value is 180
The second largest value is 178
Chapter 2 – Section 2
• To draw a stem-and-leaf plot
– Write all the values in ascending order
– Find the stems and write them vertically in ascending
order
– For each data value, write its leaf in the row next to its
stem
– The resulting leaves will also be in ascending order
• The list of stems with their corresponding leaves
is the stem-and-leaf plot
Chapter 2 – Section 2
• Modifications to stem-and-leaf plots
– Sometimes there are too many values with
the same stem … we would need to split the
stems (such as having 10-14 in one stem and
15-19 in another)
– If we wanted to compare two sets of data, we
could draw two stem-and-leaf plots using the
same stem, with leaves going left (for one set
of data) and right (for the other set)
Chapter 2 – Section 2
• A dot plot is a graph where a dot is placed
over the observation each time it is
observed
• The following is an example of a dot plot
Chapter 2 – Section 2
• A useful way to describe a variable is by
the shape of its distribution
• Some common distribution shapes are
– Uniform
– Bell-shaped (or normal)
– Skewed right
– Skewed left
Chapter 2 – Section 2
• A variable has a uniform distribution when
– Each of the values tends to occur with the
same frequency
– The histogram looks flat
Chapter 2 – Section 2
• A variable has a bell-shaped distribution
when
– Most of the values fall in the middle
– The frequencies tail off to the left and to the
right
– It is symmetric
Chapter 2 – Section 2
• A variable has a skewed right distribution
when
– The distribution is not symmetric
– The tail to the right is longer than the tail to the left
– The arrow from the middle to the long tail points right
Right
Chapter 2 – Section 2
• A variable has a skewed left distribution when
– The distribution is not symmetric
– The tail to the left is longer than the tail to the right
– The arrow from the middle to the long tail points left
Left
Summary: Chapter 2 – Section
2
• Quantitative data can be organized in
several ways
– Histograms based on data values are good
for discrete data
– Histograms based on classes (intervals) are
good for continuous data
– The shape of a distribution describes a
variable … histograms are useful for
identifying the shapes
Chapter 2
Section 3
Graphical
Misrepresentations
of Data
Chapter 2 – Section 4
• The two graphs show the same data … the
difference seems larger for the graph on the left
• The vertical scale is truncated on the left
Chapter 2 – Section 4
• The gazebo on the right is twice as large
in each dimension as the one on the left
• However, it is much more than twice as
large as the one on the left
Original
“Twice” as large
Summary: Chapter 2 – Section 1
• Qualitative data can be organized in
several ways
– Tables are useful for listing the data, its
frequencies, and its relative frequencies
– Charts such as bar graphs, Pareto charts, and
pie charts are useful visual methods for
organizing data
– Side-by-side bar graphs are useful for
comparing two sets of qualitative data