Data Distributions
Download
Report
Transcript Data Distributions
Data Distributions
Interactive Presentation
Data Collection
and Frequency
Tables
1. Why does sample size matter?
2. How could the way data is collected affect
answers to survey questions?
3. What are some ways to make random
selection and why is randomness desirable?
Vocabulary
• Data – facts or numbers that are
collected
Types of Data
• Categorical data – data that is a name or
category
• Numerical data – data that is a number
Vocabulary
Population – the entire group you want
to find information about
EXAMPLE:
• Sample – a group of people within a
population
EXAMPLE:
Sample or population?
REMEMBER!
• Sample Statistics will be more accurate
as sample size INCREASES!!
Vocabulary
• Survey – given to investigate behaviors
or opinions by questioning a sample from
the population
Click link for
examples
Vocabulary
• Census – a survey of an entire population
Vocabulary
• Parameter –
a measured
characteristic of
a population
• Statistic –
a measured
characteristic of a
sample
A number that
represents the
average shoe size
of ALL 7th
graders
The average shoe
size of our class
(the representative
length from the
sample)
Vocabulary Review
Definition
Facts or numbers that are collected
A measured characteristic of a sample
Data that is a number
Given to investigate opinions or behaviors
by questioning a group of people
A group of people within a population
Survey of an entire population
A measured characteristic of a population
Data that is a name or category
The group you want to find information
about
Discussion 1
• A school principal wants to know the
average amount of time it takes her
students to reach school each morning.
To find this out, she asked 20 students in
each grade “How long does it usually take
you to reach school in the morning?”
Explain how the words population,
sample, data, and survey fit this
situation.
Discussion 2
• An automotive shop has 25 workers. The
owner wants to reward his workers with a
company outing. He is considering a day
at a baseball game, a day at an
amusement park, or a dinner for the
workers at a restaurant. He decides to
conduct a survey so he can make the best
choice.
Formulate a single question he could
ask. Should he use a sample or a
census?
Frequency Table
• After you choose a question, you need to
collect and organize your data. A good
way to do this is to use a frequency table
(frequency distribution).
• Frequency distribution (frequency table)
– a table that organizes data to show how
many times each item or group of items
appears
Your
Turn!
# of pets
Tallies
Frequency
0
1
2
• Maxine took a census of all the
students in Ms. Alvarez’s class.
The data below show the
number of pets owned by each
student:
0, 1, 3, 2, 1, 4, 2, 1, 0, 3, 5, 2, 2, 1,
3, 2, 1, 4, 5, 0, 0, 1, 2, 1, 2
Organize the data in an ungrouped
frequency table. Use the data
to determine how many more
students have 1 pet than have
no pets.
3
4
5
Questions:
1._______ students have 1 pet.
2._______ students have no pets.
3.How many more students have 1
pet than have no pets?
_______
4. The data are organized in the
frequency table above. The data
show that _____ more students
have 1 pet than have no pets.
•
Try another frequency table
problem:
A survey of 200 people asked “On your dream vacation,
how would you get where you are going?” The results
are shown in the frequency table:
Transportation
Number of people
Airplane
125
Automobile
6
Boat
42
Train
27
1. What percent of those surveyed chose boat?
Challenge
Question:
2. What percent did not chose airplane?
1. 21% of those surveyed chose
boat
2. 37.5% of those surveyed did
not choose airplane
Getting the Idea
• A frequency distribution presents data in a
table. It is easy to read the data in a
frequency distribution, but it is not easy to
get the “whole picture” from the list of
numbers. Graphs are used to show data.
We will show you a variety of graphs you
can use to display your data later on in this
unit!
Ticket-out-the-door
• The 2,000 members of
a club were mailed
postcards, asking them
to suggest locations for
next year’s annual
meeting. Only 150
returned the postcards.
How do the new ideas
from this lesson fit this
situation?
Frequency table
sample
1. The 2,000 members of the
club represent
_____________.
2. The 150 members who sent
back the postcards represent
the ___________.
3. What the members write on
the postcards is called
________.
4.The act of collecting the
information on the postcards is
called a _________.
5.A good way to organize this
data is to use a ________
__________.
• How can I describe and interpret a data
set in a meaningful way?
• VOCABULARY: central tendency, mean,
median, mode
Measures of central tendency:
1.Mean
2.Median
3.Mode
Vocabulary
These are
measures of
central tendency!
• Mean – the average (add up the values
and divide by the # of values)
• Median – the middle number in a list of
numbers (Hint: write the numbers in order)
• Mode – the value that occurs the most
EXAMPLE 1
• Find the mean,
median, and mode of
the data in the table:
9, 8, 9, 8, 7, 8, 9, 10,
10, 7, 8, 9, 8, 8, 10, 8,
8, 9, 10, 8, 8, 10, 9, 9, 9
*Hint to help with
mean*
Use the frequency
column to find the
TOTAL number of
students
Score
10
9
8
7
Frequency
Example1 Answers:
EXAMPLE 2
Zack wants to have a mean score of 80 on his
health quizzes. He scored 70, 75, 82, and
90 on his first four quizzes. What score
must he earn on his fifth quiz to have a
mean score of exactly 80 for all five
quizzes?
SMART STRATEGY:
Use what you know about MEAN!
Step 1: Find the sum of the 4 scores you know.
Step 2: Find the sum if Zack has a mean score of 80 on all 5 quizzes.
Step 3: What number would you need to add 317 to get a sum of 400?
Step 4: Check your answer
Example 3
• This stem and leaf plot shows
the number of miles Jamal
biked per week for each of the
past 10 weeks:
Stem
Leaf
3
6 6
4
0 3 3 5 7 8 9
5
3
Key: 5 3 = 53 miles
This week, Jamal was ill so he
only biked 11 miles. How does
this change the median and
mean of the data?
Which would be the best measure
for each situation?
1. Would you use mean, median, or mode
to describe the typical selling price of a
bicycle?
2. Would you use mean, median, or mode
to determine the most popular toy sold at
a store?
MMMR Rap
• M to the M to the M to the R,
Remember this rhyme and
you’ll go far
• Mode, Median, Mean & Range,
Now singing this song might
feel strange.
• Mode, Mode now I’ve been
told, is the number you will see
the most
• Median now he’s the man, the
one in the middle, line HIM up
the best you can
• From small to large, small to
large remember this & your in
charge
• Now mean mean you may
wonder, just add add add all
your numbers
• Then you just simply divide &
you’ll have one number to your
surprise
• Last but not least is our friend
the range
• He’s not the best & he’s kind of
strange
• You start with the high &
subtract the low, that’s the
range now that’s fo sho!
• EQ: What are measures of
variation?
VOCABULARY: Variation, Range,
Quartiles, Interquartile Range, outlier, 5
Number Summary
Vocabulary
• Variability – How a data set is spread out
• Range – The difference between the
greatest and least values in a data set
27, 39, 40, 22, 19,
41, 58, 40, 53, 49
*HINT: Largest Number – Smallest Number
58 – 19 = 39
• Quartile: The three numbers that split an
ordered data set in four equal groups
Lower Quartile
(median of the
lower half of
the data)
The median of the
data set
Upper Quartile
(median of the
upper half of
the data)
• 5 Number Summary: the 5 numbers that
divide a set of data into 4 equal groups.
1. Minimum or Lower Extreme
2. Lower Quartile (Q1)
3. Median (Q2)
4. Upper Quartile (Q3)
5. Maximum or Upper Extreme
• Interquartile range: The difference
between the first and third quartiles. (Note
that the first and third quartiles are
sometimes called upper and lower
quartiles.)
IQR = UQ - LQ
• Outlier – a number that is much greater
than or much less than the rest of the
numbers in a data set
EXAMPLE 1
Below are the weekly earnings for eight
Kroger Employees. Find all measures of
variation:
$260, $175, $215, $350, $320, $235, $240,
$280
You are looking for:
1.Lower extreme
2.Quartile 1
3.Median
4.Quartile 3
5.Upper Extreme
6.Range
7.IQR
EQ: How can I use box-and-whisker plots to display
and analyze data?
• Box-and-Whisker Plot: a five number
summary of data organized into quartiles
The box-and-whisker plot below shows the weights, in pounds, of the
dogs that were weighed this morning at a veterinarian’s office.
Approximately what percent of the dogs weighed less than 25
pounds?
0
10
20
30
40
50
60
70
80
90
100
1. The box-and-whisker plot shows that the lower quartile of the data is
_______ pounds.
2. The lower quartile is the median of the lower __________ of the
data set.
3. The quartiles divide the data into ____________.
4. What fraction of the data is less than the lower quartile? _______
5. What percent is equivalent to that fraction?
The double box-and-whisker plot below shows the number of points
scored in games by two basketball players on the same team. Find
the range and interquartile range for each player. Who was the
most consistent scorer?
Step 1: Put the data in order from least to greatest
Step 2: Find the median
Step 3: Find the Lower Quartile
Step 4: Find the Upper Quartile
Step 5: Draw a number line
Step 6: Place a point above the median, lower quartile,
and upper quartile
Step 7: Draw a box (with a vertical line thru the median)
Step 8: Place a point above the lower extreme
Step 9: Place a point above the upper extreme
Step 10: Draw the whiskers
• 15 shoppers rated a brand of paper towel
on a scale from 0-10
2, 6, 6, 6, 7, 8, 8, 8, 9, 10, 10
• How do I collect data on a population that
is too large to study?
• VOCABULARY: sample, population
Vocabulary
A sample is a _________ selected group
that is ___________ of the population.
If the sample is __________ of the
population, then the measures of central
tendency and of variation for the
________ and the __________ should be
similar.
The larger the _______ size, the more
accurate the _________.
Example 1
Owen took a random sample of 10 students who take piano
lessons at a music school and recorded their ages. The
director of the school took a census of all 30 students
who take piano lessons at the school and recorded their
ages. Owen’s sample data and the director’s census
data are show below:
Owen’s Sample Data: 5, 12, 12, 12, 12, 13, 13, 17, 17, 25
Director’s Census Data: 5, 7, 8, 9, 10, 10, 11, 11, 11, 11,
12, 12, 12, 12, 12, 13, 13, 13, 13, 14, 15, 16, 17, 17, 17,
18, 19, 21, 21, 25, 30
Find and compare the mean, median, mode,
and range of the sample and the census.
Example 2
• Which of the two samples has measures
that are closer to those of the actual
population?
Example 3
The manager of an online bookstore kept track of the number of
books in each box that was shipped for 100 orders. His assistant
randomly selected two samples from his data and calculated the
mean and median for each:
Sample A: 4, 7, 9, 9, 10, 11, 12, 15, 20, 26
Sample B: 1, 4, 4, 9, 12
Which sample is more likely to have a mean and a median that
are good approximations of the actual mean (12.5) and the
actual median (11.5) of the population? Calculate the mean
and median of each sample to determine if your guess was
correct or not.
• How can best organize categorical and
numerical data?
• VOCABULARY: categorical data,
numerical data, line plot, pictograph
Vocabulary
• Categorical data – data that is a name or
category
• Numerical data – data that is a number
What are
some
examples?
Vocabulary
• Line plot– each data item is shown as a
mark above a number line; good for
showing numerical data
How many brothers and sisters do you have?
Class
Example
• Pictograph – a
graph that
shows data
using symbols
or pictures
Example 2
• Jenny keeps statistics during basketball
practice. She recorded the number of free
throws each player on the team
successfully made out of 15 attempts. Her
data are listed below:
10, 14, 15, 12, 12, 9, 8, 14, 12, 5, 13, 10, 10,
12, 11
Create a line plot to display these data.
Then identify the mode.
Example 3
Leslie surveyed a sample of her classmates. She asked them to name
the number of different states they have lived in. She displayed the
results of her survey in the line plot below:
Identify any outliers for the data. Then find the
median and the range, with and without the
outlier(s). Does removing the outlier(s) change
those measures?
• How can I collect, organize, and analyze
data in a meaningful way?
• VOCABULARY: histogram, bar graph
• Bar graph – uses bars to display categorical data
• The bars have spaces between them
• All the bars are the same width
Number of Victories
Washington Warriors Victories
Year
Steps to making a BAR GRAPH
Day
1. Study your data Visitor
from the
frequency table
and determine a
scale
2. Draw and label
the graph.
M
T
W
R
F
115
113
133
56
84
Your turn to try …
• Using the
frequency table
below, draw a bar
graph
School Days Per Year
Country
School Days
Belgium
175
Japan
243
Nigeria
190
S. Korea
220
USA
180
• Histogram – uses bars to show the frequency of
data within equal intervals
Since the
intervals leave no
gaps, the bars of
a histogram do
not have spaces
between them!!
Steps to Creating a Histogram
1.
2.
3.
4.
5.
Draw and label the axes of
your histogram
List the intervals from the
frequency table on the
horizontal axes
Use the totals from the
table to set the scale on
the vertical axes
Draw the bar for each
interval
The bars should be
touching, the same width
and shaded
• Example-Top 30
requested songs
Weeks
Frequency
1-5
4
6-10
11
11-15
9
16-20
4
21-25
0
26-30
2
Example 1: Double Bar Graph
The double bar graph shows the
number of tickets sold by four
theatres yesterday. What was
the mean number of tickets
sold by these theatres?
Example 2: Histogram
• The number of words that
Question 1: What
students in a typing class
percent of students
can type in a minute are
can type 30 or more
listed below. First make a
frequency table and then a words per minute
histogram of the data.
• 25,19,23,29,34,26,30,34,33,
20,35,35,25,29,36,22,34, 15 Question 2: How
many students
type 24 or less
words per minute?
EQ: How can I use line graphs and circle graphs to
display and analyze data?
• Line graph – a type of graph that shows change
over time using a line connecting data points
Shows trends
over time!!
People at the Sandwich Shop
During what time interval did the greatest number of people
come into the sandwich shop? By how much did it increase?
• Circle Graph –
displays categories
of data as parts of a
whole
Shows
Percents!
?
Example 1
As they exited the voting booths, 2,000 people were asked to
identify the mayoral candidate for whom they had voted. Of the
people surveyed, how many voted for Milton? How many voted
for Johnson? How many voted for Dunbar?
Example 2
Mandy asked a sample of students at her school to name their
favorite subject. Her results are shown above. If 12 students
chose social studies as their favorite subject, what is the
total number of students surveyed?
SMART STRATEGY: Set up a PROPORTION!
Making a Circle Graph
Type of
Movie
Funny
Scary
Romantic
Action
Number
of
Students
Percent of Total
Degrees Size of angle
(# if students/Total) in a
(Percent x 360)
circle
Interactivate: Circle Graph
Ticket-out-the-door
• Create your own circle graph based on the survey
data below. *HINT: A total of 50 students were
surveyed!!
Favorite
Number
type of ice of
cream
Students
Percent of
Total (# of
students/Total)
Degree
s in a
circle
Vanilla
15
360
Chocolate
25
360
Strawberry 10
360
Size of angle
(Percent x 360)
EQ: How can I use scatter plots to display and analyze
data?
• Scatter Plot: a
graph in which
ordered pairs of
data are plotted.
You can use a
scatter plot to
determine whether
a relationship, or
correlation, exists
between 2 sets of
data
HINT: LOOK FOR
TRENDS
(PATTERNS)!
As x increases, y
______________.
As x increases, y
______________.
As x increases, y
______________.
As x increases, y
______________.
As x increases, y
______________.
As x increases, y
______________.
Interactivate: Scatter Plot
Graphs that help us analyze data:
•
•
•
•
•
•
•
•
Pictographs
Histograms
NLVM – Check it out!
Bar graphs
Line graphs
Circle graphs
Line plots
Box-and-whisker plots
Scatter plots