Lesson 2.3 - James Rahn
Download
Report
Transcript Lesson 2.3 - James Rahn
Lesson 2.3
A box plot gives you an idea of the overall
distribution of a data set, but in some cases
you might want to see other information and
details that a box plot doesn’t show.
A histogram is a graphical representation of a
data set, with columns to show how the data
are distributed across different intervals of
values.
Histograms give vivid pictures of distribution
features, such as clusters of values, or gaps
in data.
The columns of a histogram are called bins
and should not be confused with the bars
of a bar graph. The bars of a bar graph
indicate categories—how many data
items either have the same value or share
a characteristic
Shatevia took a random sample of 50
students who own MP3 players at her
high school and asked how many
songs they have stored. The two
graphs were constructed from the
data in the table.
a. What is the range of the data?
The range is 248 songs. (1013-765)
Shatevia took a random sample of 50
students who own MP3 players at her
high school and asked how many
songs they have stored. The two
graphs were constructed from the
data in the table.
b. What is the bin width of each graph?
Graph A bin width = 50 songs
Graph B bin width = 10 songs.
Shatevia took a random sample of 50
students who own MP3 players at her
high school and asked how many
songs they have stored. The two
graphs were constructed from the
data in the table.
c. How can you know if the graph accounts
for all 50 values?
The sum of all the bin
frequencies = 50
Shatevia took a random sample of 50
students who own MP3 players at her
high school and asked how many
songs they have stored. The two
graphs were constructed from the
data in the table.
d. Why are the columns shorter in Graph B?
With smaller bin
widths you will usually
have shorter bins.
Shatevia took a random sample of 50
students who own MP3 players at her
high school and asked how many
songs they have stored. The two
graphs were constructed from the
data in the table.
e. Which graph is better
at showing the overall
shape of the distribution?
What is that shape?
Shatevia took a random sample of 50
students who own MP3 players at her
high school and asked how many
songs they have stored. The two
graphs were constructed from the
data in the table.
Graph A shows that the
distribution is skewed
left. This fact is harder
to see with all the ups
and downs in Graph B.
Shatevia took a random sample of 50
students who own MP3 players at her
high school and asked how many
songs they have stored. The two
graphs were constructed from the
data in the table.
f. Which graph is better at
showing the gaps and
cluster in the data?
Shatevia took a random sample of 50
students who own MP3 players at her high
school and asked how many songs they have
stored. The two graphs were constructed
from the data in the table.
With more bins you can see gaps and clusters
in the data. A dot plot is like a histogram with
a very small bin width. Graph B is the better
graph for seeing gaps and clusters.
Shatevia took a random sample of 50
students who own MP3 players at her
high school and asked how many
songs they have stored. The two
graphs were constructed from the
data in the table.
g. What percentage of the players
have fewer than 850 songs stored?
Shatevia took a random sample of 50 students
who own MP3 players at her high school and
asked how many songs they have stored. The
two graphs were constructed from the data in
the table.
Add the bin frequencies for the bins below (to the left
of) 850 songs. There are 10 data values, so 10 out of 50,
or 20% of the sample, had fewer than 850 songs.
The percentile rank of a value is the percentage of
data values that are below the given value.
In the example, 850 songs has a percentile rank of
20 because this value is greater than 20% of the
values in the sample.
The data used in this histogram have a mean of
34.05 and a standard deviation of 14.68.
Add the bin frequencies to find that there are 40 data values in all.
Approximate the percentile rank of a value two
standard deviations above the mean.
The value of two standard deviations above the mean is 34.05 + 2 x
14.68 or 63.41. All of the data values in the ten bins up to the value
of 60 are less than 63.41. Adding the bin frequencies up to 60 gives 37.
This is 37/40 or 92.5%, of the data lie below 63.41. So 63.41 is
approximately the 93rd percentile.
Approximately what percentage of the data
values are within one standard deviation of the
mean?
One standard deviation above the mean is 48.73, and one standard
deviation below the mean is 19.37. This interval includes at least
those values in the bins from 20 to 45. So 25/40 , or approximately
62.5%, of the data lie within one standard deviation of the mean.
Teenagers require anywhere from
1800 to 3200 calories per day,
depending on their growth rate and
level of activity. The food you
consume as part of your diet should
include sufficient fiber, moderate
levels of carbohydrates and fat, and
as little sodium, saturated fat, and
cholesterol as possible. The table
shows the recommended amounts of
carbohydrates and fiber and the
maximum amounts of other
nutrients in a healthy 2500-calorie
diet.
So, how does fast food fit into a
healthy diet? Examine the information
about the nutritional content of fastfood sandwiches. With your group,
study one of the nutritional
components (total calories, total fat,
saturated fat, cholesterol, sodium, or
total carbohydrate). Use box plots,
histograms, and the measures of
central tendency and spread to
compare the amount of that
component in the sandwiches. You
may want to divide your data so that
you can make comparisons between
different types of sandwiches or
between restaurants. As you do your
statistical analysis, discuss how these
fast-food items would affect a healthy
diet.