Examining Relationships

Download Report

Transcript Examining Relationships

Examining
Relationships
Section 3.1
Scatterplots
Activity
• Write down your math and verbal score
for the SAT as an ordered pair (Math,
Verbal), fold it and put it in the basket.
• I will tabulate the data as a class
• Plot the data
• Do you see a relationship between a
person’s math and verbal scores?
o Is there a discernible pattern?
o Can you describe the pattern?
o Would the pattern be different if we had used the
points (Verbal, Math)?
Introduction
• Most statistical studies involve more than one
variable
• This chapter concentrates on the relationships
among several variables for the same group of
individuals
• We have focused on quantitative variables up to
this point
• Categorical variables help to organize the data
when we have data on several variables
Questions to ask yourself when you
examine the relationship between two or
more variables
• What individuals do the data describe?
• What exactly are the variables? How are
they measured?
• Are all the variables quantitative or is at least
one a categorical variable?
• Do you want simply to explore the nature of
the relationship, or do you think that some of
the variables explain or even cause changes
in others?
o Response vs. explanatory variables
Variables
Response Variables – measures an
outcome of a study (dependent variables)
y axis
Explanatory Variable – attempts to explain
the observed outcomes (independent
variables) x –axis
Always plot the explanatory variable(if
there is one) on the horizontal axis and the
response variable on the vertical axis
• In Algebra II we use the words independent
and dependent for a scatterplots but these
words have other unrelated meanings in
statistics. We will use these words when
discussing probablity.
• Prediction requires that we identify any
explanatory and response variable
• It is easiest to identify explanatory and
response variables when we actually set
values of one variable in order to see how it
affects another variable
• EX: Alcohol has many effects on the body.
One effect is a drop in body temperature.
To study this effect, researchers give several
different amounts of alcohol to mice, then
measure the change in each mouse’s body
temp. in the 15 minutes after taking the
alcohol. Amount of alcohol is the
explanatory variable, and change in body
temp. is the response variable.
• Calling one variable explanatory and the
other response does not necessarily
mean that changes in one cause
changes in the other
• EX: Steve want to know how the median
SAT Math and Verbal relate to each
other. He doesn’t think that either score
explains or causes the other.
• There may or may not be
explanatory and response
variables. Whether there are
depends on how you plan to use
the data.
o Alcohol causes the change in body
temp.
o There is no cause and effect
relationship between SAT math and
verbal scores.
Principles to guide examination
of data
1. Start with a graph
2. Look for a pattern and
deviations from pattern
3. Add numerical description of
specific aspects of data
4. Try to describe the overall
pattern if possible
Scatterplot
• Shows the relationship between two
quantitative variables measured on the
same individuals.
• The values of one variable appear on the
horizontal axis, and the values of the other
variable appear on the vertical axis.
• Each individual in the data appears as the
point in the plot fixed by the values of both
variable for that individual.
Tips for drawing
scatterplots
1. Scale the horizontal and vertical axes
o Intervals must be uniform
o If scale doesn’t begin at zero at the
origin, use double-hash symbol to
indicate the break
2. Label both axes
3. Try to use whole grid
o Want to be able to see details
o Don’t compress the plot into one corner
of the grid
Example 3.3 page 123
Alabama- 9% take the
SAT and the average
SAT Math score is 555.
Interpreting Scatterplots
• Direction –
o Positive association (high/high, low/low)
o Negative association (high/low, low/high)
• Form
o Look for clusters
o Shape – linear, curved
• Strength – how close points follow a clear form
• Not all relationships are linear in form and
don’t have a clear direction that we can
define as positive or negative association
Outliers
• an individual observation that
falls outside the overall pattern
of the graph. An observation
can be in the x direction, in the
y direction, or in both
directions
Interpreting Scatterplots
In the previous example:
Direction: the overall pattern moves from upper left
to lower right. We call this a negative association.
Form: The form is slightly curved and there are two
distinct clusters. What explains the clusters? (ACT
States)
Strength: The strength is determined by how closely
the points follow a clear form. The example is only
moderately strong.
Outliers: Do we see any deviations from the
pattern? There are no clear outliers.
Example 3.4 Page 127
Direction:
Positive Association
Form: Linear (the points lie in a straightline pattern)
Strength:
Outliers:
Strong relationship because the
points lie close to a line with little
scatter. We can use this to
predict gas consumption quite
accurately.
There don’t appear to be any
outliers
Health and Wealth
• The next slide shows a scatter plot of data
from the world bank
o The response variable is life expectancy
at birth
o The explanatory variable is how rich a
country is measured by GDP
• People in richer countries should live longer,
and the scatterplot shows this
• But
o Relationship has an odd shape
o There is a lot of variability at the very lowest
levels of GDP
o As GDP increases a bit, so does life
expectancy
o Beyond a GDP of about $15000, there is no
more improvement in life expectancy
• It has an overall weak curved pattern
• There are outliers
Femur:
Humerus:
38
41
56
63
59
70
64
72
74
84
The table contains lengths of the leg and
upper are bones of Archaeopteryx – an
extinct animal connected to modern birds
Debate exists as to whether the six known
Archaeopteryx fossil skeletons are really all
Archaeopteryx
Graph the Data
Classifying fossils
Direction: The scatterplot shows a very strong, positive, straight-line
association
Form: Linear form.
Strength: The association is strong because the points lie very close to the line.
Outliers- there do not appear to be any.
The skeletons may come from young and old animals
Classifying fossils
The Case of the Missing
M&Ms