Chapter 7: Scatterplots, Associations, and

Download Report

Transcript Chapter 7: Scatterplots, Associations, and

Chapter 7: Scatterplots,
Associations, and Correlations
A.P. Statistics
Scatterplots
Scatterplots are the best
way to start observing
the relationship
between two
quantitative variables
Describing Scatterplots
• Direction:
Positive, Negative, none
• Form:
Linear, curved, cluster, etc
• Strength:
At this point: strong, moderate, weak
• Unusual Features:
outliers, clusters with, etc
Describing Scatterplots
• Direction
• Form
• Strength
• Unusual Features
Describing Scatterplots
• Direction
• Form
• Strength
• Unusual Features
Variables
• Explanatory or Predictor :
Attempts to explain the observed outcome
Placed on x-axis
• Response :
Measures an outcome
Placed on y-axis
BE LOGICAL
Correlation
• Measures the direction and strength of the
linear relationship between the two
quantitative variables
• Given as r
r 
1

n
xx

 s
 x
 y  y 


 s

y


z
x
zy
n 1
Correlation: Graphical
Original data
Standardized Data
Correlation Conditions
• Quantitative Variables Condition
• Straight Enough Condition
• Outlier Condition
Report correlation with and without outlier
Correlation Properties
1. Correlation makes no difference between
explanatory and response variables.
2. Correlation requires both variables to be
quantitative.
3. Because r uses standardized values of the
observations, r does not change when we
change the units of measurements of x, y, or
both.
Correlation Properties
4. Positive r indicates positive association
between the variables and a negative r
indicates negative association.
5. The correlation is always a number between
-1 and 1. The strength of the linear
relationship increases as r moves away from
0 towards either -1 or 1.
Correlation Properties
6. Correlation measures the strength of only a
linear relationship.
7. Like mean and standard deviation, r is not
resistant: it is strongly affected by a few
outliers.
8. Correlation has no units. It should not be
expressed as a percentage.
Other Information
• Correlation is not a complete description of
two variable data—even if it is a linear
relationship.
• You should give the means and standard
deviations of both x and y.
Straightening Scatterplots
• If a relationship between two quantitative
variables is not linear, we can re-express it and
straighten the form. Then we can apply the
strengths of using correlation and the other
measures that come from a linear
relationship.
• We will go more into depth about this in
Chapter 10.
• See page 154 in your text for graphic
Problems?
• Don’t say “correlation” when you mean
“association”.
• Don’t correlation categorical data.
• Be sure the association is linear.
• Beware of outliers.
• Don’t confuse correlation with causation.
• Watch out for lurking variables.
Correlation vs. Causation
• Scatterplots and correlations never prove
causation.
• The only thing that can show causation is a
randomized controlled experiment.
Correlation vs. Causation