Correlation - THANGARAJ MATH

Download Report

Transcript Correlation - THANGARAJ MATH

Correlation
The apparent relation between two
variables.
Trend
• A pattern of average behaviour that occurs
over time
Which of the scatter plots indicate the
strongest trends?
Which of the scatter plots do not
indicate a trend?
If a line of best fit were drawn on each
of the scatter plots that show a trend,
describe the slope of each line.
Which do you think show strong
positive correlation and which shows
strong negative correlation?
Make your own scatter plot and
examine the trends
• http://staff.argyll.epsb.ca/jreed/math9/strand
4/scatterPlot.htm
Correlation Coefficient , r
• A number from +1 to -1 that gives the strength
and direction of the relationship between two
variables.
Positive Correlation
• If there is a positive correlation, the coefficient is
a number between 0 and 1. If there is no
relationship between the predicted values and
the actual values the correlation coefficient is 0
or very low (the predicted values are no better
than random numbers). As the strength of the
relationship between the predicted values and
actual values increases so does the correlation
coefficient. A perfect fit gives a coefficient of
1.0. Thus the higher the correlation coefficient
the better.
Negative Correlation
• If there is a negative correlation, the coefficient is
a number between 0 and -1. If there is no
relationship between the predicted values and
the actual values the correlation coefficient is 0
or very low (the predicted values are no better
than random numbers). As the strength of the
relationship between the predicted values and
actual values increases so does the correlation
coefficient. A perfect fit gives a coefficient of
-1.0. Thus the higher the correlation coefficient
the better.
Coefficient of Determination, r2
• A number from 0 to +1 that gives the relative
strength of the relationship between two
variables.
• If r2 = 0.44, this means 44% of the variation of
the dependent variable is due to variation in
the independent variable.
Example
• r=.9
• This means there is a strong positive
correlation.
• Calculate r2 and explain what it means.
Residual
Value –
the
vertical
distance
between
a data
point and
the line
of best fit
Residual Plot
To do: Make a scatter plot using Open Calc for the data
below
x
-4
-3
0
5
15
19
21
28
36
y
8
5
12
15
27
30
32
38
50
1. Enter data for x
vertically in
Column 1
2. Enter data for y
vertically in
Column 2
3. Highlight data
and click
INSERT and
then CHART
4. Select
XY(Scatter) to
get a scatter plot
5. Click, NEXT,
NEXT, FINISH
To do: Make a line of best fit using Open Calc for the
data below
1. Highlight the
x
-4
-3
0
5
15
19
21
28
36
y
8
5
12
15
27
30
32
38
50
graph
2. Click INSERT,
TREND LINE
3. Choose LINEAR
4. Click SHOW
EQUATION and
SHOW
COEFFICENT
before pressing
OK
5. What is the
equation?
6. What is R2
7. What does R2
tell you about the
correlation
To do: Make a scatter plot using Open Calc for the data
below
x
-4
-2
0
2
4
6
8
10
12
y
8
10
12
14
17
18
20
21
24
1. Enter data for x
vertically in
Column 1
2. Enter data for y
vertically in
Column 2
3. Highlight data
and click
INSERT and
then CHART
4. Select
XY(Scatter) to
get a scatter plot
5. Click, NEXT,
NEXT, FINISH
To do: Make a scatter plot using Open Calc for the data
below
x
-4
-2
0
2
4
6
8
10
12
y
8
10
12
14
17
18
20
21
24
1. Highlight the
graph
2. Click INSERT,
TREND LINE
3. Choose LINEAR
4. Click SHOW
EQUATION and
SHOW
COEFFICENT
before pressing
OK
5. What is the
equation?
6. What is R2
7. What does R2
tell you about the
correlation
Compare Graph 1 and 2
• Which one of your two graphs show stronger
correlation? How can you tell???