Transcript Correlation
Correlation
Overview and interpretation
Making a Scatterplot
Line up the data in columns
(eliminate missing data)
Plot the student’s score on
each variable
Bill
Adapted from Wiersma, W., & Jurs, S. G. (1990). Educational measurement and testing (2nd ed.).
Needham Heights, MA: Allyn and Bacon.
Inspect the scatterplot
• The correlation coefficient (Pearson r) can
only be interpreted for linear relationships
These are all
examples of
linear
relationships
The strength of
the correlations
vary
Shavelson, R. J. (1996). Statistical reasoning for the behavioral sciences (Third ed.). Needham
Heights, MA: Allyn & Bacon.
Inspect the scatterplot (2)
• If you see these types of distributions, you
are dealing with a curvilinear relationship
Inspect the scatterplot (3)
• Students who seem to be ‘out on their
own’ in the scatter plot are called outliers
• Including outliers in the calculation can
change the relationship
outlier
Pearson r correlation coefficient
• Range from -1.0 (perfect inverse
correlation) to +1.0 (perfect correlation)
• The sign (+, -) shows the direction of the
relationship
• The number shows the strength of the
relationship (regardless of sign)
• No relationship is 0.0
The formula
Note that there are other equivalent formulas also possible.
Assumptions of Pearson correlation
• Each pair of scores is independent
• Each set of scores is normally distributed
• The relationship between scores is linear
Interpreting correlation
• Correlation merely shows a relationship
between two variables, not the meaning of
the relationship
• Correlation is not causation
• Statistical significance does not imply
importance
• Statistical significance merely indicates
that the correlation strength is greater than
one would expect by chance
Statistical significance of r
Imagine a population with a zero correlation
(Cody & Smith, 1997)
Now, sample 10 points from this population
The resulting sample would probably have a non-zero correlation
Statistical significance
• If a correlation is much larger than what
one would expect by chance, it is
considered to be significant
• Significant does not mean important or
strong
• Significant merely means that the size of
the correlation coefficient is larger than
would be expected by a chance sampling
from a zero correlation population
Determining significance
• Most statistical software packages will
automatically flag significant correlations
• If checking by hand, compare the r value
with the appropriate table
– 2-tailed decision at alpha = .05 is common
• If the value of r is equal to or larger than
the value in the table, the correlation is
significant
This is the table
from the back of a
statistics book
Decide the
level of
certainty that
you want
Find the
number
corresponding
to your N – 2
Check to see if
your correlation
coefficient is as
large or larger
than the one in
the table
Coefficient of determination
• The coefficient of determination (r2) is a
measure of the shared variance between
the two variables
(Shavelson, 1996)
Potential problems in correlation analysis
• restriction of range
– correlation of TOEFL, GRE, etc. with grade
point average
• skewedness
– test too easy or too difficult
• attribution of causality
– variable must be correlation to claim that they
are causally related, but correlation alone is
not sufficient to prove causality
Point-biserial correlation
• Used to correlate a dichotomous variable
with a continuous variable
• In testing, used to correlate a person’s
performance on an item (correct, incorrect)
with their total test score
• Used as an index of item discrimination
Point-biserial formula
Mean on the test
for people who got
item correct
Mean on the test
for people who
got item incorrect
Standard
deviation
for test
IF for
item
1 – IF for
item
TAP output
Number Item Disc. # Correct # Correct
Point
Adj.
Item
Key
Correct Diff. Index in High Grp in Low Grp Biser. Pt Bis
------- ----- ------- ----- ----- ----------- ----------- ------- ------Item 01 (2 )
22
0.44 0.72
14 (0.93)
3 (0.21)
0.64
0.60
Item 02 (4 )
29
0.58 0.58
13 (0.87)
4 (0.29)
0.51
0.47
Item 03 (4 )
35
0.70 0.71
15 (1.00)
4 (0.29)
0.52
0.48
Item 04 (3 )
26
0.52 0.72
14 (0.93)
3 (0.21)
0.63
0.59
Item 05 (2 )
37
0.74 0.50
15 (1.00)
7 (0.50)
0.38
0.34
Item 06 (1 )
19
0.38 0.72
13 (0.87)
2 (0.14)
0.59
0.55
Item 07 (3 )
36
0.72 0.43
14 (0.93)
7 (0.50)
0.34
0.28
Item 08 (4 )
23
0.46 0.79
15 (1.00)
3 (0.21)
0.63
0.59
Item 09 (4 )
23
0.46 0.79
14 (0.93)
2 (0.14)
0.61
0.56
Item 10 (4 )#
37
0.74 0.22
13 (0.87)
9 (0.64)
0.18
0.12