Lecture 11 Slides (correlation)

Download Report

Transcript Lecture 11 Slides (correlation)

Bivariate Statistics
Y
Nominal
X
Ordinal
Interval
Nominal
Ordinal
Interval
2
Rank-sum
Kruskal-Wallis H
t-test
ANOVA
Spearman rs (rho)
Pearson r
Regression
October 31
Sir Francis Galton
Karl Pearson
http://www.york.ac.uk/depts/maths/histstat/people/
Source: Raymond Fancher, Pioneers of Psychology. Norton, 1979.
A correlation coefficient is a numerical expression
of the degree of relationship between two
continuous variables.
80
RDG
70
60
50
40
30
30
40
50
60
MATH
70
80
Pearson’s r
-1  r  +1
-1    +1
Sample
_ C
XC sc
n
Sample
_ D
XD sd
n
Population
Sample
_ B
µ 
Sample
_ E
XE se
n
n XB sb
Sample
_ A
XA sa
n
SampleC
rXY
SampleD
Population
rXY
XY
_ E
Sample
rXY
SampleB
rXY
SampleA
rXY
Pearson’s r
-1  r  +1
-1    +1
Pearson’s r is a function of the sum of the cross-product of z-scores for x and y.
Pearson’s r
r=
 zx zy
N
SampleC
rXY
SampleD
Population
rXY
XY
_ E
Sample
rXY
SampleB
rXY
SampleA
rXY
The familiar t distribution, at N-2 degrees of
freedom, can be used to test the probability that the
statistic r was drawn from a population with  = 0
H0 :  XY = 0
H1 :  XY  0
where
r
N-2
t=
1 - r2
Some uses of r
• Association of two variables
• Reliability estimates
• Validity estimates
Factors that affect r
Non-linearity
Restriction of range / variability
Outliers
Reliability of measure / measurement error
Johnson & Newport, scaled properly,
with new ranges age <20 and >20.
All Subjects
English Proficiency
300
200
r=-.87
r=-.49
10
30
100
0
0
20
Age of Arrival
40
Spearman’s Rank Order Correlation rs
Point Biserial Correlation rpb
Pearson’s r
-1  r  +1
-1    +1
Pearson’s r can also be interpreted as how far the scores of Y individuals
tend to deviate from the mean of X when they are expressed in standard deviation units.
Pearson’s r
-1  r  +1
-1    +1
Pearson’s r can also be interpreted as the expected value of zY given a value of zX.
tend to deviate from the mean of X when they are expressed in standard deviation units.
The expected value of zY is zX*r
If you are predicting zY from zX where there is a perfect correlation (r=1.0), then
zY=zX.. If the correlation is r=.5, then zY=.5zX.