Getting More of out of Multiple Regression

Download Report

Transcript Getting More of out of Multiple Regression

Getting More out of
Multiple Regression
Darren Campbell, PhD
Overview
View on Teaching Statistics
 When to Apply
 How to Use &
 How to Interpret
Multiple Regression Techniques



1. Centring removing /group difference
confounds
2. Centring interpret continuous
interactions
3. Spline functions – Piecemeal
Polynomials

Estimate separate slopes each angle of the
regression polynomial
Perks of Multiple Regression
1. Realistic many influences  Behaviour
 2. Control over confounds
 3. Test for relative importance
 4. Identify interactions

Why Not Use ANOVAs?

Not realistic:
 Many
behaviours / constructs are continuous
e.g., intelligence, personality

Loss of statistical power - categories

scores assumed to be the same + error
 mixing systematic patterns into the error term
What is Centring?

Simple re-scaling of raw scores
 Raw
Score minus Some Constant value
 x1 – 5.1
1 – 5.1 = -4.1
4 – 5.1 = -1.1
 x2 – 29.4
30 – 29.4 = 0.6
35 -- 29.4 = 5.6
A Simple Case for Centring

Babies:
& Fuss – parent report diary measures
 Fail about - limb movement
 Cry

Are these 2 infant behaviours related?
 Emotional
Responses & Emotion Regulation
A Simple Case for Centring
Age
Moves / Hr
Crying Hrs/Day
6 week olds
5.1
4.7
6 month olds
29.4
Full Sample
17.2
 Are
3.5
4.1
these 2 infant behaviours related?
6 Week-Olds
6 w eek-old infants
9

r = +.47
8


some infants cry
more & move more
others cry less &
move less
Hours of Crying
7
6
5
4
3
2
1
0
0
1
2
3
4
5
6
7
Activity - lim b m ovem ents
8
9
10
6 Month-Olds

6 m onth-old infants
r = +.38
7

some infants cry more &
move more
others cry less & move
less
6
Hours of Crying

5
4
3
2
1

What if we combine the
two groups?
0
25
30
35
Activity - lim b m ovem ents
40
• Do we get a significant corr? If so, what kind?
6 week-olds & 6-month-old infants
Hours of Crying
9
8
7
6
5
4
3
2
1
0
0
5
10
15
20
25
Activity - limb movements
• Full sample r = -0.22
30
35
40
What happened with the Correlations?
6 Week-olds: r = +.47
 6 Month-Olds: r = +.38
 6 Week & 6 Month-olds: r = -0.22

Correlations = Grand Mean Centring



1) Mean Deviations for each variable: X & Y
2) Rank Order Mean Deviations
3) Correlate 2 rank orders of X & Y
The Disappearing Correlation Explained

Grand Mean Centring lead to
 all
the older infants being classified as high movers
 young infants low movers
 Young high criers & high movers -> high criers & low
movers
 Large Group differences in movement altered the
detection of within-group r’s

What should we do?
Solution: Create Group Mean Deviations
Re-scale raw scores
 Raw – Group Mean
 6 week-olds:
 xs – 5.1
 6 month-olds:
 xs – 29.4

Solution: Create Group Mean Deviations
Crying
Raw AL
Group
Means
Group Centred AL
5.7
1
-5.11
-4.11
6
4
-5.11
-1.11
2
5
-5.11
-0.11
0.5
30
-29.4
0.63
2.5
35
-29.4
5.63
2
34
-29.4
4.63
• Raw Scores
6 week-olds & 6-month-old infants
Hours of Crying
9
8
7
6
5
4
3
2
1
0
0
5
10
15
20
25
Activity - limb movements
30
35
40
Hours of Crying /48 Hrs
Group Centred Scores
9
8
7
6
5
4
3
2
1
0
6 Weeks Old
6 Months Old
-10
-8
-6
-4
-2
0
2
4
6
8
10
Limb Movements / 48 Hrs


Group mean data r = .41 - full sample
Mulitple Regression could also work on uncentred variables
 Crying = Group + Uncentred AL
 Not a Group x AL interaction – the relation is the same for both groups
Centring so far
1. Centring is Magic
 2. Different types of centring

 Depending
on the number used to re-scale
the data
 Grand mean – Pearson Correlations
 Group Means – Infant Limb Movements
Regression Interactions Centring

Great for Interpreting Interactions

trickier than for ANOVAs
 do not have pre-defined levels or groups
 based on 2+ continuous vars
Multiple Regression - the Basics
The Basic Equation:
 Y = a + b1*X1 + b2*X2 + b3*X3 + e


Outcome = Intercept + Beta1 * predictor1 + B2 * pred2 + B3 * pred3 + Error
a = expected mean response of y
 betas: every 1 unit change in X you get a
beta sized change in Y

Regression Interactions Centring

Reducing multicollinearity




interaction predictor = x1 * x2
x1 & x2 numbers near 0 stay near 0 and high x1 & x2
numbers get really high
interaction term is highly correlated with original x1 &
x2 variables
Centring makes each predictor: x1 & x2


have more moderate numbers above and below zero
positive and negative numbers
Reduces the multiplicative exaggeration between x1
& x2 and the interaction product x1*x2
Centring to reduce Multicollinearity
X1 with X1*X2 multicollinearity
Original Variables
X1 with X1*X2 multicollinearity
Centred Variables
90
30
80
70
x1*x2 product
x1*x2 product
20
60
50
40
30
10
20
0
10
-6
-4
-2
0
0
0
10
x1
20
-10
x1
2
4
Regression

Y = a + b1*X1 + b2*X2 + b3*X1*X3 + e

How does X2 relate to Y at different levels of
X1?

How does predictor 2 (shyness) relate to the
outcome (social interactions) at different stress
levels (X1)?
Uncentred Data
X1 = 26.2 (14.5)
X2 = 24.8 (27.6)
Centred Data
X1 = 0.0 (14.5)
X2 = 0.0 (27.6)
Correlation Matrix:
x1
x1
--
x2
x12
** p = .01
* p = .05
x2
x12
y
0.58** 0.65** 0.14**
--
0.96** 0.28**
--
0.34**
x1c
x1c
x2c
x12c
--
x2c x12c
0.58** 0.11
--
y
0.14*
0.66** 0.28**
--
0.34**
Regression Equation Results

No Interaction:
Y

Uncentred:
Y

= b0 + b1 * X1 + b2 * X2
= 1164.8 – 4 X1 + 20 X2 **
Centred:
Y
= 1550.8 – 4 X1 + 20 X2 **
Regression Equation Results

Interaction Term Included:
Y

Uncentred:
Y

= b0 + b1 * X1 + b2 * X2 + b3 * X1*X2
= 1733 – 19.1 X1 – 31.7 X2 ** + 1.26 X1*X2
Centred:
Y
= 1260 + 12.0 X1 + 1.1 X2 + 1.26 X1*X2
But what does it mean…

How does X2 relate to Y at different levels
of X1?

How does predictor 2 (shyness) relate to
the outcome (social interactions) at
different stress levels (X1)?
Post Hocs

Y = b0 + b1 * X1 + b2 * X2 + b3 * X1*X2

Y = ( b1 * X1 + b0 ) + ( b2 + b3 * X1 ) * X2
-1 SD below X1 Mean
X - (- 14.547663)
X + 14.547663
&
+ 1SD above X1 Mean
X - 14.547663
Scatterplots: Moving the Y Axis
Crying Hrs/Day
AL Mean Centred
10
5
0
-10
-5
0
5
10
Movement Hrs/Day
AL +1SD Below Mean
10
Crying
Hrs/Day
Crying
Hrs/Day
AL -1SD Below Mean
5
0
-10
0
Movement Hrs/Day
10
10
5
0
-10
0
Movement Hrs/Day
10

-1 SD Below X1 Mean



Y = 1085 -19.1 X1 - 17.1 X2 + 1.26 X1*X2
t (1,196) = -1.40, p =.16
Centred:
Y = 1260 + 12.0 X1 + 1.1 X2 + 1.26 X1*X2
 t (1,196) = 0.12, p =.88


+1 SD Above X1 Mean


Y = 1435 - 19.1 X1+ 19.4 X2 ** + 1.26 X1*X2
t (1,196) = 3.66, p =.001
Regression Interaction Example

Predicting inhibitory ability with motor
activity & age

simon says like games
 4 to 6 yr-olds & physical movement
 Move by Age interaction
F (1, 81) = 5.9, p < .02
 Young (-1.5SD): move beta sig + Inhibition
 Middle (Mean) : move beta p = .10 ~ Inhibition
 Older (+1.5SD): move beta n.s. inhibition

Polynomials, Centring, & Spline Functions

Polynomial relations: quadratic, cubic, etc

Y = a + b1*X1 - b2*X1*X1 + e
250
200
150
100
50
0
-50
-100
-10
-5
0
5
10
15
Curvilinear Pattern

Assume a symmetric
pattern – X2

But, it may not be ...

Perceived Control (Y)
slowly increases & then
declines rapidly in old age
250
200
150
100
50
0
-50
-100
-10
-5
0
5
10
15
500
400
300
200
100
0
0
5
10
15
This Brings us to Spline Functions

Split up predictor X
 2+

variables
XLow & XHigh
250
200
150
100
50
0
-10

0
5
10
15
20
XLow = X – (-5) & set values at the next change
point to zero
 Ditto

-5
for XHigh
Re-run Y = a + b1*XLow - b2*XHigh+ e
Perks of Spline Functions

Estimate slope anywhere along the range

Can be sig on one part - n.s. on another

Steeper or shallower
Multiple Regression Techniques



1. Centring removing /group difference
confounds
2. Centring interpret continuous
interactions
3. Spline functions

More precise understanding of polynomial
patterns
Questions
• Alpha control procedures for spline functions
– Could be argue that you are describing the pattern
already identified?
– Conservatively, you could apply an alpha control
procedure. I like the False Discovery Rate
procedures.
– Replication is preferred, but not always possible.
Alpha Control Aside
• The source of Type 1 errors is typically poorly
described.
• Typical: If enough probability tests are run, the
probability will increase to the point where
something becomes significant just by chance.
– But, probability is linked to the representativeness of
your data and type 1 error is a proxy for the likelihood
of the representativeness of your data.
• My View: The real source of Type 1 errors is that
if you
– divide up the data into enough subgroupings
– eventually one of those subgroupings will differ
because it is misrepresentative of reality.
Standardized vs Centred
• Centred is x – xM
• Standardized (x – xM)/ SDx
– Makes variability for each predictor = 1
– Standardized Beta = raw b * SDx / SDy
– Similar to centring but different metric needs to be
adjusted for interaction terms
• To get comparable results with interaction term
– Standardization should be applied to X1 and X2 prior
to the X1*X2 estimate then use “raw” coefficients
Centring and Spline Functions

Relatively simple procedures

Old dogs in the Statistic World


but new tricks for many
That’s All Folks!