Getting More of out of Multiple Regression

Transcript Getting More of out of Multiple Regression

Getting More out of
Multiple Regression
Darren Campbell, PhD
Overview
View on Teaching Statistics
 When to Apply
 How to Use &
 How to Interpret
Multiple Regression Techniques



1. Centring removing /group difference
confounds
2. Centring interpret continuous
interactions
3. Spline functions – Piecemeal
Polynomials

Estimate separate slopes each angle of the
regression polynomial
Perks of Multiple Regression
1. Realistic many influences  Behaviour
 2. Control over confounds
 3. Test for relative importance
 4. Identify interactions

Why Not Use ANOVAs?

Not realistic:
 Many
behaviours / constructs are continuous
e.g., intelligence, personality

Loss of statistical power - categories

scores assumed to be the same + error
 mixing systematic patterns into the error term
What is Centring?

Simple re-scaling of raw scores
 Raw
Score minus Some Constant value
 x1 – 5.1
1 – 5.1 = -4.1
4 – 5.1 = -1.1
 x2 – 29.4
30 – 29.4 = 0.6
35 -- 29.4 = 5.6
A Simple Case for Centring

Babies:
& Fuss – parent report diary measures
 Fail about - limb movement
 Cry

Are these 2 infant behaviours related?
 Emotional
Responses & Emotion Regulation
A Simple Case for Centring
Age
Moves / Hr
Crying Hrs/Day
6 week olds
5.1
4.7
6 month olds
29.4
Full Sample
17.2
 Are
3.5
4.1
these 2 infant behaviours related?
6 Week-Olds
6 w eek-old infants
9

r = +.47
8


some infants cry
more & move more
others cry less &
move less
Hours of Crying
7
6
5
4
3
2
1
0
0
1
2
3
4
5
6
7
Activity - lim b m ovem ents
8
9
10
6 Month-Olds

6 m onth-old infants
r = +.38
7

some infants cry more &
move more
others cry less & move
less
6
Hours of Crying

5
4
3
2
1

What if we combine the
two groups?
0
25
30
35
Activity - lim b m ovem ents
40
• Do we get a significant corr? If so, what kind?
6 week-olds & 6-month-old infants
Hours of Crying
9
8
7
6
5
4
3
2
1
0
0
5
10
15
20
25
Activity - limb movements
• Full sample r = -0.22
30
35
40
What happened with the Correlations?
6 Week-olds: r = +.47
 6 Month-Olds: r = +.38
 6 Week & 6 Month-olds: r = -0.22

Correlations = Grand Mean Centring



1) Mean Deviations for each variable: X & Y
2) Rank Order Mean Deviations
3) Correlate 2 rank orders of X & Y
The Disappearing Correlation Explained

Grand Mean Centring lead to
 all
the older infants being classified as high movers
 young infants low movers
 Young high criers & high movers -> high criers & low
movers
 Large Group differences in movement altered the
detection of within-group r’s

What should we do?
Solution: Create Group Mean Deviations
Re-scale raw scores
 Raw – Group Mean
 6 week-olds:
 xs – 5.1
 6 month-olds:
 xs – 29.4

Solution: Create Group Mean Deviations
Crying
Raw AL
Group
Means
Group Centred AL
5.7
1
-5.11
-4.11
6
4
-5.11
-1.11
2
5
-5.11
-0.11
0.5
30
-29.4
0.63
2.5
35
-29.4
5.63
2
34
-29.4
4.63
• Raw Scores
6 week-olds & 6-month-old infants
Hours of Crying
9
8
7
6
5
4
3
2
1
0
0
5
10
15
20
25
Activity - limb movements
30
35
40
Hours of Crying /48 Hrs
Group Centred Scores
9
8
7
6
5
4
3
2
1
0
6 Weeks Old
6 Months Old
-10
-8
-6
-4
-2
0
2
4
6
8
10
Limb Movements / 48 Hrs


Group mean data r = .41 - full sample
Mulitple Regression could also work on uncentred variables
 Crying = Group + Uncentred AL
 Not a Group x AL interaction – the relation is the same for both groups
Centring so far
1. Centring is Magic
 2. Different types of centring

 Depending
on the number used to re-scale
the data
 Grand mean – Pearson Correlations
 Group Means – Infant Limb Movements
Regression Interactions Centring

Great for Interpreting Interactions

trickier than for ANOVAs
 do not have pre-defined levels or groups
 based on 2+ continuous vars
Multiple Regression - the Basics
The Basic Equation:
 Y = a + b1*X1 + b2*X2 + b3*X3 + e


Outcome = Intercept + Beta1 * predictor1 + B2 * pred2 + B3 * pred3 + Error
a = expected mean response of y
 betas: every 1 unit change in X you get a
beta sized change in Y

Regression Interactions Centring

Reducing multicollinearity




interaction predictor = x1 * x2
x1 & x2 numbers near 0 stay near 0 and high x1 & x2
numbers get really high
interaction term is highly correlated with original x1 &
x2 variables
Centring makes each predictor: x1 & x2


have more moderate numbers above and below zero
positive and negative numbers
Reduces the multiplicative exaggeration between x1
& x2 and the interaction product x1*x2
Centring to reduce Multicollinearity
X1 with X1*X2 multicollinearity
Original Variables
X1 with X1*X2 multicollinearity
Centred Variables
90
30
80
70
x1*x2 product
x1*x2 product
20
60
50
40
30
10
20
0
10
-6
-4
-2
0
0
0
10
x1
20
-10
x1
2
4
Regression

Y = a + b1*X1 + b2*X2 + b3*X1*X3 + e

How does X2 relate to Y at different levels of
X1?

How does predictor 2 (shyness) relate to the
outcome (social interactions) at different stress
levels (X1)?
Uncentred Data
X1 = 26.2 (14.5)
X2 = 24.8 (27.6)
Centred Data
X1 = 0.0 (14.5)
X2 = 0.0 (27.6)
Correlation Matrix:
x1
x1
--
x2
x12
** p = .01
* p = .05
x2
x12
y
0.58** 0.65** 0.14**
--
0.96** 0.28**
--
0.34**
x1c
x1c
x2c
x12c
--
x2c x12c
0.58** 0.11
--
y
0.14*
0.66** 0.28**
--
0.34**
Regression Equation Results

No Interaction:
Y

Uncentred:
Y

= b0 + b1 * X1 + b2 * X2
= 1164.8 – 4 X1 + 20 X2 **
Centred:
Y
= 1550.8 – 4 X1 + 20 X2 **
Regression Equation Results

Interaction Term Included:
Y

Uncentred:
Y

= b0 + b1 * X1 + b2 * X2 + b3 * X1*X2
= 1733 – 19.1 X1 – 31.7 X2 ** + 1.26 X1*X2
Centred:
Y
= 1260 + 12.0 X1 + 1.1 X2 + 1.26 X1*X2
But what does it mean…

How does X2 relate to Y at different levels
of X1?

How does predictor 2 (shyness) relate to
the outcome (social interactions) at
different stress levels (X1)?
Post Hocs

Y = b0 + b1 * X1 + b2 * X2 + b3 * X1*X2

Y = ( b1 * X1 + b0 ) + ( b2 + b3 * X1 ) * X2
-1 SD below X1 Mean
X - (- 14.547663)
X + 14.547663
&
+ 1SD above X1 Mean
X - 14.547663
Scatterplots: Moving the Y Axis
Crying Hrs/Day
AL Mean Centred
10
5
0
-10
-5
0
5
10
Movement Hrs/Day
AL +1SD Below Mean
10
Crying
Hrs/Day
Crying
Hrs/Day
AL -1SD Below Mean
5
0
-10
0
Movement Hrs/Day
10
10
5
0
-10
0
Movement Hrs/Day
10

-1 SD Below X1 Mean



Y = 1085 -19.1 X1 - 17.1 X2 + 1.26 X1*X2
t (1,196) = -1.40, p =.16
Centred:
Y = 1260 + 12.0 X1 + 1.1 X2 + 1.26 X1*X2
 t (1,196) = 0.12, p =.88


+1 SD Above X1 Mean


Y = 1435 - 19.1 X1+ 19.4 X2 ** + 1.26 X1*X2
t (1,196) = 3.66, p =.001
Regression Interaction Example

Predicting inhibitory ability with motor
activity & age

simon says like games
 4 to 6 yr-olds & physical movement
 Move by Age interaction
F (1, 81) = 5.9, p < .02
 Young (-1.5SD): move beta sig + Inhibition
 Middle (Mean) : move beta p = .10 ~ Inhibition
 Older (+1.5SD): move beta n.s. inhibition

Polynomials, Centring, & Spline Functions

Polynomial relations: quadratic, cubic, etc

Y = a + b1*X1 - b2*X1*X1 + e
250
200
150
100
50
0
-50
-100
-10
-5
0
5
10
15
Curvilinear Pattern

Assume a symmetric
pattern – X2

But, it may not be ...

Perceived Control (Y)
slowly increases & then
declines rapidly in old age
250
200
150
100
50
0
-50
-100
-10
-5
0
5
10
15
500
400
300
200
100
0
0
5
10
15
This Brings us to Spline Functions

Split up predictor X
 2+

variables
XLow & XHigh
250
200
150
100
50
0
-10

0
5
10
15
20
XLow = X – (-5) & set values at the next change
point to zero
 Ditto

-5
for XHigh
Re-run Y = a + b1*XLow - b2*XHigh+ e
Perks of Spline Functions

Estimate slope anywhere along the range

Can be sig on one part - n.s. on another

Steeper or shallower
Multiple Regression Techniques



1. Centring removing /group difference
confounds
2. Centring interpret continuous
interactions
3. Spline functions

More precise understanding of polynomial
patterns
Questions
• Alpha control procedures for spline functions
– Could be argue that you are describing the pattern
already identified?
– Conservatively, you could apply an alpha control
procedure. I like the False Discovery Rate
procedures.
– Replication is preferred, but not always possible.
Alpha Control Aside
• The source of Type 1 errors is typically poorly
described.
• Typical: If enough probability tests are run, the
probability will increase to the point where
something becomes significant just by chance.
– But, probability is linked to the representativeness of
your data and type 1 error is a proxy for the likelihood
of the representativeness of your data.
• My View: The real source of Type 1 errors is that
if you
– divide up the data into enough subgroupings
– eventually one of those subgroupings will differ
because it is misrepresentative of reality.
Standardized vs Centred
• Centred is x – xM
• Standardized (x – xM)/ SDx
– Makes variability for each predictor = 1
– Standardized Beta = raw b * SDx / SDy
– Similar to centring but different metric needs to be
adjusted for interaction terms
• To get comparable results with interaction term
– Standardization should be applied to X1 and X2 prior
to the X1*X2 estimate then use “raw” coefficients
Centring and Spline Functions

Relatively simple procedures

Old dogs in the Statistic World


but new tricks for many
That’s All Folks!

Getting More of out of Multiple Regression

Transcript Getting More of out of Multiple Regression

Directory